A Robust and High-Dimensional Clustering Algorithm Based on Feature Weight and Entropy

Since the Fuzzy C-Means algorithm is incapable of considering the influence of different features and exponential constraints on high-dimensional and complex data, a fuzzy clustering algorithm based on non-Euclidean distance combining feature weights and entropy weights is proposed. The proposed algorithm is based on the Fuzzy C-Means soft clustering algorithm to deal with high-dimensional and complex data. The objective function of the new algorithm is modified with the help of two different entropy terms and a non-Euclidean way of computing the distance. The distance calculation formula enhances the efficiency of extracting the contribution of different features. The first entropy term helps to minimize the clusters’ dispersion and maximize the negative entropy to control the clustering process, which also promotes the association between the samples. The second entropy term helps to control the weights of features since different features have different weights in the clustering process. Experiments on real-world datasets indicate that the proposed algorithm gives better clustering results than other algorithms. The experiments demonstrate the proposed algorithm’s robustness by analyzing the parameters’ sensitivity and comparing the computational distance formulas. In summary, the improved algorithm improves classification performance under noisy interference and high-dimensional datasets, increases computational efficiency, performs well in real-world high-dimensional datasets, and encourages the development of robust noise-resistant high-dimensional fuzzy clustering algorithms.


Introduction
In the field of machine learning and data mining, research on clustering has always attracted extensive attention [1][2][3][4]. Clustering methods are primarily categorized as partitionbased, density-based, and hierarchical clustering methods [5][6][7]. Partition-based clustering methods classify different samples on the basis of features. In general, density-based clustering methods classify samples on the basis of the number of samples at each location. The samples are considered related (contained and included) and classified by the hierarchy of different samples in hierarchical clustering methods. Hierarchical clustering is suitable for small datasets but not large datasets. Partition-based clustering is one of the most used clustering methods for large dataset. Density-based clustering divides data points into high-density and low-density regions and is suitable for large datasets. In recent years, three-way soft clustering has been a new direction in clustering research, which attributes samples in the positive region as belonging to the cluster, samples in the boundary region as partially belonging to the cluster, and samples in the negative region as not belonging to the cluster [8][9][10]. Clustering divides the objects in the set into different clusters based on a certain standard (distance) to increase the intra-cluster similarity and reduce the inter-cluster similarity. Clustering algorithms are divided into soft and hard clustering on the basis of clustering research. Hard clustering specifies that a sample can be divided into only one cluster, while soft clustering allows a sample to be divided into different clusters. For two primary clustering methods, K-means and FCM clustering methods are the most widely used [11,12]. These two algorithms have driven the research on clustering, and much research has been performed to improve them. However, noise and initial cluster centers often influence the clustering results. Moreover, the clustering result is often significantly reduced when the clustering algorithm faces high-dimensional and complex data. A good clustering algorithm is supposed to be high-performance, robust, and scalable.
To resolve these difficulties, several improved clustering algorithms have been proposed [13][14][15][16]. The K-means method divides each sample into a specific cluster. However, these clusters often have overlapping and fuzzy divisions in practical applications. To solve the uncertain data objects, soft clustering algorithms are introduced. Bezdek [17] introduced the fuzzy set theory into K-means and proposed the FCM algorithm, which uses the membership function to verify the membership relationship between objects and clusters. Furthermore, it has demonstrated efficient performance in different applications [18,19]. However, the degree of the membership function does not always correspond to the cluster to which it belongs. The FCM algorithm is partition-based. The advantage of the partitionbased clustering method is that the convergence is fast, and the disadvantage is that it requires that the number of clusters can be reasonably estimated and that the choice of initial centers and noise can have a significant impact on the clustering results. In these two traditional methods, all features are given the same weight and are easily influenced by noise [20,21]. These methods are also susceptible to random initial cluster centers, and poor initialization will likely result in local optimal solution generation [22,23]. In the face of high-dimensional and complex data, high-dimensional data are usually very sparse in space, and the sample size always seems very small compared with the dimensionality of the space. The features of clusters are not obvious. Traditional clustering algorithms cannot guarantee robustness; hence, it is extremely important to fully use the features' properties.
To address these issues, it is critical to remember that different weights should be assigned to different features. In previous studies, the weights corresponding to the features were assigned in one of two ways. The first method is to assign weights to the features on a global scale. Throughout the process, a specific feature is given only one weight. The second method is to assign features local weights, which means that features in a dataset have different weights in different clusters. Numerous studies have demonstrated that the second method outperforms global weighting [24]. Therefore, the SCAD algorithm considers different feature weights in the different clusters and simultaneously clusters the data with feature discrimination [25]. However, the conventional FCM algorithms, including improved algorithms, constrain related variables through exponential regularization, which may lead to consistent results and low precision when dealing with sparse and noisy data (the denominator is 0). Entropy is proposed to go for better-constrained features in the clustering process. Entropy-Weighting K-Means is particularly prominent method among local weight-based and entropy-weight algorithms [26]. EWKM pioneered the form of entropy weight and applied it to membership to better constrain the objective function. The algorithm introduced the form of entropy weight in response to this defect to incentivize more features that contribute to identifying clusters. The newly introduced approach is more efficient in dealing with various data types, particularly high-dimensional sparse data. Entropy weights can be used to assign relevant feature weights to a cluster, whereas fuzzy partitions can help identify the best partitions. This method, however, ignores the limitations of Euclidean distance and may result in inconsistent classification [24,27]. Moreover, the weights of this algorithm are very dependent on the initial clustering centers and are sensitive to changes related to the centers. If the initial cluster center varies, the algorithm will not be robust and will not converge. Therefore, improving the clustering algorithm should focus on these aspects. To address the effect of noise and features with different weights in different clusters, certainly improved algorithms for classification criteria based on non-Euclidean distances have been proposed [28][29][30]. Some algorithms using entropy weights have also been proposed in unsupervised clustering studies. Both Entropy K-Means [31] and U-K-Means [32] algorithms use feature weights and entropy weights in unsupervised learning. However, the distance formula used in their objective functions is Euclidean distance, which does not perform well in the face of high-dimensional and noisy data. This can eventually lead to noise and irrelevant features influencing the overall clustering results. In recent years, there has been extensive interest in research on feature weights and entropy weights for fuzzy clustering. However, these algorithms frequently only consider the Euclidean distance and do not constrain features or use fuzzy partitioning [33][34][35].
This paper proposes an advanced FCM algorithm to handle high-dimensional sparse and noisy data to solve the flaws mentioned above. The proposed algorithm's new findings use entropy control features and partitions, and adding non-Euclidean distance division eliminates noise interference. Moreover, the entropy weight is introduced to the membership and weight variables to enhance its efficiency in processing different datasets. In improving clustering in the face of high-dimensional and complex data, the intra-cluster dispersion is minimized, and the negative weight entropy is maximized to motivate features to help identify clusters. Furthermore, the proposed method updates the membership degrees of samples in different clusters and the feature weights in different clusters during each iteration so that the objective function converges rapidly. This algorithm avoids this issue by efficiently handling high-dimensional data and noise with feature weights and non-Euclidean distance formulas. Furthermore, entropy weights are used to constrain variables, which can be more advantageous than exponential constraints in some cases. Extensive experimental results on real datasets suggest that the proposed algorithm performs better in clustering. The proposed algorithms on high-dimensional and complex datasets also exhibit high performance. Furthermore, the algorithm exhibits robustness and stability in the experiment.
The remainder of the paper comprises the following sections: Section 2 introduces several of the most classic clustering methods. In Section 3, the proposed algorithm, its convergence proof, and its complexity analysis are provided. The performance of the proposed algorithm and other clustering algorithms is compared and evaluated using different clustering metrics in Section 4. Lastly, Section 5 presents a summary of the paper.

The K-Means Algorithms
For a given dataset X N×M , N denotes the number of samples, M denotes the number of features, K represents the number of clusters, x ij denotes the jth feature in the ith sample, c ij denotes the cluster center of the jth feature in the ith cluster, and u ij denotes whether the ith sample belongs to the kth cluster. (x ij − c lj ) 2 is the Euclidean distance between the ith sample and the jth cluster at the lth feature. The objective function can be defined as subject to The K-means can be minimized by continuously iterating the following equations: where z lj denotes the value of the jth feature in the lth clustering center, and t denotes all the clustering centers in the clustering process. The proposed K-means algorithm has significantly contributed to the study of clustering. However, when noise is encountered, the noise dimension is also considered in the results while computing the distance between the samples, leading to decreased clustering accuracy. Furthermore, as the dimensionality of the dataset increases, so does the number of outlier points and the dispersion between samples, affecting the clustering center and changes the clustering results. K-means clustering is often inefficient when dealing with high-dimensional sparse and noisy data. Moreover, it is susceptible to the initial clustering center.

The Weighting K-Means Algorithms
WK-Means generalizes the K-means and introduces a new algorithm to solve the noise data [30] skillfully. It considers that different features must have different weights so that the effect of the noise dimension can be ignored as much as possible when evaluating the distance between samples. Therefore, noise far from the cluster centroid is given a smaller weight and has less influence on the cluster centroid. Therefore, the clustering accuracy is improved. The objective function of WK-Means is as follows: The WK-means can be minimized by continuously iterating the following equations: where In Equation (5), w ij indicates whether the ith sample belongs to the kth cluster, w j denotes the weight of the jth feature, and β represents a fuzzy constant, usually taken as 2.
Although the WK-means algorithm was the first to use weights, targeting global weights performs poorly in some data, and hard clustering may result in results being fixed. It can be deduced that the weight difference between features is not always visible when the dataset is high-dimensional. The FCM algorithm based on fuzzy ideas significantly reduces the singularity of clustering and has several advantages in terms of high-latitude data.

Fuzzy C-Means Algorithm
As the most representative soft clustering method, FCM defines a new method for dividing the clusters. Each sample has a different degree of membership for each cluster, and its cluster assignment is determined by the degree of membership. The objective function can be defined as The FCM is minimized by continuously iterating the following equations: where u ij represents the degree of the ith data's membership to the jth cluster, and m is the number of fuzzy factors. Although the FCM algorithm is the most representative soft clustering algorithm, it still has numerous flaws and shortcomings. For example, the algorithm only counts the "nearest" neighbor samples, and the number of samples in a category is very large. The algorithm prioritizes that sample, which affects the clustering results. Moreover, fuzzy factors as exponential forms often do not constrain the features well, leading to the fact that it tends to perform much worse when combined with the Euclidean distance formula.

The Proposed Algorithm
In this section, a novel Fuzzy-C-Means-based entropy weighting algorithm is proposed. Motivated by the shortcomings of traditional Fuzzy-C-Means clustering algorithm, a new algorithm is presented that includes local feature weighting, the use of entropy weights acting on features, and the degree of membership to improve clustering's sensitivity and accuracy to random class centers. Furthermore, because the Euclidean distance is susceptible to noise and outliers [29], a non-Euclidean distance is introduced. The new distance formula makes the algorithm more robust and fully uses the dataset's features to get the clustering result. The objective function can be defined as In Equation (15), U = [u ij ] is an N by K matrix, in which u ij denotes the degree of the ith sample's membership to the center of the jth cluster; C = [c jl ] is a K by M matrix, where c jl represents the center of the jth cluster and is defined by u ij . Moreover, W = [w jl ] is a K by M matrix, where w jl denotes the weight of the lth feature in the jth cluster. U is the membership matrix of each sample to cluster, containing N samples and K clusters. C is the feature center matrix of each cluster, containing K clusters and M features. W is the feature weight matrix of each sample to cluster, containing K clusters and M features. The term (1 − exp(−δ l (x il − c jl ) 2 ) denotes a non-Euclidean distance metric between the ith sample and the jth cluster in the lth feature and is defined as follows: where δ l denotes the inverse of the variance of the lth feature of the data, subject to Minimizing F in Equation (15) with the constraints forms a class of constrained nonlinear optimization problems. The usual approach toward optimization of F is to introduce partial optimization for U, C, and W. First, U and C are fixed, and the reduced F is minimized with respect to W. Next, U and W are fixed, and the reduced F is minimized with respect to C. Followed by this, W and C are fixed, and the reduced F is minimized to solve U. After the results are calculated iteratively, the solution can be drawn.
The Lagrange multiplier technique is used to solve the following unconstrained minimization problem: where α and β are Lagrange multipliers. By setting the gradient of F with respect to α, β, u ij , c jl , and w jl to zero, From Equations (21) and (22), where From Equations (19) and (23), where it follows that exp( α λ which can be substituted into Equation (23), From Equations (20) and (24), where it follows that which can be substituted into Equation (24), For the clustering centers, where it follows that which gives It is evident that Equation (35) is independent of the parameters λ and γ. The interdependence of both terms promotes the detection of a better partition during the clustering process. The proposed algorithm minimizes Equation (15), using Equations (29), (32), and (35).

Parameter Selection
The values of λ and γ are essential for the proposed algorithm since they affect the significance of the second and third terms in Equation (15) relative to the first term. Initially, λ plays two roles in the clustering process; when λ is large, this results in a smaller value of u ij in Equation (29) and, thus, the second term has a more significant influence to minimize Equation (15). Therefore, it tries to assign more than one sample cluster to make the second term more negative while clustering. The membership entropy value becomes larger when the membership value u ij of a sample to all clusters is equal. After the position of the samples is fixed, all the clustering centers move to the same position for an enormous entropy value. Second, when λ is large, this results in an immense value of u ij in Equation (29). Therefore, the first term plays a key role in minimizing Equation (15). The local feature weights are controlled by γ. Since γ is positive, the value of w jl is inversely proportional to ∑ M i=1 D ij . A smaller value of this term results in a larger w jl . If γ is large, the third parameter controls the partitions, and all feature weights are assigned 1/M in different clusters.
Assuming that F t denotes the value of Equation (15) after the run, F t+1 denotes the value after the next completion. The proposed algorithm is summarized below (Algorithm 1).

Algorithm 1. Proposed Clustering Algorithm.
Input: Dataset X N×M , the number of clusters K, the values of parameters λ and γ. Randomly set K cluster centers, generate a set of initial weights, set t = 0, the maximum iteration is MAX, and local minimum value = F t+1 − F t . Output: U = [u ij ] N×K . Repeat 1: Compute the non-Euclidean distance matrix D N×M . 2: Update the partition matrix U N×K using Equation (29). 3: Update the weight matrix W K×M using Equation (32). 4: Update the cluster center matrix C K×M using Equation (35). 5: Until the objective function is less than or equal to the local minimum or reaches the maximum iterations.

Convergence Analysis
It is important to note that the proposed algorithm will converge in the iterations. It can be observed that different partition U occurs only once during the algorithm process. Therefore, it is assumed that U i = U j where i = j. It must be noted that given U t . The minimizer W t should be calculated. For U i and U j , the minimizers are W i and W j , respectively. It is evident that W i = W j since U i = U j . Using U i , W i , U j , and W j , the minimizers C i and C j can be calculated, respectively. It is obvious that C i = C j . Therefore, the following equation can be obtained: However, the function F(·, ·, ·) monotonically decreases. Therefore, different partition U occurs only once during the algorithm process. u ij in Equation (29) can be calculated after taking the derivative of Equation (18) and setting it equal to zero. Therefore, u ij can be minimal or maximal. If the second partial derivative of Equation (18) is positive, it can prove that u ij defined by Equation (29) is a local minimum of Equation (18). The second partial derivative of Equation (18) with respect to u ij is Since u ij > 0, λ > 0, it can prove that Equation (37) is positive. Therefore, K in Equation (29) is a local minimum of Equation (18). The proposed algorithm converges in a finite number of iterations.

Computational Complexity
As shown in Table 1, the computational complexity of the proposed algorithm is high when compared to other clustering algorithms. However, by adding new terms, the clustering performance is improved. The computational complexity of the algorithm is based on four update processes: updating the distance matrix D, cluster center matrix C, membership matrix U, and weight matrix W. The computational complexity of each process is equal to NKM. Each process executes independently; hence, the total computational complexity is 4NKM. Furthermore, N denotes the number of samples, K denotes the number of clusters, and M denotes the number of features. Each iteration updates D, C, U, and W using Equations (29), (32), and (35) and finally classifies the different samples according to the matrix U.

Method
Computational Complexity

Experiments
In the experimental section, to test the performance of the proposed algorithm, the performance of other clustering algorithms is evaluated and compared on real-world datasets. These algorithms are the standard K-Means [11], the standard FCM [12], WK-Means [30], RLWHCM [28], SCAD [25], EWK-Means [31], and UK-Means [32]. These algorithms have specific standard parameters. If the parameter values are uniform, the influence of the parameters can be removed to observe the clustering results. Therefore, various parameters were equalized to avoid inconsistency in algorithm performance. In the experiments, the maximum number of restarts was set to 100. For λ = 0.3 and γ = 1.4, the clustering centroids were randomly selected from the original datasets. In practical applications, choosing the appropriate threshold value is a crucial issue. If the threshold is too small, the algorithm may converge very slowly or not even converge; if the threshold is too large, the algorithm may stop prematurely, resulting in less accurate clustering results. Therefore, experiments and adjustments are needed to determine an appropriate threshold size. Considering that the threshold value is not the same for different datasets, to get the best performance, we reduced from 0.1 to 0.00001 by 10 times in each case to get the best clustering result. After testing, value = 10 −5 was set to get accurate clustering results for different datasets. Each algorithm was iterated 100 times, and the best result was recorded.
Seven real-world datasets from UCI [36] were used to assess the performance of the proposed approach and compare its results to other approaches. Furthermore, the text, face image, and biological datasets [37] highlight the proposed algorithm's performance in high-dimensional and noisy datasets. These datasets are mentioned in Table 2. The best clustered values for each dataset in Tables 3 and 4 are bolded.

Evaluation Indicators
The clustering accuracy is defined as In Equation (38), D l denotes the number of samples correctly classified into the lth cluster, and N represents the number of points in the dataset. A considerable value of ACC [38] suggests a better clustering performance.
Since the sample labels of the real-world dataset are known, RI [38] was used to evaluate the similarity between the clustering partitions and real partitions.
In Equation ( . In Equation (40), A and B are the two partitions in the dataset comprising clusters I and J, respectively. P(i) indicates the probability that a randomly selected sample is allocated to a cluster A i , P(i, j) represents the probability that a sample belongs to both clusters A i and B i , and H(A) is the entropy associated with all the probabilities P(i) in partition A [39]. A larger value of N MI leads to more consistent clustering results.

Clustering Results on Real-World Datasets
As demonstrated in Tables 3 and 4, the clustering performance of the proposed algorithm in terms of ACC, RI, and NMI was much better on different real-world datasets than other clustering methods. The clustering results suggest that the proposed algorithm significantly improved the clustering performance and provided the best results in most real-world and txt datasets. It should also be noted that the proposed algorithm's performance was not exceptional in the Wine dataset. After careful examination of this dataset, it was discovered that there were only three clusters in this dataset, and there was little variation in the values of the features. The value of a feature in different clusters may differ by as little as 0.1, which negatively impacts any clustering algorithm and leads to poor performance of entropy-weight terms and non-Euclidean distances. Most classical clustering algorithms had ACC and NMI values of nearly 0.5 in the high-dimensional datasets because the number of clusters was two, indicating that these algorithms failed due to high dimensionality and noise. The reason is that there are often many outliers in high-dimensional data, and the data distribution could be sparser. Traditional clustering algorithms have difficulty clustering these sparse data. As the dimensionality increases, the calculation based on distance makes the clustering centers progressively more complex and is affected by outliers leading to shifts. At the same time, because the different feature weights of different clusters are not considered, the feature of sparse high-dimensional data cannot be exploited, making the clustering results much less accurate. However, using the entropy constraint feature in the proposed algorithm improves the algorithm's performance in three performance metrics. On the other hand, the clustering algorithm with exponential constraints equalizes sample membership degrees, resulting in an inaccurate ACC value of 0.5. The results also show that the proposed algorithm has the advantage of assisting in the detection of noise and the main classification features in large datasets. The clustering results of the proposed algorithm performed better in the Face Image Dataset and Biological Dataset. This shows that, with random initialization of clustering centers, our algorithm was better and more stable in handling high and noisy datasets. From the clustering results, we can find that the K-Means algorithm performed better in the face dataset because it did not introduce the feature weights, which could lead to unsatisfactory results for high-dimensional data and when the number of clusters is large. Furthermore, this demonstrates that the algorithm can be used for classification in both supervised and unsupervised learning.
Numerous points justify the performance of the proposed algorithm. First, this algorithm has good performance even with terrible initial centers, while the other algorithms are severely sensitive to initialization. Second, the introduction of entropy weighting allows different features to be well added to the clustering process. Third, the non-Euclidean distance makes it possible to encounter noisy and sparse data in the calculation and does not affect the clustering results.
The value range of the parameters was discussed in detail in the previous section. To further examine the sensitivity of λ and γ, the algorithm's sensitivity was analyzed on the Iris and Zoo datasets. λ = 0.3 was fixed, increasing γ from 0 by 0.1 to 2 each time. The sensitivity of λ can be inferred from Figure 1. After that, γ = 0.8 was fixed, increasing λ from 0 by 0.1 to 2 each time. The sensitivity of γ can be observed in Figure 2. It can be observed from the figures that ACC, RI, and NMI did not fluctuate much when the two parameters were varied, which highlights the excellent performance and robustness of our algorithm. Higher values for the three metrics also directly reflect the proposed algorithm's efficiency and robustness. The ACC changes of the proposed algorithm were analyzed on the basis of the non-Euclidean distance and the Euclidean distance tested on the Iris and Zoo datasets, to better demonstrate this advancement of the proposed non-Euclidean distance in computing sample point distances. Figure 3 shows that clustering results based on non-Euclidean distances were more accurate than those based on Euclidean distances. In the non-Euclidean distance, the average ACC values were 0.95 and 0.93, respectively, while in the Euclidean distance, the average ACC values were 0.79 and 0.80, respectively. The new distance formula increased the ACC of the algorithm by 18%, which also indicates the advantage of the newly proposed non-Euclidean distance in dealing with high-dimensional sparse and noisy data. Furthermore, the ACC's variance of non-Euclidean distance was 0.025, while the ACC's variance of Euclidean distance was 0.074, indicating that the non-Euclidean distance makes the algorithm more robust. To better distinguish the algorithms in the experiment, Table 5 reflects the usage conditions associated with the compared algorithms. The experimental results show that the proposed algorithm had high clustering accuracy in most real-world datasets from various domains. The sensitivity analysis of the two parameters could be determined, proving that the algorithm is robust and performs well. Furthermore, when comparing the Euclidean distance and the non-Euclidean distance on the objective function, we can find that the non-Euclidean distance was more accurate and stable in clustering results.

Discussion of Noise
To demonstrate the performance of the proposed algorithm under the influence of noise, a new experiment was designed for the Iris dataset. In the Iris dataset, uniformly distributed data from 0 to 1 were randomly assigned as new noise features. Furthermore, to compare the effect with and without noise, we compared experiments for the original dataset and the dataset with noise. As shown in Figure 4, the noise only slightly affected the clustering results. This shows that the performance of the proposed algorithm remained good even though the noise affected the dataset. Moreover, as shown in Figure 5, the noise did not influence the assignment of feature weights. This confirms the high accuracy and stability of the algorithm.

Conclusions
This paper proposed a new algorithm for classifying high-dimensional and noisy data on the basis of non-Euclidean distance, combining feature weights and entropy weights. In this approach, two different entropy terms are added to the objective function, which helps better identify the clustering features of complex data. The performance was compared with state-of-the-art methods in terms of different clustering measures, reveal-

Conclusions
This paper proposed a new algorithm for classifying high-dimensional and noisy data on the basis of non-Euclidean distance, combining feature weights and entropy weights. In this approach, two different entropy terms are added to the objective function, which helps better identify the clustering features of complex data. The performance was compared with state-of-the-art methods in terms of different clustering measures, revealing that the proposed approach is a new clustering algorithm that can partition data with

Conclusions
This paper proposed a new algorithm for classifying high-dimensional and noisy data on the basis of non-Euclidean distance, combining feature weights and entropy weights. In this approach, two different entropy terms are added to the objective function, which helps better identify the clustering features of complex data. The performance was compared with state-of-the-art methods in terms of different clustering measures, revealing that the proposed approach is a new clustering algorithm that can partition data with improved performance. Considering the nature of the proposed algorithm and the results of extensive experiments on various datasets, it can be applied to medical research and textual information, facilitating the extraction of critical features, and obtaining clustering results in high-dimensional and complex data conditions. The proposed algorithm significantly improves on the following aspects: (1) The clustering result is consistent and stable, as it is not susceptible to the original cluster centers and assigns different feature weights to each cluster in the clustering process. (2) The entropy weights improve the algorithm's handling of partitioning during the clustering process and highlight the importance of distinguishing different features. (3) The introduction of non-Euclidean distance makes the algorithm more robust and efficient in handling high-dimensional sparse and noisy data in the real world. (4) The insensitivity to parameter changes ensures the flexibility of the algorithm.
In the future, EM and Gaussian mixture models will be used to improve the clustering algorithm, making it more useful in image processing.
Funding: This research received no external funding.