Next Article in Journal
Magnetic Field Analysis and Optimization of the Gauge of Hybrid Maglev Needles
Previous Article in Journal
Robustness of 3D Navier–Stokes System with Increasing Damping
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

A Novel Hybrid High-Dimensional PSO Clustering Algorithm Based on the Cloud Model and Entropy

School of Management, Guizhou University, Guiyang 550025, China
*
Author to whom correspondence should be addressed.
Appl. Sci. 2023, 13(3), 1246; https://doi.org/10.3390/app13031246
Submission received: 17 December 2022 / Revised: 11 January 2023 / Accepted: 11 January 2023 / Published: 17 January 2023

Abstract

:
With the increase in the number of high-dimensional data, the characteristic phenomenon of unbalanced distribution is increasingly presented in various big data applications. At the same time, most of the existing clustering and feature selection algorithms are based on maximizing the clustering accuracy. In addition, the hybrid approach can effectively solve the clustering problem of unbalanced data. Aiming at the shortcomings of the unbalanced data clustering algorithm, a hybrid high-dimensional multi-objective PSO clustering algorithm is proposed based on the cloud model and entropy (HHCE-MOPSO). Furthermore, the feasibility of the hybrid PSO is verified by the simulation of the multi-objective test function. The results not only broaden the new theory and method of clustering algorithm for unbalanced data, but also verify the accuracy and feasibility of the hybrid PSO. Furthermore, the clustering analysis method based on information entropy is a new method. As a result, the research results have both important scientific value and good practical significance.

1. Introduction

There are various high-dimensional unbalanced data in big data applications. In particular, it is very difficult to deal with high-dimensional multi-view data in classic clustering. Cluster analysis is a clustering method of unsupervised learning as the most important branch of data mining technology that has very important scientific value. Simultaneously, the cluster analysis is a multivariate statistical analysis method to cluster samples by maximizing the similarity of samples within a class and minimizing the similarity of samples between classes. Usually, the intra class squared error is minimized to achieve class division. Concurrently, the cluster analysis can also be regarded as an optimization problem and high-dimensional multi-objective clustering optimization is one of the challenges. Therefore, how to efficiently process such massive data and obtain valuable information from it has become the focus of enterprises, scientific research, and other fields. Data is growing explosively, and a large number of high-dimensional data are generated from different fields every day. Aiming at the exponential and high-dimensional problem, a high-dimensional, multi-objective optimization algorithm with additive structure is proposed [1]. How to identify and mine useful information from massive data is particularly important. Multi-objective optimization algorithms have been widely used in many technical industries [2]. However, in many practical applications, data often presents the characteristics of category imbalance and increases the difficulty of the clustering algorithm. It is widely known that data often has the problem of unbalanced distribution of categories in many practical applications. Because the proposed implementation is very high, even when the problem dimension and population number increases, it can be widely used in real-world optimization problems [3]. At present, a lot of work has been carried out on the clustering of class imbalance data, which has achieved good results. However, there are still some problems that need to be solved urgently. Consequently, how to design a reasonable data balance processing method is a very meaningful problem.
Clustering has very important scientific research value such as being the most important branch of data mining technology. In addition, data imbalance is an urgent problem to be solved in the field of high-dimensional complex data clustering in practical projects [4]. Under the background of highly unbalanced clustering data, some scholars apply generalized clustering bootstrapping to Gaussian quasi likelihood and robust estimation methods [5]. Simultaneously, the critical factor affecting the performance improvement of unbalanced data is the selection of clustering methods. Aiming at the limitations of existing unbalanced data modeling techniques, a hybrid clustering algorithm based on boundary sparse samples is proposed. Eventually, the experimental results show that the algorithm has good performance on different data sets with an average increase of 3.5% [6]. In the meantime, the clustering methods for serious class imbalance data and the clustering performance of these methods will significantly decline with the increase in the imbalance ratio. By extending the nonparametric estimation method, the clustering data can satisfy the similarity of objects in the same cluster. In this way, the objects between different clusters can be reduced [7]. The clustering method is sensitive to the initial solution, and it is easy to fall into the local optimum. Furthermore, a fuzzy c-means clustering algorithm based on optimal selection is proposed [8]. The optimal supervised C-means clustering algorithm discusses the alignment of a supervised orthogonal linear local tangent space [9]. In the meantime, the comparison results show that the local search algorithm is superior to other advanced algorithms in terms of solution quality and running time [10]. Simultaneously, it introduces the high-dimensional particle swarm optimization(PSO) algorithm presented by Kennedy and Eberhart that it is a bio-inspired optimization technique based on arandom search to solve some clustering problems. The above research results further expand the application field of high-dimensional, multi-objective optimization problems and improves the ability of multi-objective PSO to solve practical problems such as pattern recognition, data mining, and machine learning. Therefore, the clustering is not only widely used in real life, but also has great theoretical significance.

2. Multi-Dimensional Cloud Model

The cloud model is a kind of uncertainty artificial intelligence theory and method from qualitative to quantitative transformation. At the same time, the cloud model has the characteristics of the fuzzy theory and membership function. Moreover, the cloud model can completely unify the fuzziness and uncertainty of things’ cognitive concepts into qualitative and quantitative processing and apply them to data mining, intelligent control, and other fields. In practical applications, we consider that events obey or nearly obey the normal distribution, so we usually use the normal cloud model to analyze problems. Furthermore, we will use E x to express the expectation of the cloud model because the expectation best represents the average degree and level of all sample points in the cloud model. In addition, we will use E n to describe the entropy of the cloud model because the measurement of E n is determined by the randomness and fuzziness of the sample points of the cloud model, which can describe the dispersion of cloud droplets in the cloud model. Subsequently, we will use H e to represent the super entropy of the cloud model, which can measure the uncertainty measure of the cloud model. The larger H e is, the greater the cloud droplet dispersion of the sample point and the larger the range of cloud droplet membership of the cloud model, thus the thicker the sample cloud generated. At the same time, we can find that the application of specific cloud models can be realized through cloud reasoning, cloud computing, cloud clustering, and other methods. Subsequently, the cloud models are transformed from abstract to specific by the cloud digital feature generation cloud droplet process. Through the above analysis, the specific relationship can be defined as follows.
μ ( x ) = e ( x E x ) 2 2 E n 2
In the normal cloud model, cloud droplet groups have different contributions to qualitative concepts. In this paper, we use multi-dimensional normal clouds to illustrate the contribution of cloud droplet groups to the concept. It can be concluded that the contribution of all elements of the normal cloud model to the qualitative concept is determined as follows.
Δ C μ A ( x ) Δ x / ( 2 π E n )
Through the above analysis of the normal cloud model, it can be concluded that the total contribution of all elements of the normal cloud model to the qualitative concept is expressed as follows.
C = + μ A ( x ) d x 2 π E n = + e ( x E x ) 2 / ( 2 E x 2 ) d x 2 π E n = 1
The total contribution of all elements in the universe E x 3 E n , E x + 3 E n to the concept can be expressed as follows.
C E x ± 3 E n = 1 2 π E n E x 3 E n E x + 3 E n μ A ( x ) d x = 99.74 %
The sample cloud drops of cloud model in the interval [ E x 2 E n , E x + 2 E n ] and [ E x 3 E n , E x + 3 E n ] account for about 33.34%. At the same time, the contribution of cloud model sample points to qualitative concepts is about 4.40%. Through the above analysis of the cloud model, the contribution of cloud droplet groups in different cloud model sample areas to qualitative concepts can be described in Figure 1 as follows.
The cloud model is actually a model that transforms qualitative concepts and quantitative data into each other. Positive cloud realizes the transformation from qualitative concept to quantitative data. Through the above analysis, the probability density function expected by the cloud model can be expressed as follows.
f E n ( x ) = 1 2 π H e e ( x E n ) 2 2 H e 2
The probability density function of the variable can be determined as follows.
f x ( x | E n ) = 1 2 π E n e ( x E x ) 2 2 E n 2
where the expectation of variables is E x and the normal distribution of variance is E n . From the conditional probability density formula, it can be inferred that the probability density function can be expressed as follows.
f x ( x ) = f E n ( x ) × f x ( x | E n ) = + 1 2 π H e | y | e ( x E x ) 2 2 y 2 ( y E n ) 2 2 H e 2 d y
For any variable, the corresponding function value can be obtained through numerical integration. When the number of cloud drops is cloud drops, the time parzen method can be used to estimate the probability density function. Consequently, the probability density function of the cloud model at that time H e = 0 can be expressed as follows.
f ( x ) = 1 2 π E n e ( x E x ) 2 2 E n 2
Because all cloud droplets come from the expectation that they are normal random variables, the expectation and variance have the conclusions E X = E x and D X = E n 2 + H e 2 . It is widely known that the traditional algorithms are easy to fall into local optimum and the convergence speed is slow. In order to overcome these shortcomings, a particle swarm optimization algorithm based on the cloud model is proposed. Simultaneously, we can assume that the two-dimensional cloud is represented by C i 1 = ( E x i 1 , E n i 1 , H e i 1 ) and C i 2 = ( E x i 2 , E n i 2 , H e i 2 ) . Through the above analysis, the shape similarity calculation based on normal two-dimensional cloud is shown as follows.
S i m s i ( C i 1 , C i 2 ) = min ( E n i 1 2 + H e i 1 2 , E n i 2 2 + H e i 2 2 ) max ( E n i 1 2 + H e i 1 2 , E n i 2 2 + H e i 2 2 )
Through the above analysis, the distance similarity based on the normal cloud model can be calculated as follows.
S i m d i = 1 ( E x i 1 E x i 2 ) 2 + ( E n i 1 E n i 2 ) 2 + ( H e i 1 H e i 2 ) 2
Therefore, the comprehensive similarity calculation of two-dimensional cloud model can be expressed as follows.
μ i = S i m c i = S i m s i × S i m d i
Multi-dimensional data refers to a data set composed of a large number of samples with multi-dimensional attributes. Concurrently, the multi-dimensional cloud model is a cloud model generated according to various characteristics of data to establish an integrated cloud. Furthermore, the system can be evaluated by calculating the similarity between the multi-dimensional cloud model composed of new data and the integrated cloud. As a result, the multi-dimensional cloud model can effectively solve the problem of low clustering accuracy caused by the fuzziness and uncertainty of multi-sample, multi-dimensional data. In the meantime, the simulation results of the multi-dimensional cloud model can be represented in Figure 2 as follows.
Due to the uncertainty and multidimensional attributes of complex data, it is difficult to obtain high clustering accuracy by directly using existing machine learning algorithms and statistical analysis methods. At the same time, it can be found, through experiments, that this clustering should have computational efficiency and be completed in the shortest time. In order to solve the two problems of similarity and attribute reduction, we propose a high-dimensional, multi-objective particle swarm optimization hybrid clustering algorithm based on the cloud model and entropy.

3. HybridHigh-Dimensional, Multi-Objective PSO Clustering Algorithm Based on the Cloud Model and Entropy

Clustering is a critical technology and method in data mining, machine learning, pattern recognition, and artificial intelligence. Through a clustering algorithm, we can find the global distribution pattern and the relationship between data attributes. In particular, the clustering algorithm has the ability of scalability, processing different types of data, and processing high-dimensional data. Therefore, the emergence of large-scale unbalanced data sets poses special challenges to clustering analysis technology. In the meantime, the hybrid PSO algorithm has a global optimization ability and distributed random search characteristics to solve the problems that traditional clustering algorithms easily fall into, including local optimization and sensitive to initial value [11,12]. Simultaneously, the convergence speed and global optimization ability of the algorithm can be effectively improved through the combination of PSO, which combines excellent features and traditional clustering methods [13,14,15,16,17,18,19]. In order to effectively reduce the algorithm from falling into the local optimum, the multi-mode cooperative multi-objective particle swarm optimization algorithm based on reinforcement learning is proposed. Furthermore, the experimental results show that the hybrid PSO algorithm is more effective and robust than the other algorithms [20]. As a powerful optimization technology, a feature selection clustering method based on particle swarm optimization (PSO) algorithm is proposed [21]. The particle swarm optimization (PSO), based on multi subgroup distributed architecture, is very effective for static multi-objective optimization problems, but it has not been used to solve dynamic multi-objective problems (DMOP) at present [22]. According to the multi-objective particle swarm optimization (MOPSO) algorithm, the design of the updating mechanism and population maintenance mechanism is the critical technology to obtain the optimal solution [23]. In order to effectively solve multi-modal, multi-objective optimization problems with the same fitness value, an input-output fuzzy clustering was proposed [24]. Based on the feature selection algorithm, a new hybrid clustering algorithm for multi-objective PSO feature selection is proposed [25]. Particle swarm optimization has a built-in guidance strategy to improve their solutions [26]. Combined with the automatic clustering problem, a hybrid method of chaos game optimization and particle swarm optimization (CGOPSO) is proposed [27]. Simultaneously, the proposed clustering algorithm is compared with spherical K-means and the PSO algorithm in terms of feasibility and convergence characteristics [28]. However, it is easy to converge to the suboptimal clustering solution too early and the learning coefficient value needs to be adjusted to find a better solution [29]. Therefore, the particle swarm optimization (PSO) algorithm is widely used in clustering analysis.
By optimizing the algorithm to find the best cluster center, the hybrid ensemble clustering algorithm is used as an optimization problem to solve [30]. At the same time, integrated clustering has been used as the final clustering to achieve high quality. On the basis of data analysis, it is a good idea to improve the performance and the speed in the field of machine learning [31]. At the same time, we all know that the fundamental reason for the decline of clustering performance of imbalanced sets lies in the high-dimensional characteristics of imbalanced data. However, optimization problems show a trend of diversification accompanied by non-linear, high dimensional, and other characteristics. The optimization objectives involved in these practical problems may be one or even several conflicting ones, often with harsh constraints. The PSO algorithm is an intelligent optimization with few adjustable parameters and good robustness. With its outstanding performance in dealing with single objective optimization problems, it has been extended to the field of multi-objective optimization. Through the comparison with the comparison algorithm, it shows that the superiority of the PSO depends on the parameter tuning and it has premature convergence. The clustering evaluation function of data set is regarded as the fitness function of PSO algorithm, which it is a well-known, unsupervised clustering algorithm. As a result, the possible clustering partition is regarded as a particle of the population, so that the hybrid PSO can realize the function of the optimal clustering partition and search efficiently and escape from local optima.
On the basis of the above analysis, we constructed a fitness function similar to the multi-objective PSO algorithm based on the objective function and constraints of the multi-objective optimization problem. In the given multi-objective interval PSO algorithm, initial conditions and fitness function particle search space that are optimal variables and encoded as the position. During the iteration of multi-objective interval particle swarm optimization algorithm, particles track their own optimal position ( p b e s t ) and the group optimal position ( g b e s t ). Consequently, the tracking equation of multi-objective PSO algorithm can be expressed as follows.
{ v i d k + 1 = v i d k + c 1 r 1 ( p b e s t i d k x i d k ) + c 2 r 2 ( g b e s t i d k x i d k ) x i d k + 1 = x i d k + t 0 * v i d k + 1 ( t 0 = 1 )
Among them, c 1 and c 2 are the acceleration constants for multi-objective interval PSO algorithm to adjust the maximum step size of particles to reach the optimal position ( p b e s t ) and the optimal position ( g b e s t ). Where x i d k + 1 = x i d k + v i d k + 1 . Each particle is assigned an initial position and initial velocity as the optimal initial state. r 1 ( x ) and r 2 ( x ) are uniformly distributed random numbers to simulate the group behavior of the multi-objective interval PSO algorithm. In order to improve the convergence speed and solution quality of multi-objective interval particle swarm optimization algorithm, the basic PSO iterative equation is improved using the following expression.
v i d k + 1 = w v i d k + c 1 r 1 ( p b e s t i d k x i d k ) + c 2 r 2 ( g b e s t i d k x i d k )
Along with these parameters, w is the inertial weight coefficient. Larger inertia weight can enhance the global search ability of the multi-objective particle swarm optimization algorithm. With the reduction of inertia weight, the multi-objective PSO algorithm realizes the local search and finally gets the optimal solution. Finally, we propose to introduce a compression factor into the multi-objective PSO algorithm to ensure the convergence equation of the algorithm.
Through the above analysis, the convergence equation of the algorithm can be expressed as follows.
v i d k + 1 = χ { v i d k + c 1 r 1 ( p b e s t i d k x i d k ) + c 2 r 2 ( g b e s t i d k x i d k ) }
where ( φ = c 1 + c 2 ) is the compression factor of the multi-objective particle swarm optimization algorithm. χ = 2 / | 2 φ φ 2 4 φ | .
Through research of the PSO and multi-objective PSO algorithm, we can conclude that the average focusing distance of the algorithm particles d ˜ , the maximum focusing distance between particles d max , and the change rate of the particle focusing on distance K are shown as follows.
{ d ˜ = i = 1 n i = 1 m ( p g m x i d ) 2 n d max = max ( i = 1 m ( p g m x i d ) 2 )
K = ( max ( i = 1 m ( p g m x i d ) 2 ) i = 1 n i = 1 m ( p g m x i d ) 2 n ) max ( i = 1 m ( p g m x i d ) 2 )
Through in-depth research on the multi-objective PSO algorithm, the following functions are used to explore the properties and related mechanisms of the multi-objective PSO algorithm and to perform the following simulation experiments. The simulation results of its multi-objective function can be shown in Figure 3 as follows.
In the multi-objective PSO algorithm, it is often difficult to get rid of the local optimum, the accuracy of the solution is not enough, the calculation speed is not fast enough, and so on. As the number of problem objectives increases, the dimension and scale of the solution increase accordingly, resulting in high complexity of the problem and greater difficulty in solving it. How to effectively solve the quality and efficiency of complex optimization problems based on swarm intelligence algorithms has become a hotspot in this field. For multi-objective optimization problems, we should not only solve the balance between diversity and convergence, but also consider the optimal particle selection strategy and diversity maintenance mechanism and other issues effectively. Through a series of experiments, the convergence of different PSO algorithms with different particle sizes can be represented in Figure 4 as follows.
Through experiments, we found that the multi-objective PSO algorithm has the advantages of a small number of individuals, fast iterative convergence, simple operation, and easy implementation. At the same time, we found that the PSO is a random search algorithm based on probability, which has strong robustness and global optimization ability. In the process of generating initial particles, the strategy makes full use of the correlation between features to make the initial population more competitive. Therefore, the multi-objective PSO algorithm has a wide range of applications. However, when dealing with feature selection of high-dimensional data, most of the existing feature selection methods based on the PSO algorithm are prone to fall into local optimization, high computational cost, premature convergence, and low search efficiency. Some scholars also proposed intelligent clustering technology to improve the convergence performance of clustering technology [16]. It is very difficult for classical clustering algorithms to process high-dimensional data. The hybrid algorithms should consider all features of the data and the correlation of all features.
The PSO algorithm guides particles to move in the search space according to the individual optimal position and the global optimal position. Concurrently, this strategy is simple and efficient, which easily leads to the problem of particle oscillation in the search process. Thus, the search efficiency of the particle swarm optimization algorithm is reduced and some solutions with better performance are missed. Furthermore, the core idea of the hybrid clustering algorithm is to generate a large number of high-quality feature subsets using the correlation information of features. It not only has a good clustering effect, but also has good dimension reduction efficiency when facing unbalanced data. Thus, it can be applied to the clustering problem of unbalanced data. Due to the unordered range of clustering data, it is impossible to compare the numerical values. The distance-based measurement method cannot be used to measure the similarity between objects. Simultaneously, we will study the clustering algorithm based on information entropy for the above clustering data. Based on the PSO algorithm, the cloud model is applied to deal with uncertainty with the idea of information entropy.
The lowest value of information entropy is selected as the classification class, and the class with higher information entropy is used as the next target object, and so on, until a cluster that matches the k value is found. Furthermore, the idea of this clustering algorithm is based on the theory of information entropy, and the average information gain rate is proposed according to the relevant knowledge of information entropy. The highest value of mean information gain rate is selected as the standard of equivalence classification. In addition, we introduce information entropy and mutual information to measure the importance of features and select features with high information entropy. Subsequently, the data can be mapped from a high-dimensional feature space to a low dimensional space through information entropy features. As a result, the joint entropy of two variable random variables can be expressed as follows.
H ( X , Y ) = E [ log 1 p ( x , y ) ] = x y p ( x , y ) log 1 p ( x , y )
Under the condition that random variable X is determined, the conditional entropy of random variable Y can be expressed as follows.
H ( Y | X ) = x y p ( x , y ) log 1 p ( y | x ) = x p ( x ) y p ( y | x ) log 1 p ( y | x )
Through the above analysis, it can be expressed as follows.
H ( Y | X ) = x p ( x ) H ( Y | x ) = E [ log 1 p ( y | x ) ]
To solve the above problems, we propose a hybrid, high-dimensional, multi-objective PSO clustering algorithm based on the multi-dimensional cloud model and entropy theory(HHCE-MOPSO) and the feature selection algorithm based on correlation information entropy and the particle swarm optimization algorithm. Simultaneously, the particle position vector is used as the attribute weight vector and the information entropy is used as the attribute weight evaluation function. In addition, the gradient descent method is used to minimize the attribute weight evaluation function. In the process of clustering, the influence of intra class entropy and inter class entropy on attribute weight is comprehensively considered and a group of optimal attribute weight values are finally obtained through iteration. Thus, the global search ability and convergence speed of the algorithm are improved. In the meantime, the hybrid method firstly uses correlation-based information entropy to reduce the high dimension of features quickly. Then, the optimization features are searched by the PSO algorithm. Similar or redundant features are divided into one feature class by information entropy to reduce the search time and space. The particle swarm is prevented from falling into local optimization to obtain the optimal solution of the final feature set. Finally, we will consider large-scale, high-dimensional data that is difficult to process by existing particle swarm feature selection algorithms and propose anovel hybrid, high-dimensional, multi-objective PSO clustering algorithm based on the cloud model and information entropy.

4. Numerical Example

In the multi-objective particle swarm optimization algorithm, particles can be selected based on the Pareto optimal individual extremum. The current position of its particles are compared with the best position in history. Finally, a satisfactory solution, which is consistent with the actual situation, is selected as the individual extremum of the particle. At the same time, we apply the following four experimental functions to do simulation experiments. F 1 = ( x 1 0.5 ) 4 ( x 2 0.2 ) 4 + 2 . F 2 = 0.5 + ( s i n x 1 2 x 2 4 ) 4 0.5 / ( 1 + 0.01 ( x 1 2 + x 2 4 ) ) 4 . F 3 = ( x 2 0.5 ) 4 ( x 1 s i n ( x 2 0.5 ) . F 4 = ( x 1 0.7 ) 4 10 c o s ( 2 π x 2 ) + 10 . Through testing different functions, the experimental results of the multi-objective function based on the multi-objective PSO algorithm are shown in Table 1 as follows.
The results show that the hybrid, high-dimensional, multi-objective PSO clustering algorithm is very efficient in terms of creating well separated, compact, and sustainable clusters in Table 1. Then, the multi-objective PSO algorithm is applied to conduct Pareto vector simulation experiments on the above multi-objective functions. The experimental results are shown in Figure 5 as follows.
From the above analysis, it can be concluded that the HHCE-MOPSO algorithm has better performance than the optimization algorithm, which provides a scientific experimental basis for the future numerical experiments. Metaheuristic and unsupervised PSO clustering algorithms require tuning parameters that control the scope of exploration and exploitation during the optimal search in search space. From Figure 5, it can be observed that the parameter settings are crucial for the hybrid algorithms to function effectively and can significantly affect the performance results.
In order to verify the feasibility of the hybrid PSO algorithms, we applied the following functions to carry out a series of simulation experiments. Based on the above analysis, we used the same test function such as Rosenbrock and different hybrid PSO algorithms. In addition, the experimental results with the best performance of the same test function are described in Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 as follows.
Based on the above analysis, we used the same test function such as Ackleysfcn and different hybrid PSO algorithms. The experimental results of the same test function are described in Figure 11, Figure 12, Figure 13, Figure 14 and Figure 15 as follows.
For quantitative comparison of clustering performance, we used the same test function such as Dejongsfcn and different hybrid PSO algorithms. The experimental results of the same test function are described in Figure 16, Figure 17, Figure 18, Figure 19 and Figure 20 as follows.
Based on the above analysis, we will use the same test function such as Dropwavefcn and different hybrid PSO algorithms. In the meantime, the experimental results of the same test function are described in Figure 21, Figure 22, Figure 23, Figure 24 and Figure 25 as follows.
For quantitative comparison of clustering performance, we used the same test function such as Griewangksfcn and different hybrid PSO algorithms. The experimental results of the test function are described in Figure 26, Figure 27, Figure 28, Figure 29 and Figure 30 as follows.
Based on the above classical multi-objective test function analysis, we used the same test function, such as Nonlinearconstrdemo, and the different hybrid particle swarm optimization algorithms. The experimental results of the same test function are described in Figure 31, Figure 32, Figure 33, Figure 34 and Figure 35 as follows.
Based on the above classical multi-objective test function analysis, we used the same test function, such as Rastriginsfcn, and different hybrid PSO algorithms. The experimental results of the same test function are described in Figure 36, Figure 37, Figure 38, Figure 39 and Figure 40 as follows.
Based on the above analysis, we will use the same test function, such as Schwefelsfcn, and different hybrid PSO algorithms. The experimental results of the same test function are described in Figure 41, Figure 42, Figure 43, Figure 44 and Figure 45 as follows.
From the experimental results of the classical multi-objective test function analysis, it can be seen that the algorithm is simple and effective, with a fast convergence speed and high accuracy. On the basis of theoretical analysis, we used the classic multi-objective test function to conduct effective experiments. Through experiments(all the results of test function were averaged on 50 independent runs), we found that the cloud model is an effective tool in uncertain transformation between qualitative concepts and the quantitative expressions. At the same time, the experimental results show that the algorithm can effectively solve the test problems of most continuous complex targets, which reflects the feasibility of the hybrid algorithm. At the same time, the experimental results show that the algorithm still has a certain degree of robustness. Therefore, the multi-objective PSO algorithm can effectively solve complex multi-objective problems, and can be combined with various advanced constraint processing technologies to deal with single objective and multi-objective optimization problems.
To compare the performance results of hybrid algorithms, numerical and many segmentation measures have been discussed. The result of this implementation showed that the information entropy can be used to control the search, which will strengthen the evolutionary purpose contained in the particle swarm and make the particles converge to the optimal solution more quickly. Through a series of simulation experiments of the UCI dataset, the rationality and feasibility of the hybrid algorithm was also found to be statistically different to other algorithms are verified. Through experimental analysis, the performance parameter results of HHCE-MOPSO clustering algorithms are shown in Table 2 as follows.
Through experiments, we found that the performance of HHCE-MOPSO clustering algorithm could be evaluated using some synthetic and real-life datasets with some validity metrics (cluster numbers, intra cluster distance, intercluster distance, G-mean, and MCC), which was also compared with some prevalent state-of-art automatic clustering algorithms. Furthermore, we found that big data clustering is completed in a reasonable time and is sustainable as well over the dynamic data. The impact of well-defined distance functions on the objective function of different clustering algorithms has been compared with the results of the proposed algorithm. To further verify the hybrid algorithm, we used the UCI, such as Hayes-Roth, for experimental simulation and the results are shown in Figure 46, Figure 47 and Figure 48 as follows.
The experimental data analysis shows that the HHCE-MOPSO algorithm has good performance in terms of optimization quality and convergence speed. The more complex the attributes and higher the dimensions of the dataset, the worse the clustering effect will be if the same algorithm is applied. At the same time, the experiments on different scale datasets verify the effectiveness and efficiency of the hybrid clustering algorithm. Simultaneously, the hybrid PSO clustering algorithm proposed has good performance in different subsets of features in clustering accuracy and efficiency. Through the analysis of experimental data, the experimental results show that the proposed hybrid clustering algorithm is superior to the traditional clustering algorithm in clustering accuracy, and the running time under the same conditions is superior to the time of the traditional clustering algorithm. In addition, when the number of data samples increases, the advantage of the algorithm in time will gradually decrease. Although the hybrid clustering algorithm based on the entropy theory and cloud model is superior to other traditional clustering technologies, it still has some inevitable defects. Therefore, it is necessary to further study how to ensure the reasonable setting of relevant parameters in the future.

5. Relevant Conclusions

A novel, hybrid, high-dimensional PSO clustering algorithm based on the cloud model and entropy theory is proposed by applying the cloud model and entropy theory to analyze the relationship between similarity measurement and data distribution. It can effectively eliminate the unimportant features in high-dimensional data sets, thus improving the performance of the hybrid clustering algorithm. The extensive experimental results on some classical benchmark functions demonstrate that the proposed hybrid algorithm produces encouraging results compared with the clustering techniques established in the literature. With the content, which is from general to complex and from unconstrained problems to constrained problems, we propose different HHCE-MOPSO algorithms that achieve a promising result by combining the excellent features, which was developed by hybridization of the PSO in this work. Through in-depth analysis, we found that the clustering algorithm is prone to premature or stagnation problems in the optimization process, which is mainly due to the insufficient speed updating ability of each particle in the late stage of PSO. In the meantime, we use the cloud model theory to realize the dynamic adjustment of the multi-rule uncertainty of the inertia weight, and the test results of test function show that this method has a fast convergence speed and a good optimization effect. Combined with frontier group intelligent optimization, the scope of multi-objective optimization problem is expanded. Subsequently, the obtained experimental results demonstrate that the hybrid algorithm is suitable for both scientific research and engineering applications, although there are also some shortcomings inevitably. Therefore, it is necessary to improve the proposed algorithm according to the characteristics of actual problems, so that it can better solve engineering problems with certain characteristics.
In summary, we have presented the HHCE-MOPSO algorithm. Through analysis, we found that it is a new idea to design a clustering algorithm to improve clustering quality and speed effectively. In addition, we also proved that the updating mechanism and the parameter design of the population mechanism of the multi-objective particle swarm optimization clustering algorithm are critical to obtain the optimal solution through a series of experiments. At the same time, we found that the similarity between objects is measured by information entropy and the similarity threshold is directly calculated by the data. In addition, information entropy clustering can be used to merge similar factors and can simplify the complexity of the system. Furthermore, a series of experiments showed that the HHCE-MOPSO algorithm, which takes advantages of the PSO, can achieve better clustering results and is feasible and effective. Hence, sustainability and dynamicity of big data have been augmented by extending the HHCE-MOPSO algorithm to provide automatic clustering of the data. However, the data presents the characteristics of high dimensions and small samples with the advent of the era of big data. Therefore, we will need to further study a new feature selection clustering method suitable for high-dimensional and small sample imbalanced data. Although we have made some research achievements, unbalanced data clustering is a complex frontier problem. In the future, we should design an unbalanced clustering algorithm suitable for high-dimensional and small samples and improve the clustering effect and clustering performance by the spatial information.

Author Contributions

Conceptualization, R.-L.Z. and X.-H.L.; methodology, R.-L.Z.; software, R.-L.Z.; validation, R.-L.Z. and X.-H.L.; writing—original draft preparation, X.-H.L.; writing—review and editing, X.-H.L.; project administration, X.-H.L.; funding acquisition, R.-L.Z. and X.-H.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research of work is supported by National Natural Science Foundation of China (Grant No. 72261005) and Guizhou science and technology planning projects (Grant No. ZK2021G339; ZK2022G080) and Guizhou philosophy and social science planning projects (Grant No. 21GZYB09; 21GZYB10) and the research base and critical special topics of think tanks (GDZX2021031, GDYB2021022, GDYB2021023) and Guizhou Provincial Education Department Foundation (Grant No. 2022JD004) and the Talent Introduction Project of Guizhou University (Grant No. 2019016, 2019017, 20GLR001, 20GLR002) and 2022 National Social Science Fund Cultivation Project (GDPY2021014, GDPY2021015).

Institutional Review Board Statement

No applicable.

Informed Consent Statement

No applicable.

Data Availability Statement

No applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Wang, H.Y.; Xu, H.; Yuan, Y. High-dimensional expensive multi-objective optimization via additive structure. Intell. Syst. Appl. 2022, 14, 200062. [Google Scholar] [CrossRef]
  2. Cao, Y.; Mao, H.Q. High-dimensional multi-objective optimization strategy based on directional search in decision space and sports training data simulation. Alex. Eng. J. 2022, 61, 159–173. [Google Scholar] [CrossRef]
  3. Hussain, M.M.; Fujimoto, N. GPU-based parallel multi-objective particle swarm optimization for large swarms and high dimensional problems. Parallel Comput. 2020, 92, 102589. [Google Scholar] [CrossRef]
  4. Li, X.Q.; Jiang, H.K.; Liu, S.W.; Zhang, J.J.; Xu, J. A unified framework incorporating predictive generative denoising autoencoder and deep Coral network for rolling bearing fault diagnosis with unbalanced data. Measurement 2021, 178, 109345. [Google Scholar] [CrossRef]
  5. Mayukh, S.; Welsh, A.H. Bootstrapping for highly unbalanced clustered data. Comput. Stat. Data Anal. 2013, 59, 80–81. [Google Scholar]
  6. Soohyun, A.; Wang, X.L.; Johan, L. On unbalanced group sizes in cluster randomized designs using balanced ranked set sampling. Stat. Probab. Lett. 2017, 123, 210–217. [Google Scholar]
  7. Wang, X.L.; Soohyun, A.; Johan, L. Unbalanced ranked set sampling in cluster randomized studies. J. Stat. Plan. Inference 2017, 187, 1–16. [Google Scholar] [CrossRef]
  8. Zhao, F.; Fan, J.L.; Liu, H.Q. Optimal-selection-based suppressed fuzzy c-means clustering algorithm with self-tuning non local spatial information for image segmentation. Expert Syst. Appl. 2014, 41, 4083–4093. [Google Scholar] [CrossRef]
  9. Li, F.; Minking, K.; Yu, C.; Wang, J.X.; Tang, B.Q. Life grade recognition of rotating machinery based on Supervised Orthogonal Linear Local Tangent Space Alignment and Optimal Supervised Fuzzy C-Means Clustering. Measurement 2015, 73, 384–400. [Google Scholar] [CrossRef]
  10. Gao, J.; Tao, X.X.; Cai, S.W. Towards More Efficient Local Search Algorithms for Constrained Clustering. Inf. Sci. 2022, 11, 107. [Google Scholar] [CrossRef]
  11. Zhang, Y.; Kong, X. A particle swarm optimization algorithm with empirical balance strategy. Chaos Solitons Fractals X 2023, 10, 100089. [Google Scholar] [CrossRef]
  12. Zhao, J.; Chen, D.; Xiao, R.; Cui, Z.; Wang, H.; Lee, I. Multi-strategy ensemble firefly algorithm with equilibrium of convergence and diversity. Appl. Soft Comput. 2022, 123, 108938. [Google Scholar] [CrossRef]
  13. Fontes, D.; Homayouni, S.; Gonçalves, J. A hybrid particle swarm optimization and simulated annealing algorithm for the job shop scheduling problem with transport resources. Eur. J. Oper. Res. 2023, 306, 1140–1157. [Google Scholar] [CrossRef]
  14. Li, X.; Xing, K.; Lu, Q. Hybrid particle swarm optimization algorithm for scheduling flexible assembly systems with blocking and deadlock constraints. Eng. Appl. Artif. Intell. 2021, 105, 104411. [Google Scholar] [CrossRef]
  15. Adamu, A.; Abdullahi, M.; Junaidu, S.; Hassan, I. An hybrid particle swarm optimization with crow search algorithm for feature selection. Mach. Learn. Appl. 2021, 6, 100108. [Google Scholar] [CrossRef]
  16. Khan, T.; Ling, S. A novel hybrid gravitational search particle swarm optimization algorithm. Eng. Appl. Artif. Intell. 2021, 102, 104263. [Google Scholar] [CrossRef]
  17. Jafari, M.; Salajegheh, E.; Salajegheh, J. Optimal design of truss structures using a hybrid method based on particle swarm optimizer and cultural algorithm. Structures 2021, 32, 391–405. [Google Scholar] [CrossRef]
  18. Cai, B.; Zhu, X.; Qin, Y. Parameters optimization of hybrid strategy recommendation based on particle swarm algorithm. Expert Syst. Appl. 2021, 168, 114388. [Google Scholar] [CrossRef]
  19. Liu, Z.; Qin, Z.; Zhu, P.; Li, H. An adaptive switchover hybrid particle swarm optimization algorithm with local search strategy for constrained optimization problems. Eng. Appl. Artif. Intell. 2020, 95, 103771. [Google Scholar] [CrossRef]
  20. Zhang, X.Y.; Xia, S.; Li, X.Z.; Zhang, T. Multi-objective particle swarm optimization with multi-mode collaboration based on reinforcement learning for path planning of unmanned air vehicles. Knowl. Based Syst. 2022, 250, 109075. [Google Scholar] [CrossRef]
  21. Abualigah, L.M.; Khader, A.T.; Hanandeh, E.S. A new feature selection method to improve the document clustering using particle swarm optimization algorithm. J. Comput. Sci. 2018, 25, 456–466. [Google Scholar] [CrossRef]
  22. Ahlem, A.; Nizar, R.; Raja, F.; Qahtani, M.A.; Almutiry, O.; Dhahri, H.; Hussain, A.; Alimi, M.A. DPb-MOPSO: A Dynamic Pareto bi-level Multi-objective Particle Swarm Optimization Algorithm. Appl. Soft Comput. 2022, 129, 109622. [Google Scholar]
  23. Yang, Y.; Liao, Q.F.; Wang, J.; Wang, Y. Application of multi-objective particle swarm optimization based on short-term memory and K-means clustering in multi-modal multi-objective optimization. Eng. Appl. Artif. Intell. 2022, 112, 104866. [Google Scholar] [CrossRef]
  24. Tsekouras, G.E. A simple and effective algorithm for implementing particle swarm optimization in RBF network’s design using input-output fuzzy clustering. Neurocomputing 2013, 108, 36–44. [Google Scholar] [CrossRef]
  25. Abdolreza, R.; Milad, S.; Sadegh, F. Particle ranking: An Efficient Method for Multi-Objective Particle Swarm Optimization Feature Selection. Knowl. Based Syst. 2022, 245, 108640. [Google Scholar]
  26. Niteesh, K.; Harendra, K. A fuzzy clustering technique for enhancing the convergence performance by using improved Fuzzy c-means and Particle Swarm Optimization algorithms. Data Knowl. Eng. 2022, 140, 102050. [Google Scholar]
  27. Ouertani, M.W.; Manita, G.; Korbaa, O. Automatic Data Clustering Using Hybrid Chaos Game Optimization with Particle Swarm Optimization Algorithm. Procedia Comput. Sci. 2022, 207, 2677–2687. [Google Scholar] [CrossRef]
  28. Li, H.Y.; He, H.Z.; Wen, Y.G. Dynamic particle swarm optimization and K-means clustering algorithm for image segmentation. Optik 2015, 126, 4817–4822. [Google Scholar] [CrossRef]
  29. Mohammed, A.; Mohanad, A.; NorAshidi, M.I. Density-based particle swarm optimization algorithm for data clustering. Expert Syst. Appl. 2018, 91, 170–186. [Google Scholar]
  30. Gowda, Y.; Lakshmikantha, B.R. On fly hybrid swarm optimization algorithms for clustering of streaming data. Results Control Optim. 2023, 10, 100190. [Google Scholar]
  31. Huang, Q.R.; Gao, R.; Akhavan, H. An ensemble hierarchical clustering algorithm based on merits at cluster and partition levels. Pattern Recognit. 2023, 136, 109255. [Google Scholar] [CrossRef]
Figure 1. The qualitative concept map of the cloud model.
Figure 1. The qualitative concept map of the cloud model.
Applsci 13 01246 g001
Figure 2. The simulation diagram of the multi-dimensional cloud model.
Figure 2. The simulation diagram of the multi-dimensional cloud model.
Applsci 13 01246 g002
Figure 3. The simulation results of multi-objective function.
Figure 3. The simulation results of multi-objective function.
Applsci 13 01246 g003
Figure 4. The convergence of different algorithms with different sizes.
Figure 4. The convergence of different algorithms with different sizes.
Applsci 13 01246 g004
Figure 5. The Pareto optimal solution of test function.
Figure 5. The Pareto optimal solution of test function.
Applsci 13 01246 g005
Figure 6. The experimental results of Rosenbrock by HHCE-MOPSO.
Figure 6. The experimental results of Rosenbrock by HHCE-MOPSO.
Applsci 13 01246 g006
Figure 7. The experimental results of Rosenbrock by C-MOPSO.
Figure 7. The experimental results of Rosenbrock by C-MOPSO.
Applsci 13 01246 g007
Figure 8. The experimental results of Rosenbrock by E-MOPSO.
Figure 8. The experimental results of Rosenbrock by E-MOPSO.
Applsci 13 01246 g008
Figure 9. The experimental results of Rosenbrock by MOPSO.
Figure 9. The experimental results of Rosenbrock by MOPSO.
Applsci 13 01246 g009
Figure 10. The experimental results of Rosenbrock by PSO.
Figure 10. The experimental results of Rosenbrock by PSO.
Applsci 13 01246 g010
Figure 11. The experimental results of Ackleysfcn by HHCE-MOPSO.
Figure 11. The experimental results of Ackleysfcn by HHCE-MOPSO.
Applsci 13 01246 g011
Figure 12. The experimental results of Ackleysfcn by C-MOPSO.
Figure 12. The experimental results of Ackleysfcn by C-MOPSO.
Applsci 13 01246 g012
Figure 13. The experimental results of Ackleysfcn by E-MOPSO.
Figure 13. The experimental results of Ackleysfcn by E-MOPSO.
Applsci 13 01246 g013
Figure 14. The experimental results of Ackleysfcn by MOPSO.
Figure 14. The experimental results of Ackleysfcn by MOPSO.
Applsci 13 01246 g014
Figure 15. The experimental results of Ackleysfcn by PSO.
Figure 15. The experimental results of Ackleysfcn by PSO.
Applsci 13 01246 g015
Figure 16. The experimental results of Dejongsfcn by HHCE-MOPSO.
Figure 16. The experimental results of Dejongsfcn by HHCE-MOPSO.
Applsci 13 01246 g016
Figure 17. The experimental results of Dejongsfcn by C-MOPSO.
Figure 17. The experimental results of Dejongsfcn by C-MOPSO.
Applsci 13 01246 g017
Figure 18. The experimental results of Dejongsfcn by E-MOPSO.
Figure 18. The experimental results of Dejongsfcn by E-MOPSO.
Applsci 13 01246 g018
Figure 19. The experimental results of Dejongsfcn by MOPSO.
Figure 19. The experimental results of Dejongsfcn by MOPSO.
Applsci 13 01246 g019
Figure 20. The experimental results of Dejongsfcn by PSO.
Figure 20. The experimental results of Dejongsfcn by PSO.
Applsci 13 01246 g020
Figure 21. The experimental results of Dropwavefcn by HHCE-MOPSO.
Figure 21. The experimental results of Dropwavefcn by HHCE-MOPSO.
Applsci 13 01246 g021
Figure 22. The experimental results of Dropwavefcn by C-MOPSO.
Figure 22. The experimental results of Dropwavefcn by C-MOPSO.
Applsci 13 01246 g022
Figure 23. The experimental results of Dropwavefcn by E-MOPSO.
Figure 23. The experimental results of Dropwavefcn by E-MOPSO.
Applsci 13 01246 g023
Figure 24. The experimental results of Dropwavefcn by MOPSO.
Figure 24. The experimental results of Dropwavefcn by MOPSO.
Applsci 13 01246 g024
Figure 25. The experimental results of Dropwavefcn by PSO.
Figure 25. The experimental results of Dropwavefcn by PSO.
Applsci 13 01246 g025
Figure 26. The experimental results of Griewangksfcn by HHCE-MOPSO.
Figure 26. The experimental results of Griewangksfcn by HHCE-MOPSO.
Applsci 13 01246 g026
Figure 27. The experimental results of Griewangksfcn by C-MOPSO.
Figure 27. The experimental results of Griewangksfcn by C-MOPSO.
Applsci 13 01246 g027
Figure 28. The experimental results of Griewangksfcn by E-MOPSO.
Figure 28. The experimental results of Griewangksfcn by E-MOPSO.
Applsci 13 01246 g028
Figure 29. The experimental results of Griewangksfcn by MOPSO.
Figure 29. The experimental results of Griewangksfcn by MOPSO.
Applsci 13 01246 g029
Figure 30. The experimental results of Griewangksfcn by PSO.
Figure 30. The experimental results of Griewangksfcn by PSO.
Applsci 13 01246 g030
Figure 31. The experimental results of Nonlinearconstrdemo by HHCE-MOPSO.
Figure 31. The experimental results of Nonlinearconstrdemo by HHCE-MOPSO.
Applsci 13 01246 g031
Figure 32. The experimental results of Nonlinearconstrdemo by C-MOPSO.
Figure 32. The experimental results of Nonlinearconstrdemo by C-MOPSO.
Applsci 13 01246 g032
Figure 33. The experimental results of Nonlinearconstrdemo by E-MOPSO.
Figure 33. The experimental results of Nonlinearconstrdemo by E-MOPSO.
Applsci 13 01246 g033
Figure 34. The experimental results of Nonlinearconstrdemo by MOPSO.
Figure 34. The experimental results of Nonlinearconstrdemo by MOPSO.
Applsci 13 01246 g034
Figure 35. The experimental results of Nonlinearconstrdemo by PSO.
Figure 35. The experimental results of Nonlinearconstrdemo by PSO.
Applsci 13 01246 g035
Figure 36. The experimental results of Rastriginsfcn by HHCE-MOPSO.
Figure 36. The experimental results of Rastriginsfcn by HHCE-MOPSO.
Applsci 13 01246 g036
Figure 37. The experimental results of Rastriginsfcn by C-MOPSO.
Figure 37. The experimental results of Rastriginsfcn by C-MOPSO.
Applsci 13 01246 g037
Figure 38. The experimental results of Rastriginsfcn by E-MOPSO.
Figure 38. The experimental results of Rastriginsfcn by E-MOPSO.
Applsci 13 01246 g038
Figure 39. The experimental results of Rastriginsfcn by MOPSO.
Figure 39. The experimental results of Rastriginsfcn by MOPSO.
Applsci 13 01246 g039
Figure 40. The experimental results of Rastriginsfcn by PSO.
Figure 40. The experimental results of Rastriginsfcn by PSO.
Applsci 13 01246 g040
Figure 41. The experimental results of Schwefelsfcn by HHCE-MOPSO.
Figure 41. The experimental results of Schwefelsfcn by HHCE-MOPSO.
Applsci 13 01246 g041
Figure 42. The experimental results of Schwefelsfcn by C-MOPSO.
Figure 42. The experimental results of Schwefelsfcn by C-MOPSO.
Applsci 13 01246 g042
Figure 43. The experimental results of Schwefelsfcn by E-MOPSO.
Figure 43. The experimental results of Schwefelsfcn by E-MOPSO.
Applsci 13 01246 g043
Figure 44. The experimental results of Schwefelsfcn by MOPSO.
Figure 44. The experimental results of Schwefelsfcn by MOPSO.
Applsci 13 01246 g044
Figure 45. The experimental results of Schwefelsfcn by PSO.
Figure 45. The experimental results of Schwefelsfcn by PSO.
Applsci 13 01246 g045
Figure 46. The simulationclustering results withPSO.
Figure 46. The simulationclustering results withPSO.
Applsci 13 01246 g046
Figure 47. The simulationclustering results with MOPSO.
Figure 47. The simulationclustering results with MOPSO.
Applsci 13 01246 g047
Figure 48. The simulationclustering results with HHCE-MOPSO.
Figure 48. The simulationclustering results with HHCE-MOPSO.
Applsci 13 01246 g048
Table 1. Function experiment results based on multi-objective PSO.
Table 1. Function experiment results based on multi-objective PSO.
NF1F2F3F4
10.1064998084054610.0673301822404150.1243771163014620.499955546962077
20.0497371539063360.0868203955861980.1521983895663210.498713173701964
30.0122713915055790.0634078765732360.1185571858915840.500789090951651
40.0597487790079470.0799846011590470.1817787000516870.500001701392795
50.0071823507073800.0449946287052430.1063694969860300.496933444858318
60.0283116131376510.0740908869885840.2752602802394080.498887957603687
70.0352767497145610.0681191085055110.1742985971558420.498458132620899
80.0183988129961230.0593032926904390.1641498898114810.498878446602397
90.0177247303732420.0611478775613010.2294167999848210.498599468093871
100.0872594728534280.1108010909278730.2681418480024870.500060922016868
110.0251385335992460.0520336965001430.2961399284921640.499667735329447
120.0157321903709210.0747788351229260.2109263174553250.500978252622088
130.0532518005553260.0902381370484830.2377278942082170.499292970499502
140.0063659158637660.0320232209577250.2631933227173730.501448338212031
150.0196528798573070.0539627013178580.3340565209004360.497959864836247
160.0482068425068230.0633302482055660.2953961872768770.500654771790587
170.0694393352700740.0443997345651230.2656525037131110.500439821847981
180.0130998039877060.0579498622226820.2446006579281450.499580800950720
190.0144167400763180.0563396544514760.2664223243224190.500250196137362
200.0121172243339880.0523032644962260.3055323642727700.499126078903273
210.0428667575756540.0894258497966200.3120862052072920.500010829781512
220.0146471380749650.058134128590683−0.0564086773842640.498794765432105
230.0880419868129750.1020849855127650.2055323642727700.498975149753489
240.0908905576603510.0954084896609950.1795409920494980.498666077341443
250.0155084566587590.0649552611427030.2795409920494980.498220217186682
260.0221305045503770.0755036790896710.0605696397051760.499393068016345
270.0839065020982940.0856842933122670.2493921498082580.500079250249175
280.0272955996157590.0659481876693350.2166396042365730.500225643158520
290.0076869724462380.0348299345665850.1442335595939720.499288863741454
30−0.000087959115083−0.0043124686297150.2612817585813850.499781973533050
310.0315251878988910.0676960653313310.1586044285095870.499187353748357
320.0434951616662480.080722762768860−0.0085940106324670.498532431799683
330.0663439021247940.0692437767918270.3418415832489010.499840866124779
340.1022737742488940.0956959890720270.2528874954620820.499923364181269
350.0974277410978060.0984207804664120.1866826177048630.499829479413832
360.1109003100945210.0854225035408090.3365388633730880.500550587358501
370.0635720371903240.0613861209755820.2914836412370780.500164235318542
380.1638960337732970.0880451905378320.2780632283275570.497931277694565
390.0760418768605400.0863410961825840.3664260247239160.499321870487254
400.0366311490523830.0304060155008880.3439397933268590.500131940026728
410.1532246987352890.1057745727645090.2852764997199100.499966015358706
420.0366311490523830.0304060155008880.2871896769868900.499769791524706
430.0597368270730720.0670278748516850.3479252599180050.499587528992349
440.0228699070905760.0622050892122560.3871896769868900.499273538602641
450.0261024419602790.0696217873712450.4147262376198340.500619836407797
460.0044377546381530.0209903357712390.4841896769868900.498792173268238
470.1151226466872550.0817660414343140.5046187243898480.499304770335719
480.0741677516787190.0715590163335620.4046187243898480.499907527996732
490.0090900768891290.0500913622873260.4127921127942050.498914307071991
500.0579169083444940.0674533491597170.4068022827550050.499529080847996
Table 2. The performance parameter results of hybrid algorithms.
Table 2. The performance parameter results of hybrid algorithms.
DatasetAlgorithmAccuracySensitivitySpecificityG-meanAUCMCC
LensesK-mean0.62000.55200.62150.42150.85250.8952
K-Boosted0.75200.60200.70100.60250.79800.8821
K-SMOTE0.86250.62500.82150.85200.86780.8150
SMOTE-Boosted0.89200.65250.89500.90100.89600.8320
HH-PSO-K0.97000.70200.93200.92500.95250.8980
LymphographyK-mean0.71500.64200.63150.54150.90250.9002
K-Boosted0.76200.65200.72160.62250.81800.8920
K-SMOTE0.88320.67500.84150.88200.89780.8255
SMOTE-Boosted0.89200.68200.90500.91000.90600.8610
HH-PSO-K0.98050.71200.94100.93600.96200.9080
ArrhythmiaK-mean0.56000.53200.60150.50150.75250.8650
K-Boosted0.71200.61200.72100.50250.80800.8620
K-SMOTE0.84250.60400.84150.75200.83780.8050
SMOTE-Boosted0.84200.64250.89900.80800.84600.8220
HH-PSO-K0.92000.80200.91200.91500.92250.8900
Bach-ChoralesK-mean0.82500.75200.82150.72000.85000.9052
K-Boosted0.85200.80800.80100.80240.81800.9021
K-SMOTE0.90250.86500.92150.87200.88780.8950
SMOTE-Boosted0.91200.89250.92500.91100.90600.8920
HH-PSO-K0.98000.90200.94200.93500.96250.9080
Connect-4K-mean0.78000.56200.64150.54150.88250.9052
K-Boosted0.79200.61200.71100.64280.82800.8925
K-SMOTE0.89250.63500.84550.86250.88080.8250
SMOTE-Boosted0.90200.66900.90500.91100.90650.8420
HH-PSO-K0.98000.71250.95200.93500.96250.9080
CovertypeK-mean0.64000.58800.64500.44520.87000.9085
K-Boosted0.77200.62500.72500.62280.81400.9020
K-SMOTE0.88250.64870.84300.87200.82450.8250
SMOTE-Boosted0.91500.67890.91200.91200.90050.8520
HH-PSO-K0.98000.72830.94100.93500.95000.9250
Cylinder-BandsK-mean0.67000.60300.67100.57800.90000.9250
K-Boosted0.80200.65280.75230.65210.84520.9030
K-SMOTE0.90100.67280.87800.90100.90600.8650
SMOTE-Boosted0.62500.70550.92000.94200.92500.8920
HH-PSO-K0.93600.75600.95000.95260.96020.9360
DermatologyK-mean0.42000.53200.60150.40120.83250.8700
K-Boosted0.55500.58200.67010.58900.77800.8620
K-SMOTE0.76250.60300.80160.83680.84700.8010
SMOTE-Boosted0.80200.63280.87500.89500.87620.8120
HH-PSO-K0.90200.70200.90200.90100.93250.8750
DiabetesK-mean0.68000.59600.68150.48150.90250.9450
K-Boosted0.81250.67200.76150.66200.93800.9320
K-SMOTE0.92250.68540.89150.91200.92780.8850
SMOTE-Boosted0.93200.70250.94500.95100.93650.8920
HH-PSO-K0.98000.78200.95200.97500.98250.9580
Hayes-RothK-mean0.70000.63200.70150.60150.78250.9552
K-Boosted0.83250.68240.78000.68280.85850.9021
K-SMOTE0.92250.68580.90150.91200.91780.8950
SMOTE-Boosted0.96200.73250.97500.97100.94600.8910
HH-PSO-K0.98050.78200.97900.97500.95200.9785
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zhang, R.-L.; Liu, X.-H. A Novel Hybrid High-Dimensional PSO Clustering Algorithm Based on the Cloud Model and Entropy. Appl. Sci. 2023, 13, 1246. https://doi.org/10.3390/app13031246

AMA Style

Zhang R-L, Liu X-H. A Novel Hybrid High-Dimensional PSO Clustering Algorithm Based on the Cloud Model and Entropy. Applied Sciences. 2023; 13(3):1246. https://doi.org/10.3390/app13031246

Chicago/Turabian Style

Zhang, Ren-Long, and Xiao-Hong Liu. 2023. "A Novel Hybrid High-Dimensional PSO Clustering Algorithm Based on the Cloud Model and Entropy" Applied Sciences 13, no. 3: 1246. https://doi.org/10.3390/app13031246

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop