A Novel Semi-Supervised Fuzzy C-Means Clustering Algorithm Using Multiple Fuzziﬁcation Coefﬁcients

: Clustering is an unsupervised machine learning method with many practical applications that has gathered extensive research interest. It is a technique of dividing data elements into clusters such that elements in the same cluster are similar. Clustering belongs to the group of unsupervised machine learning techniques, meaning that there is no information about the labels of the elements. However, when knowledge of data points is known in advance, it will be beneﬁcial to use a semi-supervised algorithm. Within many clustering techniques available, fuzzy C-means clustering (FCM) is a common one. To make the FCM algorithm a semi-supervised method, it was proposed in the literature to use an auxiliary matrix to adjust the membership grade of the elements to force them into certain clusters during the computation. In this study, instead of using the auxiliary matrix, we proposed to use multiple fuzziﬁcation coefﬁcients to implement the semi-supervision component. After deriving the proposed semi-supervised fuzzy C-means clustering algorithm with multiple fuzziﬁcation coefﬁcients (sSMC-FCM), we demonstrated the convergence of the algorithm and validated the efﬁciency of the method through a numerical example.


Introduction
Data clustering is a method that divides the elements of a data set into clusters such that data elements in the same cluster have similar properties, and data elements in different clusters have different properties [1,2].Data clustering is an important pre-processing step to produce initial knowledge, supporting the decision-making for the next processing steps.In the current clustering methods, Fuzzy C-means clustering (FCM) algorithm gives relatively good results, taking advantage of the flexible nature of fuzzy logic [3].In FCM, a data element can flexibly choose a cluster to belong to, characterized by the membership grade of the element in the cluster [4].Specifically, the membership grade U ik of element i belonging to cluster k has a value in the range from 0 to 1, and the larger the U ik value, the more likely element i belongs to cluster k [5,6].
Clustering belongs to the class of unsupervised learning methods in which there is no given information about the labels of the data elements, dissimilar to classification methods.However, there are also cases in which knowledge of certain data can be known in advance; then, it becomes semi-supervised fuzzy clustering.For a semi-supervised clustering problem, there are often two goals: (i) clustering and labeling the data, and (ii) improving the quality of clustering based on the existing knowledge about the data [7][8][9][10][11].The first goal is clustering, as defining cluster labels for the data remains the primary goal.With semi-supervision, the structure of the clusters and the cluster centers must still be clearly distinguished.The second goal is to improve the clustering quality based on the existing knowledge.The clustering quality can be evaluated from the clustering label.Of the two objectives, if more attention goes toward clustering labeling, then more supervisory knowledge needs to be introduced, and the clustering results will be expected to be better.The disadvantage of this method is that certain measures of "similarity" may not be better, but this is acceptable when there are data points that are "shaped" by the knowledge; hence, the ability to update the cluster center is also improved [12].There is knowledge about data that can be considered to be relatively accurate.It will be known to a degree in advance if a data element should or should not belong to a certain cluster.Such factors are dependent on the related expertise of the person conducting the clustering [13].
Yasunori et al. [14] proposed to improve the FCM algorithm by adding an auxiliary supervisory matrix, representing the supervised membership grade, into the objective function.Although the authors did not outline a specific method to determine the value of the supervisory matrix, the study made it clear that semi-supervised fuzzy clustering is completely feasible.The authors called this algorithm semi-supervised standard Fuzzy C-means clustering (sSFCM).This algorithm is mainly based on the approach that the supervised data point must belong to a certain cluster.The membership grade on that cluster then must be larger than the value in other unsupervised clusters.
The FCM method uses an exponential parameter m, also known as the fuzzification coefficient, in the objective function, which adjusts the membership grade U ik of element i belonging to cluster k.In the standard FCM algorithm, the value of parameter m is selected from the beginning, for example, m = 2.In recent years, there have been studies that extended the selection of parameter m as an interval [m 1 , m 2 ] or a fuzzy value that, when reducing the type, would essentially be the selection of different m ∈ [m 1 , m 2 ] values, applied to each iteration [15].The extension to use multiple fuzzification coefficients m instead of only one value in FCM was presented in [16].When assigning each data point an appropriate fuzzification coefficient m based on the density of that point to the surrounding points, the clustering quality was improved.
This study is based on the concept of utilizing multiple fuzzification coefficients to apply to semi-supervised fuzzy clustering.When placing a supervised element i into cluster k, the appropriate fuzzification coefficient m ik can be selected, which will be different from other fuzzification coefficients.This affects the calculation of U ik which increases, and at the same time, U ik ˆmik changing also affects the determination of cluster k center.Similarly, when preventing a supervised element i from being in cluster k, its parameter m ik will be different from other m ij values.The adjustment of the fuzzification coefficient is similar to using the auxiliary matrix similar to other semi-supervised clustering methods, and it is also needed to ensure the convergence of the clustering algorithm.The contribution of this study is the proposal of a novel semi-supervised FCM algorithm using multiple fuzzification coefficients (sSMC-FCM) as well as the method to determine the appropriate fuzzification coefficient values for semi-supervision.
Section 2 of the paper outlines the FCM algorithm and the semi-supervised fuzzy clustering algorithm using auxiliary matrices.Section 3 presents the novel sSMC-FCM algorithm.Section 4 shows a numerical example using the proposed method and the subsequent results, while Section 5 presents concluding remarks.

Preliminaries 2.1. Standard Fuzzy C-Means Clustering (FCM) Algorithm
The standard FCM algorithm attempts to divide a finite number of N data elements X = {X 1 , X 2 , . . . ,X N } into C clusters based on some given criteria.Each element X i ∈ X, i = 1, 2, . . ., N, is a vector with D dimensions.The elements in X are divided into C clusters with cluster centers V 1 , V 2 , . . ., V C in the centroid set V.
In FCM algorithm, U is a matrix that represents the membership of each element into each cluster.Matrix U has certain characteristics as below:

•
U ik is the membership grade of an element X i in the cluster k with center V k , where The larger U ik is, the more element X i belongs in cluster k.
An objective function is defined such that the clustering algorithm must minimize the objective function (1): where D ik 2 = X i − V k 2 is the distance between two vectors X i and V k , and m is the fuzzification coefficient of the algorithm.
Summary of steps for the standard FCM algorithm: Input: the dataset X = {X 1 , X 2 , . . . ,X N }.
Output: the partition of X into C clusters.

•
Step 2: At the l − th loop, update U according to the formula: (2)

Semi-Supervised Standard Fuzzy C-Means Clustering (sSFCM) Algorithms
The semi-supervised fuzzy clustering method in [14] added a supervisory matrix U that represents the supervision of an element forced to belong or to not belong in a cluster.U ik > 0 when the supervised element i is placed into cluster k, and U ik = 0 when there is no supervision.This additional supervised membership grade U ik was added to the objective function to be minimized, and is shown as follows: where The sSFCM algorithm, shown below, works to minimize J(U, V) through many iterations.
Summary of steps for the sSFCM algorithm: Input: the dataset X = {X 1 , X 2 , . . . ,X N }, and the supervised membership grade U. Output: the partition of X into C clusters.

•
Step 2: At the l − th loop, update U according to the formula: (5)
A numerical example of the sSFCM algorithm [14]: A data set with 20 elements X = {X 1 , X 2 , . . . ,X 20 }, X i ∈ R 2 , as seen in Table 1, is to be divided into two clusters.Implement the sSFCM algorithm with m = 2 for 3 cases: Case 2: semi-supervised, attempting to place data points 9 and 10 into cluster 1, Case 3: semi-supervised, attempting to place data points 9 and 10 into cluster 1, After clustering, we obtain matrices U for all 3 cases, as shown in Table 2.The clustering results indicate that sample points 9 and 10 belong to cluster 2 in case 1 and case 2 and to cluster 1 in case 3, showing the impact of the added semi-supervised component in the sSFCM algorithm through the supervised membership grade U.Although in case 2, the attempt to place points 9 and 10 in cluster 1 was not successful, the points were much closer to cluster 1 compared to case 1.
In this study, instead of using the matrix U, we propose the use of multiple fuzzification coefficients as the mean to make the original unsupervised FCM algorithm a semi-supervised one.

Derivation of the Proposed sSMC-FCM Algorithm
In this sub-section, we show the derivation of the proposed sSMC-FCM algorithm.Each membership U ik can receive different fuzzification coefficients m ik .The supervision of element i to belong to cluster k is performed through adjusting m ik .A formula for m ik can be given as follows, with exponential parameters M, M > 1: • m ij = M for all 1 ≤ j ≤ C, for unsupervised elements i.

•
m ik = M , and m ij = M for all j = k, for supervised elements i to belong to cluster k.
The approach to determine the exponential parameters M and M is discussed in Section 3.2.
The objective function to be minimized in the sSMC-FCM algorithm is formulated as follows, where X = {X 1 , X 2 , . . . ,X N } is the dataset, N is the number of elements, C is the number of clusters, U ik is the membership grade of an element X i in the cluster k with center Let Y ⊂ {1, 2, . . . ,N} × {1, 2, . . . ,C} be the set of supervised elements, we have: To solve the optimization problem shown in Equation ( 7), we utilize the Lagrange multiplier method.Let Moreover, with And next, we calculate U ik using which implies the following: Combining with ∑ C j=1 U ij = 1, we consider the two following cases: Case 1: For unsupervised elements i, m ij = M for all 1 ≤ j ≤ C, Equation ( 12) becomes: Replacing ( 14) into (13), we obtain Case 2: For supervised elements i to belong to cluster k, m ik = M and m ij = M for j = k, Equation ( 11) becomes: Combining with ∑ C j=1 U ij = 1, to calculate U ij , we need to solve the following: The steps to solve Equation ( 16) are shown through Equations ( 17)-(20) below.To make the presentation of the derivation seamless, we first give the calculation formulas, and the proof that this solution can solve Equation ( 16) will be presented in Section 3.2 with Proposition 1.
Specifically, we calculate d min = min j=1,...,C D ij , then: Calculate Calculate µ ik which is a variable in µ ik From there, we have the sSMC-FCM algorithm as follows.
Output: the partition of X into C clusters.

•
Step 2: At the l − th loop, update U according to Equation (15) for unsupervised elements, or according to Equations ( 17)-(20) for supervised elements.

Determination of the Fuzzification Coefficients for Supervised Elements
In the proposed sSMC-FCM algorithm, when an element i is to be supervised, U ij will be calculated according to Equations ( 17)-(20) in Step 2 of the algorithm.In this sub-section, we will discuss these formulas in more detail, as well as how to determine the exponential parameter M for the supervised elements.Proposition 1. U ij calculated from Equations ( 17)-(20) satisfies Equation (16).
Proof of Proposition 1. From Equation (20), taking the sum of U ij , we can see that it satisfies the condition of ∑ C j=1 U ij = 1 in Equation ( 16).
We can hence apply Equations ( 17)-(20) in Step 2 of the proposed algorithm.
Next, we investigate how to determine the exponential parameter M .We will utilize Proposition 2 after proving it.Proposition 2. The function f (x) = 1 ax 1 x−1 with a ≥ 1 is an increasing function when x > 1.
Applying Proposition 2 to the following formula used to calculate M for the super- we see that this function is an increasing function.Therefore, as M increases, the membership grade U ik increases.Setting the supervision for an element corresponds to selecting a larger fuzzification coefficient M for that element compared to other fuzzification coefficients.
Then, Proposition 3 is utilized to guide the determination of parameter M .
Proposition 3. Let U ik be the membership grade of element X i in its semi-supervise placement into cluster k according to Equations ( 17)-( 20) with a given parameter M and a to-be-determined parameter M .Let U ik be the membership grade of element X i in its unsupervised placement into cluster k according to Equation (15).With α ∈ (0, 1), we have U ik ≥ α if the following equation is satisfied: Proof of Proposition 3. From Equation (15), we have Substitute into (23), we have the equivalent Next, from Equation (20), we have Since M > M, and from the above, we have Since µ ik is to be solved in Equation (19), µ ik > 0, and the above can be simplified into Therefore, if Equation ( 23) is satisfied, then Equation ( 25) is also satisfied.
From Proposition 3, Equation (23) demonstrates an approach to determine parameter M .The right-hand side of ( 23) is a value that can be calculated knowing M, α, and the distances, while the left-hand side of ( 23) is a decreasing function as M increases.From this, we can start at M = M, and then increase M value and check simultaneously until Equation ( 23) is satisfied to solve for M .
Example: Given the data set from Table 1, the element X 9 has U 91 = 0.189, with M = 2.If we want to apply supervision such that U 91 ≥ 0.5 to put X 9 into cluster 1, then from Equation (20), we need to determine M such that the condition 0.5 M −1 M ≤ 0.233 is satisfied.For M = 2, 0.5 M −1 M = 1, and hence it does not satisfy the condition.We will have to increase the value of M to decrease the value of 0.5 M −1 M to satisfy the condition.Through certain iterations, we obtain M = 5.582, and 0.5 M −1 M = 0.233, which satisfies the condition, and effectively places X 9 into cluster 1.
A point that can be further discussed is that in Step 2 of the sSMC-FCM algorithm, it is necessary to solve Equation (19) to obtain µ ik .In Equation (19) From this, we can see that the left-hand side of Equation ( 19) is an increasing function, while the right-hand side has a value in the range [0,1].Therefore, we can use an approximation method to obtain µ ik , starting from µ ik = 0, and then increasing gradually to determine the solution.

Numerical Examples
To evaluate the proposed algorithm, we use the same numerical example shown in Section 2.2.A data set with 20 elements X = {X 1 , X 2 , . . . ,X 20 }, X i ∈ R 2 , as seen in Table 1, is to be divided into two clusters.Implement the sSMC-FCM algorithms with M = 2 for the following 3 cases:
The Euclidian distance was used in the calculations and results in this work.Table 3 shows membership matrices U for each case, as the results of clustering.From the results in Table 3, we have the following observations: • As M increases, the membership grades of the supervised elements increase.Initially, with no supervision, data points 9 and 10 were placed into cluster 2. With supervision and M = 4, their membership grades increased but not enough to move into cluster 1, while with M = 8, these two data points were successfully placed into cluster 1;

•
Compared to the sSFCM algorithm, we can see that the matrices U in the case of M = 4, M = 2 using the sSMC-FCM algorithm shown in Table 3 are similar to the corresponding matrices U in the case of U 9,1 = U 10,1 = 0.3 using the sSFCM algorithm shown in Table 2.Both cases were able to increase the membership grade of the points to belong to cluster 1 but were not successfully in moving the points into cluster 1;

•
In future works, it is possible to expand on other different representations for the fuzzification coefficients, combining with hedge algebra [17][18][19] in new representations and calculations.

Conclusions
In this study, we developed a novel clustering algorithm, called sSMC-FCM, based on the standard FCM algorithm, adding the semi-supervision aspect through the use of multiple fuzzification coefficients or also known as exponential parameters.In the sSMC-FCM algorithm, we allow the data elements to have different fuzzification coefficients, instead of only one such as in the standard FCM method.The expansion from hard clustering to fuzzy clustering involves the addition of fuzzification coefficients; hence, the determination of the fuzzification coefficient values is an interesting topic to be further researched.In this study, we derived the novel sSMC-FCM algorithm and proved three propositions to show the convergence of the algorithm and to explain how to determine the fuzzification coefficients and demonstrated the efficiency of the algorithm using a numerical example.The proposed algorithm added supervision in the normally unsupervised FCM clustering algorithm.This method can be applied in practical problems such as remote sensing image segmentation, when knowing an image is certainly of an image type such as a lake, but due to image noises and the unsupervised approach, the attribute features may cause that image to be perceived as another image type, such as clouds, leading to undesirable results.With the proposed semi-supervised method, we should be able to place the image into the knowingly correct image type as desired.
, its right-hand side has a value in the range [0,1].Set A = ∑ C j=1,j =k µ ij , b = M −M M −1 , with b < 1, the lefthand side of this equation has the form of f (x) = x (x+A) b , which has the derivative of f (x) = (x+A) b −xb(x+A) b−1

Table 2 .
The resulting membership U for the 3 cases using the sSFCM algorithm.

Table 3 .
The resulting membership U for the 3 cases using the sSMC-FCM algorithm.