Kernel Clustering with a Differential Harmony Search Algorithm for Scheme Classification

This paper presents a kernel fuzzy clustering with a novel differential harmony search algorithm to coordinate with the diversion scheduling scheme classification. First, we employed a self-adaptive solution generation strategy and differential evolution-based population update strategy to improve the classical harmony search. Second, we applied the differential harmony search algorithm to the kernel fuzzy clustering to help the clustering method obtain better solutions. Finally, the combination of the kernel fuzzy clustering and the differential harmony search is applied for water diversion scheduling in East Lake. A comparison of the proposed method with other methods has been carried out. The results show that the kernel clustering with the differential harmony search algorithm has good performance to cooperate with the water diversion scheduling problems.


Introduction
Metaheuristics are designed to find, generate, or select a heuristic that provides a good solution to an optimization problem [1].By searching over a large set of feasible solutions, metaheuristics can often find good solutions with less computational effort [2].There has been a widespread usage of metaheuristics and their applications in artificial intelligence, e.g., transit network design problems [3], sewer pipe networks [4], water distribution systems [5], sizing optimization of truss structures [6], ordinary differential equations [7], and so forth.Using metaheuristic algorithms, complex problems are not far from finding their solutions [7].Metaheuristics are good at finding near-optimal solutions to numerical real-valued problems [8].
Harmony search (HS) is a new phenomenon-mimicking algorithm, proposed by Geem in 2001 [9].It is a relatively recent metaheuristic method based on musical performances.In the HS algorithm, a new solution is generated by pitch adjustment and random selection.HS has the ability to deal with discontinuous optimization problems, as well as continuous problems.Comparing with other artificial intelligent algorithms such as the genetic algorithm and its variants, the HS algorithm requires fewer parameters and these parameters are easy to set.Moreover, HS can overcome the drawback of GA's building block theory.The advantages of HS have resulted in much interest in recent years, and HS has been widely applied in many fields, such as email classification [10], single machine scheduling problems [11], and so on.
Although HS has its advantages, i.e., it is good at identifying the high-performance regions of the solution space at a reasonable time, it gets into trouble in performing local searches for numerical applications [12].In order to improve the optimization ability, different variants of HS have been developed.Mahdavi et al. presented an improved harmony search algorithm (IHS) which changes the parameters dynamically with the generation number [12].The IHS algorithm presents a strategy of parameter adjustment which improves the performances of HS algorithm.However, the user needs to specify the new parameters, which are not easy to set, and it still performs local searches for some numerical applications.Other modifications, such as the global-best harmony search algorithm (GHS) [13] and chaotic harmony search algorithms (CHS) [14], have shown better performance than the classical HS, but they still have their own shortcomings.The GHS algorithm generates new harmonies using the variable from the best harmony.However, the algorithm cannot be adopted when the variables have different value ranges.This drawback limits the scope of the application of this method.The CHS generates new solutions following the chaotic map, but simulations show that CHS still suffers from a local optimum when dealing with some numerical applications.
Fuzzy clustering is a class of algorithms for cluster analysis which determine the affinities of samples using mathematical methods.It divides samples into classes or clusters so that items in the same class are as similar as possible, while items in different classes are dissimilar.It is a useful technique for analyzing statistical data in multi-dimensional space.
Fuzzy clustering has gained a significant level of interest for its ability to classify the samples which are multi-attributed and difficult to analyze.In recent years, there have been a large number of studies on fuzzy clustering.Fuzzy c-means clustering (FCM) [14] and fuzzy k-means clustering (FKM) [15] are widely used to categorize similar data into clusters.The kernel fuzzy clustering algorithm (KFC) [16] and the weighted fuzzy kernel-clustering algorithm (WFKCA) [17] are also efficient ways for cluster analysis.Research has shown that WFKCA has good convergence properties and the prototypes obtained can be well represented in the original space.However, these clustering methods have the same problem: the iterative solution is not the optimal solution.In order to overcome this drawback, in this paper, we combine the KFC with the HS algorithm to help the KFC perform better.
Although the modifications of HS have shown a better performance than the classical HS, the performances still need improvement.In this paper, we proposed a new differential harmony search algorithm (DHS), and applied it to kernel fuzzy clustering.A comparison of the proposed method with other methods has been carried out.Finally, the proposed method is applied to the water diversion scheduling assessment in East Lake, which is a new study in the East Lake Ecological Water Network Project.The water diversion scheduling tries to divert water from the Yangtze River to the East Lake Network, aiming to improve water quality in the sub-lakes.Using a hydrodynamic simulation model and a water quality model, the quality of the lakes at the end of the water diversion can be simulated.In order to obtain a better improvement of the water quality and reduce the economic cost, the water diversion scheme must be carefully developed.The diversion scheduling in the East Lake Network is a multi-objective problem, however, multi-objective evolutionary algorithms cannot be adopted because the water quality simulation is time-consuming.Thus, we made some typical schemes in the feasible range, and obtain the scheme results by the water quality model.The purpose of the kernel clustering with differential harmony search method is to classify the results in order to find out the schemes which perform better than others.
This paper is organized as follows: Section 2 presents an overview of the harmony search algorithm and kernel fuzzy clustering; Section 3 describes the modification and the combination of kernel fuzzy clustering and the differential harmony search algorithm; Section 4 discusses the computational results; and Section 5 provides the summary of this paper.

Harmony Search Algorithm
Harmony search (HS) is a phenomenon-mimicking algorithm inspired by the improvisation process of musicians proposed by Geem in 2001 [9].The method is a population-based evolutionary algorithm and it is a simple and effective solution to find a result which optimizes an objective function.Parameters of the harmony search algorithm usually include harmony memory size (hms), harmony memory considering rate (hmcr), pitch adjustment rate (par), and fret width (fw).
The main steps of the classical harmony search mainly include memory consideration, pitch adjustment, and randomization.Details of HS are explained as follows: Step 1 Initialize algorithm parameters This step specifies the HS algorithm parameters, including hms, hmcr, par, and fw.
Step 2 Initializing the harmony memory In HS, each solution is called a harmony.The harmony memory (HM) is equivalent to the population of other population-based algorithms.In HM, all solution vectors are stored.In this step, random vectors, as many as hms, are generated following Equation (1): where LB i and UB i are the lower and upper bounds of the ith variable.D represents the dimensions of the problem.U(0, 1) is a uniformly-distributed random number between 0 and 1.
Step 3 Improvising a new harmony ) is generated based on three rules: memory consideration, pitch adjustment, and random selection.The procedure is shown in Algorithm 1.

Algorithm 1 Improvisation of a New Harmony
For i = 1 to D do If U(0, 1) ≤ hmcr then x i = x j i where j is a random integer from (1, 2, • • • , hms) If U(0, 1) ≤ par then Step 4 Update harmony memory If the new harmony generated in Step 2 is better than the worst harmony stored in HM, replace the worst harmony with the new harmony.
Step 5 Check the terminal criteria If the maximum number of improvisations (NI) is reached, then stop the algorithm.Otherwise, the improvisation will continue by repeating Steps 3 to 4.

Kernel Fuzzy Clustering
In the past decade, several studies on the kernel fuzzy clustering method (KFC) have been conducted [18,19].A kernel method means using kernel functions to map the original sample data to a higher-dimensional space without ever computing the coordinates of the data in that space.The kernel method takes advantage of the fact that dot products in the kernel space can be expressed by a Mercer kernel K, given by K(x, y) ≡ Φ(x) T Φ(y), where x, y ∈ R d [20].
Suppose the sample dataset is given as X = {X 1 , X 2 , ..., X n }, and each sample has K attributes X i = {X i1 , X i2 , ..., X iK }.Firstly, the original sample data is normalized following Equation (2): The purpose of kernel fuzzy clustering is to minimize the following objective function: Constraints are: where C is the number of clusters, N is the number of samples, K is the number of features, T is the feature-weight vector, and m is coefficient, m > 1.
According to the relevant literature [17,20,21], Equation ( 3) is simplified as: where K(x, x ) is a Mercer kernel function.In this paper, the (Gaussian) radial basis function (RBF) kernel is adopted: The clustering center and membership matrix is:

Differential Harmony Search Algorithm
The harmony search (HS) algorithm tries to replace the worst item in the harmony memory (HM) if the generated harmony is better.However, some local optimums in HM have the probability of remaining unchanged for a long time.Additionally, the guide from the best harmony can also improve the local search ability of the algorithm.In this section, a novel modification of the harmony search algorithm, named differential harmony search (DHS), is proposed to help the algorithm converge faster.
In the proposed method, the harmonies in the HM which meet the specific conditions will be changed following the differential evolution strategy.Moreover, the new generated harmonies will consider the best vector.

A Self-Adaptive Solution Generation Strategy
The purpose of this strategy is to improve the local search ability of the algorithm.In the pitch adjustment step, the classical HS changes the random harmony selected from HM by small perturbations.Different from classical HS, the pitch adjustment in DHS will consider the best vector, as follows: where cr is the crossover probability; it generally varies from 0 to 1. Through a large number of numerical simulations, we obtain the conclusion that the suitable range of cr is [0.4,0.9].In this paper, the parameter cr is dynamically adjusted following Equation ( 13).When cr is smaller, the new harmony has a higher probability to use the value coming from the best vector, which means that the convergence speed will be faster; when cr is larger, the harmonies will retain their own diversity.Overall, using this strategy, the algorithm will have better local search ability in the early stage.
where CI is the current iteration, and NI is the maximum iteration.

A Differential Evolution-Based Population Update Strategy
The purpose of this strategy is to help the algorithm avoid from local convergence.When updating HM with the new harmony, some local optimums in HM have the probability of remaining unchanged for a long time.In DHS, if the harmony in HM remains unchanged for several iterations (sc = 20), it will be replaced with a new harmony using the differential evolution strategy shown in Equation ( 14): where x i and x i are random stored values from HM.Using the dynamic parameter cr, the harmonies have higher probability of variation in the later stage.

Implementation of DHS
The DHS algorithm consists of five steps, as follows: Step 1 Initialize the algorithm parameters This step specifies parameters, including harmony memory size (hms), harmony memory considering rate (hmcr), pitch adjustment rate (par), fret width (fw), and memory keep iteration (sc).
Step 2 Initialize the harmony memory This step is consistent with the basic harmony search algorithm.New random vectors (x 1 , . . ., x hms ), as many as hms, are generated following Equation ( 15): Then each vector will be evaluated by the objective function and stored in the HM. Step

Improvising a new harmony
In this step, a new random vector x = (x 1 , x 2 , . . ., x D ) is generated considering the self-adaptive solution generation strategy.The procedure is shown in Algorithm 2.

Algorithm 2 Improvisation of a New Harmony of DHS
Step 4 Update the harmony memory If the vector generated in Step 3 is better than the worst vector in HM, replace the worst vector with the generated vector.
If the vector in HM remains unchanged for several iterations (sc = 20), replace it with a new vector generated, following Equation ( 14), if the new vector is better.
The procedure of step 4 is shown in Algorithm 3.

Algorithm 3 Update the Harmony Memory
Step 5 Check the stop criterion Repeat Step 3 to Step 4 until the termination criterion is satisfied.In this paper, the termination criterion is the maximum number of evaluations.

DHS-KFC
In this paper, we use the proposed DHS to search the optimal result of the kernel fuzzy clustering (KFC).When DHS is applied to KFC, we must define the optimization variables.In this paper, the cluster center v = {v 1 , • • • , v n } is chosen as the optimization variables.The weight matrix w = [w 1 , • • • , w K ] T is given by the experts using a method which determines the relative importance of attributes, such as Fuzzy Analytic Hierarchy Process (FAHP) [22].The membership matrix u = {u 1 , • • • , u n } can be obtained by Equation (10).The objective function of KFC is shown as: The DHS algorithm will find the optimized result which minimizes the objective function.The procedure is described as follows (Figure 1): Step 1 initialize the parameters of the harmony search algorithm, initialize the harmony memory.
Step 2 initialize the parameters of the KFC, maximum generation N, and the weight matrix w, set the initial value of the cluster center matrix v 0 to a randomly-generated matrix.Then the membership matrix u 0 can be obtained from Equation (10).
Step 3 generate a new solution vector based on the harmony search algorithm.
Step 4 obtain the cluster center matrix v n from the solution vector, calculate the membership matrix u n based on Equation (10), and then calculate J n based on Equation ( 16).
Step 5 compare J n with J n−1 .If J n remains unchanged until 10 iterations, and go to Step 7.
Step 6 set the current iteration n = n + 1.If n > N go to Step 7, otherwise go to Step 3.
Step 7 classify the samples based on their membership.

end for
Step 5 Check the stop criterion Repeat Step 3 to Step 4 until the termination criterion is satisfied.In this paper, the termination criterion is the maximum number of evaluations.

DHS-KFC
In this paper, we use the proposed DHS to search the optimal result of the kernel fuzzy clustering (KFC).When DHS is applied to KFC, we must define the optimization variables.In this paper, the cluster center is chosen as the optimization variables.The weight matrix  is given by the experts using a method which determines the relative importance of attributes, such as Fuzzy Analytic Hierarchy Process (FAHP) [22].The membership matrix can be obtained by Equation (10).The objective function of KFC is shown as: The DHS algorithm will find the optimized result which minimizes the objective function.The procedure is described as follows: Step 1 initialize the parameters of the harmony search algorithm, initialize the harmony memory.
Step 2 initialize the parameters of the KFC, maximum generation N, and the weight matrix w , set the initial value of the cluster center matrix 0 v to a randomly-generated matrix.Then the membership matrix 0 u can be obtained from Equation (10).
Step 3 generate a new solution vector based on the harmony search algorithm.
Step 4 obtain the cluster center matrix n v from the solution vector, calculate the membership matrix n u based on Equation (10), and then calculate n J based on Equation (16).
Step 5 compare n J with J remains unchanged until 10 iterations, and go to Step 7.
Step 6 set the current iteration Step 7, otherwise go to Step 3.
Step 7 classify the samples based on their membership.

Formula Search Domain Optimum
Ackley function Parameters of the algorithms are shown in Table 2.All of the experiments are performed using a computer with 3.80 GHz AMD Athlon x4 760k with 8 GB RAM.The source code is compiled with Java SE8.We evaluated the optimization algorithms based on 10 dimensional versions of the benchmark functions.The maximum evaluation count (NI) is set to 100,000.Each function is tested 100 times for each algorithm.
Table 3 shows the maximum, minimum, means, and standard deviation of errors of the algorithms on each benchmark function.Table 4 shows the distribution of the results simulated by the algorithms.As demonstrated in Tables 3 and 4, it is clear that DHS outperforms the other variants of HS in almost all of the functions except the Rosenbrock function.In these cases, the minimum, maximum, means, and standard deviations obtained by DHS are smaller than the results obtained by HS and its variants.
When dealing with the Rosenbrock function, the global minimum of the Rosenbrock function is inside a long, narrow, parabolic-shaped flat valley, meaning that convergence to the global minimum is difficult.Although the minimum of GHS is smaller than DHS, the maximum, mean, and standard deviation show that DHS is more stable than GHS.We choose one typical unimodal function and one multimodal function to test the convergence speed of the algorithms.

Sensitivity Analysis of Parameters
In this section, the effect of each parameter in the search process of the DHS algorithm will be discussed.
Similar with the basic HS, the DHS algorithm has parameters that include hms, hmcr, par, and fw.The default parameters of DHS are hms = 50, hmcr = 0.9, par = 0.3, fw = 0.005, sc = 20.The maximum number of evaluations is set to 100,000.Then we change each parameter and test via the benchmark functions.Each scene runs 50 times.Tables 5-9 show the results of the optimization of the benchmark functions.The results in Table 5 show that the performance of the DHS algorithm will be better if sc is smaller.The value of sc determines the replacement frequency of the memories in the HM.Results show that sc values between 10 and 40 are suggested.
In Tables 6 and 7, we found that although values of hms and hmcr have impacts on the optimization results, but there is no obvious rule for the selection of hms and hmcr.Hms values between 30 and 70 are applicable for most cases, while hmcr values between 0.8 and 0.99 are suggested.
In Table 8, DHS performs better when par values are less than 0.3.In Table 9, we found that the algorithm is not sensitive to the parameter fw.A value of 0.005 is applicable for most cases.

Numerical Experiments of DHS-KFC
To test the effectiveness of the DHS-KFC method, University of California Irvine (UCI) machine learning repositories: wine dataset and iris dataset are used here.The iris dataset consists of 50 samples from each of three species of iris.The wine dataset is the results of a chemical analysis of wines grown in the same region in Italy but derived from three different cultivars.The method is compared with the k-means and fuzzy cluster method.Classification results are shown in Table 10.The classification rates of wine and iris were higher for DHS-KFC than kmeans and fuzzy cluster.

Case Study
East Lake is the largest scenery-related tourist attraction in Wuhan China, located on the south bank of the Yangtze River.The East Lake Network covers an area of 436 km 2 , consisting of East Lake, Sha Lake, Yangchun Lake, Yanxi Lake, Yandong Lake, and Bei Lake.A map of the East Lake is shown in Figure 3.In recent years, climate change and human activities have influenced the lakes significantly.Increasing sewage has led to serious eutrophication.Most of the sub-lakes in the East Lake Network are highly polluted.Chemical oxygen demand (COD), total nitrogen (TN) and total phosphorus (TP) of the lakes are shown in Table 11.CD28 is furthermore essential for homeostatic proliferation and survival of nTregs in the periphery [15].In contrast, development of iTregs requires the presence of IL-2 and transforming growth factor (TGF)-β instead of CD28 co-stimulation [16].

Transcription Factors and Surface Markers
Treg-cell lineage commitment is pivotally promoted by the expression of transcription factor  In recent years, the government has invested 15.8 billion yuan (RMB) to build the Ecological Water Network Project.The Water Network Connection Project is one of the subprojects; this project tries to transfer water from the Yangtze River through the diversion channels to improve the water quality in the lakes.In this project, Water diversion scheduling (WDS) is a new scheduling method combining with hydrodynamics.Previous works, including a hydrodynamic simulation model and water quality model, have already been done.Using these models, water quality of the lakes at the terminal time can be simulated.The purpose of WDS is to find a suitable scheme which has a better water quality result and lower economic cost.This is a multi-objective problem but, unfortunately, multi-objective evolutionary algorithms (MOEA) cannot be adopted because the water quality simulation is time-consuming.After a variety of simulations have been done, several feasible schemes and their simulation results had been made.However, the difference among the results is small.Thus, we use cluster analysis to summarize these schemes.
The water diversion scheduling in the East Lake Network is a multi-objective problem.To reduce the construction cost, the existing diversion channels include Zengjiaxiang, Qingshangang, Jiufengqu, Luojiagang, and Beihu pumping stations are considered.Water is brought from the Yangtze River to the East Lake Network through the Zengjiaxiang and Qingshangang channels, while water in the lakes is drained out through Luojiagang and Beihu pumping stations (Figure 4).The objective functions are shown as: , ,

I
q q q q q q q q q (17) where I is the water quality index vector including TP, TN, and COD information, which is obtained by the water quality model.C is the total amount of water, Q is the economic cost, qz is the inflow vector of the Zengjiaxiang channel, qq is the inflow vector of the Qinshanggang channel, and ql is the outflow vector of the Luojiagang channel.Then the outflow of Jiufengqu and Beihu pumping stations follows: The objective functions are shown as: q q , q l C = f 2 q z , q q , q l Q = f 3 q z , q q , q l (17 where I is the water quality index vector including TP, TN, and COD information, which is obtained by the water quality model.C is the total amount of water, Q is the economic cost, q z is the inflow vector of the Zengjiaxiang channel, q q is the inflow vector of the Qinshanggang channel, and q l is the outflow vector of the Luojiagang channel.Then the outflow of Jiufengqu and Beihu pumping stations follows: The initial water qualities of the six lakes are shown in Table 11.The maximum inflow of the Qinshangang channel is 30 m 3 /s, the maximum inflow of the Zengjiaxiang channel is 10 m 3 /s, the total diversion time is 30 days.The water quality of Yandong Lake has already reached the standard, so it is not considered in this paper.The problem of water diversion scheduling is to design the inflow of Zengjiaxiang, Qinshangang, Luojiaxiang every day, which improves the water quality as much as possible with minimum cost. Since the water quality simulation is time-consuming and MOEA cannot be adapted to this problem, after some pretreatment, we have identified a number of feasible schemes.These feasible schemes are shown in Table 12.Then we applied these schemes to the water quality model.The water quality simulation results of the schemes are shown in Table 13.Since the water quality model may have a certain amount of error, we decided to determine a set of good schemes instead of one scheme.q q q l q z q q q l q z q q q l q z q q q l q z q q q l q z q q q l 1 10 0 10 In this paper, the goal is to use the DHS-KFC method to determine the good schemes.According to the requirements, we need to divide the schemes into five categories, including excellent, good, average, fair, and poor.Using the kernel cluster method explained in Section 3, we input the scheme results shown in Table 13 as the samples for the clustering, including three water quality parameters (COD, TP, TN) of five lakes, water diversion quantity, and economic cost.The weight vector of water quality, water diversion quantity, and economic cost is defined in Equation ( 23) according to the advice given by experts.
In this case study, the parameter m = 1.7, and the minimum objective function value is 0.225269.The cluster results are shown in Table 14 Using the Technique for Order of Preference by Similarity to Ideal Solution (TOPSIS) model [22,23], the similarity values of the cluster centers are shown as: CC = 0.950 0.592 0.13 0.052 0.816 (22) The schemes in cluster I are considered better than the others.

Figure 1 .
Figure 1.Framework of the proposed method.Figure 1. Framework of the proposed method.

Figure 1 .
Figure 1.Framework of the proposed method.Figure 1. Framework of the proposed method.

Figure 2
shows the convergence of the DHS algorithm compared to the variants of HS.The results clearly show that the DHS algorithm converges faster than other variants of HS.Algorithms 2017,

Figure 2 .
Figure 2. Convergence characteristics: (a) Ackley function; (b) Sphere function.4.1.2.Sensitivity Analysis of ParametersIn this section, the effect of each parameter in the search process of the DHS algorithm will be discussed.Similar with the basic HS, the DHS algorithm has parameters that include hms, hmcr, par, and fw.The default parameters of DHS are hms = 50, hmcr = 0.9, par = 0.3, fw = 0.005, sc = 20.The

Molecules 2017 ,
22, 134 2 of 23 inhibitory cytokines, metabolic disruption of target T-cells, cytolysis, and regulation of dendritic cell maturation and function [14].Tregs are divided into two different subsets, depending on their origin.While natural Tregs (nTregs) are derived from the thymus, inducible Tregs (iTregs) develop from naïve T-cell precursors in the periphery.Thymic development of nTregs depends on TCR-stimulation in combination with CD28 co-stimulation.

Figure 3 .
Figure 3. Map of the East Lake Network.

Figure 4 .
Figure 4. Diversion channels of the East Lake Network.

Figure 4 .
Figure 4. Diversion channels of the East Lake Network.

Table 1 .
These functions are tested by simulating the HS, IHS, GHS, CHS, and the proposed DHS algorithms.Of the functions in Table 1, the Ackley function, Griewank function, Rastrigin function, and Rosenbrock function are multimodal functions.The Sphere function and Schwefel 2.22 function are unimodal functions.

Table 2 .
Parameters of the algorithms.

Table 3 .
Errors of test results of the algorithms.

Table 4 .
Distribution of error of the algorithms.

Table 7 .
The effect of hmcr on the means and standard deviations.

Table 9 .
The effect of fw on the means and standard deviations.

Table 10 .
Results of the clustering method.

Table 11 .
Initial water qualities of the six lakes.

Table 13 .
Results of the schemes shown in Table12.

Table 14 .
Results of the proposed method.