Improved Learning-Automata-Based Clustering Method for Controlled Placement Problem in SDN

: Clustering, an unsupervised machine learning technique, plays a crucial role in partitioning unlabeled data into meaningful groups. K-means, known for its simplicity, has gained popularity as a clustering method. However, both K-means and the LAC algorithm, which utilize learning automata, are sensitive to the selection of initial points. To overcome this limitation, we propose an enhanced LAC algorithm based on the K-Harmonic means approach. We evaluate its performance on seven datasets and demonstrate its superiority over other representative algorithms. Moreover, we tailor this algorithm to address the controller placement problem in software-deﬁned networks, a critical ﬁeld in this context. To optimize relevant parameters such as switch–controller delay, intercontroller delay, and load balancing, we leverage learning automata. In our comparative analysis conducted in Python, we benchmark our algorithm against spectral, K-means, and LAC algorithms on four different network topologies. The results unequivocally show that our proposed algorithm outperforms the others, achieving a signiﬁcant improvement ranging from 3 to 11 percent. This research contributes to the advancement of clustering techniques and their practical application in software-deﬁned networks.


Introduction
Clustering is a technique used for analyzing statistical data [1].The process involves partitioning the data into clusters, so that data within the same cluster have the highest similarity, while data in different clusters have less similarity.Various clustering approaches, such as hierarchical, distance-based, density-based, and graph-based, have been proposed so far.Figure 1 depicts a diagram illustrating different clustering approaches.

Introduction
Clustering is a technique used for analyzing statistical data [1].The process involves partitioning the data into clusters, so that data within the same cluster have the highest similarity, while data in different clusters have less similarity.Various clustering approaches, such as hierarchical, distance-based, density-based, and graph-based, have been proposed so far.Figure 1     Appl.Sci.2023, 13, 10073.https://doi.org/10.3390/app131810073https://www.mdpi.com/journal/applsci The clustering technique has various applications in medical science [2], architecture [3], drug discovery [4], image processing [5], computer networks [6], and communication [7], among others.It is also an important tool for solving controller placement problems (CPPs) [8] in software-defined networks (SDNs) [9].In CPPs, the goal is to determine the optimal number and location of controllers in a network, with the key objectives being the delay between controllers and switches, the delay between controllers, and load balancing [10].
A learning automaton (LA) is one type of machine learning algorithm with a finite number of actions and different probabilities assigned to each action [11].In each round, the LA takes an action and evaluates the reinforcement signal from its surrounding environment.Then, the LA updates its action probability vector according to the reinforcement signal until a satisfactory solution is achieved.The LA has been used in various computer network applications to enhance existing solutions.The LA has also been applied to improve the k-means clustering method in [12].In the proposed clustering algorithm, LAC, each LA is assigned to a data point, and the LAs determine the clusters to which their associated data points belong.LAC has been shown to outperform other clustering algorithms such as k-clusters, k-means, k-medians, k-means++, and k-medoids in UCI datasets.However, LAC, like k-means, is sensitive to the selection of initial points.
To address this issue, we propose an optimization method of LAC using k-Harmonic means (KHM) in this paper.Our approach, LAC-KHM, yields better results than LAC on various datasets.In addition to considering the distance between data points and their corresponding cluster centers, we also take into account the distance between individual cluster centers and load balancing for solving CPPs.
The rest of this paper is organized as follows: In Section 2, we present related works.Section 3 provides a thorough explanation of the preliminaries.In Sections 4 and 5, we present the proposed algorithms, LACKHM and CLAC-KHM, respectively.Section 6 evaluates the performance of the algorithms, starting with LACKHM in the presence of seven datasets from the California Irvine (UCI) clustering benchmarks and then discussing CLAC-KHM performance on four topologies.Finally, Section 7 concludes the paper.

Related Works
In this section, we will first discuss different types of clustering methods and then describe their application in solving CPPs.

Types of Clustering Methods
Data clustering is an important issue in data mining [13] that can be divided into two groups: soft and hard [14].Soft clustering means overlapping clustering, such as fuzzy clustering, where any data point may belong to more than one cluster with different membership grades.The other clustering, hard clustering, is exclusive clustering in which each data point exists in just one cluster [15].
From a different point of view, hierarchical and partitioning methods are the two main categories of clustering methods that were introduced in [16].A hierarchical clustering method works by grouping data into a tree of clusters.In hierarchical clustering, the aim is to produce a hierarchical series of nested clusters.A diagram represents this hierarchy, which is an inverted tree that describes the order in which factors are merged (bottom-up view) or clusters are broken up (top-down view).On the other hand, the partitioning method was described in [13], which starts from an initial clustering and then moves data points from one cluster to another with a relocation method iteratively.There are two types of algorithms in this approach: error minimization algorithms and graph-theoretic clustering.The basic idea in error minimization algorithms is to determine clusters by minimizing a specified error criterion.The most famous algorithm in this area is k-means, which employs the sum of squared error (SSE) as an error criterion.The SSE measures the total squared Euclidean distance of instances to their representative values.However, kmeans has some problems such as limiting to numeric attributes or being sensitive to initial centers of clusters.Therefore, some algorithms based on k-means such as k-prototype [17] or KHM [18] were proposed to overcome these problems.In addition, the k-medoids method was proposed, which is more robust than the k-means algorithm [19].Graphtheoretic methods produce clusters via graphs.The famous algorithm in this method is based on the minimal spanning tree (MST) [20].Another algorithm was proposed based on limited neighborhood sets [21].
From another perspective, clustering methods are divided into three main categories [13]: grid-based methods, density-based methods, and model-based clustering.The grid-based clustering methods use a multiresolution grid data structure to quantize object areas into a finite number of cells that form a grid structure, on which all the operations for clustering are implemented.Grid-based clustering uses dense grid cells to form clusters. [22].Density-based clustering: The data points in the region separated by two clusters of low point density are considered as noise.DBSCAN is one of the most famous methods in this approach.The environment with a radius of a given object is known as the neighborhood of the object.If the neighborhood of the object includes at least a minimum number, MinPts, of objects, then it is called a core object [23].Model-based clustering (or distribution models): This method assumes a data model and applies an EM algorithm to find the most likely model components and the number of clusters [24].

Clustering APPLICATION in CPPs
Among the available clustering methods, k-means, spectral, and DBSCAN have been investigated more than others for solving CPPs.Therefore, we consider the proposed algorithms based on these methods.
Several solutions have been proposed based on the DBSCAN clustering method to appropriately locate the SDN controllers, as described in [25][26][27][28][29]. Since the method suggests the proper number of clusters, all of the proposed algorithms recommend a number of clusters, such as in [25], where the appropriate number of controllers is determined using silhouette analysis and gap statistics.However, one of the main disadvantages of the algorithm is that the value of the minimum number of objects should be defined in advance, which limits its ability to cluster datasets with high-density differences.Additionally, the neighborhood radius may not be suitable for all clusters, and this algorithm suffers from a high computational overhead.
In [26], the density-based controller placement (DBCP) method partitioned the given network using a density-based clustering method.This research considered CPPs with and without the capacity of controllers.
Clustering by fast search and find of density peaks (CFSFDP) proposed in [30] has been used in [27,28] to deal with the CPP.CFSFDP is a density-based clustering algorithm that requires fewer initial parameters and has more execution speed.However, it suffers from the need to experimentally preset the number of clusters.
The authors in [27] used a comprehensive consideration value according to [30], γ, which was calculated as the product of local density (ρ) and minimum distance (δ).However, two problems are identified with this definition.Firstly, the values of ρ and δ may have different orders of magnitude, so they were normalized to ensure equal treatment.Secondly, nodes at different densities may have the same δ value, making it challenging to select the initial clustering center.To address this, the weight of δ was increased in low-density areas.The modified calculation equation for γ was presented as γ i = ρ i .δi ˆ 3  4 .A higher γ value indicates a higher likelihood of being a clustering center.The authors proposed a method to automatically determine the inflection point of γ, which corresponds to the transition from nonclustering centers to clustering centers.By analyzing the curve of γ values and finding the vertex of a folded line, the inflection point was identified.The optimal number of controllers was determined by counting the number of points with values greater than the inflection point γ value, and the locations of these points represented the deployment of clustering centers.The algorithm provided a step-by-step process for the selection of clustering centers.
The researchers in [28] utilized the idea of information entropy and a firefly algorithm for determining the local density and considered a point with the high density of switches as the cluster center.In the algorithm, the number and location of the controllers are discovered based on the jumping point in the decision graph.
The authors in [29] presented an improved density-based controller placement algorithm (DCPA) that enhances the efficiency of controller placement in a network.This algorithm achieves the required number of controllers by exploring candidate values of the radius and dividing the entire network into multiple subnetworks.Within each subnetwork, the controllers are deployed with the dual objective of minimizing the average propagation latency and the worst-case propagation latency between controllers and switches.
The spectral clustering method was used in [31][32][33][34][35] for CPP clustering.It is worth mentioning that the load-balancing parameter was considered in [31,32,35], and only [32,35] introduced algorithms for estimating the proper number of controllers.The authors in [32] utilized the structure of eigenvectors for the objective.In [33], first, the controllers were mapped into the row vector classification using spectral clustering, and then, using Kmedoids algorithm, which is based on simulated annealing, the vectors were classified to achieve a flexible distribution of the controllers.The results in [34] demonstrated that the spectral algorithm outperforms K-median and K-center in terms of intercontroller latency.The problem was formulated for minimizing the controller cost, main costs, delays of switch-controller and intercontrollers, and main power cost in [35].
K-means was applied in [36][37][38] for solving a CPP in a SDN by paying attention to the delay between switches and controllers and load balancing.Only [38] among them considered intercontroller delay.It should be noted that the researchers in [37] also employed hierarchical clustering.
The authors in [39] mathematically formulated the placement of controllers as an optimization problem.The objectives of this problem were considered to minimize the controller response time, which refers to the delay between the SDN controller and assigned switches, as well as the control load (CL), intracluster delay (ICD), and intracluster throughput (ICT).To address this, they introduced a computationally efficient heuristic called deep-Q-network-based dynamic clustering and placement (DDCP).This heuristic utilized reinforcement and deep learning techniques to solve the optimization problem.

Basic Concepts
In this section, the basic concepts that will be used in this research are further explained.

K-Means Clustering
K-means clustering is a fast and commonly used technique due to its low iteration rates and ease of implementation.The k-means algorithm works by attempting to find the cluster centers (C 1 , C 2 , . . ., C K ) in a way that minimizes the total squared sum of distances between each data point "X i " and its closest cluster center (C j ).However, the performance of k-means heavily depends on the initialization of the centers, which is a key issue with this algorithm.The algorithm establishes strong connections between data points and their closest cluster centers, resulting in cluster centers that do not leave the local range of data density.As the LAC algorithm is implemented via k-means, it still suffers from the issue of the random initialization of the centers.Thus, in this research, we optimized the algorithm by using KHM.

K-Harmonic Means
K-Harmonic means (KHM) was proposed by the authors in [18] as a new clustering method based on k-means.In this algorithm, the harmonic means are used instead of the Euclidean distance to solve the initialization problem of the KM algorithm.
The objective function of the KHM algorithm is named KHM, and it calculates the harmonic mean of the distance from each point to all centers.KHM investigates two functions, soft membership and weight.The weight function assigns a higher weight to data points that are far away from every center, defining the impact of each data point on computing new components of the cluster center.Parameter "ρ" is a user-defined input parameter in KHM, typically equal to or greater than 2. It influences the fuzziness of cluster assignments.A higher value of "ρ" results in smoother membership distributions and allows data points to have more evenly distributed memberships across multiple clusters.Conversely, a lower value of "ρ" leads to sharper cluster boundaries and more distinct memberships.
The algorithm continues until a predefined number of iterations is reached, or until the output of the KHM objective function does not change significantly.

Learning Automata
Learning automata [11] are probabilistic decision-making tools that iteratively adapt to the environment and learn the optimal action.A widespread type of learning automata is variable structure learning automata, which are defined by a quadruple [α, β, P, T], where: In the n th step of a linear learning algorithm, if the i th selected action α i (n) receives the reward reinforcement signal β(n) = 0, the corresponding probability vector of learning automaton, p(n + 1), is updated using 1.If it receives the penalty reinforcement signal β(n) = 1, the corresponding probability vector of learning automaton p(n + 1) is updated using 2 [40]: where in Equations ( 1) and ( 2), a and b are learning parameters (reward and penalty parameters), and different values for a and b create different learning algorithms: • If a = b, the learning algorithm will be of the linear reward penalty (LRP) type.

•
If b = 0, the learning algorithm will be of the linear reward inaction (LRI) type.

•
If a >> b (a is much larger than b), the learning algorithm will be of the reward epsilon penalty (LREP) type.

Learning-Automata-Based Clustering
In this form of clustering, each data point is equipped with the third type of learning automaton (LREP), and then, the learning automaton (LA) defines the membership status of the data point.The membership status of a data point in relation to a cluster is determined via "K-Means."Therefore, this learning approach is based on the Euclidean distance between the data point and other data points within the cluster.The number of selectable actions for each learning automaton is equal to the number of clusters.After each selection, if the data point is assigned to the correct cluster, the selection is rewarded, and if not, it is penalized.
There are two reasons why the process of learning each data point was conducted using learning automata in this work: 1.
The input size: The input size is a function of the number of members and their attributes.It is necessary to mention that in the learning process of LA, the entirety of the given data are considered.It depends on the datasets.

2.
The cluster count: LA assign each datum to each cluster in during the learning process.The actions of the automata show the selection of a cluster for that particular member.

Software-Defined Networks and CPPs
Software-defined networks (SDNs) are a new network technology that is controlled centrally and intelligently.In this type of network, the management and tracing of packets are entirely the responsibility of the controlling part of it.The main issue of the network's controlling plane occurs when the network size is large, as in this case, just one controller will not be able to cover the entire network.Therefore, the main challenge in this scenario is finding the right amount and location for the controllers.Clustering is one of the most reasonable viewpoints that can succeed in the complicated management process of large networks and can guarantee load balancing.Using the concept of clustering, a large network is divided into multiple subnetworks in a manner that there will be a controller for each subnetwork.Some matters such as the delay between controllers, load balancing, and reliability are also important considerations in addition to decreasing the delay between the switch and the controller.

LAC-KHM: The Proposed Algorithm
LAC-KHM is based on the LAC algorithm, in which each data point is equipped with a learning automaton (LA).The number of actions of each LA equals the number of clusters, and during the learning process, the LA specifies to which cluster its associated data point belongs.The action of each LA is chosen based on the action probability vector (APV).The LA's decision is compared with the output of the k-means algorithm as a reinforcement signal.However, like k-means, LAC is sensitive to the proper selection of initial points.Therefore, in this research, LAC is improved through KHM, and the distance between data points and cluster centers is calculated with regard to the harmonic distance.The proposed algorithm, LAC-KHM, is formed based on three functions: select cluster, update probability, and calculate accuracy.The LAC-KHM algorithm is shown in Algorithm 1.It is noted that the cluster centers themselves are also updated based on the harmonic distance.

Algorithm 1: LAC-KHM algorithm
This algorithm has three functions with input and output parameters, and operations that follow: Select Cluster: in fact, during rounds this function specifies to which cluster the data point belongs.It is noted that initially all actions in APV have the same probability.Input parameters: Membership_cluster, probability, num_cluster, num_data Output parameter: Membership_cluster Membership_cluster = Function Select_cluster (membership_cluster, probability, num_cluster, num_data) Update probability: This function updates the APV using learning rule LRP based on the received reinforcement signal.In fact, with each execution, the probability of the selected action is either decreased or increased based on the harmonic distance between data points and cluster centers.

CLAC-KHM: Customized LAC-KHM for CCP
Clustering algorithms have been applied in various scenarios, including solving the CPP.To solve the CPP, switches are mapped to data points and controllers to cluster centers.In most studies, the similarity criterion in clustering is the delay between the controller and the switch, which is essentially the distance between data points and centers in each cluster.It is also worth mentioning that the most practical algorithm is the one that considers other parameters, such as load balancing and intercontroller delay.Therefore, to solve the CPP, the LAC-KHM algorithm is customized to also consider load balancing and intercontroller distance, in addition to the distance between data points and centers.This means that the LA resident on each data point is rewarded not only when it reduces the distance between the data and cluster center, but also when it improves intercentral cluster and load balancing.As mentioned before, switches and controllers are placed in data points and cluster centers, respectively, and the network is partitioned based on this foundation.

Problem Formulation
SDN is a typical network comprising of controllers, switches, and links, which can be modeled as an undirected graph G = (V, E), where V represents a set of switches and E represents a set of links between the switches.In addition, n denotes the number of nodes in a given graph of switches, k denotes the number of controllers in the SDN, and C = {c 1 ,c 2 ,..,c k } is a set of controllers.This algorithm divides the network into several subnetworks, with each cluster having a controller.Clustering the network is defined using SDN i (V i , E i ) as follows: SDN i is a connected region ∀i ∈ k Formula (3) implies that the network is covered with all formed subnetworks.Formula (4) demonstrates that there is no overlapping between subnetworks, and as mentioned, k denotes the number of controllers in the SDN.Formula (5) indicates that the highest similarity can be found between the members of the same cluster.Therefore, all of the switches can be assigned to one controller, and the lowest similarity is between the members of two different and separate clusters, which is demonstrated in Formula (6).Formula (7) indicates that all members of a subnetwork are connected with links.In this research, the similarities are intercontroller delay, controller/switch, and load balancing.
In this research, the objective function (OF) is defined as the following: where l − b is denoted the load-balancing parameter.To calculate this parameter, we need to compute N_C, where the ideal range for the number of members in each cluster is defined as follows: where n and k are the total number of switches and clusters, respectively.Consequently, if the number of members in a cluster is in the range of N_C, they are determined as True and if not, they are considered False.Finally, l_b is the sum of clusters with the "True" tag, divided by k.The average delay between each switch and its related controller [41] and its normalized formula is demonstrated in (10) and (11).
Similarly, the intercontroller delay, which is calculated using "Dijkstra", and its normalized format are illustrated in ( 12) and ( 13), respectively: The coefficients α, β, and γ are still assumed to be in the range of [0, 1].The values of l_b, d S-C , and d C-C depend on the specific context and units of measurement.These coefficients can be adjusted at any given time.The CLAC-KHM algorithm is demonstrated in Algorithm 2.

Experimental Results
In this section, we evaluate LAC-KHM initially with seven datasets and then, the CLAC-KHM is compared with the three aforementioned algorithms on four topologies.

Performance Evaluation of LAC-KHM
In what follows, LAC-KHM's efficiency is evaluated via comparisons with k-medians, k-medoids, k-means++, k-means, and standard LAC on seven different datasets, the features of which are displayed in Table 1.In addition, the input parameters of the functions in the LAC-KHM algorithm are also explained.Details of the dataset features and their initial values can be seen in this table as well.Additionally, the parameters of the LA are set according to those of [12].Similarly, the results of checking the accuracy of the given algorithm compared to the rest are shown in Table 2.As can be seen in this table, LAC-KHM achieves the best results.

Performance Evaluation of CLAC-KHM
In this section, CLAC-KHM is first compared to k-means, spectral, and LAC, and afterwards, they are all analyzed in a similar condition on four real internet zoo topologies on different scales.The initial values of the objective function in this study are shown in Equation ( 16): Due to the fact that KHM is based on reducing the distance between the center and the points, and load balancing is another factor that relatively guarantees fault tolerance, α is set to 2. Table 3 shows the details of the topologies used in this research.In [42], it is noted that some nodes have no complete information about latitudes and longitudes.Hence, we ignore these nodes throughout our simulations.Table 3 demonstrates the symbols of the paper.
The average delay between each switch and its related controller The average delay between controllers In this research, we calculated the harmonic distance by replacing the Euclidean distance with the haversine distance, which is more appropriate for real-life topologies.The haversine distance is the distance between two data points on a sphere using their latitude and longitude.Therefore, we consider the minimum-delay path between all data points using the haversine distance [43].The haversine formula uses the central angle θ between every two data points on a sphere as shown in Equation (17), where d and r are the distance and the sphere radius, respectively.
The haversine distance is calculated via the haversine function which equals hav(θ) = sin2(θ), in which we are given a direct calculation of the longitude and latitude of the two points, as elaborated below: The average delay between controllers  − Normalization of  (−)  − Normalization of  (−) In this research, we calculated the harmonic distance by replacing the Euclidean distance with the haversine distance, which is more appropriate for real-life topologies.The haversine distance is the distance between two data points on a sphere using their latitude and longitude.Therefore, we consider the minimum-delay path between all data points using the haversine distance [43].The haversine formula uses the central angle Ɵ between every two data points on a sphere as shown in Equation (17), where d and r are the distance and the sphere radius, respectively.
The haversine distance is calculated via the haversine function which equals hav(Ɵ) = sin2(Ɵ), in which we are given a direct calculation of the longitude and latitude of the two points, as elaborated below: To solve for the distance d, we apply the archaversine (inverse haversine) to h = hav(θ) or use the arcsine (inverse sine) function: )) The ɸ1 and ɸ2 are the latitudes and λ1 and λ2 are the longitudes of data_point1 and data_point2 in our function, respectively.
Since LAC is based on k-means, the CLAC-KHM algorithm is evaluated with standard k-means in addition to LAC.Spectral clustering is a method with roots in graph theory, whereas k-means, k-medoids, and k-median are methods based on the distance between the centers and data points.Therefore, they are different from spectral.That is why we also compared CLAC-KHM with the spectral algorithm on four topologies (Dialtecom, Intellifiber, Iris, and Aarnet) with four different sizes.Table 4 shows the topologies' details studied throughout our evaluations [42].
The average delay between controllers  − Normalization of  (−)  − Normalization of  (−) In this research, we calculated the harmonic distance by replacing the Euclidean distance with the haversine distance, which is more appropriate for real-life topologies.The haversine distance is the distance between two data points on a sphere using their latitude and longitude.Therefore, we consider the minimum-delay path between all data points using the haversine distance [43].The haversine formula uses the central angle Ɵ between every two data points on a sphere as shown in Equation (17), where d and r are the distance and the sphere radius, respectively.
The haversine distance is calculated via the haversine function which equals hav(Ɵ) = sin2(Ɵ), in which we are given a direct calculation of the longitude and latitude of the two points, as elaborated below: To solve for the distance d, we apply the archaversine (inverse haversine) to h = hav(θ) or use the arcsine (inverse sine) function: )) The ɸ1 and ɸ2 are the latitudes and λ1 and λ2 are the longitudes of data_point1 and data_point2 in our function, respectively.
Since LAC is based on k-means, the CLAC-KHM algorithm is evaluated with standard k-means in addition to LAC.Spectral clustering is a method with roots in graph theory, whereas k-means, k-medoids, and k-median are methods based on the distance between the centers and data points.Therefore, they are different from spectral.That is why we also compared CLAC-KHM with the spectral algorithm on four topologies (Dialtecom, Intellifiber, Iris, and Aarnet) with four different sizes.Table 4 shows the topologies' details studied throughout our evaluations [42]. (−) The average delay between co  − Normalization of  (−)  − Normalization of  (−) In this research, we calculated the harmonic distance by tance with the haversine distance, which is more appropriate haversine distance is the distance between two data points on and longitude.Therefore, we consider the minimum-delay p using the haversine distance [43].The haversine formula uses every two data points on a sphere as shown in Equation ( 17 Since LAC is based on k-means, the CLAC-KHM algorit ard k-means in addition to LAC.Spectral clustering is a meth ory, whereas k-means, k-medoids, and k-median are method tween the centers and data points.Therefore, they are differen we also compared CLAC-KHM with the spectral algorithm on Intellifiber, Iris, and Aarnet) with four different sizes.Table 4 s studied throughout our evaluations [42].
To solve for the distance d, we apply the archaversine (inverse haversine) to h = hav(θ) or use the arcsine (inverse sine) function: )) The Φ 1 and Φ 2 are the latitudes and λ 1 and λ 2 are the longitudes of data_point 1 and data_point 2 in our function, respectively.
Since LAC is based on k-means, the CLAC-KHM algorithm is evaluated with standard k-means in addition to LAC.Spectral clustering is a method with roots in graph theory, whereas k-means, k-medoids, and k-median are methods based on the distance between the centers and data points.Therefore, they are different from spectral.That is why we also compared CLAC-KHM with the spectral algorithm on four topologies (Dialtecom, Intellifiber, Iris, and Aarnet) with four different sizes.Table 4 shows the topologies' details studied throughout our evaluations [42].
Table 5 presents the results.As mentioned previously, the Aarnet topology comprises 19 nodes, and for accommodating this topology, the number of controllers was examined within the range of 3 to 10. Consequently, in the table, you will observe a shaded area representing the number of controllers exceeding 10 for this particular topology.
The given values demonstrate that the delay between switches and controllers in k-means is less compared to that of spectral.However, the values of OF are not higher than spectral in all topologies with any number of controllers.These results determine that the CLAC-KHM achieves better results than the other algorithms as well.For the sake of a better explanation, the OF values are depicted for all algorithms in Figures 1-3 for the four topologies.
In this study, it was observed that Dial_telecom is the largest topology, and the spectral algorithm achieved better results compared to k-means.However, as shown in Figure 2, LAC performed as well as the spectral algorithm.Nevertheless, it is noteworthy that CLAC-KHM achieved the best performance in this topology.Figure 3 indicates the fact that the CLAC-KHM is the most efficient rather than the other representative algorithms.Meanwhile, it depicts that when there are a small number of controllers, the spectral clustering outperforms k-means in terms of OF, and contrarily, with increasing controller numbers, k-means has better results.It is worth bringing up the fact that they have no predictable behaviors in Intellifber.As we can recognize in Figure 4A,B, the behavior of CLAC-KHM is similar to LAC and k-means while enjoying better performance on the Iris topology and Aarnet topology.In contrast, spectral clustering does not have a rather normal behavior.Figure 3 indicates the fact that the CLAC-KHM is the most efficient rather than the other representative algorithms.Meanwhile, it depicts that when there are a small number of controllers, the spectral clustering outperforms k-means in terms of OF, and contrarily, with increasing controller numbers, k-means has better results.It is worth bringing up the fact that they have no predictable behaviors in Intellifber.
As we can recognize in Figure 4A,B, the behavior of CLAC-KHM is similar to LAC and k-means while enjoying better performance on the Iris topology and Aarnet topology.In contrast, spectral clustering does not have a rather normal behavior.From the attained results, it is concluded that the CLAC-KHM leads to better efficiency thanks to utilizing learning automata and KHM.

Conclusion and Future Works
In summary, the paper proposed two algorithms, LAC-KHM and CLAC-KHM, for solving the CPP in an SDN.LAC-KHM improved the LAC algorithm by incorporating the KHM method to avoid the sensitivity to the initialization of the algorithm.CLAC-KHM customized LAC-KHM for solving the CPP by considering the three main metrics, which were the distance between the controllers, the distance between controllers and switches, and load balancing.The proposed algorithms were evaluated on four different topologies, and the results showed that they outperformed k-means, spectral, and LAC algorithms.However, the proposed algorithms suffered from high computational complexity.This limitation arose due to the CLAC algorithm relying on the KHM algorithm, which en- From the attained results, it is concluded that the CLAC-KHM leads to better efficiency thanks to utilizing learning automata and KHM.

Conclusions and Future Works
In summary, the paper proposed two algorithms, LAC-KHM and CLAC-KHM, for solving the CPP in an SDN.LAC-KHM improved the LAC algorithm by incorporating the KHM method to avoid the sensitivity to the initialization of the algorithm.CLAC-KHM customized LAC-KHM for solving the CPP by considering the three main metrics, which were the distance between the controllers, the distance between controllers and switches, and load balancing.The proposed algorithms were evaluated on four different topologies, and the results showed that they outperformed k-means, spectral, and LAC algorithms.However, the proposed algorithms suffered from high computational complexity.This limitation arose due to the CLAC algorithm relying on the KHM algorithm, which entailed calculating the harmonic mean of distances between each data point and all centers for membership and weight functions.Consequently, this approach incurred a substantial computational burden.On the other hand, the CLA algorithm leveraged the k-means algorithm, which assigns data points to clusters based solely on their proximity to the assigned cluster, significantly reducing the computational complexity by avoiding calculations involving all clusters, which needs to be investigated in future works.
depicts a diagram illustrating different clustering approaches.
2023 by the authors.Submitted for possible open access publication under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/license s/by/4.0/).

Figure 2 .
Figure 2. Objective functions for all algorithms on Dial_Telecom.

Figure 3
Figure 3 indicates the fact that the CLAC-KHM is the most efficient r other representative algorithms.Meanwhile, it depicts that when there are a of controllers, the spectral clustering outperforms k-means in terms of OF, a with increasing controller numbers, k-means has better results.It is worth b

Figure 2 .
Figure 2. Objective functions for all algorithms on Dial_Telecom.

Figure 3 .
Figure 3. Objective functions for all algorithms on Intellifber.

Figure 3 .
Figure 3. Objective functions for all algorithms on Intellifber.

Figure 4 .
Figure 4. Objective functions for all algorithms on Iris and Aarnet topologies.

Figure 4 .
Figure 4. Objective functions for all algorithms on Iris and Aarnet topologies.
r is the set of actions where r is the number of actions.

Function Update_probability (membership_cluster, probability, num_cluster, position _of_data, signal, alpha, beta)
Calculate accuracy: This function evaluates the accuracy of the algorithm, demonstrating how closely it matches the expected results.Input parameters: obtained_result, expected_result, num_cluste Output parameter: Accuracy

Accuracy = Function Calculate accuracy (obtained_result, expected_result, num_cluster)
After initializing parameters, the algorithm keeps on running until output of KHM objective function changes significantly.

Table 1 .
The details of the datasets used in this research.

Table 2 .
Accuracies of the algorithms.

Table 3 .
Symbols of the paper.

Table 4 .
Features of the topologies.
ppl.Sci.2023, 13, x FOR PEER REVIEW 2, LAC performed as well as the spectral algorithm.Nevertheless, it is no CLAC-KHM achieved the best performance in this topology.