Directionally-Enhanced Binary Multi-Objective Particle Swarm Optimisation for Load Balancing in Software Defined Networks

Various aspects of task execution load balancing of Internet of Things (IoTs) networks can be optimised using intelligent algorithms provided by software-defined networking (SDN). These load balancing aspects include makespan, energy consumption, and execution cost. While past studies have evaluated load balancing from one or two aspects, none has explored the possibility of simultaneously optimising all aspects, namely, reliability, energy, cost, and execution time. For the purposes of load balancing, implementing multi-objective optimisation (MOO) based on meta-heuristic searching algorithms requires assurances that the solution space will be thoroughly explored. Optimising load balancing provides not only decision makers with optimised solutions but a rich set of candidate solutions to choose from. Therefore, the purposes of this study were (1) to propose a joint mathematical formulation to solve load balancing challenges in cloud computing and (2) to propose two multi-objective particle swarm optimisation (MP) models; distance angle multi-objective particle swarm optimization (DAMP) and angle multi-objective particle swarm optimization (AMP). Unlike existing models that only use crowding distance as a criterion for solution selection, our MP models probabilistically combine both crowding distance and crowding angle. More specifically, we only selected solutions that had more than a 0.5 probability of higher crowding distance and higher angular distribution. In addition, binary variants of the approaches were generated based on transfer function, and they were denoted by binary DAMP (BDAMP) and binary AMP (BAMP). After using MOO mathematical functions to compare our models, BDAMP and BAMP, with state of the standard models, BMP, BDMP and BPSO, they were tested using the proposed load balancing model. Both tests proved that our DAMP and AMP models were far superior to the state of the art standard models, MP, crowding distance multi-objective particle swarm optimisation (DMP), and PSO. Therefore, this study enables the incorporation of meta-heuristic in the management layer of cloud networks.


Introduction
Recent years have seen a rapid boom in the development of many new technologies such as Internet of Things (IoTs) and cloud systems. The emergence of cloud computing and data storage centres has led researchers to focus on optimising functionality and service. The implementation of these optimisations became easier and flexible with the development of software defined networking (SDN). However, the centralized control architecture of SDN generates concerns about reliability, scalability, fault tolerance, and interoperability [1]. Hence, assuring efficient management of SDN based computation is essential for the successfulness of the system. Computing, a common type of service type, can be defined as coordinating the execution of big processes or tasks on a network to meet a set of objectives or goals such as maximum reliability, minimum execution time, binary variants of the approaches were generated based on transfer function and they are denoted by B-(Method Name), e.g., BAMP for AMP.
The literature review is presented in Section 2, and the methodology is detailed in Section 3, while the evaluation and the results are explained in Section 4. The conclusion and the future considerations are expressed in Section 5.

Literature Review
The problem of load balancing in SDN networks has become an active research topic in the recent years. In the work of [14], the architecture of SDN layers considers load balancer as one block of SDN application tier. In the survey of [15], the authors provided a taxonomy of load balancing in SDN with discussing the various objectives such as response time, resources optimization throughout, and bottlenecks. Their taxonomy has classified load balancing in control plane and data plane. The former was divided into hierarchal and virtualization controller, while the latter was divided into server and link.
There are many research articles exploring the use of meta-heuristic searching in SDNbased network applications, for instance, modified genetic searching for SDN placement in networks [16] and chaotic salp swarm algorithm (CSSA) optimisation to obtain the optimal number of SDN in networks [17], while [18] proposed a resource selection MOO genetic algorithm using SDN network. There are two categories in the context of load balancing using SDN technology-computational load balancing and traffic load balancing. In the work of [19], an approach for enabling real-time traffic matrix for the traffic measurement system in SDN was proposed. Their design includes fixed and elastic schemas in order to achieve overhead reduction without compromising on accuracy. Hence, it falls under the category of multi-objective SDN network traffic measurement. In the work of [20], a flow-aware elephant flow detection applied to SDN was proposed in order to enable sharing the elephant flow classification tasks between the controller and the switches, which is a type of traffic load balancing.
Many studies have used meta-heuristics for traffic load balancing, such as a study by [21], where genetic algorithm was integrated with ant colony optimisation for traffic load balancing, and a study by [22] that proposed genetic optimisation for traffic loading using SDN. In this study, the researchers used a load balancing algorithm to identify the shortest path, which requires the least number of operations, by looking for the lowest capacity among the switches. This was determined by the load balancer using information provided by the controller. Three pieces of information were used for this objective: the path cost, the switch capacity, and the operational cost. The algorithm was also designed to have a mutation operation that used a path and a link to create a new path [22]. While that study only merely used a single optimisation, other studies have adopted MOO algorithms for similar purposes [16]. Computational load balancing aims to optimise task execution from various perspectives such as execution, reliability, and cost. In a study by [23], a framework for load balancing using two meta-heuristic optimisation methods focused on makespan and cost metrics. However, their framework failed to evaluate exploratory searches. Another work by [24] used MOO for SDN-based load balancing where quality of service (QoS) was considered a constraint, while energy saving and load balancing were optimised using MOPSO. This method also failed to evaluate the exploration or the diversity of the solutions. Load balancing has been applied in advanced simulations such as molecular dynamics in a study by [25], where the optimisation was based on heterogeneous supercomputers, which made the optimisation more difficult or a non-deterministic polynomial time NP-hard problem and combined genetic optimisation and PSO.
Numerous recent studies explore developing effective MOPSO with a focus on how to best supply searching via a combination of methods to enhance and obtain more optimal solutions. In a study by [26], four strategies-multi-population, dynamic clustering, solution life, and probability lottery-were used in conjunction with their MOPSO variant model. The study concluded that their MOPSO variant model was superior since it included more than one strategy while searching. However, it failed to carefully evaluate other strategies such as set coverage, hyper-volume, and delta measure, thereby rendering their algorithm inadequate. Another aspect that is currently under consideration is the computation cost of an algorithm when multi-populations are added.
In another study [27] using MOPSO, the moving strategy of particles toward local and global was embedded in the cross-over operation, which cannot be considered a real improvement in the search itself. The MOPSO variant models developed by the studies mentioned in this literature review used various concepts to ensure the discovery of an adequate set of non-dominated solutions. In a study by [28], MOPSO was used with crowding distance to perform clustering. Considering that multi-objective optimisation is evaluated using wide set of indicators [29], various approaches were developed based on the concept of using one of the indicators of diversity to select solutions from one iteration to another. In another study, R2 indicator contribution was used to select particles instead of crowding distance value, which is capable of achieving higher diversity in the search [30]. R2 indicator contribution value was used also by another MOO-based PSO method study in order to scalarise the solutions in the archive of non-dominated solutions [31]. Another study used unary epsilon indicator and Pareto dominance [32] in addition to direction-based reference points similar to a study by [33].
A study by [34] used angle-based searching in MOO-based PSO, which considered selecting solutions from the low density angle region and deleting the extra particles from the high density angle region. While the method was based on adaptive angle division, distance-based crowding was not included in the search, which affected one aspect of the diversity of the discovered solutions. Other researchers have developed MOPSO that incorporated crowding distance [35]. The study modified the velocity formula by including the sharing-learning factor. The sharing factor was added as a third term in the velocity equation to move the particle not only to the direction of personal and global beat particles but also to the average of all other particles in the swarm, which added more diversity to the swarm. The approach also added Gaussian mutation to the particles to achieve higher exploration. Furthermore, the study proposed an update of best global using greedy strategy with respect to each particle's changing position. This method, however, ignored the mobility direction in the particles, which is regarded as another factor in the diversity of exploration. A study by [36] evaluated an aspect that is usually ignored, leaders evaluation in traditional PSO. They went on to propose a new concept that a good leader takes feedback from his/her followers and modifies their decisions accordingly. The study went on to identify various cases of follower-based improvements in the swarm according to the change of the fitness values. Based on that, the velocity of the leader which expressed the changes of the movement speed and direction toward the leaders was changed. However, this method was only applied in single-objective optimisation. Another MOPSO variant model, which incorporated a new concept to achieve diversity and exploration, was proposed by [37]. The researchers divided the space into sets of hyper-boxes and tracked the number of solutions in each hyper-box, finally considering only the solutions at the boundary of each hyper-box. Some methods have developed new criteria for selecting leaders from the repository of non-dominated solutions. For example, a study by [38] developed an improved MOO variant model and provided an algorithm for selecting leaders from sets of non-dominated solutions using a geometrical approach. The approach selected points that had the least distance from the line fitted model for the set of non-dominated solutions. However, the problem with this approach was its invalid assumption of the straight-line approximation of the set of non-dominated solutions occurring in most cases.
Other studies have incorporated clustering in solution selection from one iteration to another, where the set of non-dominated solutions were decomposed to clusters and solutions belonging to different clusters were selected to achieve diversity in the solutions. For instance, [39] used Euclidean distance for clustering. This can be criticised for the fact that an implicit sphere assumptions of Pareto parts geometric model was made, which is not valid in many types of optimisation surfaces. Many other methods have opted to convert MOO to mono-objective optimisations via decomposition and use it along with dominance to select solutions. This was done in a study by [40] where penalty-based boundary intersection (PBI) was used with dominance to create a hybrid strategy. In our opinion, researchers can fall in the non-convexity trap by applying mono-objective mapping. In the same vein, a study by [41] estimated solution domination using cosine transformation and reference vector association and also used simplified leader-oriented mobility equation to counter the slow convergence and simplify the calculation. The approach presented elite velocity-based selections and a twofold leader definition. However, the cosine distance and the reference vector association lacked accurate estimation of solution diversity in the search space. Studies that have developed MOO-based PSO by exploiting existing information and communication theory concepts, namely entropy and its usefulness in probing the convergence of algorithms based on the entropy behaviour, also warrant mention. This method was first proposed by [42] where a simulation was used to prove the association between the change of entropy in the particles of the Pareto front and the convergence of the algorithm. This method was further developed by [43] where, using particle entropy, the particles were mapped on a parallel-cell coordinate system and a feedback information system, and the difference of entropy was used to change the parameters of the algorithm. Although entropy can be a useful metric indicator of the diversity of the solutions, it still lacks actual geometric and directional description of the particles in the space.
While numerous studies have evaluated load balancing using meta-heuristic, none have evaluated it from the perspective of non-dominated solutions. Furthermore, the exploration of the solution space has not received adequate attention since it is critical in providing the decision maker with sets of choices. Although MOPSO was among the evaluated methods, the improvements done to it ignored searching direction, which plays an important role in the diversity of the offered solutions. A summary of the models and the objectives listed in this literature review are presented in Table 1. None of these studies explored incorporating all five objectives in one model. Therefore, we present the findings of our MP-based load balancing model in subsequent sections.

Methodology
The symbols used throughout this study are explained in Table 2. The methodology starts with presenting the network model in Section 3.1. Next, the task model is provided in Section 3.2. Afterwards, the energy model, the time execution metric, and the renting cost metric are provided in Sections 3.3-3.5, respectively. Next, the optimization objective functions are given in Section 3.6. Next, we present the transfer function model for dealing with the binary space in Section 3.7. The crowding distance is presented in Section 3.8, and the developed DAMP algorithm is provided in Section 3.9. Next, a big O notation is given in Section 3.10. Lastly, a separated section is dedicated for the evaluation analysis.

Network Model
The network was represented by an undirected graph G(N, R), where the networks were N = N j : j = 1, 2, . . . n R = e ij , i, j = 1, 2..r and i = j . The edge between two nodes (i, j) had a weight that represents the distance between the two nodes (d ij ). When the i node was not connected to the j node, the distance between the two was infinity. When unconnected, the nodes were represented as N 5 and N 6 .
Each node was described by variables to determine its computational and power specifications. In order to describe node n j , we used the tuple v j , e j , e init where v j denoted computation power of the node, which was measured as instruction per second (IPS), e j denoted average energy consumption as measured by Joules per second (J/sec), and e init denoted the initial energy. Each node was also described by two constant variables, P 0 and L 0 . P 0 represented the maximum computational load, while L 0 represented the maximum communication load.

Task Model
The task model G(M, E M ) was provided by directed acyclic graph (DAG) where we assumed that we had m tasks M = {M i : i = 1, 2, . . . m}, and each task was described by computation or, P = {P i : i = 1, 2, . . . m}, which indicated the number of instructions (NI) and the communication or, L = {L i : i = 1, 2, . . . m}, measured in bytes. Since the nodes that executed the tasks had P 0 and L 0 , Equation (1) was used to determine the number of nodes required to execute any i task as K.
where K 0 denotes the minimum number of nodes required to execute one task.

Energy Model
The energy consumption model was a combination of two parts; the first part, E comp , denoted energy consumption based on the execution of the instructions of the P i task and was determined using Equation (2).
The second part, E comm , was a combination of two other parts and was expressed in Equation (3).
It assumed that the radio emitted E elec [ nJ bit ] to power the transmitter or the receiver circuit and was expressed in Equation (4).
In order to receive, the energy was expressed in Equation (5) as follows: where: E Tx (k, d) was the energy consumption when transmitting k bits over a distance d, E Rx (k) was the energy consumption when receiving k bits, k was the number of transmitted bits and derived from L in the task model, d was the distance between the two nodes and derived from the network model, E elec = 50 nJ bit was the constant required to power the transmitter or the receiver circuit, and amp was the coefficient related to the transmitter amplifier and equalled 100 pJ bit /m 2 Using L i which was communication load of task i and substituting Equations (2), (4) and (5) in (6) provided (7)

Time Execution Metric
Equation (8) expresses how long each node (n i ), at a velocity of v i , spent executing task j two times. Its computation and communication are given as follows: Equation (9) was used to determine the makespan

Renting Cost Metric
By assuming that each node in the network had a renting rate of r i , and assuming that the n i node needed to operate for t ij in order to execute task j, then the total renting cost of the node was Equation (10) RC ij = r i t ij (10) However, to minimize the renting cost of all nodes to execute all tasks, we used the Equation (11)

Optimisation Objective Functions
The solution was optimised by assigning specific tasks to specific nodes as defined by the matrix X = x ij ∈ {0, 1} n×m . Therefore, the problem was a binary optimisation problem. The objective function was described by the five following Equations (12)-(16) and the connectivity limitation of the dependent tasks.

Transfer Function Model
Transfer function was used to convert particle swarm searching to binary to solve the described problem. Assuming that the i particle had a velocity of v t i,d for the dimension d and the iteration t, the corresponding particle bit changed its value with a probability . We used this probability to generate the random numbers r ∈ [0, 1] and compared it with TF v t i,d . If the value was lower than TF v t i,d , then it changed its bit value, otherwise it did not change. After converting the approaches of binary space, we added the letter B to indicate the binary space in the method name.

Crowding Distance
The concept of crowd distance was first proposed by [12] in NSGA-II. It measured the density of the solutions in the space with respect to the objectives. The purpose of using this concept was to add a solutions selection criterion when they were non-dominated. Basically, solutions located in less crowded areas or that had bigger crowd distances were favoured. Incorporating this criterion in PSO provided higher exploration in the solution space. The pseudo-code for determining crowd distance is provided in Algorithm 1.

Developed DAMP Model
The distance angle multi-objective particle swarm optimisation (DAMP) model developed by this study is explained in this section. A directionally aware MOO differs from traditional MOO by exploring power that aims at probabilistic spreading of the solutions in the space according to their crowding distance and directions. Contrary to a study by [13], which determined combined crowding distance with direction in the search and allocated more priority to direction, DAMP probability performs the search while allocating equal weight or selection probability to both direction and distance. The pseudo-code for DAMP is provided in Algorithm 2. Algorithm 2. The algorithm for distance angle multi-objective particle swarm optimisation (DAMP). To ensure the progress of the search, the algorithm selected the best solution out of a combined pool of solutions from both the original swarm and the swarm after mobility and mutation. The selection was performed via non-dominated sorting and sorted using nondomination criterion ≮. The solutions were then ranked accordingly, where x, y belonged to the same rank, followed by x ≮ y and y ≮ x.
Assuming that the solutions were sorted within k ranks, as shown in Figure 1, solutions ranked in R 1 were the most optimal solutions and dominated over the subsequent solutions. Solutions in rank R 2 were the second most optimal and, while they dominated over other ranks, they were dominated by solutions in rank R 1 . Additionally, if ∀ x, y ∈ R i , then, x ≮ y and y ≮ x. The objective was to select N solutions from the sorted solutions. As shown in the pseudocode in Algorithm 3, the selection algorithm began by selecting solutions from the most optimal ranks R 1 , R 2 , . . . R i , where the following equations applied: N − (N 1 + N 2 + . . . N i−1 ) was then selected from N k solutions and was consistent with the exploration. The remaining solutions, up to N, were selected in a way that was consistent with exploration. This method considered two criteria-the angle distribution and the crowding distance distribution. It started by sorting the solutions from the highest to the lowest crowding distance and the highest to the lowest angle range. Angle range rank was defined as the number of selected solutions within an angular sector in the solution space. The first and the second sets of sorted solutions were assigned to sorted solutions distance and sorted solutions angle, respectively. It then underwent an iterative process from 1 until N − (N 1 + N 2 + . . . N i−1 ) and generated a random number (r) in each iteration. If the generated number was between 0 and 0.5, a solution was selected from sorted solutions distance. Otherwise, it was selected from sorted solutions angle. This ensured a balance between both angle and crowding distance explorations. The pseudo-code is provided in Algorithm 4. For DAMP, the solutions selection probability combined two criteria, crowding distance and angle range rank. To demonstrate the concept, we assumed that we had six non-dominated particles, as shown in Figure 2a-c, which shows the corresponding angular range ranks and the crowding distance arrays, respectively. Using the probabilistic calculation, a probability of 0.5/2 was assigned to solutions three and four and then to solutions one and two. An overall probability of 1/4 was assigned to all four solutions.

Big O Notation
We present the complexity analysis of four variants of MP. Assuming that we had for our meta-heuristic searching N particle, M iterations, the particle length d, the number of objectives m, then the complexity of the searching was given as We observed that the only difference between DAMP, DMP, AMP, and MP was the sorting part where the algorithm had to sort the solutions based on their angle as well as their distance. The sorting part for the three algorithms was the same, which was O (mN 2 ), because the big O notation of summation to two functions was the maximum big O notation of them.

Evaluation
The MOO performance measures used to evaluate our proposed method and a comparison between our method and the benchmarking MOO mathematical functions are provided in this sub-section.
3.11.1. C-Metric Measure C-metric, or set coverage, compared the two Pareto fronts of two approaches in terms of domination. If we had both approach A and approach B, the Pareto front generated from approach A was labelled P A , and the Pareto front generated from approach B was labelled P B . C-metric C(A, B) = C B indicated the number of solutions from B that were dominated by solutions in A. The lower the C B value was, the better the performance was. Therefore, the objective was to develop an approach with the lowest C B value. The formula for this measure is expressed in Equation (20):

Hyper-Volume Measure
This measure was a simultaneous indicator of diversity and domination. It was defined as the correlation between the hypercube, the diagonal distance between the solutions in the Pareto front, and the worst set in terms of domination. Therefore, the higher the hyper-volume volume contributed to a better quality of the solutions. The formula for this measure is expressed:

Delta Measure
This measure was a simultaneous indicator of the uniformity of the Pareto fronts' distributions and spread. Therefore, it was a measure of diversity. It was denoted by ∆ and needed to be minimal.
where: The pseudocode for calculating this measure is provided in Algorithm 5, [12].

Generational Distance (GD)
This measure was an indicator of the optimality of the solutions in terms of their closeness to the true Pareto front solutions. It measured the average distances between the Pareto front solutions and the true Pareto front solutions. Therefore, the lower the GD value was, the more optimal a solution there was [44].
where: |P S | was the number of solutions in the Pareto set, P T was the true Pareto front, d i was the Euclidean distance between the solutions in P S and the nearest solutions in P T .

Number of Non-Dominated Solutions
This measure was an indicator of MOO algorithm performance in terms of the number of non-dominated solutions (NDS) in the Pareto front. Therefore, the higher the NDS values were, the higher the performance was [45].

Evaluation and Results
DAMP, DMP, and AMP were evaluated using MATLAB 2019b and other benchmark MOPSOs [9], as shown in Table 4. We used the same common parameters of the proposed methods and the benchmarks, and we used the same objective for comparison. Furthermore, each experiment was repeated for 10 runs, changing the seed of the random number generator. The parameters that were selected were based on tuning processes for c1 and c2 that represent the coefficients of the effect of personal best and global best respectively. They were selected to be 1/3 and 2/3, respectively. We also set the parameters of w, V max , and V min to be 0.5, 0.1, and 0.001, respectively. These parameters were related to the original equation of particles mobility of PSO that are presented in Equations (25)- (28).

Evaluation By Mathematical Functions
This section evaluates MOO with the MOO mathematical functions mentioned in the sub-section above. The results are presented in the following Section 4.

Set Coverage Analysis
The role of set coverage was to judge the superiority of the developed DAMP model over industry benchmarks, such as MP and DMP, and intermediate variant models, such as AMP. Therefore, we attempted to compare C(DAMP, x) values.
With C(x, DAMP), x could represent any one of the three compared models. Figure 3 shows the obvious superiority of the DAMP model over the C-metric values of all the other mathematical functions. However, a difference existed in domination performance between the models. For instance, the DAMP model had the most domination over the MP model compared to the other two models. We also observed similar domination performance between AMP and DMP models. We also noticed that DAMP had higher C(DAMP, x) values when evaluated using ZDT1, ZDT2, ZDT3, ZDT4, ZDT6, and FON. On the other hand, we found that the C(DAMP, x) values were nearly identical when evaluated using POL, and that the C(x, DAMP) values were slightly lower when evaluated using KUR when x = AMP, DMP. Therefore, the overall performance of the DAMP model was clearly superior in comparison to all other models in terms of domination.

Hyper-Volume
This function measure judged the diversity of the developed solutions. While it was regarded as a secondary measure, after a set coverage, where domination was more important than diversity, judging diversity helped identify the overall performance of the algorithm. As shown in Figure 4, we observed that the MP model provided higher hyper-volume values than the other models when evaluated using ZDT1, ZDT3, ZDT4, and KUR. Furthermore, Table 5 shows the statistical significance of hyper-volume values for ZDT1, ZDT4, ZDT6, SCH, and FON based on t-test values. Cross-references of these visualisation results with statistical results are shown in Table 5 where the superiority of the DAMP model in terms of hyper-volume value is statistically shown for FON, SCH, and ZDT6. Conversely, the hyper-volume value of the MP model was superior to the DAMP when evaluated using ZDT1 and ZDT4. Nevertheless, for all these functions, the MP model had relatively low domination, as seen in the set coverage analysis sub-section. We also found that the DAMP model had higher hyper-volume values than the MP model when evaluated using SCH. We observed that the hyper-volume performance, when evaluated using KUR, was similar to the MP model and superior to the AMP and the DMP models. These findings support the quality of diversity of the DAMP model solutions apart from the observed domination in the previous sub-section.

Number of Non-Dominated Solutions (NDS)
This measure indicated the number of non-dominated solutions which was an indicator of the choices provided to the decision maker after optimisation. As seen in Figure 5, the NDS of the DAMP model was comparable with the AMP and the DMP models when evaluated using SCH, ZDT1, ZDT3, ZDT4, KUR, and POL. Moreover, the NDS of all the other models was superior and statistically significant to the MP model when it was evaluated using ZDT1, ZDT2, ZDT3, ZDT4, ZDT6, KUR, and POL, as shown in Table 5.

Delta Measure
This measure was regarded as a measure of diversity. An observation of the values revealed that the DAMP model had an average diversity compared to AMP and DMP models when it was evaluated using almost all of the functions, whereas the MP model had higher diversity when evaluated using KUR and less diversity when evaluated using FON. This measure was, again, a secondary measure after the domination of solutions. Crossreferencing of the average delta values is shown in Figure 6, and the statistical findings in Table 5 confirmed that the DAMP model was statistically superior to the MP model when evaluated using FON, whereas the MP model was not superior, with a statistical significance only for KUR.

Generational Distance
This measure was an indicator of the closeness between the discovered solutions and the true Pareto solutions. It did not provide accurate description of domination as set coverage. However, it did indicate the distance between the discovered solutions and the true Pareto front. In Figure 7, we could see that, for all mathematical functions, DAMP, AMP, and DMP models had similar performance values and were better than the MP model in terms of distance to the true Pareto front. Cross-referencing of these visualisation results with statistical test values are shown in Table 5, and it shows that the DAMP model was statistically significant over the MP model when evaluated using ZDT3, ZDT4, ZDT6, FON, and KUR. Bear in mind that the results of this measure do not reflect domination performance and should, therefore, be read in conjunction with the set coverage results.

Statistical Evaluation
For evaluation purposes, each measure was generated via 10 experiments using random seed for our DAMP model, the AMP model, and the two industry standard DMP and MP models. As seen in Table 5, our DAMP model dominated almost all MOO functions in comparison to the DMP and the MP models. Furthermore, statistical significance also proved the superiority of our DAMP model over the AMP model for some measures, namely, NDS and GD, when evaluated using ZDTs.

Evaluation Based on Load Balancing Model
This section provides the evaluation of the load balancing in terms of the MOO evaluation metrics. The evaluation was based on number of tasks equal to six and number of nodes in the network equal to 30 nodes. The network is depicted in Figure 8 with the assigned tasks to each of the nodes. This shows that each of the algorithm assigned different tasks to the different nodes where there were different solutions with different performance metrics of the same problem. We also observed that the links of the networks were established to produce connected graphs, which enabled the data exchange while executing the tasks. The parameters and the setting of the evaluation are presented in Table 6.  We added a comparison with two additional algorithms, namely, binary non-dominated sorting genetic algorithm (BN2) and binary particle swarm optimisation (BPSO). We used the same number of solutions and iterations, 50 and 100, respectively. As for objective comparison with the benchmarks, we show the evaluation results in the following Section 4. For objective evaluation, each of the methods was executed 10 times, and its generated set coverage results were compared with the other methods. The results are provided as boxplots depicted in Figures 9-11. As it is observed in Figure 9, BAMP achieved higher set coverage for most approaches, which provided the superiority of using the angle as criterion of searching compared with the other approaches. Another observation was the use of the angle, which provided a wide range of possibilities according to the seed compared with not using the angle or the distance, as it has been shown in C(BAMP, BMP) when compared with C(BMP, BAMP) where the latter was narrower than the former. In addition, we observed that BAMP dominated BN2 with higher percentage than the domination of BN2 over BAMP. Contrary to the angle, the usage of distance as criterion for exploration had less influence on the domination, as it is shown in Figure 10, where the majority of approaches accomplished more dominance than standalone usage of distance represented by BDMP. Another observation was that the usage of angle and distance was better than using the distance solely. This is observed in Figure 11 because the value of C(BDAMP, BDMP) was higher than the value of C(BDMP, BDAMP). However, BPSO generated a non-dominated solution compared with the other benchmark. This solution dominated from a single objective over other objectives, namely, the objective of energy distribution, while it was dominated with respect to other objectives.

Hyper-Volume
In order to evaluate the exploration in the objective space, we generated the hypervolume. The results of the hyper-volume as boxplots are given in Figure 12. Observing the Figure 12, we noticed that each of the experiments provided different values of hypervolume. In some experiments, BAMP indicated higher value of hyper-volume (HV), while in others, there was superiority of BDMP. This revealed that the performance of exploration was highly sensitive to the initial seed. However, BN2 outperformed other approaches of swarm family in the hyper-volume.

Number of Non-Dominated Solutions
The other metric that was used to evaluate the multi-objective optimization approaches from the perspective of load balancing was the number of non-dominated solutions, which is given in Figure 13. We observed that the approaches had almost the same NDS except for BMP due to the high non-domination reached by searching within each algorithm.

Relative Generational Distance
The last metric was the relative generational distance, which needed to be minimized. Figure 14 shows that BAMP and BPSO were the best in terms of minimizing this metric compared with the other benchmark. However, it is important to distinguish between lower Euclidean distance and more domination, which is not always associated.

Statistical Evaluation
For thorough evaluation, we conducted a t-test to verify the superiority from a statistical perspective. The t-test evaluation was conducted based on three metrics, namely, RGD, HV, and NDS. The t-test was based on a series of 10 runs, changing the random seed. The evaluation used a confidence level of 0.05 for rejecting the null hypothesis and accepting the statistical significance of the difference between the approaches. As we observe in the Figures 15 and 16, respectively, RGD and HV were the two metrics showing most statistical differences between the approaches, while for NDS, statistical difference was only observed for the comparison with BPSO because it was a single objective optimization and was weak for providing the number of non-dominated solutions, as shown in Figure 17.

Conclusions and Future Considerations
The study successfully optimised load balancing in software defined networking (SDN) using multi-objective optimisation (MOO) based on particle swarm optimisation (PSO). The industry-benchmark-MP model was expanded to include two additional search criteria, crowding distance and crowding angle. The former provided AMP, the latter provided DMP algorithm, and joining them provided DAMP. In addition, Sigmoid transfer function was then incorporated to convert them to binary, which provided BMP, BAMP, BDMP, and BADMP. The evaluation was decomposed into two phases; the first one was conducted based on benchmarking mathematical functions while the second one was conducted on a developed load balancing SDN model with four objectives: energy (E), energy distribution (D), makespan (T), and renting cost (R). It was found from the evaluation that both AMP and DAMP were superior over DMP and MP in terms of the optimization of the benchmarking mathematical functions. Both BAMP and BDAMP were also superior over BMP and BDMP in terms of the load balancing metrics. Hence, the hypothesis of the superiority of directionality or angle in the optimization was confirmed. Furthermore, it was concluded that using conversion to binary space did not affect the performance of the optimization.
As potential applications for our method, we give edge networks where the users have some applications that need tasks to be executed in real time or with less latency, which requires renting some local nodes for this purpose instead of sending them to the cloud. Other potential applications are for the developed multi-objective optimization, which can be used for various combinatory problems, such as surgeries planning in hospitals [46] and job-shop planning [47].
Several limitations of the approach that can be addressed are as follows. Firstly, it uses fixed angle resolution for dividing the solution space. This might lead to nonstable performance based on the value that is given to the angle. Future studies should explore extending load balancing by adding other objectives, such as node reliability, and incorporating more search criteria in the optimisation algorithm. Another future work is to enable adaptive angle decomposition of the solution space. Secondly, it uses probabilistic selection of non-dominated solutions based on angle or distance in an equal way. Another future work is to make the selection based on adaptive probability of selecting non-dominated solutions. Thirdly, it uses global learning based on moving the particle toward its global best following the conventional mobility equation of PSO. The global learning might lead to premature convergence; a better approach is to use comprehensive learning [48]. Fourthly, we will extend the model to handle dynamical aspects such as running tasks twice on the same node and the cache effect.