Social Network Optimization for WSN Routing: Analysis on Problem Codiﬁcation Techniques

: The correct design of a Wireless Sensor Network (WSN) is a very important task because it can highly inﬂuence its installation and operational costs. An important aspect that should be addressed with WSN is the routing deﬁnition in multi-hop networks. This problem is faced with different methods in the literature, and here it is managed with a recently developed swarm intelligence algorithm called Social Network Optimization (SNO). In this paper, the routing deﬁnition in WSN is approached with two different problem codiﬁcations and solved with SNO and Particle Swarm Optimization. The ﬁrst codiﬁcation allows the optimization algorithm more degrees of freedom, resulting in a slower and in many cases sub-optimal solution. The second codiﬁcation reduces the degrees of freedom, speeding signiﬁcantly the optimization process and blocking in some cases the convergence toward the real best network conﬁguration. parametric analysis: the color is proportional to the cost value, where red is high cost and blue low cost. The results show that the high quality solutions area is different, but it is possible to ﬁnd


Introduction
The Internet of Things paradigm is increasing the importance of Wireless Sensor Networks (WSN) in which a set of small simple sensors are interconnected for creating a very complex structure. The information sensed by all the nodes of the network should be sent to the cluster head that processes them and exploits the information [1].
The design of a WSN arises some issues in terms of deployment and operational costs. In this framework, Evolutionary Optimization Algorithms (EAs) can be very important tools for the system design because of their flexibility and ease of use.
The research on EAs has two main preferential directions: on the one hand, more complex operators are introduced and analyzed in the algorithms for improving their performance [2]; on the other hand, their field of applicability is analyzed and enlarged [3].
The most used EAs are Genetic Algorithms (GA) and Particle Swarm Optimization (PSO) [4]. PSO and Ant Colony Optimization (ACO) [5] are the most common algorithms that belong to Swarm Intelligence.
Among EAs, Differential Evolution (DE) algorithm has been widely applied and studied [6,7]. This algorithm is not-biologically inspired; while it was originally designed for real-value problems [8], it has been implemented also for discrete problems [9]. This algorithm has been applied to a wide range of multimodal problems, such as neural network training [10]. Biogeography-Based Optimization (BBO) is a more recently developed EA based on the survival mechanism of species in different environments [11].

Swarm Intelligence
Swarm Intelligence Algorithms are an important class of Evolutionary Optimization Algorithms. In this paper, Particle Swarm Optimization and Social Network Optimization are used. In the following, a brief description of both these algorithms is presented.

Particle Swarm Optimization
Particle Swarm Optimization (PSO) is a milestone in Swarm Intelligence algorithms [25]. Its operators are derived from the concept of collective intelligence, which can be summarized in the following three sentences [26]: • Inertia: 'I continue on my way'. • Influence of the past: 'I've always done in this way'. • Influence by society: 'If it worked for them, it would also work for me'.
Each individual of the population represents a particle, which is characterized by a position and velocity.
At each iteration of the algorithm, the personal best position found by each particle (PB) and the global best (GB) found by the entire swarm are computed. These points are the attractors of each individual. In particular, the velocity is updated in the following way: where v i is the velocity of the ith particle and p i its position in the search space. PB i is the personal best while GB is the global best. Finally, ω, c 1 , and c 2 are three user defined parameters. At each iteration, the position is updated in the following way: PSO performance are greatly influenced by the selection of the algorithm working parameters. The implemented PSO version is characterized by the following parameters: the inertia weight (ω), the personal learning coefficient (c 1 ), the global learning coefficient (c 2 ), and the velocity clamping (V rsm ) [27,28]. Moreover, the algorithm performance depends on the population size (n pop ) that is related to the number of iterations (n iter ) and the number of objective function calls (n call ) by the following equation: n pop · n iter = n call (3)

Social Network Optimization
Social Network Optimization is a recently developed population-based, Swarm Intelligence algorithm that takes its inspiration from the information sharing process in Online Social Network [29].
The population of the algorithm is composed by users of the social network who interact online by publishing some posts. In fact, at each iteration, the users express their opinions by means of the status contained in the post. In addition to this information, the post contains the name of the user that have posted it, the time at which it is posted, and a visibility value. The process of passing from opinions to a post status is called linguistic transposition.
The status corresponds to the candidate solution of the optimization problem, while the visibility value is created by means of a proper mapping of the cost function associated to the specific candidate solution, as shown in Figure 1. The visibility values of the entire population are used to update the reputations of all the users of the social network. This operation transfers global information among the entire population. The reputations are used for creating one of the two interaction structures of SNO: the trust network.
The second interaction network is the friend network. These two networks are very different one from each other: the friend network leads to more consistent interactions, its variability is lower, and its modifications are related to out-of-the-social elements. The trust network creates weaker connections, it varies quickly, and it is modified according to the visibility values, as shown in Figure 2. From the interaction networks, each user extracts some ideas that compose the attraction point that is used for creating the new opinions. The implemented operator emulates the assumption of a complex contagion, which guarantees a better trade-off between exploration of the domain and the exploitation of the acquired knowledge: All these operators make SNO a quite complex algorithm, but they give to it the possibility to work very well in different kinds of problems and they reduce the risk of stagnation in local minima, which is a well-known issue of PSO.
The most important algorithm parameters are α and β because they tune the effective behavior of the population. Two different analyses have already been performed for properly selecting these parameters: The first one is an analytical analysis of a simplified version of the complex contagion. The second one is a numerical parametric analysis performed on different benchmarks. More details on these analyses can be found in [30].
The parametric analysis on the parameters α and β has been performed on two standard benchmarks for EAs: the Ackley and Schwefel functions. The termination criterion is set to 5000 objective function call and 50 independent trials have been done on 20D functions. Figure 3 shows the results of the parametric analysis: the color is proportional to the cost value, where red is high cost and blue low cost. The results show that the high quality solutions area is different, but it is possible to find a good set of parameters for both the functions. In particular, the selected set of parameters is α = 0.1 and β = 0.45.  In [31], it is possible to find a comparison between SNO and other EAs, in which the performance of SNO is assessed on standard benchmarks.

Wireless Sensor Network
A Wireless Sensor Network is composed of a set of sensors, deployed in a field, that can communicate by means of a multi-hop protocol to a central node, called the cluster head.
In the following, the adopted network model is described and then the two proposed problem codification techniques are described. Finally, the optimization environment for this problem is analyzed.

Network Energy Model
The analyzed WSN is composed by a set of equal sensor nodes deployed randomly in the space. Each of these nodes has a maximum communication distance that is imposed by the acceptable signal-to-noise ratio.
The communication between the nodes and the central node, the cluster head, happens by means of a multi-hop protocol: in this way, even nodes quite far from the cluster head can send to it the sensed information.
Each sensor but the cluster head is fed by batteries with a total capacity of 9kJ, thus the network lifetime is limited by the energy consumption of the most stressed sensor.
The energy consumption of each node can be computed as function of the transmitted and received bits: in fact, it is assumed that the sensing energy is negligible with respect to the communication energy [20]. Thus, the total energy consumption is composed by two terms: where E trx,i is the total energy required for transmission and reception of information: where e trx is the energy required to keep on the communication equipment and b trx,i is the total number of bit transmitted and received.
The second term of the total energy consumption is the amplification energy required for the transmission of the information. It depends on the number of transmitted bits (b ij ) and on the transmission distance (d ij ) between the two nodes i and j: Both the sensing energy and the energy consumed in idle time are considered negligible with respect to the other terms.
The transmitted and received bits are functions of the network topology, i.e., of the interconnections between sensors and can be easily calculated.
The network is composed by a set of randomly deployed sensors: as introduced above, the feasible connections between sensors is determined by the maximum communication distance between nodes.
The feasible connections are bidirectional if they involve only sensor nodes, while are considered monodirectional if one of the two nodes is the cluster head, as shown in Figure 4a, where the directed connections are indicated with the arrow.
The feasible connections are more than the required ones, thus it is possible to select the effective final network topology by identifying the active connections among the feasible ones, as shown in Figure 4b.  The selection of the active connection should be performed also selecting the direction of the communication. This is an important aspect for the problem codification, as analyzed below.
Thus, when the active connections are selected, the network can be represented by a directed graph; by means of a graph search starting from the nodes with higher depth, it is possible to determine the flow of information in the network, and thus the number of received and transmitted bits.
This network model was implemented in Matlab, adopting the parameters shown in Table 1. The objective of this optimization problem is the maximization of the network lifetime, i.e., the number of working cycles in which all the sensors have a residual stored energy.
The design variables of this problem are the active connections in the WSN; in fact, by changing the active connections, the number of bits received and transmitted by the sensors is modified and, thus, the sensor lifetime is changed. It is possible to codify these design variables in different ways in the optimization framework, changing the search space size and the number of unfeasible solutions. A deeper analysis of the possible codification methods is provided in the following.
To the best of the authors' knowledge, there are some papers that have proven that the problem of the lifetime maximization by properly selecting the routing paths in a WSN is a NP-hard problem [32,33].

Problem Codification
The presented model of WSN can be codified in the optimization environment in different methods that affect the number of optimization variables, their type, the possibility to find unfeasible solutions, and the search space size.
The most basic codification of this problem can be performed using binary design variables, each one representing if a feasible connection is also active. This formulation is very simple and generic, but it leads to a high number of non-connected networks, i.e., networks in which at least one node cannot communicate with the cluster head. Moreover, many solutions are surely suboptimal, such as ones with loops or with a node that transmits its information to more than one single sensors. This possibility, which can be important for reliability analysis of the communication, is not optimal in this context because the transmission and receiving energy of some nodes is doubled.
In this strategy, the number of design variables is equal to the number of feasible connections in the network, and thus generally grows more than linearly with the number of nodes.
It is possible to design more complex problem codifications by analyzing some features of optimal solutions. First, each sensor should be able to transmit its information to another one.
In this framework, it is possible to associate to each node its adjacency list (as shown in Figure 5) and select one target node from this list. With this codification, the problem is characterized by N − 1 integer design variables, where N is the number of nodes. The drawback of this codification (called in the following adjacency-based codification) concerns the possibility of creating non-connected networks during the optimization process or looped graphs.
To avoid these drawbacks, it is possible to consider the node depth, that is the number of hops needed for communicating from the analyzed sensor to the cluster head. Figure 6a shows the same graph seen before in which the node position is related with node depth. Each circle corresponds to a different depth.  For the second codification method, depth-based adjacency lists have been created: in each list, only the node with smaller node depth are included. Figure 6b shows the same network of Figure 5 in which the depth-based lists are shown. In addition, in this case, the design variables are integer numbers.
As it is possible to clearly see, the number of degrees of freedom for the algorithm is significantly reduced; this eliminates completely the possibility to have non-connected nodes in the network, but it eliminates some paths that can be optimal.
These two codification techniques make this problem combinatorial. However, in most real cases, it is impossible to test all of them due to the search space size.

Performance Calculation
In this section, an example of how the system works is presented: it starts from the optimization variables and shows all the steps required for the calculation of the cost value.
The optimizer works with real variables that are translated into integer values by means of a set of thresholds [23]. Each variable corresponds to a node and the number obtained by the translation of the optimization variables is the index in the adjacency list for the first codification technique or in the depth-based list for the second. In this example, the adjacency-based codification of the network presented in Figure 5 is used, but it can be extended easily to the other method. Figure 7 shows the decodification of two different sets of optimization variables: in the first one (Figure 7a), the resulting network is connected, while, in the second one (Figure 7b), it is not and, thus, the solution is unfeasible.
If the network is feasible, the calculation of the cost is run. The first activity performed is the estimation of the information packages through each node. This is computationally performed in the graph represented by the active connections by means of a customized search that starts from the deeper nodes and then goes upward. Each node creates one information package and forward the received ones. Figure 8 shows the transmission packages for each edge.   Once the number of packages transmitted and received is computed, the energy model shown in Section 3.1 can be easily applied.

Optimization Environment
After having defined the WSN model and the codification technique, it is possible to design the optimization environment, i.e., all the interactions between the different elements involved in the solution of this problem.
The optimization environment designed for the routing problem is the one shown in Figure 9. Two evolutionary algorithms are used, PSO and SNO. The optimization variables used in these algorithms are decodified to a specific network configuration with the techniques shown in the previous section. While a candidate solution of the algorithm is translated in a network configuration, two values are computed: the number of disconnected sensors, which is used as a constraint on the optimization problem, and the energy required by each sensor, which is the major objective of this problem.
The cost function is constructed for imposing the constraint and for helping the convergence of the algorithm.
In particular, it is defined as follows: 70 + 20 · n disc n disc > 0 The first condition is related to the compliancy with the constraints. The offset value (70) is used to avoid feasible solutions having higher costs than unfeasible ones: this creates a step in the convergence curves that helps the identification of the time in which the algorithm has reached the feasible region of the search space.
The second cost term is the one related to feasible solution. The scale factor 10 5 does not affect the optimization process; the cost term max i E i is the maximum energy consumption of the sensors and it is the real objective of the optimization problem, while the cost term ∑ i E i is the total energy consumed, and it is added, properly weighted by λ and the number of sensors N, for improving the convergence.

Results and Discussion
The optimization environment defined in the previous section has been applied for the routing optimization.
In this section, the results obtained are shown and discussed: in particular, the selected test cases are firstly analyzed; then, the optimization results of PSO and SNO are provided; and, finally, a peculiar case is analyzed in depth.

Test Cases
Due to the fact that the performance of the algorithms and codification methods can be highly case-dependant, several test cases are here created for the performance analysis.
Each network is characterized by a different number of sensors and a different deployment of them in the space, as shown in Figure 10, where a network with 50 nodes and one with 85 are compared.
The two networks are very different: in particular, the network with 85 nodes has a higher maximum node depth due to the upper area in which several nodes are not connected with the cluster head.  For each tested network, it is possible to calculate the maximum node depth, the total number of possible configurations with the adjacency-based codification, and the number of configurations with the depth-based codification. Table 2 shows the parameters of the tested networks. The characteristics of each network does not depend just on the number of nodes: thus, the selected networks explore different possible combinations between node depth, number of connections, and number of nodes.
Analyzing the table, it is clear that the depth-based codification reduces drastically the size of the problem.

Results
To compare the results of the two different codification techniques and the two algorithm, the number of objective function calls was selected as termination criterion.
In fact, this parameter is the one that mainly drives the computational time in the optimization due to the fact that the self-time of the optimization algorithm can be usually neglected. For example, in the optimization of an 85-node network with 14,000 objective function calls, the self-time of SNO is 1.3 s for a total optimization time of 26 s, i.e., 5%. PSO is slightly faster and its self-time in the same conditions is 0.9 s, i.e., 3.5%.
Since the problem complexity grows with the number of sensors, the allowed number of objective function calls was also increased with the following empirical rule: where N is the number of sensors in the network. The population size for both these algorithms was selected according to a preliminary parametric analysis on some standard mathematical benchmarks. In particular, for PSO, the optimal population size is 50 individuals, while for SNO it is 100.
Due to the intrinsic stochastic nature of both these algorithms, 50 independent trials were performed for each configuration. Figure 11 shows the results of the optimization. In particular, the average cost value obtained in the 50 independent trials is reported for each test case with the two codification techniques.  Analyzing these results, it is possible to make some considerations. The effect of the problem codification technique impacts on the performance of the algorithms in different ways: in fact, for PSO, the depth-based codification is always much better than the adjacency-based one, while, for SNO, the superiority of the depth-based codification is less important. This is due to the fact that SNO usually performs better than PSO in handling high-dimensional problems with many local minima.
To introduce a numerical estimation of the difference between these two codification techniques, the gap value was calculated for each case: This value represents the improvement of the depth-based codification expressed as function of the adjacency-based value.
The average GAP value for the mean value of SNO over all the test cases is 0.58 meaning that the depth-based mean value is, on average, half of the adjacency-based one. For PSO, the average GAP value over the mean value is 0.98. Figure 12 shows the best results obtained by both SNO and PSO. Analyzing this figure, two additional considerations can be done. The minimum values obtained by PSO are much lower than the average ones, indicating a difficulty of this algorithm in achieving the global minimum of the function. Secondly, the difference between the two codifications is reduced, in particular for the results of SNO: in fact, for this algorithm, in eight cases, the adjacency-based codification is better, and in only eight other cases the depth-based is drastically better. This can also be seen from the average GAP calculated on the minimum value, that is 0.18 for SNO. This means that the optimal solution requires some hops that violate the depth rule used to create the second codification.  For comparing the results of the two algorithms, in particular, the best value obtained in the 50 independent trials, the results are plotted in Figure 13 underlining the comparison between the two algorithms.   Figure 13a shows that the performance of SNO with the adjacency-based codification is drastically better with respect to the ones of PSO, especially in nine cases in which the gap between the two algorithms overcomes 1; in only one case, PSO is better than SNO (gap -0.05). The average gap between the two algorithms is 2.95, highly biased by three cases in which it is over 10.
On the other hand, the results of Figure 13b show that with the depth-based codification the performance of the two algorithm becomes more similar. In fact, the average gap is 0.04 and in 18 cases the solutions obtained by the two algorithms have the same cost value.
To highlight some differences between the optimizers in this case, in which the results seem the same, the termination criterion has been modified in the simulations.
In single-objective optimization, the choice of the termination criterion is not as critical as in multi-objective optimization [34], however different possibilities are analyzed in the literature.
The most common criterion is the number of objective function calls: in this case, the computational time required by the optimization is limited a priori. A second possibility is related to the population diversity: when it drops, the algorithm has reached a minimum. The effectiveness of this criterion depends highly on the mutation operators of the algorithm and it is hardly applicable for SNO. Finally, another common possibility is to fix a maximum number of iterations in which the best solution is not improved [35]. In this last analysis, the non-improving iterations criterion was adopted. Figure 14 shows the results when the termination criterion was set to 10 non-improving iterations. In addition, in this case, 50 independent trials were performed for both algorithms.
The number of objective function calls required before the termination criterion was applied are shown in Figure 14a. It clearly shows that the effective number of objective function calls is much smaller to the time given in the standard optimization. It is interesting to see that PSO seems slightly faster than SNO.
The time performance should also be combined with the final cost values, as shown in Figure 14b, in which the final cost and the delta with respect to the previously found optimum are shown. These data show that the final cost of PSO is almost always greater than the one of SNO: this means that SNO performs more exploration than PSO. Finally, from the delta values, it is possible to notice that this second termination criterion often is not able to provide the algorithm enough time to find the optimum of the function.  To analyze the effect of the termination criterion, the number of non-improving iterations was increased to 50. The optimization time (Figure 15a) is more than doubled for both the algorithms. In addition, in these tests, in many cases, SNO requires more iterations than PSO. In Figure 15b, it is clear that, in almost all cases, SNO is able to achieve the optimal solutions, while the errors of PSO are still high.  In the following, a peculiar network is deeply analyzed: its peculiarity is that the adjacency-based codification gives better results.

Analysis of a Peculiar Case
The network with 50 nodes is here analyzed as a peculiar case, representative of all the networks in which the adjacency-based codification gives better results with respect to the depth-based.
Here, the results of the four optimizations on this network (two codifications and two algorithms) are analyzed. All these results were obtained performing 50 independent trials and using as termination criterion 7000 objective function calls. Figure 16a shows the convergence curves obtained with SNO. Each grey line is a single trial, while the blue line is the average convergence. The figure also provides a zoom of the last 6000 calls. The convergence curves show some jumps in the first iterations (before 1500 objective function calls), corresponding to the achievement of feasible solutions.  The performance of the best network achieved with this optimization is shown in Figure 16b, where the evolution of the sensors' battery capacity is represented. In this figure, the dark red color means full charge and blue empty battery. In this figure, it is possible to notice that the network lifetime is around 119 KCC and that six sensors are more critical than the others.
The active paths of the optimal solution are shown in Figure 17 with the green lines. In this figure, two areas have been highlighted. The red circles underline some sub-optimal configurations of the network: for example, in the left one, node 41 hops on 36 that hops on 33 that send all this information to sensor 35. This is sub-optimal because all these nodes could be connected directly to node 35. Figure 17. Routing in the optimal network. The green lines are the selected paths and the red circle underline some sub-optimal areas of the network.
The reason this sub-optimality appears is because all the involved nodes are not critical in the definition of the lifetime: in fact, as shown in Figure 16b, all of them achieve a minimum energy greater than the 50% of the initial capacity.
The convergence curves of PSO, represented in Figure 18a, shows that this algorithm could not find feasible solutions in all the independent trials. The stagnation in high-cost local minima makes the final optimal not competitive with the one found by SNO: in fact, as expressed by Figure 18b, the network lifetime is limited to 61 KCC. The reason of this very limited lifetime can be understood analyzing the active paths in the optimal solutions found by PSO shown in Figure 19. From the energy evolution it is possible to notice that the loaded nodes are the 28 and 8. In fact, they aggregates the information of large clusters of nodes: all the nodes highlighted in red transmit their information through node 8, while all the orange nodes hops on node 28.   The convergence of the 50 independent trials of SNO with the depth-based codification are shown in Figure 20a: the optimization process is very effective because all the trials converge to the same solution before 500 objective function calls out of the 7000 allowed. The fast convergence of all the trials to the same solution suggests that it is the global minimum for this function.  This assumption can be confirmed analyzing the convergence curves of the 50 trials of PSO ( Figure 21a): in fact, most of the solutions converge toward the same minimum value. From these curves, it is possible to highlight that, even if the best solution of PSO is equal to the SNO one, the first algorithm is not able to converge in all the trials.  From the energy evolution of SNO ( Figure 20b) and PSO (Figure 21b), it is possible to see that the network lifetime (114 KCC) is lower than the best solution achieved by SNO in the adjacency-based codification (119 KCC). In fact, in these solutions, the transmission load is concentrated mostly just in three nodes: nodes 8, 42, and 49.
The overload of these nodes can bee seen from the optimal solution obtained with the depth-based codification shown in Figure 22. The colored circles underline the cluster of nodes that transmits through the highly loaded nodes. In particular, the red circles belong to the cluster of node 8, the yellow ones to sensor 42, and the orange ones to node 49.
This solution is the global minimum with the depth-based codification because the search space has been drastically reduced and, thus, the algorithm has no degree of freedom to reduce the load of critical nodes.

Conclusions
In this paper, the maximization of the network lifetime of a Wireless Sensor Network is achieved by properly selecting the routing paths. This problem is faced with two different evolutionary optimization algorithms: the traditional PSO and the more recent SNO.
The objective of this paper is to analyze the difference between two different problem codification methods: with the first one (the adjacency-based codification), the algorithm can choose an "exit way" among all the adjacent nodes, while, in the second one (the depth-based codification), the selection is restricted to sensors with lower depth in the network.
Two important findings were obtained with the test campaign conducted over more than 25 test cases. Firstly, the reduction of the search space size due to the depth-based codification improves drastically the optimization convergence, in terms of both the quality of the final solution and the optimization time. This improvement is more evident in the PSO algorithm: in fact, with the adjacency-based codification, the performance of this algorithm is very poor, while, with the depth-based codification, it is comparable with SNO.
The drawback of the depth-based codification is due to the lower degrees of freedom that are given to the optimizer: in fact, while this reduces the problem complexity, it is likely to eliminate some good network configurations from the search space. This phenomenon was deeply inspected and it was confirmed by the results of the optimization of the 50-nodes network. Funding: This research received no external funding.

Conflicts of Interest:
The authors declare no conflict of interest.