A PSO-Based Uneven Dynamic Clustering Multi-Hop Routing Protocol for Wireless Sensor Networks

Since wireless sensor networks (WSNs) are powered by energy-constrained batteries, many energy-efficient routing protocols have been proposed to extend the network lifetime. However, most of the protocols do not well balance the energy consumption of the WSNs. The hotspot problem caused by unbalanced energy consumption in the WSNs reduces the network lifetime. To solve the problem, this paper proposes a PSO (Particle Swarm Optimization)-based uneven dynamic clustering multi-hop routing protocol (PUDCRP). In the PUDCRP protocol, the distribution of the clusters will change dynamically when some nodes fail. The PSO algorithm is used to determine the area where the candidate CH (cluster head) nodes are located. The adaptive clustering method based on node distribution makes the cluster distribution more reasonable, which balances the energy consumption of the network more effectively. In order to improve the energy efficiency of multi-hop transmission between the BS (Base Station) and CH nodes, we also propose a connecting line aided route construction method to determine the most appropriate next hop. Compared with UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS and PSO-SD, PUDCRP prolongs the network lifetime by between 7.36% and 74.21%. The protocol significantly balances the energy consumption of the network and has better scalability for various sizes of network.


Introduction
Wireless sensor networks (WSNs) have attracted widespread attention in recent years. Due to the low cost, small size and self-organization of sensors [1], WSNs have been adopted in diverse application fields, such as military, crime prevention, environmental monitoring, health care services, vehicular movements, etc. [2][3][4] As sensor nodes are supplied by non-rechargeable batteries [5], designing an energy-efficient routing protocol to prolong the network lifetime is a vital issue in WSNs.
A number of routing protocols have been proposed to reduce the energy consumption of WSNs. Among them, the clustering scheme has better flexibility and scalability, and is considered to be one of the most effective solutions in this regard. Therefore, most of the current researches on routing protocols are based on the clustering scheme, such as LEACH (Low-Energy Adaptive Clustering Hierarchy) [6], PEGASIS(Power-Efficient GAthering in Sensor Information Systems) [7], HEED (Hybrid Energy-Efficient Distributed clustering) [8], EEMC (Energy-Efficient Multi-level Clustering) [9] and so on. Clustering can improve energy efficiency, but it can cause hotspot problems [10,11].
Many energy-efficient clustering routing algorithms have been proposed to solve the hotspot problem. Reference [12] proposed a gravitational search algorithm (GSA) based clustering and routing algorithm. GSA uses formulae to elect CH nodes and assign nodes to address the hotspot problem. The UCCGRA (Unequal Clustering and Connected Graph Routing Algorithm) algorithm [13] considers the clustered network with unequal size based on sensor energy used for the transmission. The work highlights the concept of balancing the node energy for inter and intra cluster communication. In UCCGRA, the vote based selection of CH nodes creates more control message overheads during re-clustering, which cause unnecessary energy depletion. Grid-based clustering protocols, such as [14][15][16], form clusters by dividing the network area into grids. The size of the grids away from the BS is larger than the size of the grids near the BS, which can alleviate the hotspot problem to some extent. However, for a network environment with uneven node distribution, grid-based clustering is unreasonable. Because there may be a large difference in the number of nodes in the same size grids.
The aforementioned protocols improve the energy efficiency of WSNs. However, not all the protocols carefully consider the distribution of nodes. Moreover, most of these algorithms focus only on the selection of CH nodes. They do not carefully consider the distribution and scale of clusters. The protocols suffer from an energy imbalance across the network. The hotspot problem still exists to a certain extent and some protocols cannot be applied to large-area network environments. This paper proposes a PSO-based uneven dynamic clustering multi-hop routing protocol (PUDCRP), which alleviates the hotspot problem and achieves better energy balance. We use an improved PSO algorithm to determine the circular area where candidate CH nodes are located. We introduce a multi-objective fitness function to select CH nodes. We also propose a connecting line aided route construction method to achieve an energy efficient routing. Simulation results proved that PUDCRP has a better performance in network lifetime and energy consumption of the network. The major contributions of this paper can be summarized as follows. • We propose a PSO-based uneven dynamic clustering method which divides the network area into circles with unequal sizes based on the distribution of nodes. The circles far away from the BS are larger than the ones near the BS, which can alleviate the hotspot problem.

•
We introduce a multi-objective function to select CH nodes from the circles. The function considers nodes' residual energy, the number of neighbor nodes, and distances from the nodes to the BS. Therefore, the intra-cluster energy consumption is minimized. • Two fitness functions are proposed to determine the optimal positions of the circular areas in our PSO method. The G best fitness function is to achieve the maximum coverage across the network. The fitness function used to determine P besti is the absolute value of the difference between the actual number of nodes covered by the circular area and the number of ideal coverage nodes, which makes each circular area contain the number of nodes that match its size. • A connecting line aided route construction method is proposed to determine a multi-hop route. The method considers the distance from the candidate CH node to the connecting line between the source CH node and the BS, the transmission distance from the source CH node to the candidate CH node, and the residual energy of candidate CH node. Hence, the energy consumption of the transmission is reduced.

Swarm Intelligence Based Routing Protocols
Swarm intelligence has an essential idea of self-association and self-organization that can offer better solution for optimizing routing protocols of WSNs. Ari et al. [23] developed a cluster-based power efficient routing protocol named as ABC-SD. This protocol utilizes search features of artificial bee colony (ABC) which is used to design the low power consumption cluster. However, the algorithm uses a distributed method for cluster head elections, and the threshold energy used to elect CHs is fixed. If all the energy in the cluster is less than the threshold energy, CHs may not be elected. Yalçın et al. [24] proposed two algorithms, namely CH selection, based on bacterial interaction, and a cognitive routing algorithm for energy and transmission boundary. From the simulation results that were obtained from the Matlab 2016b software (MathWorks, Natick, America). It can be said that the study is more adaptive and applicable in real WSN scenarios. Karaboga et al. [25] proposed a clustering routing protocol based on an artificial bee colony algorithm in order to increase network lifetime. The algorithm employs a QoS (Quality of Service) mechanism to minimize the delays between signals received from the clusters. However, this protocol does not consider the coverage of the CHs, which leads to unbalanced energy consumption. Rao et al. [26] proposed a Particle Swarm Optimization based Energy efficient Cluster Head Selection algorithm (PSO-ECHS) which considers various parameters such as intra-cluster distance, sink distance and residual energy of sensor nodes to select CH nodes. Due to several factors being considered at the same time, some nodes far from CH nodes die prematurely when the CH node selected by the nodes has more residual energy and is far away from these nodes. Kuila et al. [27] used a PSO-based clustering algorithm to enhance the lifetime of WSNs. In this method, clustering takes place based on the average cluster distance and the lifetime of the gateway. The fitness value of each particle in the swarm is computed by using fitness function and this fitness value is used to judge the quality of the network. A particle with better fitness function value gives better network structure. Xiang et al. [28] proposed a PSO-based energy efficient routing algorithm. The fitness function of the algorithm considers the residual energy of nodes and transmission distance to balance the network energy consumption to some extent. Wang et al. [29] proposed a special clustering method called Energy Centers searching using Particle Swarm Optimization (EC-PSO) for heterogeneous WSNs. It adopted EC-PSO to elect nodes close to the energy center as CHs. However, EC-PSO is only applicable to network environments with even nodes since it uses geometric method to achieve evenly distributed CHs during the first period. Kaswan et al. [30] proposed a multi-objective and PSO based energy efficient path design for mobile sink in wireless sensor networks. The algorithm is presented with an efficient particle encoding scheme and derivation of a proficient multi-objective fitness function. Objectives include minimizing the longest path length and minimizing the number of multi-hops. However, because energy is not considered in objectives, some nodes may die prematurely. Latiff et al. [31] proposed an energy-aware clustering for wireless sensor networks using particle swarm optimization, which defines a new cost function to minimize the intra-cluster distance and optimize the energy consumption of the network. However, CHs directly transmitted data to the BS during routing. It may cause unbalanced energy consumption. Singh et al. [32] proposed particle swarm optimization (PSO) approach for generating energy-aware clusters by optimal selection of cluster heads. The PSO eventually reduces the cost of locating optimal position for the head nodes in a cluster. The PSO-based approach was implemented within clusters rather than the BS, which makes it a Particle Swarm Optimization Semi-Distributed method (PSO-SD). However, the distance between nodes and the BS are not taken into account in the protocol, which may result in excessive energy consumption of the CHs to transmit data to the BS.

Network and Energy Model
The aim of the proposed protocol in this paper is to provide the more appropriate distribution of cluster heads and the better scalability for different scale network environments, and effectively improve the energy efficiency for WSNs. We mainly consider the clustering method, the algorithm for selecting CH nodes, and the routing algorithm to balance the energy consumption and prolong the network lifetime.

Network Model
This paper considers randomly deploying n sensor nodes in a square area of size M × M, The square area is represented by symbol A. The assumptions of the network environment are as follows: • All sensor nodes are homogeneous and have the same initial energy.

•
Each sensor nodes are aware of its own location by using GPS (Global Positioning System) or some other localization mechanisms.

•
After all the sensor nodes are deployed, they are fixed.

•
All the sensor nodes are aware of their residual energy and have same transmission range.

•
Sensor nodes can adjust their own energy consumption based on the distance to the receiver.

•
Each node has a unique ID.

•
The BS is static and positioned at the boundary of the square area.

Energy Model
The energy model of this paper is the same as literature [33] and [34]. The energy consumption of nodes mainly occurs at the transmitter, the power amplifier, and the receiver to run the radio electronics. The model adopts the free space and the multi-path fading channel, depending on the distance between the transmitter and receiver. The energy consumption of a node is proportional to d 2 when the propagation distance d is less than the threshold distance d 0 , otherwise it is proportional to d 4 . The total energy expended to deliver an l-bit packet from the transmitter to its receiver over a link of distance d is calculated by Equation (1): where E elec is the energy consumed by a sensor node to transmit or receive 1-bit data, and fs and mp are two amplifier coefficients of free-space model and multi-path fading model respectively. The threshold distance d 0 is calculated by Equation (2): where fs and mp are two parameters of the amplifier. When d < d 0 , the energy consumption of the sensor nodes uses the Free-space model, and the amplifier parameter is fs . When d ≥ d 0 , the energy consumption of the sensor nodes uses the multi-path fading model, the amplifier parameter is mp . In this paper, the maximum transmission distance of a node is controlled not to exceed d 0 , which ensures that nodes in the same cluster are within the transmission range of the proposed PSO-based uneven dynamic clustering method.

Overview of PSO
Before presenting the proposed algorithm, we give an outline of the particle swarm optimization (PSO) (Kennedy et al. 1995) algorithm [35]. PSO is based on a swarm of particles of a predefined number (say N p ). Each particle P i (1 ≤ I ≤ N p ) provides a complete solution to a multidimensional optimization problem. Dimension D of all the particles is equal. Particle P i has position X i,d (1 ≤ d ≤ D) and velocity V i,d in the dth dimension of the multidimensional space. Let the ith particle P i of the population be represented by Equation (3) as follows.
A fitness function is used to evaluate each particle to judge its quality of the solution to the problem. The personal best called Pbest i is the best position of each particle P i . The global best called Gbest is the best position of all particles . In order to reach the global best position, each particle P i follows its own best, i.e., Pbest i and Gbest to update its own velocity and position. In each iteration, its velocity V i,d and position X i,d in dimension D is updated by using Equations (4), (5), respectively.
where T R is the maximum number of iterations, w (w max = 0.9, w min = 0.4) is self-adapting parameter, c 1 and c 2 (0 ≤ c 1 , c 2 ≤ 2) are the acceleration coefficients, and r 1 and r 2 (0 < r 1 , r 2 < 1) are the randomly generated values. The update process is repeated until an acceptable value of G best is obtained or a fixed number of iterations (t max ) is reached. After getting a new updated position, the particle evaluates the fitness function and updates P besti as well as G best for the minimization problem as follows.
Sensors 2019, 19, 1835 6 of 24 Gbest = P i , i f (Fitness(P i ) < Fitness(Gbest)) Gbest, otherwise (8) Figure 1 shows how a particle explores in the multi-dimensional search space to achieve a global best solution. A particle P i occupies position X i,d (t) with velocity V i,d (t) at a point of time and it is moving in a certain direction. Later the particle changes the direction and moves to another position using its memory. It then changes its direction again by the influence of the swarm and occupies a new position X i,d (t+1).
Sensors 2019, 19 6 of 25 Figure 1 shows how a particle explores in the multi-dimensional search space to achieve a global best solution. A particle Pi occupies position Xi,d(t) with velocity Vi,d(t) at a point of time and it is moving in a certain direction. Later the particle changes the direction and moves to another position using its memory. It then changes its direction again by the influence of the swarm and occupies a new position Xi,d(t+1). After a number of iterations, the particles will find the optimal solution in the searching space. The workflow of PSO is illustrated as Figure 2.

Proposed Algorithm
Similar to the existing hierarchical routing protocols, the operation of PUDCRP also is broken up into rounds. Each round is divided into a set-up phase and a steady-state phase. In the set-up phase, an improved PSO is used to determine the circular area where the candidate CH nodes are located. The PSO-based method operates at the BS. CH nodes are selected by a multi-objective fitness function. Non-CH nodes join the cluster where the nearest CH node is located. In the steady-state phase, a connecting line aided route construction method was proposed to achieve an energy efficient routing.

Terminologies
For the ease of understanding of the proposed algorithm, we first define some terminologies as follows.  After a number of iterations, the particles will find the optimal solution in the searching space. The workflow of PSO is illustrated as Figure 2.
Sensors 2019, 19 6 of 25 Figure 1 shows how a particle explores in the multi-dimensional search space to achieve a global best solution. A particle Pi occupies position Xi,d(t) with velocity Vi,d(t) at a point of time and it is moving in a certain direction. Later the particle changes the direction and moves to another position using its memory. It then changes its direction again by the influence of the swarm and occupies a new position Xi,d(t+1). After a number of iterations, the particles will find the optimal solution in the searching space. The workflow of PSO is illustrated as Figure 2.

Proposed Algorithm
Similar to the existing hierarchical routing protocols, the operation of PUDCRP also is broken up into rounds. Each round is divided into a set-up phase and a steady-state phase. In the set-up phase, an improved PSO is used to determine the circular area where the candidate CH nodes are located. The PSO-based method operates at the BS. CH nodes are selected by a multi-objective fitness function. Non-CH nodes join the cluster where the nearest CH node is located. In the steady-state phase, a connecting line aided route construction method was proposed to achieve an energy efficient routing.

Terminologies
For the ease of understanding of the proposed algorithm, we first define some terminologies as follows.

Proposed Algorithm
Similar to the existing hierarchical routing protocols, the operation of PUDCRP also is broken up into rounds. Each round is divided into a set-up phase and a steady-state phase. In the set-up phase, an improved PSO is used to determine the circular area where the candidate CH nodes are located. The PSO-based method operates at the BS. CH nodes are selected by a multi-objective fitness function. Non-CH nodes join the cluster where the nearest CH node is located. In the steady-state phase, a connecting line aided route construction method was proposed to achieve an energy efficient routing.

Terminologies
For the ease of understanding of the proposed algorithm, we first define some terminologies as follows.

2.
A: The area of the network.

4.
E 0 : The initial energy of nodes.

5.
E i : The residual energy of sensor node s i , 1 ≤ I ≤ n.
D: The dimension of particle characteristics.

Particle Representation and Initialization
In PSO, a particle swarm represents a complete solution. For the clustering process of the proposed algorithm, a particles represents optimal positions of the center of the circular areas where candidate CH nodes are located. Each component P i (t) = (X i,1 (t), X i,2 (t)) = (x i (t), y i (t)) denotes the coordinates of the center of a circular area.

Determination Radius of Circular Area
In multi-hop routing protocols, the CH nodes closer to the BS undertake more data forwarding tasks, which causes nodes near the BS area to die prematurely and generates undetectable hotspots. This is the so-called hotspot problem. To address the problem, an effective solution is to make the clusters closer to the BS smaller and make the clusters farther away from the BS larger. Small clusters near the BS are with fewer nodes and have short transmission distance to the BS. Therefore, this approach can compensate for the energy consumption of the nodes near the BS for forwarding data from the other CH nodes. Since in the proposed algorithm the distribution of the circular areas determines the location of CH nodes, the area of the circular areas near the BS should be smaller than the ones farther away from the BS, which can help to achieve the reasonable distribution of the above clusters to a certain extent. The radius of a circular area is calculated by Equation (9).
where R i is the radius of the ith circular area, dis i is the distance between the center of the ith circular area and the BS, dis max is the maximum distance between circular area centers and the BS, and R max is the maximum radius of the circular areas. The maximum radius R max is d 0 /2 in this paper. d 1 is the minimum radius of circular areas, which is calculated by Equation (10).
The value of d 1 is the average radius of the area covered by a node in the network, which can ensure that the node closest to the BS can form a cluster.

Determination Optimal Number of Circular Areas
The proper number of clusters is essential for clustering effectiveness, otherwise the network cannot benefit from clustering advantages. The optimal number of clusters is defined as C. If the value of C is too large, there will be many circular areas that overlap. If the value of C is too small, the circular area cannot cover as much of the network environment as possible. In this paper, in order to determine the C value, it is assumed that the clusters distribute the entire network environment evenly by layer, as shown in the Figure 3. There are partial overlaps between the circular areas in the figure. Because the network area cannot be filled with circles, and partial overlap of the circular areas can offset the unfilled network area. r is the sum of 2R i which is calculated by Equation (11). The radius of the circle of the kth layer is R k (assuming that the network area is divided into K layers). The number of circles of the kth layer is n k . According to the above conditions, Equation (12) can be obtained.
where A is the area of the whole network environment.  Thus, the optimal circular area number C can be obtained by cyclic calculation. The following is the calculation process for calculating the C value. The algorithm to calculate the optimal circular area number is illustrated in algorithm 1.
Algorithm 1: The calculation process of optimal circular area number C Input: The radius of cluster: R=d1/* R is initialized to d1*/. The area of the network: A. The maximum distance between circular area centers and the BS: dismax. The farthest distance from the base station to the boundary of the circle where R has been calculated: r=0/* r is initialized to 0. r is the sum of 2Ri which are calculated */ Output: the optimal number of circular areas: C 1. while A>0 do 2.
( )  Thus, the optimal circular area number C can be obtained by cyclic calculation. The following is the calculation process for calculating the C value. The algorithm to calculate the optimal circular area number is illustrated in Algorithm 1.
Algorithm 1: The calculation process of optimal circular area number C Input: The radius of cluster: The area of the network: A. The maximum distance between circular area centers and the BS: dis max . The farthest distance from the base station to the boundary of the circle where R has been calculated: r=0/* r is initialized to 0. r is the sum of 2R i which are calculated */ Output: the optimal number of circular areas: C

Derivation of Ftness Function
The proposed PSO-based clustering algorithm is clustered according to node distribution. The fitness function is used to determine the distribution of the circular areas where the candidate CH nodes are located. The derivation of the fitness function depends on the two parameters: coverage rate and Intersection-over-Universal.

Coverage Rate
Coverage rate is the ratio of the number of nodes in a circular area where candidate CH nodes are located to the total number of nodes. Obviously, the more nodes that are covered, the more candidate CH nodes can be obtained, so that local optimization can be avoided when selecting CH nodes. Moreover, it can avoid the situation where CH nodes are gathered in a certain corner. Therefore, we need to maximize the coverage rate.
Cov Cov n r n = (13) where nCov is the number of nodes covered by the circular areas and n is the number of nodes in the network.

Intersection-Over-Universal
Based on the concept of Intersection-over-Union in target detection [36], we propose an Intersection-over-Universal (IoU). IoU is the ratio of the number of candidate CH nodes in the overlapping portion of the circular area to another circular area to the total number of candidate CH nodes in the network. Figure 5 shows the intersection set I and universal set U. The more nodes in the overlapping portion among the circular areas where candidate CH nodes are located, the more likely that only one shared CH node is selected among these circular areas. It causes the CH node to bear a heavy forwarding load during the transmission phase. Excessive overlap of the circular areas also reduces the coverage of the candidate CH nodes. Therefore, it is wise to minimize IoU. That is, we need to maximize its reciprocal.

Derivation of Fitness Function
The proposed PSO-based clustering algorithm is clustered according to node distribution. The fitness function is used to determine the distribution of the circular areas where the candidate CH nodes are located. The derivation of the fitness function depends on the two parameters: coverage rate and Intersection-over-Universal.

Coverage Rate
Coverage rate is the ratio of the number of nodes in a circular area where candidate CH nodes are located to the total number of nodes. Obviously, the more nodes that are covered, the more candidate CH nodes can be obtained, so that local optimization can be avoided when selecting CH nodes. Moreover, it can avoid the situation where CH nodes are gathered in a certain corner. Therefore, we need to maximize the coverage rate.
where n Cov is the number of nodes covered by the circular areas and n is the number of nodes in the network.

Intersection-Over-Universal
Based on the concept of Intersection-over-Union in target detection [36], we propose an Intersection-over-Universal (IoU). IoU is the ratio of the number of candidate CH nodes in the overlapping portion of the circular area to another circular area to the total number in the network. Figure 5 shows the intersection set I and universal set U. The more nodes in the overlapping portion among the circular areas where candidate CH nodes are located, the more likely that only one shared CH node is selected among these circular areas. It causes the CH node to bear a heavy forwarding load during the transmission phase. Excessive overlap of the circular areas also reduces the coverage of the candidate CH nodes. Therefore, it is wise to minimize IoU. That is, we need to maximize its reciprocal.
where, n Inter is the number of nodes in the overlapping portion of the circular areas.
nodes in the network. Figure 5 shows the intersection set I and universal set U. The more nodes in the overlapping portion among the circular areas where candidate CH nodes are located, the more likely that only one shared CH node is selected among these circular areas. It causes the CH node to bear a heavy forwarding load during the transmission phase. Excessive overlap of the circular areas also reduces the coverage of the candidate CH nodes. Therefore, it is wise to minimize IoU. That is, we need to maximize its reciprocal. After iterations, if a circular area's IoU is greater than a certain threshold T (0 < T < 1, T = 0.7 in this paper), the center of this circular area should be deleted from the global best positions. The circular area where the center is located will also be deleted. Hence, the number of CH nodes is less than or equal to C. Figure 6 is the schematic diagram of the distribution of the circular areas. The circular areas cannot cover all the nodes in the network. We select CH nodes based on these circular areas.
Since the circular areas are formed according to the number of nodes, the area of the whole network and the distribution of nodes, we can achieve a reasonable distribution of CH nodes. The reasonable distribution can efficiently balance the energy consumption of the whole network.
Sensors 2019, 19 10 of 25 Inter n IoU n = (14) where, nInter is the number of nodes in the overlapping portion of the circular areas After iterations, if a circular area's IoU is greater than a certain threshold T (0 < T < 1, T = 0.7 in this paper), the center of this circular area should be deleted from the global best positions. The circular area where the center is located will also be deleted. Hence, the number of CH nodes is less than or equal to C. Figure 6 is the schematic diagram of the distribution of the circular areas. The circular areas cannot cover all the nodes in the network. We select CH nodes based on these circular areas. Since the circular areas are formed according to the number of nodes, the area of the whole network and the distribution of nodes, we can achieve a reasonable distribution of CH nodes. The reasonable distribution can efficiently balance the energy consumption of the whole network.   Figure 7 is the circular area that needs to be deleted because their IoU is greater than 0.7. The solid circles are the circular areas obtained after iterations. The number 0.86 at the top of the figure is the final node coverage after iterations. Figure 8 shows the circular areas with IoU less than 0.7 in the network after iterations. As can be seen from Figures 7 and 8, the circular areas cover as many nodes as possible. The distribution of the circular areas is based on the distribution of nodes. The selection of CH nodes is based on the distribution of the circular areas, the residual energy of the nodes, the distance from the nodes to the BS, and the number of neighbor nodes covered in the communication range. The distribution of CH nodes directly selected by multi-objective functions [37] may be too dense or too sparse. In addition, the circular areas close to the BS are smaller than the ones farther away from the BS, which is beneficial to balance the energy consumption of the network.   Figure 7 is the circular area that needs to be deleted because their IoU is greater than 0.7. The solid circles are the circular areas obtained after iterations. The number 0.86 at the top of the figure is the final node coverage after iterations. Figure 8 shows the circular areas with IoU less than 0.7 in the network after iterations. As can be seen from Figures 7 and 8, the circular areas cover as many nodes as possible. The distribution of the circular areas is based on the distribution of nodes. The selection of CH nodes is based on the distribution of the circular areas, the residual energy of the nodes, the distance from the nodes to the BS, and the number of neighbor nodes covered in the communication range. The distribution of CH nodes directly selected by multi-objective functions [37] may be too dense or too sparse. In addition, the circular areas close to the BS are smaller than the ones farther away from the BS, which is beneficial to balance the energy consumption of the network. Sensors 2019, 19 11 of 25

Proposed Fitness Function
In order to determine the optimal positions of the circular areas in our PSO method, it is best to maximize the linear combination of the above two parameters instead of maximizing them individually, since the two parameters do not conflict with each other. Therefore, we use the following fitness function Equation (15) to determine Gbest.
where, α is 0.9 in this paper, which is determined by experiments. Because the algorithm removes the circular areas with high IoU. Hence, the coverage of the circular areas is mainly considered in the fitness function. For a single circular area, there is no concept of IoU. Therefore, in our PSO approach, the fitness function to determine Pbesti is the absolute value of the difference between the actual number of nodes covered by the circular area and the number of ideal coverage nodes.

Proposed Fitness Function
In order to determine the optimal positions of the circular areas in our PSO method, it is best to maximize the linear combination of the above two parameters instead of maximizing them individually, since the two parameters do not conflict with each other. Therefore, we use the following fitness function Equation (15) to determine Gbest.
where, α is 0.9 in this paper, which is determined by experiments. Because the algorithm removes the circular areas with high IoU. Hence, the coverage of the circular areas is mainly considered in the fitness function. For a single circular area, there is no concept of IoU. Therefore, in our PSO approach, the fitness function to determine Pbesti is the absolute value of the difference between the actual number of nodes covered by the circular area and the number of ideal coverage nodes.

Proposed Fitness Function
In order to determine the optimal positions of the circular areas in our PSO method, it is best to maximize the linear combination of the above two parameters instead of maximizing them individually, since the two parameters do not conflict with each other. Therefore, we use the following fitness function Equation (15) to determine G best .
where, α is 0.9 in this paper, which is determined by experiments. Because the algorithm removes the circular areas with high IoU. Hence, the coverage of the circular areas is mainly considered in the fitness function.
For a single circular area, there is no concept of IoU. Therefore, in our PSO approach, the fitness function to determine P besti is the absolute value of the difference between the actual number of nodes covered by the circular area and the number of ideal coverage nodes.
where n Covi is the number of the nodes covered in a circular area and n ideali is the ideal number of the nodes covered in a circular area. m is the number of circular areas. n ideali in a circular area is related to the density of nodes in the network and the size of the circular area. n ideali is calculated by Equation (17).
where A i is the area of the ith circular area and A is the area of the whole network.
where, R i is the radius of the ith circular area.
In each iteration, the velocity and the positions of particles are updated using Equations (4) and (5).

Set-Up Phase
After determining circular areas where the candidate CH nodes are located, CH nodes are selected by a multi-objective function. The CH nodes in each circular area are determined according to the residual energy of nodes, the distance from the nodes to the BS, and the number of neighbor nodes covered in the communication range. The multi-objective function is as follows.
where, we assign each factor the corresponding coefficient w 1 , w 2 and w 3 , weighing the importance of each factor to CH election. w 1 + w 2 + w 3 = 1, 0 ≤ w 1 , w 2 , w 3 ≤1, E ij is the residual energy of a candidate CH node s i in the jth circular area, E 0 is the initial energy of nodes, n nbi is the number of neighbor nodes in the communication range of the node s i , d minj is the minimum distance between the BS and candidate CH nodes in the jth circular area, and d si is the distance between the candidate CH node s i and the BS. E ij divided by E 0 , n nbi divided by n, and d minj divided by d si are normalized to adjust their values in the range [0,1]. The purpose of normalization is to adjust the values measured on different scales into a common scale so that there will be the same impact when multiple objectives are superposed. The node with the largest Weight ij in the candidate circular area is selected as the CH node. The experimental results in a 200 m × 200 m network with 100 nodes are shown in Figures 9-11. Where FND indicates the round in which the first dead node occurs. HND indicates the round in which half of nodes die. LND indicates the round in which 80% nodes die. According to above experimental results, w 1 , w 2 and w 3 are set to 0.8, 0.05 and 0.15. After CH nodes are determined, each sensor node determines which cluster it wants to join by choosing the CH that requires the minimum communication energy. Once all the nodes are organized into clusters, each CH creates a schedule for the nodes in its cluster. This allows the radio components of each non-cluster head node to be turned off at any time, except at its transmission time, thereby minimizing the energy consumed by a single sensor. Once the CH node has all the data from the nodes in its cluster, it aggregates the data and then transmits the compressed data to the BS. Because the distance between the CH nodes near the BS is smaller than the ones farther away from the BS, the area of clusters which near the BS can be smaller than the ones farther away from the BS, which can alleviate the hotspot problem due to forwarding data and balance the energy consumption of the network.
of each factor to CH election. w1 + w2 + w3 = 1, 0 ≤ w1, w2, w3 ≤1, Eij is the residual energy of a candidate CH node si in the jth circular area, E0 is the initial energy of nodes, nnbi is the number of neighbor nodes in the communication range of the node si, dminj is the minimum distance between the BS and candidate CH nodes in the jth circular area, and dsi is the distance between the candidate CH node si and the BS. Eij divided by E0, nnbi divided by n, and dminj divided by dsi are normalized to adjust their values in the range [0,1]. The purpose of normalization is to adjust the values measured on different scales into a common scale so that there will be the same impact when multiple objectives are superposed. The node with the largest Weightij in the candidate circular area is selected as the CH node.   The experimental results in a 200m × 200m network with 100 nodes are shown in Figures 9, 10 and 11. Where FND indicates the round in which the first dead node occurs. HND indicates the round in which half of nodes die. LND indicates the round in which 80% nodes die. According to above experimental results, w1, w2 and w3 are set to 0.8, 0.05 and 0.15. After CH nodes are determined, each sensor node determines which cluster it wants to join by choosing the CH that requires the minimum communication energy. Once all the nodes are organized into clusters, each CH creates a schedule for the nodes in its cluster. This allows the radio components of each non-cluster head node to be turned off at any time, except at its transmission time, thereby minimizing the energy consumed by a single sensor. Once the CH node has all the data from the nodes in its cluster, it aggregates the data  The experimental results in a 200m × 200m network with 100 nodes are shown in Figures 9, 10 and 11. Where FND indicates the round in which the first dead node occurs. HND indicates the round in which half of nodes die. LND indicates the round in which 80% nodes die. According to above experimental results, w1, w2 and w3 are set to 0.8, 0.05 and 0.15. After CH nodes are determined, each sensor node determines which cluster it wants to join by choosing the CH that requires the minimum communication energy. Once all the nodes are organized into clusters, each CH creates a schedule for the nodes in its cluster. This allows the radio components of each non-cluster head node to be turned off at any time, except at its transmission time, thereby minimizing the energy consumed by a single sensor. Once the CH node has all the data from the nodes in its cluster, it aggregates the data and then transmits the compressed data to the BS. Because the distance between the CH nodes near The pseudocode of the PSO-based CH selection algorithm is given in Algorithm 2.
for t = 0 to T R do/*T R = Max. number of iterations */ 9.

Steady-State Phase
Before sending data to the BS, energy-efficient transmission routes from sensor nodes to the BS must first be established. This paper proposes a connecting line aided route construction method to address the issue. Intra-cluster communications are based on single-hop transmission. If the distance between non-CH nodes and the BS is less than d 0 , non-CH nodes directly send data to the BS via single-hop transmission. Each cluster member transmits data directly to its respective CH node.
Based on the distance from CH nodes to the BS, inter-cluster communication uses single-hop or multi-hop transmission. If the distance between CH nodes and the BS is less than d 0 , CH nodes send data directly to the BS via single-hop transmission. Otherwise, CH nodes transmit data to the BS via a multi-hop route established by the connecting line aided route construction method to reduce the energy consumption of multi-hop transmission. When the CH node selects the next hop node of a multi-hop route, it comprehensively considers the distance from the next hop to itself, the connection line connecting the CH node and the BS and the residual energy of next hop. Figure 12 shows how to establish a specific transmission route from CH node i to the BS. CH node i selects CH node j as the next hop. CH node j is selected according to Equation (20).
where Wj is the weight used to determine the next hop. The node with the smallest weight is selected as the next hop. m is the number of CH nodes, the coordinate of CH node i is (CHxi, CHyi), the coordinate of the candidate next hop node j is (NHxj, NHyj), dj is the distance between CHi and the candidate node j, dv is the vertical distance from the candidate node j to the connecting line, which is calculated by Equation (21). u1, u2 and u3 are three corresponding coefficients. After a large number of experiments, the sums of normalized FND and normalized HND under different u1, u2 and u3 are obtained. MATLAB is used to cubic fitting to get Figure 13. According to the position of the highest contour line in Figure 13 Figure 13, it can be seen that the distance to the next hop has a greater impact on the route selection when selecting the next hop. CH node i selects CH node j as the next hop. CH node j is selected according to Equation (20).
where W j is the weight used to determine the next hop. The node with the smallest weight is selected as the next hop. m is the number of CH nodes, the coordinate of CH node i is (CH xi , CH yi ), the coordinate of the candidate next hop node j is (NH xj , NH yj ), d j is the distance between CH i and the candidate node j, d v is the vertical distance from the candidate node j to the connecting line, which is calculated by Equation (21). u 1 , u 2 and u 3 are three corresponding coefficients. After a large number of experiments, the sums of normalized FND and normalized LND under different u 1 , u 2 and u 3 are obtained. MATLAB is used to cubic fitting to get Figure 13. According to the position of the highest contour line in Figure 13, it can be determined that u 1 = 0.1 to 0.25, u 2 = 0.75 to 0.8 or u 1 = 0.2 to 0.25, u 2 = 0.45 to 0.6, or u 1 = 0.25 to 0.35, u 2 = 0.3 to 0.35 or u 1 = 0.35 to 0.45, u 2 = 0.45 to 0.55 that FND and LND are better. In this paper, Figure 13, it can be seen that the distance to the next hop has a greater impact on the route selection when selecting the next hop. The candidate node with minimum weight would be selected as the next hop. Figure 14 is the routing process in simulation environment. It can be seen that the transmission path of each CH node is as close as possible to the linear distance to the BS, and the CH node close to the BS mainly acts as a relay. The routing method minimizes the energy consumption for the routes and balances the energy consumption between the CH nodes. The connecting line aided route construction method can ensure that the transmission distance between the CH nodes is as short as possible and the transmission distance from the CH nodes to the BS is as close as possible to the linear distance from the CH node to the BS. The residual energy of next hop is considered to prevent the nodes to die prematurely. Hence, the energy consumed by data transmission is significantly reduced and balanced.

BS Sensor node
Inter-cluster routing Intra-cluster routing CH node  The candidate node with minimum weight would be selected as the next hop. Figure 14 is the routing process in simulation environment. It can be seen that the transmission path of each CH node is as close as possible to the linear distance to the BS, and the CH node close to the BS mainly acts as a relay. The routing method minimizes the energy consumption for the routes and balances the energy consumption between the CH nodes. The connecting line aided route construction method can ensure that the transmission distance between the CH nodes is as short as possible and the transmission distance from the CH nodes to the BS is as close as possible to the linear distance from the CH node to the BS. The residual energy of next hop is considered to prevent the nodes to die prematurely. Hence, the energy consumed by data transmission is significantly reduced and balanced. The candidate node with minimum weight would be selected as the next hop. Figure 14 is the routing process in simulation environment. It can be seen that the transmission path of each CH node is as close as possible to the linear distance to the BS, and the CH node close to the BS mainly acts as a relay. The routing method minimizes the energy consumption for the routes and balances the energy consumption between the CH nodes. The connecting line aided route construction method can ensure that the transmission distance between the CH nodes is as short as possible and the transmission distance from the CH nodes to the BS is as close as possible to the linear distance from the CH node to the BS. The residual energy of next hop is considered to prevent the nodes to die prematurely. Hence, the energy consumed by data transmission is significantly reduced and balanced.

BS Sensor node
Inter-cluster routing Intra-cluster routing CH node Figure 14. Routing diagram in simulation environment. Figure 14. Routing diagram in simulation environment.

Simulation and Results
To evaluate our proposed protocol, MATLAB is used to perform simulations. In order to simplify the entire simulation process, it is assumed that the network has an ideal MAC (Medium Access Control) layer. The data link communication is reliable and the energy of the BS is not restricted. In the network control process, there is no any energy load for sending control messages and receiving data. Only the energy consumption of the sensor nodes is considered during the experiments. The parameters of the network areas are pre-set.
The simulation parameters of the network area are shown in Table 1. Simulation experiments were carried out on UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS, PSO-SD and PUDCRP in the corresponding network circumstances. The proposed algorithm is based on region partitioning. Therefore, this paper mainly compares the algorithm with the algorithms based on grid region partitioning and PSO based clustering algorithm. Nodes' death states of the protocols are shown in Table 2. Where FND indicates the round in which the first dead node occurs. HND indicates the round in which half of nodes die. LND indicates the round in which 80% nodes die. Table 2 shows that the PUDCRP protocol runs more rounds than the other six protocols under the same network conditions. In the 400 m × 400 m network, compared with UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS and PSO-SD, the time of the first death node in PUDCRP was delayed by 18.00%, 508.06%, 216.81%, 33.69%, 528.33% and 62.85%, respectively. HND of PUDCRP is increased by 31.95%, 61.89%, 12.67%, 61.50%, 109.42% and 52.81%, respectively. The number of running rounds of PUDCRP is increased by 48.75%, 63.21%, 68.89%, 7.36%, 74.21% and 69.81%, respectively. PUDCRP more effectively balances the energy consumption of the network and prolongs the network lifetime than the other six protocols. Table 3 shows the average and standard deviation of residual energy of nodes in the 500th round in the 400 m × 400 m network with 200 nodes. AVE indicates the average residual energy of nodes. STD indicates the standard deviation of residual energy of nodes. The more average residual energy of nodes, more effective the energy efficiency of the algorithm. Lower the standard deviation of residual energy of nodes, more balanced energy consumption. The average residual energy of nodes in PUDCRP is 22.16%, 121.30%, 8.92%, 33.73%, 197.28% and 21.05% higher than the other six algorithms, respectively. The standard deviation of residual energy of nodes in PUDCRP is 18.82%, 41.37%, 10.99%, 5.14%, 14.22% and 12.57% lower than the other six algorithms, respectively. (a)-(g) in Figure 15 are the residual energy of nodes in the 500th round in UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS, PSO-SD and PUDCRP, respectively. Figure 15 visually indicates that the minimum residual energy of nodes in PUDCRP is more than 0.1 J. The minimum residual energy of nodes in other six algorithms are all less than 0.05 J. It shows that the energy consumption per node in PUDCRP is less than other algorithms. And it can be seen from Figure 15 that the energy histogram of the nodes in PUDCRP is denser than other six algorithms, which shows the balanced energy consumption of PUDCRP. The results from Table 3 and Figure 15 show that PUDCRP is much more energy-efficient and achieve more balanced energy consumption of the entire network.  The simulation experiments also compare the number of surviving nodes and energy consumption of the seven protocols in each round in network. The results are shown in Figures 16  and 17. Figure 16 shows how the number of surviving nodes of the seven routing protocols varies with the number of operation rounds in 400 m × 400 m networks. It can be seen that the number of The simulation experiments also compare the number of surviving nodes and energy consumption of the seven protocols in each round in network. The results are shown in Figures 16  and 17. Figure 16 shows how the number of surviving nodes of the seven routing protocols varies with the number of operation rounds in 400 m × 400 m networks. It can be seen that the number of The simulation experiments also compare the number of surviving nodes and energy consumption of the seven protocols in each round in network. The results are shown in Figures 16 and 17. Figure 16 shows how the number of surviving nodes of the seven routing protocols varies with the number of operation rounds in 400 m × 400 m networks. It can be seen that the number of surviving nodes of PUDCRP begins to decrease later than the other six protocols and the round when the first dead node occurs is significantly delayed. The number of surviving nodes in PUDCRP decreases more slowly than the other six protocols. The results mean that PUDCRP balances the energy consumption of the sensor nodes more effectively than the other protocols.  Figure 17 shows how the energy consumption of the seven routing protocols in each round varies. The PUDCRP protocol consumes less energy than the other protocols per round. It can be concluded that compared with other six protocols, the PUDCRP can significantly reduce the energy consumption of nodes.  Figure 18 shows the total number of packets received by the BS of the seven routing protocols. With increasing simulation rounds, the number of packets received by the BS is different in these protocols.
In the PUDCRP algorithm, the BS receives far more data packets than EEMRP and other algorithms with the same rounds. In the 400 m × 400 m network, packets sent to the BS in PUDCRP   Figure 17 shows how the energy consumption of the seven routing protocols in each round varies. The PUDCRP protocol consumes less energy than the other protocols per round. It can be concluded that compared with other six protocols, the PUDCRP can significantly reduce the energy consumption of nodes.  Figure 18 shows the total number of packets received by the BS of the seven routing protocols. With increasing simulation rounds, the number of packets received by the BS is different in these protocols.
In the PUDCRP algorithm, the BS receives far more data packets than EEMRP and other algorithms with the same rounds. In the 400 m × 400 m network, packets sent to the BS in PUDCRP  Figure 18 shows the total number of packets received by the BS of the seven routing protocols. With increasing simulation rounds, the number of packets received by the BS is different in these protocols.
In the PUDCRP algorithm, the BS receives far more data packets than EEMRP and other algorithms with the same rounds. In the 400 m × 400 m network, packets sent to the BS in PUDCRP are saturated at 1200th round and the BS has received 88,820 packets. However, EEMRP is saturated at 1200th round and the BS has only received 53,960 packets. The experimental results show that due to the longer network lifetime and the more balanced energy consumption, the number of packets received by the BS in PUDCRP is much higher than that of the other six protocols. The number of packets received by UCCGRA and PUDCRP during the network operation period is similar, but since the network lifetime of PUDCRP is longer than UCCGRA, PUDCRP receives more packets than UCCGRA. Balanced energy consumption delays the death of the nodes, which ensures that the number of packets received by the BS remains high for a long time. Therefore, our algorithm has a significant improvement in the data transmission performance and interactive capabilities. It also shows that under the same experimental conditions, PUDCRP can collect more data and have higher network energy efficiency.
Sensors 2019, 19 21 of 25 UCCGRA. Balanced energy consumption delays the death of the nodes, which ensures that the number of packets received by the BS remains high for a long time. Therefore, our algorithm has a significant improvement in the data transmission performance and interactive capabilities. It also shows that under the same experimental conditions, PUDCRP can collect more data and have higher network energy efficiency. We also compared the scalability of the network nodes number and network areas of the seven protocols, the LNDs of UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS, PSO-SD and PUDCRP were tested in the 200 m × 200 m and 400 m × 400 m network environments with different number of nodes, respectively. Figure 19 shows the LNDs of the seven protocols in 400 m × 400 m networks with different number of nodes. Table 4 shows specific measurement data of LNDs of the seven protocols in 400 m × 400 m networks with different number of nodes. Figure 20 shows the LNDs of the seven protocols in 200 m × 200 m networks with different number of nodes. Table 5 shows specific measurement data of LNDs of the seven protocols in 200 m × 200 m networks with different number of nodes. As can be seen from Figures 18 and 19, in the network environments with seven different numbers of nodes, the LNDs of PUDCRP occurred significantly later than the other six protocols. The results show that PUDCRP has better scalability for network environments with different nodes and different sizes. This is due to the high energy efficiency of nodes and the balanced energy consumption of whole network achieved by the PSO-based uneven dynamic clustering method.  We also compared the scalability of the network nodes number and network areas of the seven protocols, the LNDs of UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS, PSO-SD and PUDCRP were tested in the 200 m × 200 m and 400 m × 400 m network environments with different number of nodes, respectively. Figure 19 shows the LNDs of the seven protocols in 400 m × 400 m networks with different number of nodes. Table 4 shows specific measurement data of LNDs of the seven protocols in 400 m × 400 m networks with different number of nodes. Figure 20 shows the LNDs of the seven protocols in 200 m × 200 m networks with different number of nodes. Table 5 shows specific measurement data of LNDs of the seven protocols in 200 m × 200 m networks with different number of nodes. As can be seen from Figures 19 and 20, in the network environments with different numbers of nodes, the LNDs of PUDCRP occurred significantly later than the other six protocols. The results show that PUDCRP has better scalability for network environments with different nodes and different sizes. This is due to the high energy efficiency of nodes and the balanced energy consumption of whole network achieved by the PSO-based uneven dynamic clustering method.  Compared to multi-hop EEBCDA and EEMRP, PUDCRP has good performance in any shape (not just in a rectangular network) and size network environment. Multi-hop EEBCDA, EEMRP and other rectangular meshing clustering algorithms are more suitable for rectangular network environments. Compared with CAMP, in PUDCRP, the farther the distance between the nodes and the BS, the larger the size of the clusters, the better the hotspot problem can be alleviated. Compared with PSO-ECHS and PSO-SD, the distribution of CH nodes in PUDCRP is more reasonable, which is  Compared to multi-hop EEBCDA and EEMRP, PUDCRP has good performance in any shape (not just in a rectangular network) and size network environment. Multi-hop EEBCDA, EEMRP and other rectangular meshing clustering algorithms are more suitable for rectangular network environments. Compared with CAMP, in PUDCRP, the farther the distance between the nodes and the BS, the larger the size of the clusters, the better the hotspot problem can be alleviated. Compared with PSO-ECHS and PSO-SD, the distribution of CH nodes in PUDCRP is more reasonable, which is Compared to multi-hop EEBCDA and EEMRP, PUDCRP has good performance in any shape (not just in a rectangular network) and size network environment. Multi-hop EEBCDA, EEMRP and other rectangular meshing clustering algorithms are more suitable for rectangular network environments. Compared with CAMP, in PUDCRP, the farther the distance between the nodes and the BS, the larger the size of the clusters, the better the hotspot problem can be alleviated. Compared with PSO-ECHS and PSO-SD, the distribution of CH nodes in PUDCRP is more reasonable, which is determined by the distribution of nodes. Among these protocols, only PUDCRP considers the distribution of nodes in the clustering process.

Conclusions
In this paper, we proposed PUDCRP, an energy-efficient multi-hop routing protocol based on particle swarm optimization (PSO) algorithm to form clusters adaptively. The PSO-based uneven dynamic clustering method divides the network area into circles with unequal sizes according to the number and distribution of nodes. The radius of the circles is determined by the distance between the center of the circles and the BS. The farther the distance from clusters to the BS, the larger the clusters, which can effectively solve the hotspot problem of WSNs. The proposed protocol improves the way clusters are created in a wireless sensor network. The key idea is to divide the network area into multiple clusters adaptively based on distribution of nodes to achieve more balanced energy consumption. Compared with the rectangular grid clustering method and the formula selection CH clustering method, the PSO-based uneven dynamic clustering method can significantly reduce the energy consumption of nodes. We further proposed a connecting line aided route construction method to improve the energy efficiency of data transmission between the BS and CH nodes. Simulation experiments showed that compared with UCCGRA, multi-hop EEBCDA, EEMRP, CAMP, PSO-ECHS and PSO-SD, PUDCRP achieves more balanced energy consumption, significantly prolongs the network lifetime, and has better scalability in both the various number of nodes and different size networks.