A K-Means Clustered Routing Algorithm with Location and Energy Awareness for Underwater Wireless Sensor Networks

: Data delivery in harsh underwater channels consumes a higher transmission power than that in terrestrial networks. However, due to the complexity of the underwater environment, the energy supply of the nodes in underwater wireless sensor networks is usually limited by their required laborious battery replacement. Thus, energy consumption is considered one of the key issues in underwater wireless optical communication. To minimize such consumption for underwater transmission nodes, much research interest has been found on K-Means technology in designing routing algorithms. However, these algorithms have not regarded the located site and the remaining energy of the underwater nodes simultaneously, which might affect their efﬁciency. In this paper, we propose a clustered routing algorithm, namely the location and energy-aware k-means clustered routing (LE-KCR) algorithm, which applies K-means technology regarding both the located site and the remaining energy of each node. In the proposed LE-KCR algorithm, both the located site and the remaining energy of a candidate cluster-head, as well as the distance between it and its sink node, are considered in cluster-head selection. In addition, given the inaccessibility of some nodes to the whole underwater sensor network resulting from the limited transmission range of their clusters, the dual-hop routing technique is adopted for the edge nodes. The simulation results indicate that the proposed LE-KCR algorithm remarkably reduces the energy consumption and the dead nodes when compared to the traditional low-energy adaptive clustering hierarchy (LEACH) protocol and the optimized LEACH protocol based on K-means clustering technology.


Introduction
Due to the large area of oceans (i.e., occupying nearly 71% of Earth's surface), many nations have proposed their respective marine development strategies, which intensively requires underwater exploration for oceans. Unlike other typical wired and wireless networks [1][2][3][4][5][6], underwater wireless sensor networks, adopting diverse underwater wireless communication technologies (e.g., underwater wireless acoustic communication), are considered essential in realizing underwater monitor and exploration, such as auxiliary navigation, ecological observation, resource development, and ship detection, in the underwater circumstances [7,8].However, because of the significant limitations of underwater acoustic systems in bandwidth and data-rate, optical waves [9] were introduced into underwater wireless communication, also known as underwater wireless optical communication (UWOC) [10][11][12][13][14], to provide low-delay and high-speed underwater data delivery. In addition, UWOC has the advantages of high confidentiality and strong anti-interference ability, although the characteristics of optical waves limit their transmission range. By nodes. However, the proposed routing algorithm based on Q-Learning was not designed for the network topology based on clustering. In [29], a new UWSN routing protocol was proposed to deal with malicious attacks and improve the reliability of the network. In such a protocol, a multi-receiver network architecture and the cluster technology were adopted. In addition, the gateway identified and verified the cluster-header to ensure that all nodes in the cluster were valid. Such approach can provide a high data transmission rate and reduce energy consumption. In [30], a new hybrid clustering approach based on the fuzzy c-means technique was proposed, which adopted clustering technology to enhance the power utilization and optimize the network life-cycle. In [21], another cluster-based energyefficient routing (CBE2R) approach was proposed. The messenger node was regarded as the cluster-head in the CBE2R routing approach and collected data from the anchor node through the relay node. The mobility of the messenger node was manipulated to optimize the network lifetime, as well as the consumed energy. In [31], the authors introduced a new protocol, named the distance and energy constrained K-means clustering scheme (DEKCS), for cluster-head selection. The potential cluster-head was selected according to the position of the potential head in the cluster. Such an algorithm dynamically updated the remaining energy threshold set for potential cluster-heads to ensure that the network completely ran out of energy before disconnecting. However, they did not consider the sparse area of sensor deployment, and the remaining energy had little influence on the selection of cluster-head. In DEKCS, the AUV was deployed on the water surface to receive and forward data. However, for underwater sensor-clustered networks, AUVs can be used to collect data from the network to reduce the transmission energy required to reach the remote base station [32,33]. In this way, data collection via AUVs can significantly improve the lifetime of UWSNs [34]. This technique has also been used in the offshore energy industry [35].
In this article, we propose a clustered routing algorithm, namely the location and energy-aware k-means clustered routing (LE-KCR) algorithm, which applies K-means technology in light of both the located site and the remaining energy of each node. The proposed LE-KCR algorithm adopts the gap statistical method to determine the optimal cluster quantity, and clusters the underwater nodes via the k-means algorithm before selecting a head for each cluster by regarding its located site, remaining energy, and distance to the sink node. The K-means algorithm is a typical distance-based clustering algorithm, which takes distance as the similarity evaluation standard. Such an algorithm usually regards obtaining clusters with similar distance and independence as the final solution. In previous algorithms, cluster heads were mostly selected according to the position of the cluster centroid. However, in this paper, more attention is given to the relative distance between the cluster head and other nodes. In addition, given the inaccessibility of some nodes to the whole underwater sensor network resulting from the limited transmission range of their clusters, the dual-hop routing technique is adopted for the edge nodes.
We organize the rest of this article as follows. In Section 2, the employed system model is introduced. In Section 3, the details of the proposed location and energy-aware k-means clustered routing (LE-KCR) algorithm is analyzed. In Section 4, the performance of the proposed LE-KCR algorithm is investigated before the article is summarized in Section 5.

System Model
In this article, a static underwater optical wireless sensor network is adopted, and Figure 1 illustrates the network model employed for the proposed LE-KCR algorithm with the sensors randomly located underwater following a Poisson distribution. In order to alleviate the effect of the directional transmission property on UWOC, each node is presumed to be equipped with a prismatic array of three high-power blue LED modules [11], with each one having a field of view of 120 degrees, to realize omni-directional transmission. We assume that an appropriate amount of autonomous underwater vehicles (AUVs) [36] with cruise function is deployed as sink nodes to collect data. In [37], the possible methods of deploying such networks were proposed. The sensors are divided into multiple clusters Photonics 2022, 9, 282 4 of 14 for data reporting through the cluster-heads selected from them. After a cluster-head collects data, it will transmit the data to the base station on land through the communication link established by the nearest AUV and the sea buoys. Noticeably, AUVs can also assist the sea buoys to update the position of nodes in the network regularly to reduce the position error of nodes [17]. In the subsequent part, both the underwater channel and the consumed energy are modeled. methods of deploying such networks were proposed. The sensors are divided into multiple clusters for data reporting through the cluster-heads selected from them. After a cluster-head collects data, it will transmit the data to the base station on land through the communication link established by the nearest AUV and the sea buoys. Noticeably, AUVs can also assist the sea buoys to update the position of nodes in the network regularly to reduce the position error of nodes [17]. In the subsequent part, both the underwater channel and the consumed energy are modeled.

Underwater Channel Model
Aquatic media contain dozens of different elements suspended or dissolved in water with different concentrations, as well as some marine animals and plants [38]. Therefore, the water medium exhibits two basic physical effects, namely absorption and scattering. According to Beer's law, the absorption and the scattering effects of the water medium can be combined as the seawater extinction coefficient c(λ) via the following formula: where λ, a(λ), and b(λ)represent the wavelength, the absorption coefficient, and the scattering coefficient, respectively. The key absorption elements include pure seawater, chlorophyll, and colored dissolved organic matter. Seawater contains pure water and dissolved salt. Chlorophyll plays a significant role in photosynthesis. It can absorb light energy in a wide wavelength range. The organic substances, mainly composed of humic and fulvic acid, can also absorb light in seawater. As for the scattering effect, it mainly comes from diverse kinds of particles, whose refractive indices can be defined from 1.03 to 1.15 [39]. As in [40], the underwater light absorption model can be established as follows:

Underwater Channel Model
Aquatic media contain dozens of different elements suspended or dissolved in water with different concentrations, as well as some marine animals and plants [38]. Therefore, the water medium exhibits two basic physical effects, namely absorption and scattering. According to Beer's law, the absorption and the scattering effects of the water medium can be combined as the seawater extinction coefficient c(λ) via the following formula: where λ, a(λ), and b(λ) represent the wavelength, the absorption coefficient, and the scattering coefficient, respectively. The key absorption elements include pure seawater, chlorophyll, and colored dissolved organic matter. Seawater contains pure water and dissolved salt. Chlorophyll plays a significant role in photosynthesis. It can absorb light energy in a wide wavelength range. The organic substances, mainly composed of humic and fulvic acid, can also absorb light in seawater. As for the scattering effect, it mainly comes from diverse kinds of particles, whose refractive indices can be defined from 1.03 to 1.15 [39]. As in [40], the underwater light absorption model can be established as follows: where a w (λ) denotes the absorption coefficient of pure water, a cl (λ) denotes the chlorophyll absorption coefficient [38,39], a 0 h and a 0 f denote the absorption coefficients of humic and fulvic acid, respectively, and C h and C f denote the concentrations of humic and fulvic acid, respectively, which can be determined by chlorophyll concentration C c (C 0 c = 1 mg/m 3 , 0 < C c < 12 mg/m 3 . K f and k h are model constants.
As for the scattering model, it can be calculated as follows [38]: where b w (λ), b 0 s (λ), and b 0 l (λ) denote the scattering coefficients of pure water, small particles, and large particles, respectively. C s denotes the concentrations of small particles with C l denoting those of large particles in seawater.

Energy Model
Underwater sensor nodes consume energy for sensing, packet transmission, packet reception, data processing, network maintenance, staying awake, etc. Among them, data transmission induces the main energy consumption, which is related to the packet size and transmission distance [41]. It includes the power consumed by wireless optical transmitting equipment and the power consumed by the transmitter power amplifier. For the receiver, only the power consumed in the device is considered to be related to the received packet. When the transmission packet size is invariant, the value of the power is constant.
Given the line of sight (LOS) transmission, the estimation of received optical power is determined by the empirical path loss model [40], which gives an effective model regarding the product of transmitter power, telescope gain, and loss as follows: where R represents the vertical distance between the transmitting plane and the receiving plane (approximately equal to the transmission range), η t and η r represent the optical efficiencies of the corresponding transmitter and receiver, respectively, with A r , θ 0 , and θ being the aperture region at the receiver, the divergence angle of the transmitter beam, and the angle perpendicular to the receiver plane and the transmitter-receiver trajectory, respectively. Without considering the beam divergence, only the effective area of the receiving antenna and the optical effect of the optical system are taken into account, and the transmission power P t can be expressed as [40] The transmission energy consumed by the transmitter can be calculated as follows: where δ denotes the effective area of the receiving antenna, t denotes the transmission time, and K(λ) represents the attenuation coefficient of seawater. The energy consumed by node #i during its delivery of 1-bit data to node #j can be derived as follows [40]: where E elec indicates the energy consumed in processing 1-bit information, d ij indicates the distance from node #i to #j, and R b indicates the data-rate in UWOC. The energy consumption for the node to receive and process 1-bit information can be derived as follows [40]:

K-Means Clustering
The K-means algorithm, regarded as one typical and widely used clustering algorithm, can sort elements with common characteristics (e.g., Euclidean distance). It is employed to partition the network into multiple non-overlapping clusters, with the minimized intra-cluster distance (i.e., the distance between nodes and the cluster centroid) and the maximized inter-cluster distance (i.e., the distance between different cluster centroids) [42].
The clustering steps are as follows: 1.
Generate a sensor network with n nodes.

3.
For each sample, compute the distance between it and each cluster center and sort it into classifications.

4.
Recalculate the cluster center of each classification (i.e., the centroid of the classification). Note that the clustering centroid in subsequent iterations is a geometric centroid rather than specific nodes.
where x, a j , and c i denote each sample, the cluster centroid, and the set of intra-cluster samples, respectively.

5.
Repeat steps 3 and 4 until the results of each category remain basically unchanged, and then terminate the iteration (the complexity of the algorithm can be controlled by changing iteration times and minimum error).
In order to determine the value of K, this paper confirms it by employing the gap statistic method [43]. Although the elbow method [19] is usually adopted to determine K, the elbow method needs to find the inflection point manually and lacks enough automaticity. Therefore, the gap statistical method is adopted in this paper as follows: In Equation (14), D k is the loss function, d ij is the distance between any two nodes of the intra-cluster, and n r is number of the nodes of the intra-cluster. In Equation (15), E(logD k ) is the expectation of logD k . This value is generally generated by Monte Carlo simulation. We randomly generate the same number of random samples as the actual samples according to the uniform distribution in the area where the sensor network is located, and we cluster the random samples with K-means to obtain D k . By repeating it multiple times (e.g., 20 times), we obtain 20 logD k . Then, the 20 values are averaged to obtain the approximate value of E(logD k ). By altering the number of clusters, the final Gap(k) takes the maximum value, which determines the best K, as shown in Figure 2. multiple times (e.g., 20 times), we obtain 20logD . Then, the 20 values are averaged to obtain the approximate value of E(logD ). By altering the number of clusters, the final Gap(k) takes the maximum value, which determines the best K, as shown in Figure 2.

Cluster-Head Selection.
Before selecting cluster-heads, the network is clustered to reduce the energy consumed in the process of cluster formation. In this paper, the cluster-head selection policy satisfies the following conditions: • In-cluster position: The nodes are unevenly distributed, so we prefer to deploy the cluster-head in the area with higher node density. According to the proximity rule [31], the proposed algorithm calculates the sum of the Euclidean distance between candidate nodes and other nodes in the cluster, and selects the node with the smallest distance as the cluster-head, which can minimize the energy consumed by the sensor to transmit to the cluster-head. However, the selected cluster-head will likely be far from regions with sparse sensor deployment. • Distance to anchor node: Some nodes close to AUVs will directly communicate and transmit with the AUVs without clustering. According to the energy consumption model of UWOC transmission as in Figure 3, the energy consumption increases with the transmission distance, especially with long transmission distance. When the transmission distance is short, the node transmission energy consumption is extremely low. Therefore, a small number of nodes close to the AUV can communicate directly with it to reduce the workload of the cluster-head. This can help reduce the cluster-head energy consumption, improve transmission efficiency, and avoid energy waste. Moreover, during cluster-head selection, nodes closer to an AUV are more likely to be selected as cluster-heads.

•
Remaining energy level: Cluster-heads consume energy far more than other nodes. If a node is repeatedly selected as a cluster-head, it may die prematurely due to excessive energy consumption, resulting in a "void area". Therefore, the proposed algorithm takes the node remaining energy level into account. The node with more remaining energy is more likely to be selected as a cluster-head. In order to reduce the void area, an energy threshold is set for each node. When the remaining energy is less than a quarter of the initial energy, it can hardly be selected as the cluster-head. In this way, the node can work normally as an ordinary node when its remaining energy is low, and the phenomenon of "void area" can be avoided.

Cluster-Head Selection
Before selecting cluster-heads, the network is clustered to reduce the energy consumed in the process of cluster formation. In this paper, the cluster-head selection policy satisfies the following conditions:

•
In-cluster position: The nodes are unevenly distributed, so we prefer to deploy the cluster-head in the area with higher node density. According to the proximity rule [31], the proposed algorithm calculates the sum of the Euclidean distance between candidate nodes and other nodes in the cluster, and selects the node with the smallest distance as the cluster-head, which can minimize the energy consumed by the sensor to transmit to the cluster-head. However, the selected cluster-head will likely be far from regions with sparse sensor deployment. • Distance to anchor node: Some nodes close to AUVs will directly communicate and transmit with the AUVs without clustering. According to the energy consumption model of UWOC transmission as in Figure 3, the energy consumption increases with the transmission distance, especially with long transmission distance. When the transmission distance is short, the node transmission energy consumption is extremely low. Therefore, a small number of nodes close to the AUV can communicate directly with it to reduce the workload of the cluster-head. This can help reduce the clusterhead energy consumption, improve transmission efficiency, and avoid energy waste. Moreover, during cluster-head selection, nodes closer to an AUV are more likely to be selected as cluster-heads.

•
Remaining energy level: Cluster-heads consume energy far more than other nodes. If a node is repeatedly selected as a cluster-head, it may die prematurely due to excessive energy consumption, resulting in a "void area". Therefore, the proposed algorithm takes the node remaining energy level into account. The node with more remaining energy is more likely to be selected as a cluster-head. In order to reduce the void area, an energy threshold is set for each node. When the remaining energy is less than a quarter of the initial energy, it can hardly be selected as the cluster-head. In this way, the node can work normally as an ordinary node when its remaining energy is low, and the phenomenon of "void area" can be avoided.

Edge Node Access Network
In the above cluster-head selection strategy, the nodes located in the dense region are more likely to be selected as the cluster-head, which may lead to the cluster-head being far away from the area with sparse sensors. Due to the "exponential" characteristics of energy consumption, the node accessing network in this area may consume significant energy, as in Figure 3. Therefore, the proposed algorithm communicates these nodes with the cluster-head through dual-hop transmission, to access the network. The nodes in the sparse region take the nodes in the union region of the transmission range and the clusterhead transmission range as candidate nodes. We choose the node with the minimum dualhop distance in the region as the relay node, as in Figure 4.The specific steps of dual-hop are as follows: the source node turns on the optical transmitter in the corresponding direction after determining the CH position, and limits the power to just be able to reliably transmit within 50 m.As shown in Figure 4a, the node closest to the intersection of the sector edge and the connection from the source node to the cluster-head are selected as the relay node. In this way, the most energy-saving relay node in the dual-hop candidate area can be selected. Figure 4b shows the model of the dual-hop strategy.

Edge Node Access Network
In the above cluster-head selection strategy, the nodes located in the dense region are more likely to be selected as the cluster-head, which may lead to the cluster-head being far away from the area with sparse sensors. Due to the "exponential" characteristics of energy consumption, the node accessing network in this area may consume significant energy, as in Figure 3. Therefore, the proposed algorithm communicates these nodes with the cluster-head through dual-hop transmission, to access the network. The nodes in the sparse region take the nodes in the union region of the transmission range and the cluster-head transmission range as candidate nodes. We choose the node with the minimum dual-hop distance in the region as the relay node, as in Figure 4.The specific steps of dual-hop are as follows: the source node turns on the optical transmitter in the corresponding direction after determining the CH position, and limits the power to just be able to reliably transmit within 50 m.As shown in Figure 4a, the node closest to the intersection of the sector edge and the connection from the source node to the cluster-head are selected as the relay node. In this way, the most energy-saving relay node in the dual-hop candidate area can be selected. Figure 4b shows the model of the dual-hop strategy.

Edge Node Access Network
In the above cluster-head selection strategy, the nodes located in the dense region are more likely to be selected as the cluster-head, which may lead to the cluster-head being far away from the area with sparse sensors. Due to the "exponential" characteristics of energy consumption, the node accessing network in this area may consume significant energy, as in Figure 3. Therefore, the proposed algorithm communicates these nodes with the cluster-head through dual-hop transmission, to access the network. The nodes in the sparse region take the nodes in the union region of the transmission range and the clusterhead transmission range as candidate nodes. We choose the node with the minimum dualhop distance in the region as the relay node, as in Figure 4.The specific steps of dual-hop are as follows: the source node turns on the optical transmitter in the corresponding direction after determining the CH position, and limits the power to just be able to reliably transmit within 50 m.As shown in Figure 4a, the node closest to the intersection of the sector edge and the connection from the source node to the cluster-head are selected as the relay node. In this way, the most energy-saving relay node in the dual-hop candidate area can be selected. Figure 4b shows the model of the dual-hop strategy.

AUV-Based Data Collection
Due to the short transmission distance of UWOC, AUVs with cruise function can be adopted to assist data transmission [35]. These AUVs are equipped with acoustic-optical hybrid systems, which can be switched between optical signals and acoustic systems to adapt to different communication ranges [44]. They establish communication links directly with sea buoys. The buoys obtain their own position information through GPS positioning, and calculate the position of AUVs according to the time difference of AUVs data arrival [45]. When an AUV approaches the network, it can quickly collect the data transmitted by the node and forward it to the corresponding sea buoy. On the other hand, AUVs can also be used as an anchor node for underwater node positioning [17]. Due to sea water turbulence and other reasons, the actual position of the node may deviate from the initial position. AUVs can also periodically update the actual site information of nodes.

Performance Investigation
We investigate the performance of the proposed LE-KCR algorithm in an underwater wireless sensor network employing optical waves. The network parameters are selected according to the channel and energy model described in Section 2. The initial energy of each sensor node is 5 J and is equipped with the components, such as an optical transmitter, receiver, and signal generator. The maximum transmission range of the signal is set as 50 m, and the minimum receiving power is set as 0.0025 mW. Power control is used to assign just enough transmitting power to guarantee this value at the receiver. The network is comprised of 100 sensor nodes, which are deployed in an area of 100*100 m 2 following a Poisson process. Regarding the characteristics of the underwater channel and the resource constraints of the nodes, most of the simulation parameters are set based on [46], while the maximum transmission range is set by considering the "exponential" characteristics of energy consumption in Figure 3. We employ the k-means technique to cluster the network and determine the number of clusters by the gap statistical method. Due to the small number of nodes in the model, the clustering algorithm converges in less than one second. It can select the best clustering number and identify the centroid of the cluster. We set an energy threshold as 2 J. When the node remaining energy is below this threshold, it will not be selected as a cluster-head, and will be regarded as a "dead node". However, it is not really a dead node, because it can still be used as an ordinary node to collect information for the network. The simulations are performed in multiple rounds. A non-cluster-head node transmits 1-bit information to the corresponding cluster-head in each round. The cluster-head processes the collected information before forwarding it to the nearest anchor node. Then, the information can be further transmitted to the receiver on the sea surface. The simulations are carried out in MATLAB. The parameters set in the simulations can be found in Table 1.  Figure 5 shows the simulation results on dead nodes with different rounds, when the basic LEACH, the LEACH k-means, and the proposed LE-KCR algorithms are adopted. As shown in Figure 5, the LEACH algorithm dies more than half of the nodes in less than 30 rounds. The Leach-k algorithm dies nearly 40% of nodes after running 100 rounds and, in order to ensure the consistency of the comparison, we set the same conditions for the proposed LE-KCR algorithm. Before reaching the energy threshold condition, only 4% of the nodes die by running 100 rounds. In the proposed LE-KCR algorithm, the selected cluster-heads are close to most nodes, so less energy is required for each round. This shows that the proposed algorithm is extraordinarily energy-saving.
Photonics 2022, 9, x FOR PEER REVIEW 10 of 14 Figure 5 shows the simulation results on dead nodes with different rounds, when the basic LEACH, the LEACH k-means, and the proposed LE-KCR algorithms are adopted. As shown in Figure 5, the LEACH algorithm dies more than half of the nodes in less than 30 rounds. The Leach-k algorithm dies nearly 40% of nodes after running 100 rounds and, in order to ensure the consistency of the comparison, we set the same conditions for the proposed LE-KCR algorithm. Before reaching the energy threshold condition, only 4% of the nodes die by running 100 rounds. In the proposed LE-KCR algorithm, the selected cluster-heads are close to most nodes, so less energy is required for each round. This shows that the proposed algorithm is extraordinarily energy-saving.  Figure 6 depicts the variety of nodes' total remaining energy with a diverse number of transmission rounds when the basic LEACH, the LEACK K-Means, and the proposed LE-KCR algorithms are adopted. Remaining energy is another important criterion to estimate the energy efficiency of wireless sensor networks [47]. It can evaluate the network lifetime based on different algorithms. As shown in Figure 6, we consider the area under the curve under the same remaining energy threshold. The proposed LE-KCR algorithm has a better network lifetime than both the LEACH and the LEACK K-Means algorithms with different numbers of transmissions. Compared with LEACH, the performance of the proposed LE-KCR algorithm improves by more than 90%. It also has 40% higher performance than the optimized k-means algorithm based on LEACH. The reason for the performance difference is that the LEACH algorithm randomly chooses cluster-heads without regarding the located site and remaining energy of cluster-heads in the network. The proposed LEACH-K algorithm designates the node nearest to the cluster centroid as the cluster-head. However, in most implementations, the remaining energy of the node and its distance to its sink node are not considered. It can be concluded that the proposed LE-KCR algorithm is more durable than the other two algorithms before the network is disconnected.  Figure 6 depicts the variety of nodes' total remaining energy with a diverse number of transmission rounds when the basic LEACH, the LEACK K-Means, and the proposed LE-KCR algorithms are adopted. Remaining energy is another important criterion to estimate the energy efficiency of wireless sensor networks [47]. It can evaluate the network lifetime based on different algorithms. As shown in Figure 6, we consider the area under the curve under the same remaining energy threshold. The proposed LE-KCR algorithm has a better network lifetime than both the LEACH and the LEACK K-Means algorithms with different numbers of transmissions. Compared with LEACH, the performance of the proposed LE-KCR algorithm improves by more than 90%. It also has 40% higher performance than the optimized k-means algorithm based on LEACH. The reason for the performance difference is that the LEACH algorithm randomly chooses cluster-heads without regarding the located site and remaining energy of cluster-heads in the network. The proposed LEACH-K algorithm designates the node nearest to the cluster centroid as the cluster-head. However, in most implementations, the remaining energy of the node and its distance to its sink node are not considered. It can be concluded that the proposed LE-KCR algorithm is more durable than the other two algorithms before the network is disconnected. Photonics 2022, 9, x FOR PEER REVIEW 11 of 14  Figure 7 presents the change in node remaining energy variance when the LEACH, the LEACK K-Means, and the proposed LE-KCR algorithms are employed. As depicted in Figure 7, the proposed LE-KCR algorithm has a better performance of energy balance than both the LEACH and the LEACK K-Means algorithms with different transmission rounds. With the increase in the number of simulation rounds, the variance of remaining energy shows an upward trend. For instance, when the number of transmissions is as low as 27, more than half of the nodes die in network when running the LEACH, which will automatically be disconnected. The variance of LEACH before disconnection and LEACH-K after 70 rounds is about 2200. However, the LE-KCR variance after 100 rounds of operation is about 700, only 1/3 of LEACK-K. This proves that the proposed LE-KCR strategy is also significantly better than the other two algorithms in energy balance.   Figure 7 presents the change in node remaining energy variance when the LEACH, the LEACK K-Means, and the proposed LE-KCR algorithms are employed. As depicted in Figure 7, the proposed LE-KCR algorithm has a better performance of energy balance than both the LEACH and the LEACK K-Means algorithms with different transmission rounds. With the increase in the number of simulation rounds, the variance of remaining energy shows an upward trend. For instance, when the number of transmissions is as low as 27, more than half of the nodes die in network when running the LEACH, which will automatically be disconnected. The variance of LEACH before disconnection and LEACH-K after 70 rounds is about 2200. However, the LE-KCR variance after 100 rounds of operation is about 700, only 1/3 of LEACK-K. This proves that the proposed LE-KCR strategy is also significantly better than the other two algorithms in energy balance.  Figure 7 presents the change in node remaining energy variance when the LEACH, the LEACK K-Means, and the proposed LE-KCR algorithms are employed. As depicted in Figure 7, the proposed LE-KCR algorithm has a better performance of energy balance than both the LEACH and the LEACK K-Means algorithms with different transmission rounds. With the increase in the number of simulation rounds, the variance of remaining energy shows an upward trend. For instance, when the number of transmissions is as low as 27, more than half of the nodes die in network when running the LEACH, which will automatically be disconnected. The variance of LEACH before disconnection and LEACH-K after 70 rounds is about 2200. However, the LE-KCR variance after 100 rounds of operation is about 700, only 1/3 of LEACK-K. This proves that the proposed LE-KCR strategy is also significantly better than the other two algorithms in energy balance.   Figure 8 presents the change in the death rate of nodes when the different network sizes are employed. As revealed in Figure 8, the change in network size has little impact on node mortality. We set the number of different nodes and calculate the node mortality after 300, 500, and 800 rounds of simulation. We can see that with the increase in simulation rounds, the node mortality of the network with the same number of nodes will increase significantly. However, with the increase in network scale, after 300, 500, and 800 rounds of network simulation, the node mortality fluctuates up and down at 0.11, 0.25, and 0.4, respectively. Due to the randomness of the nodes' located site, the network is different in the simulation. Therefore, the statistical mortality rate will fluctuate to a certain extent, but within a reasonable range. This proves that the performance of the proposed LE-KCR will not be greatly affected by the change in network scale.

Number of dead nodes
Photonics 2022, 9, x FOR PEER REVIEW 12 of 14 after 300, 500, and 800 rounds of simulation. We can see that with the increase in simulation rounds, the node mortality of the network with the same number of nodes will increase significantly. However, with the increase in network scale, after 300, 500, and 800 rounds of network simulation, the node mortality fluctuates up and down at 0.11, 0.25, and 0.4, respectively. Due to the randomness of the nodes' located site, the network is different in the simulation. Therefore, the statistical mortality rate will fluctuate to a certain extent, but within a reasonable range. This proves that the performance of the proposed LE-KCR will not be greatly affected by the change in network scale. Noticeably, this paper assumes that the position of sensor nodes does not change. However, in the actual deployment, due to the influence of water flow, it will lead to the swing of nodes and the change in node position. Therefore, it is necessary to combine the underwater node positioning algorithm to update the latest coordinates of the node in reality. Meanwhile, the water flow will also bring coordinate errors to the position estimation of anchor nodes. We plan to further this research as mentioned above in future work.

Summary
This paper presents a clustered routing algorithm, LE-KCR algorithm, based on the K-means technique for UWSNs. The proposed algorithm selects cluster-heads by regarding K-means, the relative position of nodes, distance to sink, and remaining energy. The introduction of the gap statistic can automatically select clustering numbers. Through the dual-hop strategy, the edge nodes far away from the cluster-head can also access the network without consuming abundant energy. In addition, an appropriate number of anchor nodes is set, which can not only transmit underwater information to the sea but also locate other nodes. This makes the proposed algorithm implementable when regarding node mobility. We investigate the performance of the proposed LE-KCR algorithm. The results show that the proposed LE-KCR algorithm obviously gains the advantage in reducing energy consumption and the number of dead nodes. Noticeably, this paper assumes that the position of sensor nodes does not change. However, in the actual deployment, due to the influence of water flow, it will lead to the swing of nodes and the change in node position. Therefore, it is necessary to combine the underwater node positioning algorithm to update the latest coordinates of the node in reality. Meanwhile, the water flow will also bring coordinate errors to the position estimation of anchor nodes. We plan to further this research as mentioned above in future work.

Summary
This paper presents a clustered routing algorithm, LE-KCR algorithm, based on the K-means technique for UWSNs. The proposed algorithm selects cluster-heads by regarding K-means, the relative position of nodes, distance to sink, and remaining energy. The introduction of the gap statistic can automatically select clustering numbers. Through the dual-hop strategy, the edge nodes far away from the cluster-head can also access the network without consuming abundant energy. In addition, an appropriate number of anchor nodes is set, which can not only transmit underwater information to the sea but also locate other nodes. This makes the proposed algorithm implementable when regarding node mobility. We investigate the performance of the proposed LE-KCR algorithm. The results show that the proposed LE-KCR algorithm obviously gains the advantage in reducing energy consumption and the number of dead nodes.