An Efficient Topology Discovery Protocol with Node ID Assignment Based on Layered Model for Underwater Acoustic Networks †

Underwater acoustic networks are widely used in survey missions and environmental monitoring. When an underwater acoustic network (UAN) is deployed in a marine region or two UANs merge, each node hardly knows the entire network and may not have a unique node ID. Therefore, a network topology discovery protocol that can complete node discovery, link discovery, and node ID assignment are necessary and important. Considering the limited node energy and long propagation delay in UANs, it is challenging to obtain the network topology with reduced overheads and a short delay in this initial network state. In this paper, an efficient topology discovery protocol (ETDP) is proposed to achieve adaptive node ID assignment and topology discovery simultaneously. To avoiding packet collision in this initial network state, ETDP controls the transmission of topology discovery (TD) packets, based on a local timer, and divides the network into different layers to make nodes transmit TD packets orderly. Exploiting the received TD packets, each node could obtain the network topology and assign its node ID independently. Simulation results show that ETDP completes network topology discovery for all nodes in the network with significantly reduced energy consumption and short delay; meanwhile, it assigns the shortest unique IDs to all nodes with reduced overheads.


Introduction
Underwater acoustic networks (UANs) have become increasingly important, with the advancements in underwater communication technologies and their applications in environmental monitoring, exploration of the oceans, and military missions [1][2][3]. UANs, in general, consist of various sensors and vehicles deployed underwater that are connected via acoustic links to perform collaborative tasks [4]. network merging for UANs with different topologies and sizes. Fourth, the OPNET model for underwater acoustic network topology discovery is constructed, and the effectiveness of the proposed protocol is verified through simulations.
The rest of this paper is organized as follows. Section 2 explains relevant research results on node discovery protocols for UANs. Section 3 presents the system model of the network discovery. Section 4 introduces the proposed ETDP protocol and illustrates it through an example. Section 5 theoretically analyzes and calculates the delay and energy consumption of the proposed protocol. A distributed ID assignment and topology discovery protocol for UANs (DIVE) is considered as a benchmark method, and its performance is analyzed. Section 6 outlines the results of the simulation experiments, and Section 7 concludes the paper.

Related Works
The problem of topology discovery for wireless sensor networks (WSNs) has been extensively studied [25][26][27][28]. However, the topology discovery for UANs is more complicated due to the harsh underwater acoustic channel. This section discusses existing topology discovery protocols in UANs. A taxonomy of existing topology discovery protocols in UANs is presented in Figure 1. According to the network model, node discovery protocol can divided into centralized discovery and distributed discovery.
In the centralized network architecture, there are central nodes that start and control the operation of the protocol. In [29], a primary seed node (node with known co-ordinates) is used to determine secondary and tertiary seed nodes. Co-ordinates of the first three seed nodes are used to build up the relative co-ordinate system through the triangulation method. In [30], the discovery process (Disc) is centrally controlled by a node designated as master node. Each ordinary node that discovers its neighbors sends this information back to the master node. Thus, the master node acquires the newest topology quickly. However, this also results in increased energy consumption. Furthermore, in [31], leader nodes are proposed to control the discovery process (N-Disc). Considering the limited energy of UANs, the transmit power is classified in several levels according to the propagation distance. Although the protocol can reduce the energy consumption after the network topology is obtained, the energy consumption and discovery delay of the protocol are not satisfying. The reason is that nodes need to send packets from low to high power levels several times to determine the minimum transmitting power level for each link. The topology-efficient discovery algorithm (TED) in [32] allows nodes to share time slots while controlling the number of possible collisions such that the delay of the topology discovery process can be reduced. In [33], a collision-free topology discovery protocol is proposed, where each node transmits its discovery packets at a unique time. This approach can also reduce the energy consumption of the network topology discovery.
Opposite to the centralized network architecture, there are no central nodes that start and control the operation of the protocol in a distributed configuration. A design of an automatic node ID assignment and resolution protocol for UANs (AAR) is proposed in [34]. In [35], a distributed ID assignment and topology discovery protocol for UANs is proposed. While assigning the node IDs, additional information is shared to discover the other nodes in the network. The protocol is composed of two main procedures that run simultaneously. The first one is to share the information required to assign the node IDs and to discover the network topology. The second procedure is required to verify that node IDs are unique in the network. Due to the distributed discovery approach, the nodes start working at the same time, resulting in a smaller discovery delay for the distributed protocols than for the centralized protocols. In [36], a neighbor discovery mechanism is presented based on directional transmission and reception. Table 1 provides a comparative summary of the previously discussed protocols. TED and CFVE protocols require time synchronization because they employ time division multiple address (TDMA).
Most of the discovery protocols require knowledge of the number of network nodes. The criteria to measure the protocol performance includes link connections, energy consumption, convergence time or delay, the number of discovered nodes, and transmitted packets. The CFVE and DIVE protocols can adapt to node mobile scenario, while others require settled nodes. Besides, the existing discovery protocols require knowledge of the node ID in advance except for the DIVE protocol. Considering these aspects, our protocol will be compared with the DIVE protocol. Figure 1. Taxonomy of the existing topology discovery protocols in underwater acoustic networks. Note that Disc [30] is a discovery process for initializing underwater acoustic network (UAN); N-Disc [31] is a node discovery protocol for ad hoc UAN; automatic node ID assignment and resolution protocol for UANs (AAR) [34] is an address assignment and resolution protocol for UANs; topology-efficient discovery algorithm (TED) [32] is topology-efficient discovery algorithm for UANs; CFVE [33] is a collision-free topology discovery protocol; and DIVE [35] is a distributed ID assignment and topology discovery protocol for UANs.

System Model
A UAN can be abstracted as a graph G(V, E), where V is the set of all nodes in the network and E consists of all links of the network. Some definitions are as follows: 1. ∀i, j ∈ V is the Euclidean distance between node i and node j. 2. P t and P r are the transmit and receive power of all nodes in the network, respectively. All nodes have the same transmit power P t . 3. d(P t ) represents the transmission range of each node with transmit power P t . 4. Neighbor(i) consists of node i's neighbors that could receive TD packets directly from node i.
The UAN spans multiple hops, which can be presented with a tree-based topology structure. Node T is the root node, neighbors of the root node belong to layer l, layer 1 nodes' children belong to layer 2, and so on. Thus, the entire network is layered into the root node, layer 1, layer 2, . . . , layer L i , . . . , and layer L. L i is the layer of node i.
The following definitions are additionally used: 1. Root node: The root node triggers the beginning of the ETDP. 2. Father node and child node: When node i firstly receives a TD packet from node j that is in layer L i − 1, node i sets node j as its father node. In the same way, when node j receives a TD packet from a layer L i node, whose father node is node j, node j sets it as one of its child nodes. In this network model, each node has one father node; however, it may have several child nodes. Besides, each node can be a father node and a child node at the same time, except for the root node and leaf nodes. 3. Leaf node: Nodes that do not have any child nodes are called leaf nodes. 4. Descendant node: Node i's descendant nodes include all its child nodes and their offspring nodes, which generally have a larger layer number than node i.
To efficiently complete topological discovery and node ID allocation based on a tree structure, all nodes have a unique parent node except for the root node. L i is the layer of node i. D i is the number of descendant nodes of node i, C i is the number of child nodes of node i.
For example, as shown in Figure 2, T is the root node and its layer is 0. Nodes within its transmission range (e.g., nodes a, b and c) are its neighbor nodes, and their layer is 1. As nodes a, b and c have received the TD packet from the root node for the first time, the root node is the father node of a, b and c. In the same way, as the root node can receive the packets from a, b and c, these are neighbors and child nodes of the root node. Accordingly, the ordered pair of nodes (root node| a, b, c) shows that a, b and c are child nodes of the root node and the root node is the father node of nodes a, b and c. Nodes x and y are leaf nodes because they do not have child nodes. Based on the above definition for the descendant node, all nodes except for the root node are descendant nodes of node T.

Proposed ETDP Protocol
Our goal is to achieve topology discovery and adaptive node ID assignment simultaneously with reduced energy consumption and delay.
The following assumptions are made: 1. The number of nodes in the network and the states of connectivity are unknown. 2. The node IDs have not been assigned. 3. One node is set as the root node, which triggers the topology discovery procedure. 4. Nodes communicate in half-duplex mode.
In the initial stage of the network, nodes are unstructured and do not know the information about the surrounding nodes and links. To reduce the energy consumption and delay caused by the unknown network topology, the ETDP protocol establishes the network layered structure through the transmission of TD packets based on a local timer, where nodes are classified into different layers and defined as the root node, father nodes, child nodes, leaf nodes, and descendant nodes, as mentioned in Section 3. Hence, the dynamic topology discovery protocol for UANs could be established based on a layered network model, which controls the transmission of TD packets through a random timer design, and thus, significantly suppresses collisions in the initial network state. Exploiting the received TD packets from neighbors, each node obtains the network topology and assigns its node ID independently.
Two timers are defined for node i: Tx_Timer i and Wait_Timer i . The former is employed to reduce the packet collision probability and the latter is used to collect the descendant topology of the node. The two timers are defined as where K i is a random value generated by node i, T c is the link propagation delay of the TD packet, T p is the transmission delay for the HELLO packet, and Tx_Timer max is the maximum Tx_Timer.
The TD packet includes three types of packets, i.e., HELLO, Disc (discovery), and IDA (ID assignment) packets, which are shown in Figure 3. Here, Type is the TD packet type, F i is the K i value of the node i's father node, ID s is the ID of the sending node, and LIST i shows the partial topology that includes all descendant nodes and neighbor nodes of node i. The detailed algorithm is shown in Algorithm 1. Accordingly, the ETDP protocol consists of three stages: HELLO packet transmission, disc packet transmission, and IDA packet transmission. Firstly, the network is layered and neighbor discovery is achieved through the HELLO packet transmission stage. Then, the entire network topology discovery could be completed at the end of the disc packet transmission stage. Finally, each node computes its unique ID based on the topology information and sends the IDA packet in the last stage.

HELLO Packet Transmission Stage
The root node T initiates the topology discovery process by sending a HELLO packet according to the network TD packet formats shown in Figure 3, with K T = 0 and L T = 0. Then, it starts the Wait_Timer T . Each node that has received a HELLO packet for the first time from the root node would set the root node as its father and set its layer as 1. Then, these nodes generate a random value K i as well as their HELLO packet and start the Tx_Timer i based on Equation (1). After the timeout of Tx_Timer i , they send their HELLO packets to their neighbors and start the Wait_Timer i . Similar operations are performed at the following layers. As such, each node sends its HELLO packet and starts the Wait_Timer i until reaching the nodes at layer L. Therefore, the HELLO packet transmission stage performs the above operations, starting from the root node up to the leaf node of the network. After all network nodes complete the HELLO packets transmission stage, the network is layered into layer 0, layer 1, . . . , layer L i , . . . , layer L and all leaf nodes are confirmed. Meanwhile, each node has its child node and father node, except for the leaf nodes and the root node, respectively.
At this stage, the Tx_Timer i is associated with a random value to enable nodes at the same layer to send packets at different times, which reduces the possibility of packet collisions.

Disc Packet Transmission Stage
ETDP defines the node that has not received any HELLO packets after the Wait_Timer i timeout as a leaf node. When the Wait_Timer i expires, the leaf nodes generate a disc packet and send it to their father node. The node that receives its child node disc packet stores information and updates its disc packet. The father node collects all disc packets from its children during the Wait_Timer i , and updates and sends a disc packet to the upper-layer node when the Wait_Timer i expires. After all nodes have sent disc packets, the network topology information is gathered at the root node.
At this stage, the Wait_Timer i ensures that each node receives packets from all its neighbors. After this stage is completed, each node in the network has more information of the network topology than at the previous stage. Each node i knows all its descendant topology and the number of its descendant nodes D i .

IDA Packet Transmission Stage
The root node sets 0 as its node ID. Then, it transmits the obtained network topology as a single aggregated packet IDA to layer 1 nodes. After receiving an IDA packet from the father node, node i starts the Tx_Timer i based on Equation (1) and checks whether the K i value of node i is included in this packet. If so, it calculates its node ID according to Equations (3)-(5). Then, it generates its IDA packet and sends it after the Tx_Timer i expires. Otherwise, node i does nothing.
Each node i assign its ID i according to where R is the order of nodes' K i values in the same layer L i , ID f denotes the father ID of node i , and D R Li is the number of descendants of node i whose K i order is R at layer L i . We define C i as the number of node i's children. The node obtains the unknown parameters in the Equations (3)-(5) during the shared TD packets and calculates its ID.

Illustration of EDTP
We now illustrate the steps of the ETDP procedure with a 12 nodes network. Figure 4a shows a random network that has no structure. In this network, node 0 is the root node and node IDs are unknown. After all nodes are randomly deployed, the overall structure and connectivity of the network are unknown. Compared with the node deployment of the actual network, Figure 4b shows the topology acquired and ID assignment after the ETDP procedure is completed.   Table 2 shows the detail of ETDP process. In this example, in the HELLO packet transmission stage, node 0, which sets its layer to 0, sends HELLO and starts the Wait_Timer 0 . After broadcasting the HELLO packet, node 0 neighbors (nodes 1 and 2) receive node 0's HELLO. They set their layer to 1 and set node 0 as their father node. After nodes 1 and 2 generate K 1 = 20 and K 2 = 30, they start Tx_Timer 1 = 0.2 s and Tx_Timer 2 = 0.30 s, respectively. When the timers expire, they transmit their HELLO packet and start Wait_Timer 1 and Wait_Timer 2 , separately. Then, node 0 receives the HELLO packets from nodes 1 and 2 and stores its two children nodes. Similarly, when nodes 3 and 4 receive a HELLO packet from node 1, they set their layer to 2 and store node 1 as their father. In this example, nodes 3, 7 and 14 can receive the HELLO packet from node 2. Because node 3 already has a father node, node 3 identifies node 2 as its neighbor. Only nodes 7 and 14 set node 2 as their father. After nodes 3 and 4 generate K 3 = 78 and K 4 = 96 at random, they start the Tx_Timer i . When this expires, they transmit the HELLO packet and start the Wait_Timer i . As time goes on, every node sends its HELLO packet. Table 2 provides a detailed summary of the system states. ETDP uses T_ID i < K i , F i , L i > as the temporary ID of different nodes (e.g., T_ID 1 < 20, 0, 1 > represents node 1). Therefore, node 0 child nodes are nodes T_ID 1 and T_ID 2 . Nodes 14, 7, 13, 11, 9, 10, 12 are leaf nodes, as the number of their children is zero. So, node 12 sets its descendant to 0. When its Wait_Timer 12 expires, node 12 sends the disc packet to the father node 8. After node 8 receives the disc packet from its child node 12, it calculates its D 8 = 1. When its Wait_Timer 8 expires, node 8 sends the disc packet to its father node 7. After node 7 receives the disc packets from its child node 8, it computes it. Then, when its Wait_Timer 7 expires, node 7 sends the disc to its father node 2. Node 2 collects the disc packets from its children nodes 7 and 14 and calculates its Similarly, when its Wait_Timer 2 expires, node 2 sends the disc packets to its father node 0. Finally, when Wait_Timer 0 expires, node 0 collects the disc packets of its children nodes 1 and 2. Then, it ends the disc transmission process and the entire network information is gathered at node 0. After Wait_Timer 0 expires, node 0 sends the IDA packet for ID assignment and topology notification. The ID of node 0 is set to 0. In our example, node 0 has two child nodes 1 and 2 in Figure 4a, which compute their IDs according to Equations (3)-(5) when they received IDA packets. Therefore, Analogously, node ID 2 = 11 has two child nodes 7 and 14 in Figure 4a. After they received the IDA packet from node 2, they computed their IDs as ID 7 = 13 and ID 14 = 12, respectively. After the TD packets transmission and ID assignment, all nodes in the network are assigned a unique address and obtain network topology information.

Theoretical Analysis
The performance of the proposed ETDP protocol is evaluated through analysis, when considering single-hop and multi-hop UANs. Two metrics are investigated to assess the effectiveness and overhead of the ETDP protocol, as follows: • Topology discovery communication traffic: The total communication traffic consumed by network nodes to complete the network topology discovery.

•
Topology discovery duration: The interval from the beginning of the network topology discovery until all nodes in the network obtain the entire topology information.
Meanwhile, the compared protocol DIVE is analyzed, including its communication traffic and discovery duration. Then, it is theoretically proven that the ETDP algorithm discovery communication traffic and discovery duration are better than for the existing DIVE protocol.

Topology Discovery Communication Traffic
For the network topology discovery phase, overhead is an indicator of the protocol performance, which has a linear relationship with the energy expenditure. Let us denote by N the total number of nodes and by N i the number of neighbors of node i in the network. In the HELLO packet transmission stage, the communication traffic is while in the disc packet transmission stage, this is and finally, in the IDA packet transmission stage, it is The total communication traffic required to complete the node ID assignment and topology discovery process is ∑ 3 i=1 bit i . The DIVE protocol is composed of two procedures of the network topology discovery and node ID assignment. In its first procedure, HELLO packets are shared to discover information and assign the node IDs. Table 3 illustrates the HELLO packet length of the DIVE protocol. The discovery HELLO packet format of the benchmark protocol DIVE is where K x is the key value generated by node x; R x is an additional random integer value. isMobile x is a flag set to 1 if node x is mobile, and to 0 otherwise; C x is a counter to track how many HELLO packets have been transmitted by x so far; LIST x is the list of contacts stored by node x. Therefore, the communication traffic required by the DIVE protocol to complete the network topology discovery process is where N p is the total number of sent packets in the network. The paper calculates the communication traffic of the two protocols during topology discovery process. As such, the theoretical energy consumption is derived in Section 6.

Topology Discovery Delay
The goal of topology discovery is that each node of the network obtains the entire network topology and a unique shortest ID.
Topology discovery duration comes from three stages: HELLO packet transmission, disc packet transmission, and IDA packet transmission.
In the HELLO packet transmission stage, the HELLO packet is sent hop-by-hop from the root node until the leaf node. In the disc packet transmission stage, nodes send disc packets hop-by-hop from the leaf node to the root node. In the IDA packet transmission stage, the IDA packet is sent hop-by-hop from the root node until the leaf node. The time required by these three processes is where Tx_Timer max Li is the maximum Tx_Timer Li at layer of L i . Regarding the discovery duration of the DIVE protocol, it is composed of two main procedures that run simultaneously. The first one is to share the information required to assign the node IDs and to discover the network topology. The second procedure is required to verify that node IDs are globally unique in the network.
where C t is the size of the HELLO packet, R is a nominal transmission rate, L is propagation distance, c is the speed of sound, and τ G is the guard interval. DIVE is a distributed protocol in which each node starts discovery at same time. Thus, node i requires T i to complete the discovery procedure shown as Equation (13), where N i max is the maximum N i , µ i is a coefficient, and N c is the number of collisions at node i. For the DIVE protocol, the discovery duration depends on the maximum T i between the network nodes.

Simulation Evaluation
The performance of the proposed ETDP protocol is evaluated via OPNET simulation where the pipe stage of underwater acoustic channel is adopted. In our simulation evaluation, nodes are randomly distributed in an area of 3000 m × 3000 m, where the node communication distance is 700 m and data rate is 7500 bps. The size of network is constant with the number of nodes increases. The number of nodes is varying between 4 and 200. When the maximum number of nodes is considered, the maximum number of layers is 5. The transmission, reception, and idle powers are set to 8 W, 1.3 W, and 0.285 W, respectively, as shown in Table 4. To investigate the influence of the harsh underwater acoustic channel, packet error ratio (PER) of acoustic links is considered in the simulations. We consider different node deployments to analyze the performance of ETDP. The network topology obtained through the proposed ETDP protocol is displayed in OPNET ODB (OPNET Simulation Debugger) window. Figure 5 shows how the connections between the nodes change according to the different number of network layers. Initially, we deploy a network with the maximum layer 1, then the nodes organize a star-topological network by EDTP successfully (Figure 5a). Secondly, we keep the number of nodes unchanged and increase the maximum hop count to 2 (Figure 5b For the convenience of understanding the discovery process, we did not draw the neighbor information of each node; only the nodes with the parent-child relationship were drawn in the Figure 5.
The results show that ETDP can obtain entire network topology with different network size. However, it only obtains neighbor information, number of hops and total number of nodes in DIVE. The discovered topology assumes a star state when the network layer size is small. As the maximum number of layers of the network increases, the discovered network topology appears as a tree. At the same time, for different network topologies, all nodes in the network share TD packets and generate their random values K i , i = 1, . . . , N. Finally, they complete topology obtaining and the random value K i helps network nodes to successfully assign the unique and shortest ID.
We consider the performance of ETDP in terms of energy consumption and convergence delay, and compare it with that of the DIVE algorithm under the same simulation environment and parameters. Figure 6a,b show the total energy consumption and convergence delay versus the number of nodes; these increase as the number of nodes in the network increases. As shown in the Figure 6a, the energy consumption of the proposed ETDP protocol is significantly lower than that of DIVE as the number of nodes increases. ETDP effectively decreases the number of transmission TD packets by using the relationship between the parent and child nodes. After the HELLO packet transmission stage, a ( f ather node|child nodes) relationship is established between nodes, which can partly reduce the unordered collision of TD packets in the initial stage of the network. On the other hand, the protocol has two timers to ensure that the node receives enough TD packets before replying, which reduces the packet transmissions in the network. Thus, the overall optimization of the discovery strategy effectively reduces the energy consumption. As the value of K i , L i and F i in ETDP are randomly generated values, we directly computed the mean for these values in the theoretical analysis. When the number of nodes in the network increases gradually, the mean value is closer to the actual value in ETDP. Whereas, DIVE generates more random packet collisions as the number of nodes increases. The packet retransmission caused by the collisions is not considered in the theoretical analysis in DIVE protocol, resulting in a deviation between the theoretical value and the actual value in DIVE.
After the theoretical analysis of duration, the topology discovery duration of the proposed ETDP protocol is related to the number of layers in the network. The increase of network node number may cause the increased network layer, which extends the convergence delay of ETDP, as shown in Figure 6b. Besides, ETDP uses a relatively less random value and smaller packet size, which reduced the delay to some extent compared with DIVE. In summary, the optimized discovery cycle reduces the number of transmitted packets. Thus, compared to the benchmark method, the proposed protocol obtains the network topology while greatly reducing the time and energy overhead.
(a) (b) Figure 6. The performance comparison between the proposed ETDP and DIVE protocols: when PER = 0, (a) Energy consumption for network topology discovery; (b) Network topology discovery convergence duration.
The number of transmitted TD packets is shown in Figure 7a versus the number of nodes. Results show that when the number of network nodes is small, the number of packets in the ETDP three-stage discovery strategy is slightly larger than for DIVE, for which the number of packet collisions and packet exchanges is lower. On the other hand, when the number of nodes increases, the proposed ETDP protocol outperforms DIVE. Due to the layered mechanism, an increase in the number of nodes does not cause a significant increase in the K value of ETDP. However, the packets sending strategy of DIVE causes notable enlargement of packet collision probability as the number of nodes increases.
To investigate the influence of the harsh underwater acoustic channel, the energy consumption of the two protocols is compared under different packet error ratio (PER), as shown in Figure 7b. With the increasing node number and PER, the energy consumption of two protocols increases. It can be seen that the energy consumption of DIVE is higher than that of ETDP at the same network size and PER, as ETDP has a lower number of packages sent and reduced package sizes. Moreover, results show that the increase in the number of nodes has a relatively larger effect than PER on the energy consumption of the two protocols. Furthermore, the energy consumption of DIVE increases faster than that of ETDP, as the mechanism of distributed flooding is more sensitive to the increase in the number of nodes in DIVE.

Conclusions
In this paper, the problem of obtaining topology information in UANs has been studied, which is required to determine destination nodes, perform routing, and schedule transmissions efficiently. The ETDP protocol has been proposed as a solution to the network topology discovery and node ID assignment. Through TD packet transmission, a layered structure has been built to avoid packets collisions. Moreover, two timers have been proposed to collect packets. Besides, a node IDs assignment has been introduced, based on the layered structure and collected topology information. It has been revealed that ETDP obtains accurate topology information while greatly reducing the overhead and energy consumption. Simulation results have shown that the proposed ETDP protocol completes the ID assignment for all nodes in the network using the collected topology information. Considering the energy constraint of UAN nodes, the proposed ETDP protocol is a suitable network topology discovery mechanism for UANs. Our future work will concentrate on merging the processes for new nodes joining the network and integrating topology discovery with node localizations with reduced overheads.