Data Collection Based on Opportunistic Node Connections in Wireless Sensor Networks

The working–sleeping cycle strategy used for sensor nodes with limited power supply in wireless sensor networks can effectively save their energy, but also causes opportunistic node connections due to the intermittent communication mode, which can affect the reliability of data transmission. To address this problem, a data collection scheme based on opportunistic node connections is proposed to achieve efficient data collection in a network with a mobile sink. In this scheme, the mobile sink first broadcasts a tag message to start a data collection period, and all nodes that receive this message will use the probe message to forward their own source information to the mobile sink. On receiving these probe messages, the mobile sink then constructs an opportunistic connection random graph by analyzing the source information included in them, and calculates the optimal path from itself to each node in this random graph, therefore a spanning tree could be generated with the mobile sink play as the root node, finally, it broadcasts this spanning tree so that each node could obtain an optimal path from itself to the mobile sink to forward the sensing data. In addition, a routing protocol that adapts to different nodes operating statuses is proposed to improve the reliability of data transmission. Simulation results show that the proposed scheme works better concerning the packet delivery rate, energy consumption and network lifetime.


Introduction
As a popular way of information access, Wireless Sensor Networks (WSNs) have been widely used in smart industrial fields [1] to undertake tasks such as factory automation, fault diagnosis, fuel consumption monitoring, surveillance, and industrial control, etc. Their application usually involves a large number of static sensor nodes (e.g., photoelectric sensors, ultrasonic sensors, gas sensors, video sensors, etc.) deployed over vast areas to perform data sensing, gathering and communication. However, the sensor nodes in WSNs have limited power supply that is not easy to replenish, which restricts their long-term and sustained work.
The use of mobile sinks in WSNs to replace the traditional static base stations for data collection is believed to be able to improve the network lifetime, and the feasibility of this scheme is demonstrated in details in Reference [2]. Usually, the mobile sink has stronger computing power than common sensor nodes and its energy can be replenished in a timely way. Depending on the particular application needs, the mobile sink can be either part of the external environment or part of the network, and its movement trajectory can also be either controlled or randomly planned [3]. By properly designing properly designing the movement trajectory for the mobile sink, the hot-spot problem [4] in traditional networks can be avoided and the energy consumption of sensor nodes can be more evenly balanced, thus prolonging the network lifetime. To fully explore the role of mobile nodes, in Reference [5], the authors proposed a joint energy replenishment and data collection algorithm, in which a mobile charger is used to replenish the energy for nodes and it can function as a mobile sink when equipped with wireless transceivers. In Reference [6], the authors proposed a Wi-Fi-based passive cyber-physical social sensing approach for tracking people's behaviors by sensing Wi-Fi messages from mobile devices people carry with them. However, these studies did not further consider changing the node operating status to achieve a better network lifetime performance.
To cope with this challenge, researchers have proposed the working-sleeping cycle strategy for sensor nodes, in which nodes can go to sleep to save energy while they are free of work. These studies usually focus on designing node-scheduling schemes to accomplish data transmission, in which the working-sleeping cycle strategy can be categorized into synchronous and asynchronous [7]. Therefore, the changeable node operating status in this strategy is an advantage of these studies, which can prolong the network lifetime. However, the working-sleeping cycle strategy for nodes may cause opportunistic node connections, which results in link instability and affects the data collection efficiency.
In fact, the reasons why opportunistic node connections exist in WSNs can be summarized as follows: (1) WSNs are often deployed in harsh environments where wireless signals are susceptible to interference, which may cause link instability [8], further leading to opportunistic node connections; (2) the movement of nodes usually leads to intermittent connected links and creates some form of opportunistic node connections [9]; (3) due to the limited energy of the nodes, in some scenarios the sensor nodes adopt a working-sleeping cycle strategy to save energy, and accordingly, the adjacent nodes may not be able to communicate with each other continuously [10], thus bringing about opportunistic node connections.
This work considers that all sensor nodes adopt an asynchronous working-sleeping cycle strategy to save energy and a mobile sink is used to collect data in the network, as shown in Figure  1, where the network works based on opportunistic node connections since they may not be able to communicate with each other continuously. In this case, how to model this kind of WSNs to design a scheme, and based on which to ensure the reliability of data transmission in the network becomes a meaningful challenge.  To deal with this problem, the random graph theory originated by the pioneering work of Erdős and Rényi [11] was adopted to model WSNs, in which all sensor nodes are represented by graph vertices while all wireless links are represented by edges. All edges have a probability to reflect the quality of wireless links. The random graph theory that is used in modeling WSNs can be called a random geometric graph, in which an edge can only exist between two adjacent nodes [12]. To make the model more appropriate to the real situation of WSNs, Ren et al. [13] took the reliability of links into consideration to present a novel random geometric graph coverage model. Moreover, Dong et al. [14] described a range-dependent model for WSNs by using random graph theory, of which the probability of an edge existing was expressed as a function of the distance between its two end vertices in the fading environment.
In this work, by taking the randomness of opportunistic node connections into account, we propose a novel random graph model named Opportunistic Connection Random Graph (OCRG) for WSNs. Inspired by the work of Fu et al. [15], who first proposed a scheme to address the connectivity between a designated source and destination in random graphs, in which they assumed that at each step one edge could be tested to see if it existed or not and an optimal policy that minimized the total expected number of steps determined while establishing whether the designated source and destination were connected. As a matter of fact, this work mainly focuses on the connectivity between any source node and the mobile sink in the OCRG, and based on the opportunistic node connections, a data collection scheme is proposed, in this scheme, all nodes can use the calculated optimal path to forward their sensing data to the mobile sink.
In the scheme, first, the mobile sink can launch a data collection task anywhere in network at any time by broadcasting a tag message that declares its data collection period, and each node that receives this message will forward its source information (including its whole working time in this data collection period, its status transition frequency in this period and all its neighbors' IDs) included in a probe message to the mobile sink. Here the energy efficient probe message forwarding mechanism is designed to minimize the overall energy consumption. On receiving these probe messages, the mobile sink first constructs an opportunistic connection random graph by analyzing the source information included in them, then calculates the optimal path from itself to each node in this random graph, and hence a spanning tree could be generated where the mobile sink play as the root node, and finally it broadcasts this spanning tree so that each node can obtain the optimal path from itself to the mobile sink, and according to which to forward the sensing data. In addition, a routing protocol which adapts to different nodes operating statuses is proposed to improve the reliability of data transmission when nodes forwarding their sensing data along the calculated optimal path. The experimental results show that this scheme can effectively improve the reliability of the data transmission while prolonging the network lifetime. The main contributions of this paper are summarized as follows: 1.
An opportunistic connection random graph is constructed to reflect the opportunistic node connections caused by an asynchronous working-sleeping cycle strategy; 2.
For each node, an energy efficient probe message forwarding mechanism is used to minimize the overall energy consumption in network when forwarding the source information to the mobile sink; 3.
A routing protocol which adapts to different nodes operating statuses is proposed to improve the reliability of data transmission when nodes are forwarding their sensing data along the calculated optimal path.
The rest of the paper is organized as follows: Section 2 describes the related work. Section 3 gives the system model. Section 4 presents the proposed data collection scheme in details. Simulation results and performance evaluations are shown in Section 5. Finally, Section 7 summarizes this work and gives some future research directions.

Related Work
In traditional WSNs, all sensor nodes including the sink nodes are static, sensor nodes near the sink have more traffic loads for they have to participate in forwarding sensing data from all other nodes to the sink, therefore they easily exhaust their limited energy, and which will cause network partition. This problem is called the hot-spot problem [16] and it has an adverse effect on the network lifetime.
To address this problem, in recent years researchers have introduced the mobile sink concept to networks. By fully exploiting the mobility of the mobile sink, the nodes around the sink could change continually, therefore the energy consumption in routing is distributed more evenly in the network, and thereby it prolongs the network lifetime. According to the work of the previous researchers, the path design for mobile sinks can be classified into predefined paths [17], controlled paths [18], or uncontrolled (random) paths [19]. By employing these paths, the mobile sink can play an important role in scenarios where nodes suffer opportunistic connections. For example, in Reference [20], the authors considered that the WSNs deployed within hostile environments might suffer from large-scale damage where many sensors failed simultaneously and this would cause the WSNs to split into disconnected segments, and they proposed a two-step heuristic for connecting isolated segments through intermittent links, in which a set of mobile nodes served as mobile data carriers. In Reference [21], the authors considered that in mobile sensor networks, the connectivity between nodes was intermittent due to the blockage of radio signals and nodal mobility, and they proposed a reliable data delivery scheme for the networks with an enhanced delay technique, in which nodes estimated the connectivity and expected inter-encounter time with sink nodes.
All these approaches can equalize the energy consumption of nodes, but they are not flexible enough to prolong the network lifetime because they have not considered designing different node operating statuses. Therefore, the working-sleeping cycle strategy was introduced to WSNs, in which researchers considered that the proper scheduling of sensor nodes to sleep or work could effectively reduce their unnecessary energy consumption, thus prolonging the network lifetime.
According to the node operating status, the researches can easily be divided into two main categories. In the first category, the studies aimed at synchronous working-sleeping cycle strategy, for example, in Reference [22], the authors proposed an event-triggered sleeping energy mode for the synchronous working-sleeping cycle strategy of sensor nodes, in which nodes with no data packets immediately entered sleeping status after the synchronization phase and did not participate in media access contention. However, the synchronization process remains a significant contributor to the energy consumption. To solve the problem, Ng et al. [23] proposed an energy-efficient synchronization algorithm that could reduce energy consumption by adaptively regulating the synchronization traffic and synchronization wakeup period based on the changing network neighborhood conditions through counter-based and exponential-smoothing algorithms. In the second category, the studies focus on asynchronous working-sleeping cycle strategy, in which each node could have an independent working-sleeping scheduling, by scheduling a certain number of nodes to work; they can achieve the coverage for a specific area and ensure the network connectivity in it. In Reference [24], the authors presented a novel algorithm that relied on learning automata to implement sleeping scheduling approaches, and it aimed to activate a minimum number of sensor nodes to achieve coverage of a certain area and guarantee connectivity between these nodes. Mukherjee et al. also focused on network coverage and connectivity issues, in Reference [25], they proposed a sleep-scheduling scheme that ensured a coverage degree requirement based on the dangerous levels of toxic gas leakage area, while maintaining global network connectivity with minimal awake nodes.
All these schemes can reduce the nodes' energy consumption, but the working-sleeping cycle strategy for sensor nodes will cause opportunistic node connections, which can further result in the link instability. Therefore, modelling this kind of WSNs is a challenging task, and thus random graph theory was introduced in WSNs.
In Reference [26], the authors indicated that WSNs could be represented as a graph with a set of vertices consisting of the sensor nodes of the network and a set of edges consisting of the wireless links between the nodes, and they showed that it was appropriate to use a random geometric graph for modeling WSNs. The random geometric graph evolved from the random graph where edges only exist between adjacent nodes. In Reference [27], the authors further considered heterogeneous sensor nodes that may have different transmission ranges and designed a new routing metric, and then they proposed a novel random geometric graph model based on this metric for heterogeneous WSNs. In Reference [28], the authors focused on the weaknesses of modeling WSNs topology by the means of random graph theory, proposed a new weighted topology model for WSNs based on random geometric theory, in which the strength of the transmitted signal would decrease rapidly along the increasing of distance between different nodes was taken into consideration. In Reference [29], the authors proposed a model of WSNs on a two-dimensional plane using site percolation model, a kind of random graph in which edges are formed only between neighboring nodes, and based on which, they investigated WSNs connectivity and energy consumption at percolation threshold when a so-called phase transition phenomena happens. However, in some scenarios, it is not necessary to study the connectivity of the whole network. For example, in Reference [30], authors studied the problem of optimally determining source-destination connectivity of random networks, and by viewing the network as a random graph, they determined an optimal policy for establishing whether the designated source and destination were connected.
In this work, for any node who needs to send its source information that included in the probe message to the mobile sink, an energy efficient probe message forwarding mechanism is proposed. On receiving these probe messages and by analyzing the included source information, the mobile sink can determine the neighbor relationships of these nodes and calculate the link connectivity between any two adjacent nodes, and according to which, the mobile sink can construct an OCRG.

System Model
This section gives an analysis of the network model and the energy model used in this work.

Network Model
The network model used in this paper assumes that all sensor nodes are distributed in region Ω randomly and independently, and each node has a unique ID with radio range r. In addition, all nodes adopt asynchronous working-sleeping cycle strategy for sensing and communication, and the working time α and the sleeping time γ for each node can be anisochronous. Therefore, sensor nodes can communicate with each other only when they are in their working statuses. Moreover, this network supports multiple mobile sinks to collect data and each mobile sink has a unique Sink ID (SID) to distinguish from each other, however, we only use single mobile sink to describe the design principle in this work for simplicity. The mobile sink can move randomly and collect data actively anywhere in network at any time, it will always be in the working status for having sufficient energy and has two adjustable radio range R l and R s , of which R l is the long radio range that is fixed in each data collection period but adjustable in different periods used to broadcast the tag message to launch a data collection task, and R s is the short radio range that is always equal to r used to transfer data. The symbols and parameters used in network are summarized in Table 1.  Figure 2 depicts the asynchronous working-sleeping cycle strategy for sensor nodes. Three given nodes v 1 , v 2 , v 3 are within the radio range of each other, but their operating statuses are asynchronous and their working time and sleeping time are anisochronous. Therefore, they can communicate with each other only when they are all in their working status, such as [t 1 ,t 2 ] or [t 3 ,t 4 ].

Energy Model
In this work, all sensor nodes adopt a simplified energy consumption model [31]. In the process of data collection, the energy consumption mainly consists of the received power and the transmission power. Here, let εamp be the energy consumed in the transmission amplifier, Eelec be the basic energy consumed during transmitting and receiving per bit respectively, and d(vi,vj) be the distance between node νi and νj.
The received power by node νj that receives B-bit data is described as follows: The transmission power during node νi transmits B-bit data to node νj is described as follows:

Data Collection Scheme Based on Opportunistic Node Connections
In the scheme, the mobile sink remains stationary and can determine the collection scope during the data collection. Once this data collection is completed, it can move in the network randomly and launch another data collection task anywhere in the network at any time. The proposed scheme can be presented in detail in the following four phases:

Initialization
To start a data collection period, the mobile sink can launch a data collection task anywhere in network at any time by broadcasting a tag message (including the duration for this data collection and the SID of the sink) by a long radio range Rl, and then it should keep stationary until accomplish this data collection task. Therefore, all nodes under this radio range Rl could receive this tag message.
On receiving a tag message, by obtaining the duration for this data collection included in this message, a node can calculate its whole working time (according to its own working-sleeping scheduling) and its status transition (from the working status to the sleeping status, or vice versa) frequency during this data collection period, which will be used to construct the OCRG in further.
For a node in current data collection scope, the method of Received Signal Strength Indication (RSSI) is used to estimate its distance to the mobile sink, the value of which is used to calculate its Expected Optimal Hops (EOH) (Definition 1) according to Lemma 1. The EOH of a node will be

Energy Model
In this work, all sensor nodes adopt a simplified energy consumption model [31]. In the process of data collection, the energy consumption mainly consists of the received power and the transmission power. Here, let ε amp be the energy consumed in the transmission amplifier, E elec be the basic energy consumed during transmitting and receiving per bit respectively, and d (vi,vj) be the distance between node v i and v j .
The received power by node v j that receives B-bit data is described as follows: The transmission power during node v i transmits B-bit data to node v j is described as follows:

Data Collection Scheme Based on Opportunistic Node Connections
In the scheme, the mobile sink remains stationary and can determine the collection scope during the data collection. Once this data collection is completed, it can move in the network randomly and launch another data collection task anywhere in the network at any time. The proposed scheme can be presented in detail in the following four phases:

Initialization
To start a data collection period, the mobile sink can launch a data collection task anywhere in network at any time by broadcasting a tag message (including the duration for this data collection and the SID of the sink) by a long radio range R l , and then it should keep stationary until accomplish this data collection task. Therefore, all nodes under this radio range R l could receive this tag message.
On receiving a tag message, by obtaining the duration for this data collection included in this message, a node can calculate its whole working time (according to its own working-sleeping scheduling) and its status transition (from the working status to the sleeping status, or vice versa) frequency during this data collection period, which will be used to construct the OCRG in further.
For a node in current data collection scope, the method of Received Signal Strength Indication (RSSI) is used to estimate its distance to the mobile sink, the value of which is used to calculate its Expected Optimal Hops (EOH) (Definition 1) according to Lemma 1. The EOH of a node will be employed to select proper forwarders to minimize the overall energy consumption when forwarding the probe message to the mobile sink.
In order to facilitate the sleeping node to participate in data collection when it wakes up, the mobile sink can choose to re-broadcast the tag message randomly, and in this case, the nodes that have received the same tag message will not deal with it again. Definition 1. Expected Optimal Hops (EOH). The Expected Optimal Hops denote the number of hops it takes to forward a probe message from any node to the mobile sink with the minimized energy consumption.
It should be noted that EOH is different from the shortest path, according to the energy model, the energy consumption for a node to transfer data to the mobile sink is not linearly related to the length of the data transmission path.

Lemma 1.
For any node v i in a data collection scope, its EOH satisfies: where d (vi,vsink) is the distance between node v i and the mobile sink.
Proof of Lemma 1. Assuming that there is B-bit data from node v i should be forwarded to the mobile sink v sink along the path (v i , v sink , k), where k is the hops on the path and satisfies 1 ≤ k ≤ N − 1.
To unify the path representation, v sink is represented by v i+k in the next proof process, from which it means that it needs k hops to forward a probe message from v i to v sink . Then, the energy consumption function with respect to k for this forwarding can be calculated as follows: To get the minimum value of C vi (k), we can use the average value inequality to derive inequality of C vi (k) which can be described as follows: Since the sum of the distances for any two adjacent nodes on the path (v i , v i+k , k) satisfies the following inequality: Therefore, the minimum energy consumption function C min v i (k) with respect to k satisfies: To get the value of k that can minimize C min v i (k), we can take the first derivative of C min v i (k) with respect to k as follows: Let ∂C min v i /∂k equal to 0, we can get a value of k. By using k expected to denote this value, we have: Next, taking the second derivative of C min v i (k) with respect to k and comparing the value of it at k = k expected with 0 as follows: Therefore, the C min v i (k) has a minimum value at k = k expected , and since v i+k represents v sink , so we here define the EOH for node v i is: The proof is completed.

Energy Efficient Probe Message Forwarding Mechanism
For each node in the data collection scope, it needs to send its source information (including its whole working time, its status transition frequency and all its neighbors' IDs) included in the probe messages to the mobile sink. The format of the probe message is given in Table 2. Table 2. The format of the probe message.

Message Source Source Information Sink ID Forwarders ID EOH
The "Message Source" field is used to indicate the ID of the node sending this probe message. The "Source Information" field is used to store the whole working time, the status transition frequency, and the neighbors' IDs of the message source. The "Sink ID" field indicates the SID of the mobile sink launching this data collection task, and which is also the forwarded destination for the probe messages. The "Forwarders ID" field is used to store the IDs of nodes that have forwarded this probe message in sequence. In the process of forwarding a probe message, each time the message is relayed by a node (forwarder), it should add its ID in the "Forwarders ID" field in the probe message. The "EOH" field is used to store the value of EOH for the message source.
Since a probe message can be generated from any node that received a tag message and the opportunistic node connections in the network decrease the efficiency of probe message forwarding, it is necessary to design an energy efficient probe message forwarding mechanism to minimize the overall energy consumption when forwarding them to the mobile sink.
In the mechanism, only the nodes that have received this tag message can participate in the probe message forwarding. When an intermediate node v m receives a probe message, it will check the IDs stored in "Forwarders ID" field in this probe message to learn whether it has forwarded it or not.
Here if v m has not forwarded this message, as shown in Figure 3a, it first updates the "Forwarders ID" field by adding its ID in it and calculates the Number of Forwarders (NF) based on it, by obtaining the value of EOH in this message it then can further calculate (EOH-NF), to find a neighbor whose EOH is closest to it as an optimal forwarder. However, if v m has forwarded this message, as shown in Figure 3b, for sake of exposition, suppose that the IDs in the "Forwarders ID" field are denoted as {v s , . . . , v p , v m , v n }, it first explores the neighbors that have not forwarded this probe message by checking the IDs in this field, and then updates this field by deleting the IDs of v m , v n from it and adds its ID in it. In the next, by calculating NF based on this field and obtaining the value of EOH in this message, v m can further calculate (EOH-NF) to find a neighbor whose EOH is closest to it from the explored neighbors as an optimal forwarder. Considering the asynchronous working-sleeping cycle strategy for nodes, when νm receives this probe message from νp, if only the neighbor νp is in working status, as shown in Figure 3c, it forwards the message to νp. However, if νp is also in sleeping status, as shown in Figure 3d, νm stops forwarding it. Algorithm 1 gives the detailed energy efficient probe message forwarding mechanism. Algorithm 1. Energy efficient probe message forwarding mechanism 1. When an intermediate node νm receives a probe message: 2. if (νm has not forwarded the probe message) 3. Add its ID in the "Forwarders ID" field included in the probe message; 4. Calculate NF based on the "Forwarders ID" field; 5. Calculate (EOH-NF); // the EOH is obtained from the "EOH" field 6. if (νm has neighbors) //νp is not included 7. Select the neighbor whose EOH is closest to (EOH-NF) as an optimal forwarder; 8. Forward the probe message to it; 9. end if 10. else 11. //suppose that νm has forwarded the probe message to νn, but then receives this message again from it 12. Explore the neighbors that have not forwarded this probe message; 13. Delete the IDs of νm, νn from the "Forwarders ID" field; 14. Node νm adds its ID in the "Forwarders ID" field; 15. Calculate NF based on the "Forwarders ID" field; 16. Calculate (EOH-NF); 17. if (the explored neighbors exist) 18. Select the neighbor whose EOH is closest to (EOH-NF) as an optimal forwarder; 19.

Opportunistic Connection Random Graph Construction
After receiving these probe messages, the mobile sink constructs an OCRG (Definition 2) by analyzing the included source information to reflect the opportunistic node connections in this data collection scope.  Considering the asynchronous working-sleeping cycle strategy for nodes, when v m receives this probe message from v p , if only the neighbor v p is in working status, as shown in Figure 3c, it forwards the message to v p . However, if v p is also in sleeping status, as shown in Figure 3d, v m stops forwarding it. Algorithm 1 gives the detailed energy efficient probe message forwarding mechanism. Algorithm 1. Energy efficient probe message forwarding mechanism

1.
When an intermediate node v m receives a probe message:

2.
if (v m has not forwarded the probe message)

3.
Add its ID in the "Forwarders ID" field included in the probe message;

4.
Calculate NF based on the "Forwarders ID" field;

6.
if (v m has neighbors) //v p is not included 7.
Select the neighbor whose EOH is closest to (EOH-NF) as an optimal forwarder;

8.
Forward the probe message to it;

9.
end if 10. else 11. //suppose that v m has forwarded the probe message to v n , but then receives this message again from it 12. Explore the neighbors that have not forwarded this probe message; 13. Delete the IDs of v m , v n from the "Forwarders ID" field; 14. Node v m adds its ID in the "Forwarders ID" field; 15. Calculate NF based on the "Forwarders ID" field; 16. Calculate (EOH-NF); 17. if (the explored neighbors exist) 18. Select the neighbor whose EOH is closest to (EOH-NF) as an optimal forwarder; 19.

Opportunistic Connection Random Graph Construction
After receiving these probe messages, the mobile sink constructs an OCRG (Definition 2) by analyzing the included source information to reflect the opportunistic node connections in this data collection scope.

Definition 2.
Opportunistic Connection Random Graph (OCRG). The OCRG is constructed based on the intermittent communication caused by asynchronous working-sleeping cycle strategy of nodes, which can be analyzed from all the source information, to reflect the opportunistic node connections in the data collection scope, denoted as G(V, E, P e ).
In an OCRG, V denotes a set of all message sources, E denotes a set of all opportunistic connections between any two adjacent nodes in set V, and P e denotes the link connectivity between any two adjacent nodes in set V.
For sake of exposition, the link connectivity P e between any two adjacent nodes v i and v j is denoted as P v i v j . There are two factors affect the value of P v i v j , of which the first one is the whole working time of nodes v i and v j , the longer the nodes' whole working time, the more likely they can communicate with each other. In addition, the second one is the status transition frequencies of nodes v i and v j , which can be explained by that, under the same duration for a data collection, sensor nodes with different status transition frequencies may produce the same whole working time, but nodes with higher status transition frequencies contribute more to improve the link connectivity. This phenomenon can be illustrated in Figure 4, in which the duration for a data collection for the mobile sink is 12 min, and the working-sleeping cycle for node v 1 and v 2 is 3 min and 8 min, respectively, so both of them work 8 min during this process, but the status transition frequency for node v 1 is about 2.3 times that of node v 2 . analyzed from all the source information, to reflect the opportunistic node connections in the data collection scope, denoted as G(V, E, Pe).
In an OCRG, V denotes a set of all message sources, E denotes a set of all opportunistic connections between any two adjacent nodes in set V, and Pe denotes the link connectivity between any two adjacent nodes in set V.
For sake of exposition, the link connectivity Pe between any two adjacent nodes νi and νj is denoted as . There are two factors affect the value of , of which the first one is the whole working time of nodes νi and νj, the longer the nodes' whole working time, the more likely they can communicate with each other. In addition, the second one is the status transition frequencies of nodes νi and νj, which can be explained by that, under the same duration for a data collection, sensor nodes with different status transition frequencies may produce the same whole working time, but nodes with higher status transition frequencies contribute more to improve the link connectivity. This phenomenon can be illustrated in Figure 4, in which the duration for a data collection for the mobile sink is 12 min, and the working-sleeping cycle for node 1 Based on the analysis above, the link connectivity ( i j v v P ) between any two adjacent nodes νi and νj can be calculated as follows: Noted that the mobile sink is always in working status, any of its adjacent node (under its short radio range s R ) that in working status can communicate with it at any time. Hence, for the mobile sink and any of its adjacent node νi, their link connectivity (  Based on the analysis above, the link connectivity (P v i v j ) between any two adjacent nodes v i and v j can be calculated as follows: where T v i and T v j is the whole working time of node v i and v j in a data collection period T p , respectively, while f v i and f v j is the status transition frequencies of node v i and v j in this period, respectively. Here f max is the max value of the status transition frequency obtained from all the probe messages, and it is used to achieve normalization of the node's status transition frequency. Noted that the mobile sink is always in working status, any of its adjacent node (under its short radio range R s ) that in working status can communicate with it at any time. Hence, for the mobile sink and any of its adjacent node v i , their link connectivity (P v i v sin k ) can be affected by the whole working time and the status transition frequency of node v i , and accordingly, P v i v sin k can be calculated as follows: By combing Equations (12) and (13), P e can be described as follows: Figure 5 depicts the link connectivity between four node pairs varies by the max status transition frequency of nodes. The whole working time of nodes in this data collection period is set to a fixed value to explore the impact of different status transition frequency of nodes on the link connectivity. In the figure it can be seen that, with a fixed whole working time of the node, the link connectivity between nodes decreases as the max status transition frequency of nodes increases. For two adjacent nodes with higher status transition frequency, the link connectivity between them is higher. The main reason is that the higher status transition frequency of a node increases the communication opportunities with its neighbors, and further improves the link connectivity between them.  Figure 5 depicts the link connectivity between four node pairs varies by the max status transition frequency of nodes. The whole working time of nodes in this data collection period is set to a fixed value to explore the impact of different status transition frequency of nodes on the link connectivity. In the figure it can be seen that, with a fixed whole working time of the node, the link connectivity between nodes decreases as the max status transition frequency of nodes increases. For two adjacent nodes with higher status transition frequency, the link connectivity between them is higher. The main reason is that the higher status transition frequency of a node increases the communication opportunities with its neighbors, and further improves the link connectivity between them.  Based on the analysis above, the mobile sink can first obtain the message source set V by gathering their IDs stored in the "Message Source" field included in the probe messages, and then obtain the edge set E by analyzing the neighbors' IDs of each message source that stored in the "Source Information" field, by gathering all message sources' whole working time and status transition frequencies stored in the "Source Information" field it can calculate the link connectivity between any two adjacent nodes in set V according to Equation (14), thus constructing an OCRG.
By constructing OCRG, the mobile sink models the opportunistic node connections in the data collection scope, based on which, it could calculate the optimal data forwarding path in advance for each node to send their sensing data to improve the reliability of data transmission. Furthermore, the source node could obtain its data forwarding path in advance therefore reduce the delay in each hop, and due to the intermittent communication mode of the nodes, it could also improve the success rate of the data transmission. Figure 6 depicts an OCRG with three nodes νi, νj and νk that adopt the asynchronous workingsleeping cycle strategy to save energy. Nodes νi and νj are adjacent neighbors while νj and νk are adjacent neighbors, and the link connectivity between them are Based on the analysis above, the mobile sink can first obtain the message source set V by gathering their IDs stored in the "Message Source" field included in the probe messages, and then obtain the edge set E by analyzing the neighbors' IDs of each message source that stored in the "Source Information" field, by gathering all message sources' whole working time and status transition frequencies stored in the "Source Information" field it can calculate the link connectivity between any two adjacent nodes in set V according to Equation (14), thus constructing an OCRG.
By constructing OCRG, the mobile sink models the opportunistic node connections in the data collection scope, based on which, it could calculate the optimal data forwarding path in advance for each node to send their sensing data to improve the reliability of data transmission. Furthermore, the source node could obtain its data forwarding path in advance therefore reduce the delay in each hop, and due to the intermittent communication mode of the nodes, it could also improve the success rate of the data transmission. Figure 6 depicts an OCRG with three nodes v i , v j and v k that adopt the asynchronous working-sleeping cycle strategy to save energy. Nodes v i and v j are adjacent neighbors while v j and v k are adjacent neighbors, and the link connectivity between them are P v i v j and P v j v k , respectively, hence there exists an opportunistic connection. Since node v i is not adjacent to v j , their link connectivity P v i v k is 0.

The Optimal Path Calculation Based on the Opportunistic Connection Random Graph
Once the OCRG is constructed, it needs to find the optimal path from any node to the mobile sink to realize the reliable sensing data forwarding. In this work, a spanning tree algorithm (Algorithm 2) is designed to calculate the optimal path with the max value of Path Connectivity (PC) (Definition 3) from the mobile sink to each node based on this OCRG. By employing this algorithm, the mobile sink can produce a spanning tree, based on which, each node can get its optimal path to the sink with the corresponding PC of it. Therefore, the mobile sink will broadcast this spanning tree to nodes under the long radio range l R .

Definition 3. Path Connectivity (PC).
In the OCRG, the PC of a path is related to the link connectivity between any two adjacent nodes on this path, which is used to reflect the reliability of this path when delivering data. As for a path( i v , i k v + , k), where node i v is the source node, node i k v + represents the node k hops away from i v on this path, k is the hops on the path and satisfies 1 ≤ k ≤ N − 1, the PC of this path is defined as: During constructing this spanning tree, all nodes in the OCRG are subdivided into two sets: (i) set K: including the nodes whose optimal paths from the mobile sink to themselves are known, and (ii) set U: including the remaining nodes. In the beginning, set K includes only the mobile sink while set U includes the remaining nodes in the OCRG, and a node in set U will be transferred to set K if the mobile sink finds the optimal path to it.
The algorithm includes two phases, of which the first phase is initialization. In this phase, the mobile sink calculates the PC of the paths from itself to all nodes in set U as follows: And the nodes on each path can be denoted as follows: In the second phase, the mobile sink selects a node i v with a max value of PC from set U and transfers it to set K, and then updates the PC of the path from sin k v to each node j v in set U but directly connected to i v as follows: For each path, once the PC of it is updated, the nodes on it should be updated as follows:

The Optimal Path Calculation Based on the Opportunistic Connection Random Graph
Once the OCRG is constructed, it needs to find the optimal path from any node to the mobile sink to realize the reliable sensing data forwarding. In this work, a spanning tree algorithm (Algorithm 2) is designed to calculate the optimal path with the max value of Path Connectivity (PC) (Definition 3) from the mobile sink to each node based on this OCRG. By employing this algorithm, the mobile sink can produce a spanning tree, based on which, each node can get its optimal path to the sink with the corresponding PC of it. Therefore, the mobile sink will broadcast this spanning tree to nodes under the long radio range R l . Definition 3. Path Connectivity (PC). In the OCRG, the PC of a path is related to the link connectivity between any two adjacent nodes on this path, which is used to reflect the reliability of this path when delivering data. As for a path (v i , v i+k , k), where node v i is the source node, node v i+k represents the node k hops away from v i on this path, k is the hops on the path and satisfies 1 ≤ k ≤ N − 1, the PC of this path is defined as: During constructing this spanning tree, all nodes in the OCRG are subdivided into two sets: (i) set K: including the nodes whose optimal paths from the mobile sink to themselves are known, and (ii) set U: including the remaining nodes. In the beginning, set K includes only the mobile sink while set U includes the remaining nodes in the OCRG, and a node in set U will be transferred to set K if the mobile sink finds the optimal path to it.
The algorithm includes two phases, of which the first phase is initialization. In this phase, the mobile sink calculates the PC of the paths from itself to all nodes in set U as follows: And the nodes on each path can be denoted as follows: In the second phase, the mobile sink selects a node v i with a max value of PC from set U and transfers it to set K, and then updates the PC of the path from v sin k to each node v j in set U but directly connected to v i as follows: For each path, once the PC of it is updated, the nodes on it should be updated as follows: Repeating the work in second phase until set U is empty, a spanning tree with the mobile sink as the root node can be generated.
The algorithm improves the efficiency of finding the optimal path. If the mobile sink calculates and selects the optimal path of each node directly, for each node, it needs to explore all the possible paths between them to determine which one has the maximum PC, which is inefficient and brings high delay. Furthermore, when the network density increases or the data collection scope expands, the number of the paths from a node to the mobile sink increases greatly, which could further decrease its efficiency. Algorithm 2. The spanning tree algorithm
Output: a spanning tree with the mobile sink as the root node;

3.
Add v sin k into set K;

4.
Add the remaining nodes in the OCRG into set U;

5.
for each node v i in set U do // initialization  Figure 7 depicts the detailed process of constructing the spanning tree step by step in an OCRG with seven nodes. From Figure 7a we can see, sets K and U can be denoted as {v sin k } and {v 1 ,v 2 ,v 3 ,v 4 ,v 5 ,v 6 }, respectively. In the first phase of the algorithm,v sin k calculates the PC of the paths from itself to all nodes in set U and finds the nodes on each path, the results of which are shown in Table 3 and Figure 7b, respectively. The PC[v i ] (i = 1, 2, . . . , 6) used in this table indicates the PC of the path from the mobile sink to node v i , and the value with bold format is the max value of PC in set U. The second phase of this algorithm includes 6 steps, which can be seen from Figure 7c-h. In step 1, node v 1 is transferred to set K due to it has the max value of PC among that of all nodes in set U, which can be seen in the table. By checking the OCRG, v sin k learns that only v 6 in set U is directly connected to v 1 , so it updates the PC of the path from itself to v 6 and the nodes on this path. In the next 5 steps, by repeating the work in step 1, a spanning tree is generated which can be seen in Figure 7i. The node in set K The node is transferred to set K  Step When the spanning tree is generated, the mobile sink should broadcast it to the nodes under its long radio range l R so that they can forward the sensing data to it according to this spanning tree.
On receiving the spanning tree, here it is noticing that, only a node that is on it could obtain the optimal path from itself to the mobile sink, and store the path and the PC of the path locally. However, if a node does not find itself on the spanning tree, it will re-send its probe message to the mobile sink according to the energy efficient probe message forwarding mechanism. Moreover, recalling that the  When the spanning tree is generated, the mobile sink should broadcast it to the nodes under its long radio range R l so that they can forward the sensing data to it according to this spanning tree. On receiving the spanning tree, here it is noticing that, only a node that is on it could obtain the optimal path from itself to the mobile sink, and store the path and the PC of the path locally. However, if a node does not find itself on the spanning tree, it will re-send its probe message to the mobile sink according to the energy efficient probe message forwarding mechanism. Moreover, recalling that the mobile sink can choose to re-broadcast the tag message randomly during the current data collection period, the nodes that have not received this tag message will also send their probe messages to the mobile sink.
On receiving these probe messages, the mobile sink could re-construct an OCRG, then re-calculate the optimal path from itself to each node in this OCRG and update the spanning tree accordingly.
In fact, for a node, its optimal path on the spanning tree is the most reliable path for it to forward its sensing data to the mobile sink. However, due to using the asynchronous working-sleeping cycle strategy for nodes, so during data transmission process along this path, any forwarder on the path may be in sleeping status which could lead to the failed link connection, it further causes the failure of data transmission. To solve this problem, a routing protocol that adapts to different nodes operating statuses is proposed to improve the reliability of data transmission and Algorithm 3 gives the detailed routing protocol. In this routing protocol, assuming that the sensing data from node v s is being forwarded along its optimal path on the spanning tree to the mobile sink. When an intermediate node v m receives the data, it should forward it to its father node v n only if v n is in working status, otherwise, v m should select other forwarder from its neighbors to continue this data forwarding. This kind of forwarder selection method is described as follows.
For each neighbor v i of v m that is on the spanning tree, v m calculates the link connectivity P v m v i between them according to Equation (14), by obtaining the value of PC (PC path(v i ,v sin k , * ) ) stored in v i locally it then can calculate (P v m v i ×PC path(v i ,v sin k , * ) ) to get the PC of the path v m -v i -v sin k , accordingly, v m selects the neighbor on path v m -v i -v sin k with a max value of PC as the forwarder.

1.
Assuming that the sensing data from node v s is forwarded along its optimal path on the spanning tree to the mobile sink;

2.
When an intermediate node v m receives the sensing data:

3.
if (its father node v n is in working status)

4.
Forward the sensing data to v n ;

5.
v n employs this routing protocol to forward the data;

6.
else //v n is in sleeping status

7.
if (v m has neighbors on the spanning tree) 8.
for each neighbor v i do 9.
Calculate P v m v i according to Equation (14) Figure 8 depicts the detailed process of a source node v s forwards its sensing data to v sin k . Node v s forwards the data to its father node v 1 , and then v 1 forwards it to its father node v 2 . Since the father node v 3 of v 2 is in sleeping status, v 2 needs to find other forwarder. For neighbors v 4 and v 6 of v 2 that are on the spanning tree, v 2 calculates link connectivity P v 2 v 4 and P v 2 v 6 , respectively, by obtaining the value of PC stored in v 4 , v 6 locally it then can calculate (P v 2 v 4 ×PC path(v 4 ,v sin k , * ) ) and (P v 2 v 6 ×PC path(v 4 ,v sin k , * ) ), respectively, to get the PC of the path v 2 -v 4 -v 5 -v sin k and path v 2 -v 6 -v 7 -v 8 -v sin k . Since the PC of the path v 2 -v 4 -v 5 -v sin k is greater than that of the path v 2 -v 6 -v 7 -v 8 -v sin k , node v 4 is selected as the forwarder, then the sensing data is forwarded to v sin k along the path v 4 -v 5 -v sin k .
v is selected as the forwarder, then the sensing data is forwarded to sin k v  Figure 8. The process of a source node s v forwards its sensing data to the mobile sink along the optimal path on the spanning tree.

Simulation Environment
Our simulations are performed under Matlab R2014b simulator [32] and the parameters used in this simulation are listed in Table 4. In this simulation, sensor nodes are deployed in a 400 × 400 m 2 region randomly and independently. The random mobility mode has two stages: mobile and stationary, and the mobile sink can independently determine the direction and speed when moving, and it can decide the dwell time when keeping stationary. The duration for a data collection period varies and the mobile sink can change it in different collection tasks, but in our experiments, we fix this value to compare the impact of the other parameters on the performance of the algorithm. For classical data transmission schemes, in ExOR [33], by taking advantages of the broadcast nature of the wireless medium and allowing multiple neighbors that overhear the transmission to participate in forwarding data, it can improve the reliability of data forwarding. In Probabilistic and Opportunistic Flooding Algorithm (POFA) [34], by adopting the controlled transmissions, it can achieve a target reliability on each hop of data forwarding. In Reference [35], a data transmission schemes named Reliable Proliferation Routing with low Duty Cycle (RPRDC) is proposed, by integrates three core concepts namely reliable Figure 8. The process of a source node v s forwards its sensing data to the mobile sink along the optimal path on the spanning tree.

Simulation Environment
Our simulations are performed under Matlab R2014b simulator [32] and the parameters used in this simulation are listed in Table 4. In this simulation, sensor nodes are deployed in a 400 × 400 m 2 region randomly and independently. The random mobility mode has two stages: mobile and stationary, and the mobile sink can independently determine the direction and speed when moving, and it can decide the dwell time when keeping stationary. The duration for a data collection period varies and the mobile sink can change it in different collection tasks, but in our experiments, we fix this value to compare the impact of the other parameters on the performance of the algorithm. For classical data transmission schemes, in ExOR [33], by taking advantages of the broadcast nature of the wireless medium and allowing multiple neighbors that overhear the transmission to participate in forwarding data, it can improve the reliability of data forwarding. In Probabilistic and Opportunistic Flooding Algorithm (POFA) [34], by adopting the controlled transmissions, it can achieve a target reliability on each hop of data forwarding. In Reference [35], a data transmission schemes named Reliable Proliferation Routing with low Duty Cycle (RPRDC) is proposed, by integrates three core concepts namely reliable path finder, a randomized dispersity, and forwarding, it improves the success rate of packet forwarding. In addition, both of them consider unreliable wireless links when making the routing decision. In this work, the performance of the proposed scheme is compared with that of ExOR, POFA and RPRDC in terms of the determined performance metrics, (i) Packet Delivery Ratio (PDR), (ii) Energy Consumption (EC), and (iii) Network Lifetime (NL).

Packet Delivery Ratio
In this work, the PDR is defined as the ratio of packets received by the mobile sink to packets sent by the sensor nodes, and three scenarios are designed to conduct simulations to evaluate the PDR based on the long radio range R l of the mobile sink and the network density. In these experiments, we consider two cases that a node will drop the data packet, first, a node cannot find a forwarder due to the asynchronous working-sleeping cycle strategy for sensor nodes. Second, when data packets are forwarded between two adjacent nodes, a node enters sleeping status and causes transmission failure. As shown in Figure 9, the PDR in POFA increases as the long radio range R l of the mobile sink increases while it decreases in the proposed scheme, ExOR and RPRDC. The reason is that when R l increases, more nodes under the data collection scope will join the data collection process, and due to the network density keeps unchanged, the number of hops these newly joined nodes takes to forward their sensing data to the mobile sink increases correspondingly, which decreases the success rate of delivering the sensing data in the proposed scheme, ExOR and RPRDC. However, in POFA, the sensing data of a node will be forwarded along multiple paths until one of them connected to the mobile sink, as the number of nodes in this scope increase, the number of the possible paths increases, which improves the success rate of data delivering.

Packet Delivery Ratio
In this work, the PDR is defined as the ratio of packets received by the mobile sink to packets sent by the sensor nodes, and three scenarios are designed to conduct simulations to evaluate the PDR based on the long radio range l R of the mobile sink and the network density. In these experiments, we consider two cases that a node will drop the data packet, first, a node cannot find a forwarder due to the asynchronous working-sleeping cycle strategy for sensor nodes. Second, when data packets are forwarded between two adjacent nodes, a node enters sleeping status and causes transmission failure. As shown in Figure 9, the PDR in POFA increases as the long radio range l R of the mobile sink increases while it decreases in the proposed scheme, ExOR and RPRDC. The reason is that when l R increases, more nodes under the data collection scope will join the data collection process, and due to the network density keeps unchanged, the number of hops these newly joined nodes takes to forward their sensing data to the mobile sink increases correspondingly, which decreases the success rate of delivering the sensing data in the proposed scheme, ExOR and RPRDC. However, in POFA, the sensing data of a node will be forwarded along multiple paths until one of them connected to the mobile sink, as the number of nodes in this scope increase, the number of the possible paths increases, which improves the success rate of data delivering.
From the figure it can be seen that, the proposed scheme performs better PDR (almost one time and 72%) than that in ExOR and RPRDC as l R expands, the reason is that, on one hand, a node in the proposed scheme always forwards its sensing data along its optimal path that keeps the maximum PC. On the other hand, the routing protocol in the scheme can adjust the optimal path when it is disconnected due to node sleeping, which can further improve its PDR. As shown in Figure 10, the PDR increases as the network density increases, the proposed scheme achieves almost 130% and 102% improvement in PDR over ExOR and RPRDC, respectively. Since the number of working nodes in network increases with the increased network density, it can improve the network connectivity clearly and therefore increase the overall PDR.
The proposed scheme outperforms POFA when the number of nodes in network is less than 900, but as the network density increases continuously, POFA performs better. This can be explained by From the figure it can be seen that, the proposed scheme performs better PDR (almost one time and 72%) than that in ExOR and RPRDC as R l expands, the reason is that, on one hand, a node in the proposed scheme always forwards its sensing data along its optimal path that keeps the maximum PC. On the other hand, the routing protocol in the scheme can adjust the optimal path when it is disconnected due to node sleeping, which can further improve its PDR.
As shown in Figure 10, the PDR increases as the network density increases, the proposed scheme achieves almost 130% and 102% improvement in PDR over ExOR and RPRDC, respectively. Since the number of working nodes in network increases with the increased network density, it can improve the network connectivity clearly and therefore increase the overall PDR.
The proposed scheme outperforms POFA when the number of nodes in network is less than 900, but as the network density increases continuously, POFA performs better. This can be explained by that in POFA, a node needs to select multiple forwarders to achieve its target reliability, and the improvement of network connectivity increases the possibility of achieving this target reliability rapidly, which further improves its PDR. The main reason that the proposed scheme is always performs better than ExOR and RPRDC can be explained by the different routing strategies they employed.
In RPRDC, the routing is divided into random dispersion and reliability path exploration, of which the packet reception rate, residual energy, and link quality of the node are considered, but it lacks the global information between the node and the sink. Furthermore, ExOR adopts the metric ETX associated with the link connectivity in network, to reflect the expected transmission times required to forward a data from a node to the mobile sink, and it selects a node with a minimum value of ETX as the forwarder during the data forwarding. However, in the proposed scheme, the metric PC reflects the connectivity of a path directly and the routing protocol can guarantee the data always be forwarded along a path with a max value of PC, so it achieves a higher success rate of data delivering than that in ExOR and RPRDC. that in POFA, a node needs to select multiple forwarders to achieve its target reliability, and the improvement of network connectivity increases the possibility of achieving this target reliability rapidly, which further improves its PDR. The main reason that the proposed scheme is always performs better than ExOR and RPRDC can be explained by the different routing strategies they employed. In RPRDC, the routing is divided into random dispersion and reliability path exploration, of which the packet reception rate, residual energy, and link quality of the node are considered, but it lacks the global information between the node and the sink. Furthermore, ExOR adopts the metric ETX associated with the link connectivity in network, to reflect the expected transmission times required to forward a data from a node to the mobile sink, and it selects a node with a minimum value of ETX as the forwarder during the data forwarding. However, in the proposed scheme, the metric PC reflects the connectivity of a path directly and the routing protocol can guarantee the data always be forwarded along a path with a max value of PC, so it achieves a higher success rate of data delivering than that in ExOR and RPRDC. In the next scenario, we conduct a simulation to evaluate the performance of these four algorithms in PDR with a large enough data collection scope, in which the mobile sink keeps stationary in the center of network and the long radio range l R of the mobile sink is set to 283 m so that it can cover the whole network. Figure 11 shows that as the network density increases, the PDR in the proposed scheme first decreases and then increases. The main reason is that increasing the number of nodes not only improves the network connectivity, but also increases the number of packets sent by source nodes, when the number of nodes in network increases from 600 to 1100, the increased packets sent by the source nodes decreases its PDR, however, as the increased number of nodes is greater than 1100, the improved network connectivity increases its PDR.
The PDR in POFA is always higher than that in the proposed scheme, ExOR and RPRDC, which can be explained by that, in POFA, this large data collection scope increases the number of possible paths for a source node to send its sensing data to the mobile sink, which improves the success rate of data delivering greatly. However, in the proposed scheme, ExOR and RPRDC, this large scope increases the number of hops the nodes that near the edge of the data collection scope takes to forward their sensing data, which decreases the success rate of data delivering greatly. In addition, the proposed scheme achieves a better performance in PDR than that in ExOR and RPRDC. Since the node adopts the intermittent communication mode, the delay of data delivering in each hop will affect the success rate of this forwarding, especially when the node is far away from the mobile sink. In ExOR, a node needs to run a protocol to explore the neighbors that received the data and from which to select forwarders, therefore it delays the data forwarding and then decreases the success rate of data delivering. During the reliable path exploration phase of RPRDC, a node needs to calculate the In the next scenario, we conduct a simulation to evaluate the performance of these four algorithms in PDR with a large enough data collection scope, in which the mobile sink keeps stationary in the center of network and the long radio range R l of the mobile sink is set to 283 m so that it can cover the whole network. Figure 11 shows that as the network density increases, the PDR in the proposed scheme first decreases and then increases. The main reason is that increasing the number of nodes not only improves the network connectivity, but also increases the number of packets sent by source nodes, when the number of nodes in network increases from 600 to 1100, the increased packets sent by the source nodes decreases its PDR, however, as the increased number of nodes is greater than 1100, the improved network connectivity increases its PDR.
The PDR in POFA is always higher than that in the proposed scheme, ExOR and RPRDC, which can be explained by that, in POFA, this large data collection scope increases the number of possible paths for a source node to send its sensing data to the mobile sink, which improves the success rate of data delivering greatly. However, in the proposed scheme, ExOR and RPRDC, this large scope increases the number of hops the nodes that near the edge of the data collection scope takes to forward their sensing data, which decreases the success rate of data delivering greatly. In addition, the proposed scheme achieves a better performance in PDR than that in ExOR and RPRDC. Since the node adopts the intermittent communication mode, the delay of data delivering in each hop will affect the success rate of this forwarding, especially when the node is far away from the mobile sink. In ExOR, a node needs to run a protocol to explore the neighbors that received the data and from which to select forwarders, therefore it delays the data forwarding and then decreases the success rate of data delivering. During the reliable path exploration phase of RPRDC, a node needs to calculate the reliability of its neighbors to select the optimal forwarder, thus causing additional delay. Compared with ExOR and RPRDC, in the proposed scheme, the path for a source node to forward its sensing data is calculated in advance, so it has a shorter delay in each hop of data forwarding, which is the main reason that it can achieve almost five times and four times improvement in PDR over ExOR and RPRDC, respectively. In addition, our proposed routing protocol can change the current path adaptively when it is failed, thus blocks the decline of the PDR. reliability of its neighbors to select the optimal forwarder, thus causing additional delay. Compared with ExOR and RPRDC, in the proposed scheme, the path for a source node to forward its sensing data is calculated in advance, so it has a shorter delay in each hop of data forwarding, which is the main reason that it can achieve almost five times and four times improvement in PDR over ExOR and RPRDC, respectively. In addition, our proposed routing protocol can change the current path adaptively when it is failed, thus blocks the decline of the PDR.

Energy Consumption
In this section, two scenarios are designed to evaluate the performance of EC in network. In the first scenario, we evaluate the EC for a source node to forward its sensing data to the mobile sink successfully. In the second scenario, we evaluate the total EC in network varies by time. Figure 12 shows the EC for a source node to forward its sensing data to the mobile sink varies by the network density. In this work, we select the source node whose distance to the mobile sink is half of l R to conduct this experiment and the EC in this work only reflects the energy consumed by the forwarding process when the source node sends its sensing data to the sink successfully.
Lots of tests show that the EC increases as the network density expands and it increases more obviously in POFA than that in the proposed scheme, ExOR and RPRDC. The main reason is that the increased number of nodes increases the hops a source node takes in date delivering, as shown in Figure 13, which further increases the EC.

Energy Consumption
In this section, two scenarios are designed to evaluate the performance of EC in network. In the first scenario, we evaluate the EC for a source node to forward its sensing data to the mobile sink successfully. In the second scenario, we evaluate the total EC in network varies by time. Figure 12 shows the EC for a source node to forward its sensing data to the mobile sink varies by the network density. In this work, we select the source node whose distance to the mobile sink is half of R l to conduct this experiment and the EC in this work only reflects the energy consumed by the forwarding process when the source node sends its sensing data to the sink successfully.
with ExOR and RPRDC, in the proposed scheme, the path for a source node to forward its sensing data is calculated in advance, so it has a shorter delay in each hop of data forwarding, which is the main reason that it can achieve almost five times and four times improvement in PDR over ExOR and RPRDC, respectively. In addition, our proposed routing protocol can change the current path adaptively when it is failed, thus blocks the decline of the PDR.

Energy Consumption
In this section, two scenarios are designed to evaluate the performance of EC in network. In the first scenario, we evaluate the EC for a source node to forward its sensing data to the mobile sink successfully. In the second scenario, we evaluate the total EC in network varies by time. Figure 12 shows the EC for a source node to forward its sensing data to the mobile sink varies by the network density. In this work, we select the source node whose distance to the mobile sink is half of l R to conduct this experiment and the EC in this work only reflects the energy consumed by the forwarding process when the source node sends its sensing data to the sink successfully.
Lots of tests show that the EC increases as the network density expands and it increases more obviously in POFA than that in the proposed scheme, ExOR and RPRDC. The main reason is that the increased number of nodes increases the hops a source node takes in date delivering, as shown in Figure 13, which further increases the EC. Lots of tests show that the EC increases as the network density expands and it increases more obviously in POFA than that in the proposed scheme, ExOR and RPRDC. The main reason is that the increased number of nodes increases the hops a source node takes in date delivering, as shown in Figure 13, which further increases the EC.  The EC in the proposed scheme is slightly higher than that in ExOR, and they are much lower than that in POFA. The reason is that, in POFA, its data forwarding strategy increases its PDR by selecting multiple forwarders but at the expense of energy, as network density expands, the improved network connectivity increases the success rate of each hop of data forwarding, which further increases hops and the times of reliability calculation it takes, so that the EC in it is always high. In the proposed scheme, the neighbor of a node increases as the network density increases, when a node on the optimal path adjusts the path due to the next hop node is in sleeping status, it needs to select an optimal neighbor that can maximized the PC of the path from it to the sink, this process increases the times of calculation by using the PCs of all neighbors, which further increases the EC. Figure 14 shows the EC for a source node to forward its sensing data to the mobile sink grows with the long radio range l R of the mobile sink increases. The node whose distance to the mobile sink is half of l R is selected as the source node. With the increasing of l R , the distance from the source node to the mobile sink increases, so the hops it takes in data forwarding also increases, as shown in Figure 15, which further increases the EC. The EC in the proposed scheme is slightly higher than that in ExOR, and they are much lower than that in POFA. The reason is that, in POFA, its data forwarding strategy increases its PDR by selecting multiple forwarders but at the expense of energy, as network density expands, the improved network connectivity increases the success rate of each hop of data forwarding, which further increases hops and the times of reliability calculation it takes, so that the EC in it is always high. In the proposed scheme, the neighbor of a node increases as the network density increases, when a node on the optimal path adjusts the path due to the next hop node is in sleeping status, it needs to select an optimal neighbor that can maximized the PC of the path from it to the sink, this process increases the times of calculation by using the PCs of all neighbors, which further increases the EC. Figure 14 shows the EC for a source node to forward its sensing data to the mobile sink grows with the long radio range R l of the mobile sink increases. The node whose distance to the mobile sink is half of R l is selected as the source node. With the increasing of R l , the distance from the source node to the mobile sink increases, so the hops it takes in data forwarding also increases, as shown in Figure 15, which further increases the EC.  The EC in the proposed scheme is slightly higher than that in ExOR, and they are much lower than that in POFA. The reason is that, in POFA, its data forwarding strategy increases its PDR by selecting multiple forwarders but at the expense of energy, as network density expands, the improved network connectivity increases the success rate of each hop of data forwarding, which further increases hops and the times of reliability calculation it takes, so that the EC in it is always high. In the proposed scheme, the neighbor of a node increases as the network density increases, when a node on the optimal path adjusts the path due to the next hop node is in sleeping status, it needs to select an optimal neighbor that can maximized the PC of the path from it to the sink, this process increases the times of calculation by using the PCs of all neighbors, which further increases the EC. Figure 14 shows the EC for a source node to forward its sensing data to the mobile sink grows with the long radio range l R of the mobile sink increases. The node whose distance to the mobile sink is half of l R is selected as the source node. With the increasing of l R , the distance from the source node to the mobile sink increases, so the hops it takes in data forwarding also increases, as shown in Figure 15, which further increases the EC.  The EC in the proposed scheme is slightly higher than that in ExOR, but they are much lower than that in POFA. In POFA, each node selects multiple forwarders to achieve the target reliability until one of them connected to the mobile sink, the number of forwarders increases greatly as l R expands, so the increased data delivering and reliability calculation increases its EC. In the proposed scheme, with the expansion of l R , the network density keeps unchanged but the distance from the source node to the mobile sink increases, the success rate of data delivering decreases, which further increases the times of calculating the PC to adjust the data forwarding path. This path adjustment scheme can improve its PDR but increase its EC. Figure 16 shows that the total EC increases by time, in which the mobile sink launches multiple data collection tasks anywhere in network randomly, the proposed scheme consumes 55%, 80% and 61% less energy than that in ExOR, POFA and RPRDC. Here, the energy consumption of nodes is calculated regardless of whether source nodes send their sensing data to the sink successfully or not.
In the proposed scheme, the total EC is mainly consist of the energy consumption caused by the probe message forwarding and the sensing data delivering. By employing the EOH metric, the proposed energy efficient probe message forwarding mechanism can decrease the energy consumption for forwarding the probe message. By analyzing these messages, the mobile sink could construct the spanning tree and broadcast it, so the source node could obtain its optimal path with the max PC to the sink before sending its sensing data. Therefore, during data delivering, a node can save the energy consumed by selecting the forwarders. The EC in the proposed scheme is slightly higher than that in ExOR, but they are much lower than that in POFA. In POFA, each node selects multiple forwarders to achieve the target reliability until one of them connected to the mobile sink, the number of forwarders increases greatly as R l expands, so the increased data delivering and reliability calculation increases its EC. In the proposed scheme, with the expansion of R l , the network density keeps unchanged but the distance from the source node to the mobile sink increases, the success rate of data delivering decreases, which further increases the times of calculating the PC to adjust the data forwarding path. This path adjustment scheme can improve its PDR but increase its EC. Figure 16 shows that the total EC increases by time, in which the mobile sink launches multiple data collection tasks anywhere in network randomly, the proposed scheme consumes 55%, 80% and 61% less energy than that in ExOR, POFA and RPRDC. Here, the energy consumption of nodes is calculated regardless of whether source nodes send their sensing data to the sink successfully or not. is 1100.

Network Lifetime
In this section, the mobile sink can launch multiple data collection tasks anywhere in network randomly, and we conduct simulations to evaluate the performance of NL according to the number of death nodes, of which the NL is defined as the lifetime for 25 percent nodes that exhaust their energy in network.
In  In the proposed scheme, the total EC is mainly consist of the energy consumption caused by the probe message forwarding and the sensing data delivering. By employing the EOH metric, the proposed energy efficient probe message forwarding mechanism can decrease the energy consumption for forwarding the probe message. By analyzing these messages, the mobile sink could construct the spanning tree and broadcast it, so the source node could obtain its optimal path with the max PC to the sink before sending its sensing data. Therefore, during data delivering, a node can save the energy consumed by selecting the forwarders.

Network Lifetime
In this section, the mobile sink can launch multiple data collection tasks anywhere in network randomly, and we conduct simulations to evaluate the performance of NL according to the number of death nodes, of which the NL is defined as the lifetime for 25 percent nodes that exhaust their energy in network.
In Reference [36], the authors indicated that the death of 25 percent nodes in network could block the WSNs operation greatly, in terms of both the imminent topological disruptions and the decrease of the sensor field's coverage, which could further hinder the data delivering heavily. Figure 17 shows the number of death nodes increases by time. In this work, the mobile sink moves randomly in network and launches a data collection task with a period set to 600 s. When the task is completed, the sink could move to other locations in network randomly to continue another data collection.

Discussion
The proposed data collection scheme is also applicable to the scenario where multiple mobile sinks employed in network simultaneously, by considering the distance between the mobile sinks, a feasible scheme is proposed. To show how it works in detail, two mobile sinks are taken as an example, as shown in Figure 18, in which . The mobile sinks move in network randomly, and according to the distance between them, there will be two data collection cases, as shown in Figure 18a In POFA, the node dies faster than that in the proposed scheme, ExOR and RPRDC. The main reason is that it always seeks to improve the reliability of data transmission in each hop, but at the expense of energy consumption, which speeds up the death of nodes. In addition, the time of the death of 200 nodes (25 percent nodes in network) in the proposed scheme is longer than that in ExOR, POFA and RPRDC. The reason is that, despite the nodes on the optimal paths may take more data forwarding tasks, which can reduce their lifetime and furthermore decreases the network connectivity. However, by calculating the optimal path for nodes in advance to forward the data, and employing the flexible routing protocol that could change the data forwarding path adaptively in the low connectivity network, the proposed scheme consumes less energy of nodes, which is also the reason that as the number of the data collection tasks launched by the mobile sink increases, the time for the death of the same number of nodes in the proposed scheme is getting longer than that in ExOR, POFA and RPRDC.

Discussion
The proposed data collection scheme is also applicable to the scenario where multiple mobile sinks employed in network simultaneously, by considering the distance between the mobile sinks, a feasible scheme is proposed. To show how it works in detail, two mobile sinks are taken as an example, as shown in Figure 18, in which R l _1, R l _2, R l _3 and R l _4 are the long radio ranges of v sin k1 , v sin k2 , v sin k3 and v sin k4 . The mobile sinks move in network randomly, and according to the distance between them, there will be two data collection cases, as shown in Figure 18a,b, respectively.

Discussion
The proposed data collection scheme is also applicable to the scenario where multiple mobile sinks employed in network simultaneously, by considering the distance between the mobile sinks, a feasible scheme is proposed. To show how it works in detail, two mobile sinks are taken as an example, as shown in Figure 18, in which . The mobile sinks move in network randomly, and according to the distance between them, there will be two data collection cases, as shown in Figure 18a,b, respectively. In Figure 18a, the distance between the two mobile sinks satisfies: In this scenario, the mobile sinks could use the proposed data collection scheme to perform their own data collection tasks directly.
In Figure 18b, the distance between the two mobile sinks satisfies: In Figure 18a, the distance between the two mobile sinks satisfies: In this scenario, the mobile sinks could use the proposed data collection scheme to perform their own data collection tasks directly.
In Figure 18b, the distance between the two mobile sinks satisfies: In this case, the data collection scope of the two mobile sinks are overlapped. To clarify the data delivering destination for nodes within the overlapping area, when a node receives a tag message for the first time, it takes this SID of a mobile sink included in the message as its data forwarding destination until the data collection task is completed, and during this process, it neglects other mobile sinks' data collection task even if it receives the tag messages they broadcast.
This scheme increases the flexibility of data collection, and by using multiple mobile sinks, it can collect data from multiple network regions simultaneously, which is applicable in prospect applications.

Conclusions
In this paper, we propose a novel data collection scheme based on opportunistic node connections in network with a mobile sink, which can be used in smart industrial fields. In the scheme, the mobile sink can launch a data collection task anywhere in network at any time by broadcasting a tag message, a node that receives this message can forward its probe message that includes its source information to the mobile sink by using the energy efficient probe message forwarding mechanism to minimize the overall energy consumption. On receiving these messages, the mobile sink first adopts the random graph theory to constructs an opportunistic connection random graph, then calculates the optimal path from itself to each node in this graph, hence a spanning tree is generated where the mobile sink is the root node, and it finally broadcasts this spanning tree to all nodes under its long radio range so that they can obtain the optimal paths from themselves to the mobile sink. Accordingly, a routing protocol which adapts to different nodes operating status is proposed to improve the reliability of data transmission. The simulation results show that the proposed scheme works better concerning the packet delivery ratio, energy consumption and network lifetime.
The possible future work are, first, studying the synergy between mobile sinks, and exploring the impact of the comprehensive factors such as the number of mobile sinks and the their long radio ranges on data collection efficiency. Second, studying the energy efficient cross-layer protocol design [37], so that it could be used more widely. Third, inspired by work in Reference [38], of which the authors proposed a fog structure composed of multiple mobile sinks that act as fog nodes to bridge the gap between WSNs and the Cloud, it can consider using mobile sinks to upload data to the cloud and use their powerful computing and storage capability to improve the efficiency of constructing the OCRG and calculate the optimal paths.