A Novel Theoretical Probabilistic Model for Opportunistic Routing with Applications in Energy Consumption for WSNs

This paper proposes a new theoretical stochastic model based on an abstraction of the opportunistic model for opportunistic networks. The model is capable of systematically computing the network parameters, such as the number of possible routes, the probability of successful transmission, the expected number of broadcast transmissions, and the expected number of receptions. The usual theoretical stochastic model explored in the methodologies available in the literature is based on Markov chains, and the main novelty of this paper is the employment of a percolation stochastic model, whose main benefit is to obtain the network parameters directly. Additionally, the proposed approach is capable to deal with values of probability specified by bounded intervals or by a density function. The model is validated via Monte Carlo simulations, and a computational toolbox (R-packet) is provided to make the reproduction of the results presented in the paper easier. The technique is illustrated through a numerical example where the proposed model is applied to compute the energy consumption when transmitting a packet via an opportunistic network.


Introduction
In wireless ad hoc networks, the traditional cellular architecture for multi-hop routing is characterized by the use of communication schemes where the single objective is the transmission between nodes, e.g., hop-by-hop and end-to-end [1,2]. However, such schemes lack an important property inherent to broadcast wireless channels: simultaneous reception of a single message in multiple nodes. In this case, a transmitted packet can reach more than one unit in a neighborhood (multi-cast coverage) around some destinations (sink). A communication scheme where the transmitted messages can be received by several units corresponds to an Opportunistic Routing (OR) network (ExOR wireless network protocol), formulated in [3][4][5]. This particular structure aims to obtain a higher transmission success rate, lower average delay and even smaller energy consumption when compared with traditional transport strategies [6][7][8][9][10][11]. Those advantages can be achieved because, when the opportunistic routing scheme is implemented, each hop moves the packet farther (on average) than the hops of the best possible predetermined route, implying that the packets may reach the sink using a lower number of hops or jumps [3,12].
Issues such as reduction of the number of transmissions, congestion, and coordination of duplicated messages in ad hoc networks employing OR have been extensively investigated in the literature [10,[12][13][14][15]. Many research efforts in this area have been performed by means of network simulators [16,17], i.e., using software like Unicast and Multicast Network Simulator, version 2 or 3 (ns-2 and ns-3). On the other hand, in [3,4,8,10,13], one can find some mathematical formulas for the computation of the success probability, the number of transmissions and the cost functions for selecting the best jump. However, to the best of the authors' knowledge, there are not theoretical models (i.e., without appealing to approximations via simulation) to compute energy consumption in Wireless Sensor Networks (WSN).
The digital network modeling is certainly a challenging task because phenomena such as error channel (physical layer) and congestion, collision, and error in the data packetization (Data Link and transport layer) can be complex to model in terms of analytical equations, being commonly used support algorithms for the equations [18][19][20][21][22]. However, as previously mentioned, the simulation and emulation of networks in the laboratory can be an extensive and complex task. In this sense, abstracting a mathematical entity that models the flow of packets through closed formulas can be considered an idealized model of the network. The use of theoretical models in the context of transportation or allocation of elements from one point to another in a network corresponds to supplydemand networks (SDN) [23], where, as in WSN, it is of interest to study the probability of a successful delivery, or the computation and probability distribution of the possible routes. In this context, the topology and success probabilities of the routes in the network depend on inherent features of the system, such as distances, intermediate hubs, etc. As noted, problems in SDN are similar to the ones in WSN, where the simulation, even for small systems, may take some time and be costly. For this reason, the use of probabilistic models results as a fast and cost-free solution to these problems, possibly leading to exact results in a split second. The networks parameters can be used as values of references (initial or control values) for future real-life experiments or simulations. Furthermore, these models can be extended to more realistic environments by using uncertain or random coefficients. This paper proposes a probabilistic model for OR networks capable to compute the law of route formation, probability of successful transmission per route and probability of successful network transmission (source to sink). To the best of the authors' knowledge, the proposed technique is the first one capable of obtaining these network parameters analytically. Unlike the models available in the literature that use matrix forms based on Markov chains [24][25][26][27], the proposed approach is based on a discrete percolation model, which uses the directed graph theory, favoring the construction of an iterative model. Closed expressions are provided to compute the expected value of the number of broadcasts, packet transmissions and packet receptions. The latter is a useful parameter in network design because it is directly associated with energy consumption cost. The analytical computation of energy consumption (without using a network simulation) minimizes the complexity of the model and provides quick results, making the energy planning of the units easier. Another novelty of the proposed technique is the robustness of the results with respect to the values of the probabilities of transition among modes. To obtain more accurate results, the probabilities can be considered uncertain and structured in two different ways: (i) The probabilities belong to intervals whose upper and lower bounds are known; (ii) the probabilities are random parameters whose probability density function, which depends on the distance between the nodes, is known. Both cases are related to communication channel noise problems. The investigated network presents a linear topology [26] [Chapter 3.4.1], [27,28]. The proposal is validated via Monte Carlo (MC) simulation and a practical example related to energy consumption in WSN is also provided. All methods involved are available for interested readers in the free software (The R Project for Statistical Computing- [29]) R through the Package 'Opportunistic' [30], providing the full routing scheme, probability of successful transmission, expected number of transmissions, receptions and broadcasts, as well as their respective estimations via MC simulations for a given model.

Network Definition and Preliminary Discussions
In general, a digital network can be viewed as a collection of nodes located in some space, such that each node can be a transmitter or a receiver. The terminology and the probabilistic model of the end-to-end and hop-by-hop transport schemes for Bernoulli loss process in the channel used in this paper are explained in detail in [31,32]. The network consists of N + 1 nodes, labeled 0, 1, . . . , N, where N is a finite natural number. At a given instant of time, several nodes transmit simultaneously, each one towards its own receiver. Each transmitter-receiver pair requires its own link. The simplest model of N-hop opportunistic network is analyzed, such that the power signal decays with the distance, in a probabilistic way. To this end, for each transmitter node i (0 ≤ i ≤ N − 1), it is assigned a vector of probabilities (not necessarily summing one), p (i) = [p 1 , . . . , p N−i ], i = 0, . . . , N − 1, such that the jth element of p (i) is the probability of a signal transmitted by node i be received by node i + j. Furthermore, it is considered a linear topology network [26][27][28], where nodes (units) are linearly spaced between the source and sink. It is also important to recall that the model does not evolve in time, i.e., all transmission attempts (successful or not) occur simultaneously. Figure 1 shows a simple opportunistic network sample.
One of the issues that must be addressed by the network model is the successful transmission, i.e., when a signal initially transmitted by the source (node 0) reaches the sink (node N) at least once (regardless of the traveled route). The OR network model is related to some dependent discrete percolation model [33]. The diagram depicted in Figure 1 can be seen as a directed multigraph (DMG), named directed because each edge has a direction (from source to sink) and multigraph because multiple edges can have the same end nodes.
where V is a set of nodes and − → E is a multiset of ordered pairs of nodes, called edges. Each directed edge ij ∈ − → E , i < j, ∀i, j ∈ V, is declared open when it has probability p j−i > 0 and closed otherwise. As noted, the probabilities p j−i depend on the distance j − i between the transmitter node i and receiver node j. As a consequence, the possible set of probabilities is [p 1 , p 2 , . . . , p N ]. Many applications of DMG are usually studied in basic probability courses, such as water supply networks, electrical circuits, among others. For instance, suppose that the objective is to supply water from node 0 to N. Consider open edges those open to the passage of water. Then, one is naturally interested in answering if the water supplied from node 0, which flows along the open edges only, can reach node N. An affirmative answer can be viewed, in fact, as a successful transmission for an opportunistic model.

Possible Routes
First, let K N denote the set of the different possible routes in an N-hop opportunistic network. This is given by There exist #(K N ) different possible routes in an N-hop opportunistic network (in terms of the probability), where #(·) represents the cardinality of a given set. For instance, for the 3-hop system shown in Figure 2, K N contains three solution vectors κ 1 = (3, 0, 0), κ 2 = (1, 1, 0) and κ 3 = (0, 0, 1), Each possible route is characterized by a vector κ, whose elements k i represent the number of times that an i-hop jump occurs, e.g., κ 1 = (3, 0, 0) represents the route (from 0 to 3) as three single jumps, while κ 3 = (0, 0, 1) indicates the route as one triple jump. Besides, let p κ denote p Thus, one can easily compute the probabilities for each possible route as Note that the route κ 2 may occur as p 1 p 2 or p 2 p 1 , so one must consider all possible permutations. For this, one can use the multinomial coefficient Then, the three different routes above occur with frequencies respectively. Hence, the total number of possible routes for an N-hop opportunistic model is given by N R (N) = ∑ κ∈K N C(κ). For the previous example, one obtains that is, the four routes that can be easily seen from Figure 3. Furthermore, it is possible to find an analytical expression for N R (N) by using a recursive approach as presented in what follows.
From Equation (1), one obtains for N ≥ 1. Finally, since N R (1) = 1, i.e., there exists only one route in a single hop opportunistic model, one concludes that the total number of routes in an N-hop opportunistic model is given by Different routes with their frequencies and probabilities are provided by the routes() function available in the proposed R package. The results provided by this function in a particular example can be found in the Appendix A section, which corroborates the findings in this section.

Full Stochastic Opportunist Model
This section presents the probabilistic model that allows the calculation of the probability of successful transmission, expected number of transmissions (assuming independent routes), receptions, and broadcast transmissions. Broadcast is associated with energy costs and is represented in Figure 2b by the squares ( ). Each broadcast generates at least one transmission ( ), such that the expected value of transmissions allows to compute the expected value of reception of broadcasted messages.

Probability of Successful Transmission
Let P (N) S denote the probability of a successful transmission for an N-hop opportunistic model. As an opportunistic network is, in fact, a parallel network, one has where P(O i ) represents the probability that the i-th route is open and In light of the simplified network scheme presented in Figure 2b, since for all can be iteratively computed by: with the initial condition P (0) S = 1. For instance, considering the model presented in Figure 2b, the computation of P S , which can be calculated as follows

Expected Number of Transmissions
Let T N be the number of transmissions in an N-hop opportunistic model. For i = 1, . . . , N, let A i denote the event in which the i-th node receives the signal from the source. Additionally, let 1 A i denote the indicator function of the event A i , that is, it assumes 1 if the event A i occurs and 0 otherwise. Thus, Taking the expectation in both sides, yields Besides, for each i = 1, . . . , N − 1, in light of the law of total probability, it follows that where the second term is canceled since the number of transmissions sent by the i-th node, given that it did not receive the package, is zero, that is, E[1 A i T N−i |A c i ] = 0. Therefore, the expectation of the number of transmissions in an N-hop opportunistic model is given by It is straightforward to note that an N-hop opportunistic model contains N − 1 opportunistic models nested with h = 1, 2, . . . , N − 1 number of hops, respectively. For instance, one has from Equation (5) that and so on, where E[T 1 ] = 1 since P(T 1 = 1) = 1. Equation (5) provides a simple and clear manner to compute E[T N ] instead of considering all 2 N−1 possible routes, reducing significantly the computational effort.

Expected Number of Receptions
Similar to what was done in the previous section for the expected number of transmissions, one can summarize the total number of receptions in an N-hop opportunistic model, denoted by R N , by considering two parts: the number of receptions received directly from the source and the number of receptions received from the routers. Thus, R N can be written as From (8), one can see that and so on, where E[R 1 ] = p 1 as boundary. From Equation (7) one can say that the first term represents the expected number of receptions sent by the initial node and the second term computes the expected number of receptions within the nested sub-models. Equation (8) is expressed in a computationally efficient and friendly manner to obtain E[R N ].

Expected Number of Broadcast Transmissions
Let T B N be the number of broadcast transmissions in an N-hop opportunistic model. It is possible to derive a scheme analogous to Equation (6) to compute T B N as follows and so on, where E[T B 1 ] = 1. Finally, Note the similarity between (5) and (10) since a single initial broadcast transmission for an N-hop opportunist model involves N transmissions.

Opportunistic Model Considering Random Probabilities
Consider an opportunistic network problem with random probabilities, i.e., each link probability p i follows a probability density function (pdf ) with support on (0, 1). Hereinafter, those probabilities are referred to as prior distributions since they can be chosen in such a way that some prior information about them can be introduced in the model. Let p = [p 1 , . . . , p N ] be a vector with random entries following a distribution f p (θ), namely p ∼ f p (θ), where "∼" denotes "is distributed as", θ represents a suitable vector of parameters, and f p (θ) is the prior distribution. On the other hand, let X be a random variable of interest that depends on the vector of probabilities p. As a consequence, the pdf of X also depends on p, and one can say that X follows a distribution g(·) conditioned on the probabilities p, represented by X ∼ g(x|p). For instance, one can consider X as T N , R N or T B N . By the law of total expectation [34], one has that the expectation of X can be computed as where E[X] is computed using two embedded expectations (a double integral), with E[X|p] referred to as the inner expectation and E p [·] as the outer one. Note that, the subscript in the expectation represents the variable of integration, which is omitted whenever it can be easily inferred from the context.

The inner expectation E[X|p]
is computed with respect to the random variable X, given a fixed probability vector p. This yields a function depending on p, that finally becomes a number (E[X]) when one takes the outer expectation over p. Even though (11) may appear complex, it has a simple explanation. Considering X as the expected number of transmissions T N , Equation (11) (5) can be rewritten as As can be seen, the only difference between Expressions (5) and (12) 7) and (10) can be written as For the probability of successful transmissions P (N) S , it has been assumed that probabilities p i 's are independent, i.e., f p (θ) = ∏ N i=1 f p i (θ i ), where each probability p i follows a pdf f p i (θ i ), that is, p i ∼ f p i (θ i ), and θ is a vector of parameters θ = [θ 1 , . . . , θ N ] . The proof of (13) is beyond the scope of the present paper and hence is omitted. Expressions (14) and (15) can be easily found following the same steps in (12).
As noted, the probability of successful transmissions P , regardless of the pdf of p. Consequently, it is possible to conclude that, for the measures of interest, to consider an opportunistic model with random probabilities p * is equivalent to consider a model with fixed probabilities, set as its expectations, that is, p = E[p * ].

Comparison with Existing Methods in the Literature
There are some interesting works in the literature regarding OR models. For instance, in [25], the effect of the number of re-transmissions in each node for the case where the transmitted packet does not reach any of the candidate nodes is investigated. To this end, the authors established an analogy between a discrete-time Markov chain and OR. The key feature is that the technique can be applied to any kind of network topology, but only one possible candidate forwards the packet at each transmission, i.e., it is assumed perfect coordination among the candidates. Moreover, the destination may not be reachable from the source in just one step. In [35], a general framework for OR is proposed in terms of a probabilistic graph where each link is marked with a number between 0 and 1, representing the delivery ratio. Similarly to [25], it can be applied to any kind of network topology.
Besides, a priority function is considered, establishing the priority (order relationship) among the nodes, which is needed to decide which node (among all that received the packet) will broadcast next. The main restriction is that a packet cannot be forwarded by a node with a certain priority toward nodes exhibiting lower priority.
The model proposed in this paper differs from the ones given in [25,35], essentially because all the nodes (candidates) are allowed to forward the packet toward the destination, once they receive it, leading to a higher delivery probability. Using a recursive approach, it is proposed simple closed-form expressions circumventing cumbersome mathematical expressions like the ones found in other works. In addition, to compute the full routing probability distribution, any restriction about the probability can be imposed. Even more, unlike the other two models, the model proposed here contemplates uncertainty by considering random probabilities following any unknown distribution function, for instance, uniformly distributed on an interval or following a complex bimodal distribution.
As discussed at the end of Section 4, when one considers random probabilities in the proposed model, it is sufficient to know the expected value of each probability p i and not its probability distribution. This feature is very advantageous since the expectations can be easily estimated, if necessary. To the best of the authors' knowledge, this is the only work in the OR literature that can be easily replicated by the interested reader because the entire formulation has been implemented and made available through functions in a free library (see Appendix A section).

Validation of the Model
To validate the proposed model, two different 5-hop opportunistic models are investigated. The first one is a traditional OR model, considering precisely known probabilities, while the second one introduces uncertainties when considering random probabilities on a bounded interval. Simulations using 10 6 Monte Carlo (MC) realizations were performed to compare the measures of interest with the ones obtained using the proposed expressions.
All methods proposed in this paper as well as numerical routines can be easily reproduced using the R software library "Opportunistic". This library is freely available to practitioners, and it also offers a user-friendly manual (Manual available on 30 October 2021 https://CRAN.R-project.org/package=Opportunistic).

Random Probabilities on an OR Model
Consider a 5-hop opportunistic model where each node forwards a packet according to a set of random probability p * . Each probability has a pdf as illustrated in Figure 4. In particular, each p * i is set to follow a doubly-truncated Beta distribution Beta(α, β, a, b) with parameters α = β = 2 and truncated support on (a, b) = (1 − 0.2i, 1.2 − 0.2i) for i = 1, . . . , 5. This is a convenient distribution since its support lies within the interval [0, 1] and it can be limited to an arbitrary bound (a, b) ∈ [0, 1]. For instance, p * 2 ∼ Beta(2, 2, 0.6, 0.8).  For each MC realization, the probabilities are generated according to their distribution functions, and the OR process is simulated. Finally, the arithmetic means of the interest measures and the MC estimates are stored.
For this particular example, the results are summarized in Table 2 and Figure 5. By observing Table 2, one can confirm the claim that considering the expected values as fixed probabilities leads to the same results than the estimates obtained via MC simulation, validating expressions (12)- (15). Figure 5 shows the estimated density (via smoothing kernel) for the probability of successful transmission P

Energy Consumption for WSN Opportunist Network
This section presents a practical example where the proposed OR network model can be employed. The fundamental concepts for modeling the energy consumption on an OR wireless sensor network are exposed in this section. In wireless networks, usually, each network node has two operational modes: transmission and reception. Each mode, and the switching between them, has a specific energy cost associated, as can be seen in [ Figure 2] [36]. Using the notation from [37], the costs are defined as: transmission cost (E TX ); reception cost (E RX ); and switching cost (E SW ).
The total energy consumption for a generic wireless network is denoted by E T , and it can be obtained as a function of E TX , E RX , E SW , assuming that the nominal state of a node is reception. Whenever a node is ready to transmit a packet, first it must switch from reception mode to transmission mode, then transmit the packet, and after it must switch back to the reception mode. In any opportunistic network with linear topology, the cost in Joules (J) for transmission of a single message can be computed by where T B N and N X correspond to the expected number of broadcast transmissions and the expected number of receptions, respectively. It is possible to generalize Equation (16) to take into account, for instance, the packet length or variable power transmission, by including the information about those parameters in the model, as presented in [37].

Numerical Example
The energy costs (E TX , E RX and E SW ) used in this example were obtained by [36] via experiments using modules IEEE 802.15.4 XBee Pro. In [36], the cost consumption is The network used in this simulation is the same one employed in Section 6 to transmit a single package from the source to the sink. The energy consumption expected value for the case where the probabilities are precisely known is equal to 1.0503 [µJ], and for the case where the probabilities are uncertain (Figure 4) is equal to 0.3569 [µJ]. To obtain specific energy consumptions, such as for cluster or particular nodes of the OR network, one can follow the same steps from [37] [Appendix A].

Conclusions
This paper proposed a general theoretical stochastic model for opportunistic routing based on wireless networks. The results can be obtained for both precisely known and random probabilities. The random probabilities provide versatility, allowing us to model link or freight layer errors, suitable for modeling more realistic environments. Note that the proposed model corresponds to an idealized development of a network, similar to the content of a supply-demand network. The results can be easily obtained through an R package available for free, and let the practitioners either circumvent the use of simulation studies or provide theoretical reference values for future comparison.
Differently from the models available in the literature, based on Markov chains, the proposed technique is capable of obtaining the network parameters (total number of routes, expected number of networks broadcast, successful packet transmission and, network reception) recursively using a percolation stochastic model.
The major advantages of the proposed approach, when compared with other OR models from the literature, include: nodes (candidates) are allowed to forward the packet toward the destination, once they receive it, leading to a higher delivery probability; no constraint is imposed on the probabilities; uncertain and random probabilities following any unknown distribution function, for instance, uniformly distributed over an interval or following a complex bimodal distribution are contemplated in the proposed model; regardless of the knowledge of the probabilities, the method provides similar values to the ones obtained via MC simulation. Finally, it is important to mention that the stochastic model is an advantageous tool to compute the energy consumption of a wireless network beforehand, without requiring simulation or practical implementation. This task can be easily accomplished using the R package, which allows minimizing time and effort in engineering network design.
As future work, it is suggested the adaptation of non-linear restrictions for the OR model to determine the network parameters. One particular type of restriction is the ability to delete specific routes, being a useful feature to eliminate redundant transmissions, duplicated messages and/or congestion. and (10). The second one provides its MC estimates for the last node only. The latter also provides a progress bar so the user can estimate the processing time. By default, delta is considered to be zero (no uncertainty) and M = 10 5 when the number of Monte Carlo realizations is not declared.
The results for the three main functions are shown for the same 5-hop opportunistic model with decreasing probabilities p = [0.85, 0.72, 0.50, 0.28, 0.05] proposed in Section 6.1. The number the possible routes is obtained by