Capacity-Delay Trade-Off in Collaborative Hybrid Ad-Hoc Networks with Coverage Sensing

The integration of ad hoc device-to-device (D2D) communications and open-access small cells can result in a networking paradigm called hybrid the ad hoc network, which is particularly promising in delivering delay-tolerant data. The capacity-delay performance of hybrid ad hoc networks has been studied extensively under a popular framework called scaling law analysis. These studies, however, do not take into account aspects of interference accumulation and queueing delay and, therefore, may lead to over-optimistic results. Moreover, focusing on the average measures, existing works fail to give finer-grained insights into the distribution of delays. This paper proposes an alternative analytical framework based on queueing theoretic models and physical interference models. We apply this framework to study the capacity-delay performance of a collaborative cellular D2D network with coverage sensing and two-hop relay. The new framework allows us to fully characterize the delay distribution in the transform domain and pinpoint the impacts of coverage sensing, user and base station densities, transmit power, user mobility and packet size on the capacity-delay trade-off. We show that under the condition of queueing equilibrium, the maximum throughput capacity per device saturates to an upper bound of 0.7239 λb/λu bits/s/Hz, where λb and λu are the densities of base stations and mobile users, respectively.


Introduction
Exploitation of the spatial domain is a primary way to address the challenge of exponential capacity demand in cellular communication networks [1]. Small cells [2] and device-to-device (D2D) communications [3,4] are both effective solutions to enhance the cellular network capacity by increasing the spatial reuse factor of the limited spectrum. An alternative approach to address the exploding traffic challenge is to exploit the traffic delay domain. This is motivated by the fact that a large portion of mobile data traffic is consumed by content delivery, which is non-real-time in nature. Unlike real-time services that have strict delay constraints, content delivery services have a greater flexibility to be manipulated in the delay domain (e.g., by proactive content pushing) [5,6]. It has been shown that relaxed delay constraints can be traded for capacity. This drives an emerging research field of content-centric mobile communications, which aim to find capacity-efficient solutions for massive content delivery [7][8][9].
The integration of ad hoc D2D communications and open-access small cells can result in a fundamental networking paradigm called the hybrid ad hoc network, which is a promising paradigm for future mobile communication networks. The objective of this paper is to investigate the fundamental trade-off between capacity and delay in such hybrid ad hoc networks. The capacity study of cellular D2D networks can take the reference from the extensive literature on the capacity of wireless ad hoc networks. Most existing works in this field have adopted a popular information-theoretic framework called scaling law analysis. Gupta and Kumar first proposed this framework and showed that the per-node transport capacity of arbitrary static ad hoc networks scales as 1/ √ n [10], where n is the number of nodes in the network. This result suggests that the capacity of each node diminishes as n goes large. Subsequent works on static ad hoc networks, such as [11][12][13][14], all lead to similar pessimistic results.
Based on an important insight that mobility can be exploited to enhance capacity at the expense of increased delay, Gorssglauser and Tse [15] showed that in mobile ad hoc networks, a constant per node throughput can be achieved with a two-hop relaying scheme. Several subsequent works have studied the amount of delays required to achieve a level of capacity for various mobility models, such as i.i.d. mobility [16], random walk [17][18][19], Brownian motion [20] and Levy walk [21,22]. The delay required for constant per node throughput has been shown to scale as fast as the network size.
Apart from mobility, it has been shown that adding infrastructure (e.g., base stations (BSs)) to pure ad hoc networks, resulting in the so-called hybrid wireless networks, can bring significant benefits in terms of capacity and delay. The capacity of hybrid networks with static nodes has been studied in [23][24][25][26][27][28]. It was shown that capacity increases linearly with the number of BSs, given that the number of BSs grows faster than √ n [28]. In [29], it is shown that a constant delay can be achieved. The capacity scaling law of hybrid networks with mobile nodes is studied in [30], where some mobility-dependent extra gains on the capacity are shown. The study of capacity-delay trade-off using the "scaling law" analysis has attracted much research attention in recent years. Research has been extended to address various aspects, such as motion-cast [31,32], multi-cast [33][34][35][36], converge-cast [37], group and correlated mobility [38][39][40][41][42][43], cognitive radio [44,45], etc.
Despite the enormous success and popularity of the "scaling law" framework, this framework also has some limitations. First, for a tractable analysis, the "protocol model" [10] is usually assumed to describe the communication and interfering range of a transmitter. This model, however, does not take into account accumulated interference, which can become significant in dense networks. Second, the delays incurred by buffering and queueing are commonly neglected for simplicity, resulting in potentially under-estimated delays. For example, consider a mobile node with a large amount of buffered data and a short time opportunity to access a BS. It is likely that some buffered data cannot be delivered in the first access opportunity and should wait for the next chance. As a result, queueing delays are coupled with mobility-related delays, which can potentially lead to a significant increase of the overall delay. It should be noted that the delay we considered in this paper is the fundamental delay caused by ideal (i.e., infinite-buffer) queueing at the physical layer. This delay is different from other studies that considered specific medium access control (MAC) layer functions, such as retransmission schemes [46,47]. Third, previous studies have mostly focused on the average measures, e.g., the mean delays. Such an average measure can be misleading in the case of long-tail distributions, in which the mean is biased by infrequent incidents of very large values. Because long-tail delay distributions are common in communication networks, it is very desirable to gain finer-grained insights into the exact distribution of delays.
To address the above limitations, this paper proposes an alternative analytical framework based on queueing theoretic models and physical interference models. Although both models have been used extensively for the performance study of wireless networks, the effort to unify both models in a coherent framework is still rare. Our previous conference paper [48] was an early attempt to propose a unified framework for the performance analysis of hybrid ad hoc networks. The basic idea is to capture the stochastic phenomenon of user mobility and coverage outage using queueing dynamics. However, the work was still incomplete and does not consider the issue of multi-user access. This paper further extends and refines the unified framework and provides comprehensive analysis. Specifically, new issues, including multi-user access, capacity limit and power and rate optimization, are addressed in this paper. The new framework allows us to fully characterize the delay distribution in the transform domain and pinpoint the impacts of user and BS densities, transmit power, user mobility and packet size on the uplink capacity-delay trade-off. We reach a conclusion that the maximum throughput capacity per user is bounded by 0.7239 λ b /λ u bits/s/Hz, where λ b and λ u are the densities of base stations and mobile users, respectively.
The remainder of this paper is organized as follows. The system model is described in Section 2. Section 3 introduces the new analytical framework combining both queueing models and physical interference models. A detailed characterization of delay distributions and the fundamental limits of per node capacity throughput are presented in Section 4. In Section 5, we discuss the aspects of rate and power optimization to achieve the minimum average delay. Finally, numerical results are presented in Section 6, and conclusions are drawn in Section 7. For the convenience of the readers, the major parameters defined in this paper are summarized in the table (Table 1) below.

System Model
We consider a hybrid ad hoc network with small cells and mobile users. We are interested in the uplink scenario, where mobile users transmit message to the small cell BSs. The small cell BSs are randomly deployed following two-dimensional homogeneous Poisson point processes (PPPs) with density λ b . We assume that a single dedicated frequency band is used by all small cells to provide best-effort coverage in the presence of self-interference. Similarly, the mobile users are assumed to be randomly deployed following a PPP with density λ u . Each user has a homogeneous throughput capacity demand of C bits/s. More specifically, we assume that each user has an incoming traffic stream with fixed packet size L. It is assumed that all users have identical and random mobility patterns, so that they randomly move in and out of the small cell coverage areas from time to time. The average speed of a user is denoted by v. The duration when a user is not in coverage is called coverage outage.

User Collaboration Protocol
As illustrated in Figure 1, we consider a user collaboration scheme with two-hop decode and forward relay. This simple scheme was frequently assumed in the literature and has been shown to achieve the optimal scaling in mobile ad hoc networks [15]. Our study focuses on the uplink access scenario, which includes two phases: broadcast phase and deliver phase.

•
In the broadcast phase, original packets on a device are broadcast in the D2D band with a constant rate R I and constant power P I . Nearby users who can successfully decode the packet will store the packet. Each packet is broadcast only once from its original user.

•
In the deliver phase, the original traffic and traffic received from other users during the broadcast phase are buffered in a queue and wait to be transmitted to a BS. A transmission to the BS starts only when a packet carrier falls within the coverage of a small cell. The packets are transmitted following a first-come-first-out (FIFO) policy until the buffer empties or a coverage outage occurs. The transmit power and rate used to communicate with the BSs are denoted as P II and R II , respectively. Once the transmission of the first copy of a packet starts, a signaling is performed so that all other copies of the same packet will be dropped [15]. In cases that a packet transmission is interrupted by a coverage outage, the transmission will be resumed to transmit the rest of the packet once the user moves into coverage again. In other words, we assume a preemptive-resume queueing policy, noting that our results can be easily extended for a similar preemptive-repeat policy.
The frequency bands used for D2D communications and small cell access are different to avoid interference. Without loss of generality, we assume that both frequency bands have the same normalized bandwidth of one.

Interference Model
Whether a user is within the coverage of a small cell transmitter is determined by its received signal-to-noise-and-interference ratio (SINR). Unlike "protocol models" [10] that use two idealistic circles to represent the transmit range and interfering range of a transmitter, in this paper, we consider the physical interference model, which considers the accumulation of interference from multiple transmitters. Consider a random field of non-collaborative transmitters distributed as a two-dimensional PPP process and transmitting with identical power P; the receive SINR at a typical (randomly chosen) location is given by: where P is the transmit power normalized to the Gaussian noise power, I is the accumulated interference normalized to P, h is the channel gain given by h = gd −η , d is a random variable (RV) denoting the distance between the active user and the inactive user, η is the path loss exponent and the RV g ∼ exp(1) follows an exponential distribution with unit mean to represent the power gain of Rayleigh fading channels. The accumulated interference I is given by: where i is the index of interfering active users, d i is the distance from the inactive user to the i-th interferer and g i ∼ exp(1) are RVs to account for Rayleigh fading in the interference channels.
According to the spatial PPP model, the PDF of d is given by [49]: where λ is the spatial density of transmitters. In the context of wireless networks, the above PDF could result in an unrealistic calculation of the path loss when the common path loss model is applied. When d ∈ (0, 1), we have d −α > 1, implying that the receive power becomes greater than the total transmit power, which is unrealistic. A practical approach to reduce this inaccuracy is to limit the range of d as d ∈ [1, ∞). This leads to a slightly modified PDF given by: Numerical results show that the difference between (3) and (4) becomes significant when the transmitter density becomes higher than 0.1 users/m 2 . Consider a typical receiver on the plane; the received SINR is an RV. Following a similar procedure in [50], but applying the modified PDF of d given by (4), the complementary CDF (CCDF) of SINR, given the path loss exponent η = 4, can be derived as:F where: and b = x/P. In Equation (6), Q(·) denotes the Q-function. Given that λ < 0.1 users/m 2 , which suits most practical scenarios, Equation (6) can be well approximated by: In the case of P → ∞, Equation (8) can be further simplified to [50]:

Remarks on System Parameters
Summarizing the above system description, two types of parameters can be distinguished. The first type is the system parameters, including the user packet arrival rate χ, packet size L, user density λ u , base station density λ b and user speed v. These are given parameters that cannot be optimized by protocol design. We note that the capacity per user is given by C = χL. The second type is the protocol parameters, including power parameters P I and P II and rate parameters R I and R II . These parameters can be optimized by protocol designs.
Based on the above system description, our research objective is to gain theoretical insights into the following questions: (1) How is the distribution of packet delay related to the system and protocol parameters? (2) Given the system parameters, how can protocol parameters be optimized for delay performance? (3) Is there a fundamental limit of per node throughput capacity C? (4) Given optimized protocol parameters, what is the trade-off between capacity and delay? How does this trade-off change with different system parameters? Before addressing these questions, a new analytical framework is introduced to transform the above system model into a mathematically-tractable queueing model.
The following notations regarding an RV will be applied throughout the text. Given an RV denoted as α, we will useᾱ to denote its mean,α to denote its second moment, f α (t) to denote its probability density function (PDF), F α (t) to denote its cumulative distribution function (CDF),F α (t) to denote its complementary CDF and L α (s) to denote its Laplace transform. The Laplace transform of an RV is given by: where E(·) denotes expectation.

A Queueing Model-Based Analytical Framework
Our analytical framework is based on a queueing model that characterizes the behavior of data buffering, collaborative packet delivery and random processes of coverage outage. This section will explain how parameters of the queueing model can be derived from the various system parameters and protocol parameters introduced in the previous section.

A Queueing Model
Consider the packet transmission process in a typical device, the delays incurred in different phases can be described by the queueing model illustrated in Figure 2.

Queueing in the Broadcast Phase
In the broadcast phase, original traffic is buffered in a device before it can be broadcast. The queue is characterized by two RVs α d and β d , which represent the random packet arrival interval and transmission time of packets, respectively. Under the assumption of fixed packet size and constant broadcast rate, β d becomes a constant given by β d = L/R I . Define the load parameter ε d =β d /ᾱ d ; this parameter represents the fraction of time during which a device is active in broadcasting.
The delays incurred in this queueing process include waiting time w I and completion time z I . The former is defined as the duration from the arrival of a packet till the start of its transmission. The latter is defined as the duration from the start of a packet's transmission to the end of the transmission. Define sojourn time as s I = w I + z I . This indicates the total time a packet spent in a queue.
The number of users that can successfully receive the packet from a broadcasting user is a discrete RV denoted by N. The probability mess function (PMF) of N is denoted by f N (n). After the broadcast, we call a packet belonging to type-n traffic if there are n copies of the packet in the system, i.e., the packet has been successfully broadcast to n − 1 more users.

Effective Traffic
Packets coming from the original traffic and packets received from other users via broadcast are buffered in a queue before they can be delivered to the BS. These packets, however, may be dropped if one of their copies gets transmitted first from other packet carriers. A rigid representation of the actual queueing process requires a complicated model involving queueing network, which is analytically intractable.
To simplify the analysis, we define "effective traffic" of a device as packets that eventually get transmitted from the device. Because users are homogeneous, the effective traffic load of a user should be the same as the original traffic load. After the broadcast phase, the average type-n traffic received by a user is given by n f N (n)/α d . Because there are n copies undergoing the independent queueing process on different users, the probability that a type-n packet gets transmitted from a particular user is 1/n. Therefore, the effective type-n traffic delivered from a user becomes f N (n)/α d . Summing up all of the traffic types for n ranging from one to ∞, it can be easily shown that the overall effective traffic load of a device equals 1/α d , i.e., ∑ ∞ n=1 f N (n)/α d = 1/α d . Because non-effective traffic packets are dropped before transmission as if they have never arrived on a device, only effective traffic will contribute to the actual queueing delays.

Queueing in the Deliver Phase
A preemptive-resume priority queueing model is used to describe the queueing behavior in the deliver phase. This model assumes two classes of independent traffic. The first class represents the coverage outage process, while the second class represents the effective traffic. The first class has absolute priority over the second class. This means that once a coverage outage occurs, the current transmission is stopped and should wait till the next coverage opportunity.
The effective traffic is characterized by two random RVs α e and β e . The former characterizes the arrival intervals of effective traffic packets, while the latter characterizes the uninterrupted transmission time of a packet. The outage process is characterized by α o and β o . The former represents the random duration between the arrivals of two outages, while the latter describes the random duration of an outage. Define load parameters ε e =β e /ᾱ e and ε o =β o /ᾱ o . The combined load of the two classes of traffic is ε II = ε e + ε o , and a stable queue requires ε II < 1.
Delays incurred in this phase include waiting time w II and completion time z II . We note that the completion time z II is not the same as the transmission time β e . The former should take into account cases in which the transmission of a packet is interrupted by a coverage outage, so that the time taken to complete the transmission of a packet is prolonged by random coverage outages. The sojourn time in Phase II is s II = w II + z II .

Analysis of Queueing Parameters
So far, we have introduced the seven RVs that characterize our queueing model: α d , β d , α e , β e , α o , β o and N. We will subsequently show how these RVs are related to the various system and protocol parameters introduced in Section 2. A summary of the relationships among various parameters is illustrated in Figure 3.

Assumptions
To facilitate a tractable analysis, we assume that α e and α o follow exponential distributions. In other words, Poisson arrivals are assumed for the effective traffic and the coverage outage processes. The Poisson assumption of α e is a common practice in traffic engineering. The Poisson assumption of α o is natural with PPP distributed BSs, as will be explained later in Section 3.2.2. We note that no particular distribution is assumed for α d to justify the Poisson assumption of α e . Because our system model assumes a fixed packet size and a constant broadcast rate, we have a deterministic β d ≡ L/R I . Our framework makes no particular assumptions on β e and β o , i.e., both can follow general distributions. This gives our model the flexibility to represent and differentiate a wide range of practical systems. By varying the distributions of β e , we can account for different policies and behaviors of open access small cells. Similarly, by varying the distributions of β o , we can account for different user mobility patterns.
It is easy to see that the mean inter-arrival times of the original and effective traffic are both given by:ᾱ Moreover, the mean transmission times of the original and effective traffic are given by: and:β respectively.

The Coverage Outage Process
The coverage outage process is fully characterized by RVs α o and β o . Here, we will show how their mean valuesᾱ o andβ o are inherently related to the system parameters.
Let us first considerᾱ o . As shown in Figure 4, we assume that each user has a coverage sensing area represented by a circle, the diameter of which is given by Ω. When a user moves with speed v for a short period of time t, the movement trace can be regarded as a straight line. The sensing area covered by the mobile user during t is vtΩ, and new BSs may appear within this area. We assume that the user will attempt to handover to a newly appearing BS in the coverage sensing area, and an outage event occurs during a handover. Therefore, the rate of outage arrival is the same as the rate of BS arrival in the coverage sensing area. Because BSs follow a PPP distribution on the plane, it follows that: Now, we considerβ o . As mentioned previously, we have ε o =β o /ᾱ o ; this parameter can be understood as the fraction of time that a user falls out of coverage. Parameter ε o depends on both the spatial coverage of the uplink and multi-user competition for access. We can write ε o = 1 − p c p a , where p c is the probability that a user falls within coverage, and p a is the probability that the user is granted access among multiple users within the same cell. Coverage areas are defined as areas in which a receiver can receive a data rate higher than R II in the presence of inter-cell interference. Because only one user is active in transmission in a cell, based on the interference models described in Section 2.2, we have p c =F χ (2 R II − 1; λ b , P I I ), where functionF χ (x) is the interference complementary cumulative distribution function (CCDF) defined in Equation (6). Moreover, because all users have equal access to the BS, the multi-user access results in p a = λ b /λ u in an average sense (we assume that λ u > λ b always holds). It follows that:

Number of Packet Copies N
All original traffic is broadcast in Phase I from its user with identical power P I and broadcast rate R I . The broadcast is slotted with slot length L/R I , where L is the fixed packet length. In each slot, the broadcasting users are called active users, while the rest are called inactive users. The time fraction that a user is active in broadcasting equals ε d =β d /ᾱ d . The density of active users is therefore given by: and the density of inactive users is λ w = λ u − λ a . We assume that each inactive user is associated with the nearest active user and listens to its broadcast signal. Let M denote the number of associated inactive user per active user; the PMF of M is given by [51]: where Γ(·) denotes the Gammafunction and (·)! denotes factorial. For each active user, the number of inactive users that can successfully receive its broadcast in each time slot is an RV denoted by N . The number of copies of a packet after a broadcast is denoted by N, and we have N = N + 1. An inactive user can successfully receive a packet only if it can receive Phase I broadcasting with an SINR higher than χ = 2 R I − 1. The probability of successful packet reception can be calculated as F γ (χ; λ a , P I ). Because the transmitter is assumed to follow an ergodic PPP process, the SINR can be treated as spatially ergodic. It follows that N ∼ B(M, F γ (χ; λ a , P I )), i.e., N follows a binomial distribution with parameters M and p = F γ (χ; λ a , P I ). Since M is an RV, the PMF of N can be obtained by taking the expectation over M, i.e., where C n m = m!/n!. Finally, the PMF of the random number of copies of a packet in the system is given by:

Capacity Limits
Consider the priority queue in the deliver phase; the combined load of two classes of traffic is given by: where ε e = C/R II and ε o is defined in Equation (15). A stable queue requires ε II < 1; it follows that: Given BS density λ b and power input P II , the capacity C can be optimized over R II , i.e., Numerical results show that C appears convex over R II under various parameter settings. Therefore, C * (λ b , P II ) can be calculated by effective numerical methods. Furthermore, it is easy to see that C * is a monotonically-increasing function of P II . From a theoretical point of view, we are interested in the fundamental capacity limit C lim defined as: Substitute Equations (9) and (22) into Equation (23), we get: It can be shown that C lim is a convex function of x. Numerical evaluation can be performed to give C lim = 0.7239 λ b /λ u bits/s/Hz, which shows a constant scaling with λ b /λ u . We note that our conclusion conforms with the conclusions obtained via scaling law analysis [28], which predicts that the capacity can grow linearly with λ b /λ u . Our model refines the result by obtaining the exact scaling constant as 0.7239. In Figure 5, the optimal capacity C * is illustrated as a function of P II with varying λ b based on Equation (22). It is observed that C * increases initially with increasing P II or λ b , but eventually reaches the upper bound C lim .

Delay Distributions
This subsection aims to obtain the exact distribution of the four delay parameters w I , z I , w II and z II , from which the total delay can be obtained as: The PDF of D can be numerically calculated as the convolution of the PDFs of each component.

Phase I Delays w I and z I
Because we have assumed a fixed packet size and a fixed broadcast rate R I , it is obvious that: The queueing process in Phase I forms a G/D/1, queue and the exact distribution of w I is generally unavailable. In the special case that α d follows exponential distributions, the queueing process in Phase I becomes an M/D/1queue, and we have [52]: where K = t is the largest integer smaller than t. The average waiting time is given by [52]: These results of an M/D/1 queue can serve as a reasonable estimate for the actual delay of the G/D/1 queue under practical settings. We note that under practical settings, the Phase II delays are much greater than Phase I delays, i.e., w II + z II >> w I + z I . Therefore, our subsequent focus is on obtaining the exact distributions of w II and z II .

Phase II Completion Time z II
In Phase II, we have an M/G/1 priority queue with two classes of traffic. The first-class of traffic is coverage outage, while the second-class of traffic is effective traffic. We are interested in the completion time of the second class of traffic. The Laplace transform of z II is given by [52]: where L β e (·) is the Laplace transform of β e and: Here, G(s) is the solution with the smallest absolute value that satisfies the following equation: (29)-(31), the Laplace transform L z II (s) can be obtained. The exact PDF of z II can then be numerically calculated using standard methods of Laplace inversion. Finally, the first and second moment of z II can be evaluated analytically as [52]: and:ẑ respectively.

Discussions on β o
We have so far assumed a general distribution for the outage duration β o . This distribution affects the solution of Equation (31). We will subsequently discuss two special distributions for β o .
The first distribution to consider is the exponential distribution. This memoryless distribution is a natural choice for β o when small cell BSs are randomly located as a PPP and users have coverage-independent mobility patterns. Given β o ∼ exp 1/β o , its Laplace transform can be evaluated as: It follows that Equation (31) can be solved explicitly to give: Another useful distribution we consider is the Gamma distribution. The Gamma distribution can provide more flexibility when characterizing β o for a variety of practical scenarios. Given β o ∼ Γ (k, θ), the PDF of β o is given by: where k and θ are the shape and scale parameters, respectively. Under the Gamma distribution, the Laplace transform of β o is given by: It follows that when k is an integer or a rational fraction, Equation (31) yields a polynomial form. Therefore, function G(s) can be easily solved using existing root-finding algorithms for polynomials.

Phase II Waiting Time w II
The waiting time w II of a packet depends on its traffic type, i.e., the number of packet copies in the system. We denote the waiting time of a type-n traffic as w n II . Let us first consider w 1 II , whose Laplace transform is given by [52]: where K(s) is defined in Equation (30). It is possible that the packet arrives to see an empty buffer. Therefore, the CDF function has a non-zero value at 0+, which is given by [52]: Clearly, the CDF of the virtual waiting time depends on the characteristics of the effective traffic and coverage outage process.
In Figure 6, the CDF of w 1 II is illustrated with varying values of ε o , which denotes the fraction of areas without coverage. Similar to the definition of the well-known "outage capacity" in fading channels, we can define "outage delay" as the delay that grantees certain outage. For example, a 10% outage delay is the delay t 10 that satisfies F w II (t 10 ) = 0.9. From Figure 6, a nonlinear relationship is observed between ε o and outage delays. Taking the 10% outage delay for example, when ε o takes values of 0.1, 0.2, 0.3 and 0.4, the corresponding 10% outage delay is roughly 1 s, 7 s, 17 s and 43 s, respectively. Therefore, the delay performance degrades quickly with increasing coverage outage.  Another aspect we investigate in Figure 6 is how the CDF of w 1 II is influenced by different distributions of β o . Two types of distributions are compared: one is the exponential distribution, the other the Gamma distribution with k = 2, which is also an Erlang distribution. The former distribution corresponds to a purely random network, while the latter can represent networks that are planned with certain regularities. For the purpose of fair comparison, the two types of distributions are set to have the same meanβ o . It is observed that the Erlang distribution gives slightly better performance than the exponential distribution. From this, we postulate that the delay performance will improve if the small cell network is not entirely random, but exhibits certain regularities. Now, consider the waiting time of a type-n traffic packet. Because there are now n copies undergoing independent queueing processes, the actual waiting time w n II is the minimum of the n queues. The delay CDF of a type-n traffic packet can then be evaluated as: Further consider n as an RV denoted by N and apply the law of total probability; the CDF of the waiting time of an arbitrary packet is given by: where f N (n) is the PDF of N given by (19).
In Figure 7, the CDF of w N II is illustrated with varying values of N according to Equation (40). User collaboration is shown to be effective in reducing delays. Compare Figure 7 with Figure 6; we observe that the performance given by N = 5 and ε o = 0.6 is comparable to the performance given by N = 1 and ε o = 0.2. In other words, if a packet is successfully broadcast to four other users, the coverage requirement can be relaxed about two times in this case ((1 − 0.2) ÷ (1 − 0.6) = 2). On the other hand, Figure 7 also shows that the benefits of increasing N gradually diminishes as N goes large.

Rate and Power Optimization
In the previous section, we have established the delay distribution subject to protocol parameters and system parameters. From the practical perspective of system design and optimization, it is desirable to understand how the protocol parameters (R I , R II , P I and P II ) can be properly chosen to give an optimized capacity-delay performance. Without loss of generality, our subsequent analysis is restricted to the case where both β e and β o follow exponential distributions.

Heuristic Optimization of R I
Under natural conditions, the waiting time w II dominates the delay. Therefore, the primary target of delay minimization is to minimize w II . According to Figure 7, increasing the number of packet copies is shown to be very effective in reducing delays. Therefore, a simple heuristics for the optimization of R I is to maximize the mean number of packet copiesN. It turns out a simple closed-form estimate exists to giveN =Jq, where:J = (λ u − λ a )/λ a = R I /C − 1 (42) denotes the ratio of inactive users and active users in Phase I, λ a = λ u C/R I , and: denotes the probability that an inactive user can successfully receive a packet. Because increasing R I will increaseJ, but reduce q, such a tension requires an optimization over R I . The optimization problem of R I can be formulated as follows: given C, λ u and P I , Figure 8 showsN as a function of R I . The above optimization problem is shown to have a simple structure with a single peak value, which can be easily obtained via numerical methods.

Heuristic Optimization of R II
The total delay is dominated by the waiting time w II , which depends largely on the waiting time of Type-1 traffic w 1 II . A simple heuristic to optimize R II is therefore to minimize the mean of w 1 II given by: in the case that β e and β o both follow exponential distributions. Increasing R II will reduce the Phase II transmission time (once in coverage), but at the cost of reduced probability to fall within coverage. This tension leads to an optimization problem as follows: given C, P II , L, v and λ b , In Figure 9, the mean waiting timew 1 II is shown as a function of average delivery rate R II with varying transmit power P II and capacity demand C. The objective function appears to be convex, and the optimal value can be easily obtained via numerical methods.  Figure 9. The mean waiting timew 1 II as a function of average delivery rate R II with varying transmit power P II and capacity demand C (λ b = 10 −6 , L = 1, v = 1).

Heuristic Optimization of Power P I
Unlike the optimizations over R I and R II that aim to balance between conflicting effects, increasing P I is always beneficial, but with diminishing returns in terms of capacity and delay. Our heuristic approach to the optimization of P I is based on the following idea: when P I reaches a threshold, further increasing P I is not helpful as Phase I broadcasting becomes interference limited. Therefore, we want to have the minimum P I that can achieve φ percent of the best performance given by P I → ∞. As mentioned previously, the probability for an inactive user to successfully receive a packet is F γ (χ). This can be used as a convenient performance indicator of the broadcasting performance.
The optimization of P I can now be formulated as follows: where χ = 2 R I − 1, λ a = λ u ε d and functions F * γ (·) and F lim γ (·) are defined in Equations (8) and (9), respectively.
At relatively high values of P I , the Q-function appearing in Equation (8) can be well approximated by a lower bound given by: Substituting Equations (8), (9) and (48) into Equation (47), we get: Equation (49) gives a closed-form formula to directly calculate P * I from system parameters.

Heuristic Optimization of Power P II
The idea for the optimization of P II is similar to that of P I , only that the objective function is now C lim . As shown in Figure 5, increasing P I is always beneficial to the capacity until the capacity approaches a constant limit. The optimization problem can be formulated as: where C * (λ b , P) and C lim are defined in Equations (22) and (24), respectively. Obviously, given λ b , P * II can be easily obtained from Figure 5 by drawing a horizontal line at C = φ * C lim = 0.7329φ to intersect with the various curves.

Numerical Results and Discussions
In this section, numerical results are shown to illustrate the capacity-delay trade-off with varying system parameters. We aim to shed light on the following questions: What is the trade-off between capacity and delay? How does this trade-off change with different system parameters? We note that two different metrics for the delay performance can be considered: the mean delay and the outage delay. Due to page limits, our discussions are limited to the mean delay.
The procedure of our numerical evaluation is as follows: (1) given system parameters (C, L, λ u , λ b , and v) and power parameters (P I and P II ), calculate the optimal rate parameter R I and R II according to Equations (44) and (46), respectively; (2) given all of the above parameters, calculate the PDFs of w II and z II based on Section 4; (3) calculate the PDF of the accumulated delay D and evaluate its mean valueD. Without loss of generality, we set λ b = 10 −5 and Ω = 100 in all cases. Figure 10 shows the impact of user density λ u and power P I on the capacity-delay trade-off. The trade-off is shown to be insensitive to the user density. This is because the capacity limit scales with 0.7239λ b /λ u . When capacity approaches this limit, the delay shows an exponential growth to infinity. The value of P I is also shown to have a significant impact on the delay performance. The case of P I = 200 dB represents the extreme case of infinite power. The capacity-delay trade-off at P I = 200 dB indicates the performance upper bound we can get from user collaboration. Figure 11 shows the impact of user speed v on the capacity-delay trade-off, for cases with and without user collaboration. We set P I and P II to very large values to shed light on to the fundamental performance limits. The trade-off is shown to be sensitive to the speed. For a ten-fold increase of the speed, the delay is shown to reduce by about 90%. In other words, an inversely proportional relationship is observed between speed and mean delay. The benefit of user collaboration (i.e., relay) is shown to be significant, especially when the movement speed is low. This suggests that in practice, allowing D2D communications between low speed and high speed users will effectively reduce the delays of low speed users. Figure 12 shows the impact of packet size L on the capacity-delay trade-off. In practice, it is desirable to have a larger packet size to reduce overhead. However, it is observed that increasing L leads to slightly increased delays. This suggests that the packet size should also be optimized carefully in practice. It is interesting to see that the delay becomes larger when the value of C approaches zero. This is because the heuristic algorithms for optimizing protocol parameters are sub-optimal for very small values of C. This shows some limitations of the heuristic algorithm in Section 5. While all of the above numerical results are based on the mean delay, it is also important to investigate the trade-off performance in terms of the outage delay. In practice, a small fraction of packets with large delays is allowed to be dropped by the queue; hence, the outage delay is particularly useful when the delay has long-tail distributions. Given a random delay D and its CDF F D (x), the outage delay D o (φ) is defined as the delay value that gives F D (D o ) = 1 − φ, where φ is the outage threshold. In Figure 13, we show the capacity-delay trade-off based on outage delay. As expected, we see that the outage delay increases in an exponential fashion when the capacity per user approaches the limit. Moreover, it is observed that the delay reduces with increasing outage probability φ. Finally, we note that being able to pinpoint the delay distribution and study the outage delay performance is a key merit of the analytical framework proposed in this paper. Our analytical framework can potentially be extended beyond the scenario of cellular communications and applied to other networking paradigms, such as multi-hop sensor networks [53][54][55].

Conclusions
This paper has studied the uplink capacity-delay trade-off of large-scale hybrid wireless networks with a two-hop broadcast-and-forward relaying scheme. A queueing theoretic framework has been established to evaluate the exact distribution of the delays. The impacts of transmission rates, transmission power, user density, BSs density and packet size on the capacity-delay trade-off have been thoroughly investigated. Heuristic power and rate control algorithms have been proposed for performance optimization. Using a different and independent model, we reach the same conclusion with existing literature that per-user capacity scales with BS-user density ratio. However, our model is able to give an exact scaling coefficient as 0.7239 in the interference limiting scenario. Numerical results suggest that mobility and user collaboration are effective means to reduce the mean and outage packet delay.