Optimizing the Number of Fog Nodes for Finite Fog Radio Access Networks under Multi-Slope Path Loss Model

Fog Radio Access Network (F-RAN) is a promising technology to address the bandwidth bottlenecks and network latency problems, by providing cloud-like services to the end nodes (ENs) at the edge of the network. The network latency can further be decreased by minimizing the transmission delay, which can be achieved by optimizing the number of Fog Nodes (FNs). In this context, we propose a stochastic geometry model to optimize the number of FNs in a finite F-RAN by exploiting the multi-slope path loss model (MS-PLM), which can more precisely characterize the path loss dependency on the propagation environment. The proposed approach shows that the optimum probability of being a FN is determined by the real root of a polynomial equation of a degree determined by the far-field path loss exponent (PLE) of the MS-PLM. The results analyze the impact of the path loss parameters and the number of deployed nodes on the optimum number of FNs. The results show that the optimum number of FNs is less than 7% of the total number of deployed nodes for all the considered scenarios. It also shows that optimizing the number of FNs achieves a significant reduction in the average transmission delay over the unoptimized scenarios.


Introduction
Fog computing is considered as an enabler of the Internet of Things (IoT) and a key technology for fifth-generation (5G) and beyond networks. Fog computing is an extension of the cloud computing paradigm, wherein the distributed computing and storage resources across the network are exploited to support the functionalities of a centralized cloud data center [1][2][3][4]. The key benefits that the fog computing paradigm provides in bringing the cloud computing functionalities closer to the end-nodes (ENs) at the edge of the network are lower network latency and the elimination of the possible bandwidth bottlenecks [5][6][7].
Fog radio access networks (F-RANs) enable the processing of some application data at the fog nodes (FNs) (i.e., latency-aware data), that are transmitted via wireless links by the ENs to the FNs, while the non-latency-aware data are forwarded for processing at the cloud data center [8].
Recent survey studies pointed out that the network latency of F-RAN is a crucial issue and its

•
Considering a finite BPP FRAN, we formulate an objective function to minimize the transmission delay of the network by maximizing the average network data rate. A closed-form expression of the objective function using SS-PLM has been derived. • We prove that there is an optimum value of the probability of being a FN, and thus, of the number of FN that minimizes the derived objective function that utilizes SS-PLM. We derive closed-form expressions of the optimum probability and the optimum number of FNs for the special cases of the PLE of the links between the ENs and the FNs are equal to 2 and 4.
• We derive a closed-form expression of the objective function to minimize the transmission delay by utilizing the dual-slope path loss model (DS-PLM), which is appropriate for any values of the PLEs for uplinks from the ENs to the FNs and the links formed by the FNs and the cloud center. We prove that there is a unique global value of the probability of being a FN that minimizes it. Closed-form expressions of the optimum probability and the optimum number of FNs have been derived for the special cases when the far-field PLE of the links between the ENs and the FNs are equal to 2 and 4. • A closed-form expression of the objective function to minimize the transmission delay under the N-slope path loss model (NS-PLM) has been derived. • We analyze the impacts of the DS-PLM parameters and the number of deployed nodes on the optimum number of FNs.
The rest of this paper is organized as follows. In Section 2, we presented the system model, including the network topology and the key assumptions besides the considered path loss models. The problem formulation is delivered in Section 3. Section 4 presents the mathematical framework for optimizing the number of FNs. The numerical results are discussed in Section 5. The limitations and future work directions are presented in Section 6. Finally, the concluding remarks are provided in Section 7.

System Model
In this section, we first present the network topology and the key assumptions, followed by the considered path loss models in this paper.

Network Topology and the Key Assumptions
We consider a F-RAN system model illustrated in Figure 1, wherein there is a fixed number of nodes n, which are independent and identically distributed according to BPP in b (o, R), where b (o, R) represents a 2-dimensional ball with radius R centered at the origin o. We assumed that the cloud resides at the origin, and the ENs can forward the data to the cloud through the FNs via wireless links. Moreover, we assumed that the links x from the EN to the FN and y between the FN and the cloud experience different PLEs. This is due to the fact that the PLE is dependent on the propagation environment, antenna heights, and operating frequency [35]. We also assumed that all nodes inherently have capabilities of being a FN and can be activated as a FN with a probability p, or deactivated and downgraded to be an EN with probability 1 − p. Consequently, the number of FNs is n 1 = np and the number of ENs is n 0 = n(1 − p). Since the number of FNs is determined by p, it is clear that our problem of optimizing the number of FNs is equivalent to optimizing p, which minimizes the transmission delay.
Note that optimizing the transmission delay w.r.t. the distance automatically optimizes the propagation delay, thus it is omitted. Moreover, as the uplink IoT data are generally transmitted in short data packets, the processing delay of the packet overhead is very small compared with the transmission delay. Also, the FNs can be provided with extra processing capabilities to handle the data traffic at the peak hour, thus the processing delay is assumed to be ignorable.

Propagation Model
This paper considers both small-scale and large-scale fading. For small-scale fading, a Rayleigh fading channel is assumed, i.e., the small-scale channel gain h follows an exponential distribution with unit mean h d ∼ exp (1). Whereas, the large-scale fading is assumed to be characterized by the inverse power-law path loss models. In the inverse power-law path loss models, the impacts of the environment (outdoor, indoor, rural, urban, suburban, etc.) on the path loss is reflected by the value of the PLE [36][37][38][39]. The path loss models of interest are defined as follows.
Definition 1 (SS-PLM). The standard SS-PLM is given by where (x > 0) denotes the length of the wireless link in meters, and α x stands for the PLE of the link x, which is commonly approximated by a constant in the range of 2 to 5, depending on the propagation environment and the carrier frequency [31].
The limitations of the SS-PLM lead to the consideration of the DS-PLM because it better reflects the PLE dependence on the physical environment in clustered networks and millimetre wave networks.
Definition 2 (DS-PLM). The DS-PLM is defined by [31,34] as; where (r > 0) stands for the critical distance, also known as the break-point distance because the PLE (i.e., slope) changes at it, α x(0) and α x(1) such that 0 ≤ α x(0) ≤ α x(1) are the PLEs of the near-and far-fields, respectively, and η = r (α x(1) −α x(0) ) is a factor to maintain the continuity of the path loss.
It should be noted that the critical distance is dependent on the antenna height and the environment, such that it increases with the increase in the antenna height and decreases with the high blocking environment. Generally, it is approximated as the average line-of-sight (LoS) distance of the communication link, whereas the near-field PLE is used to approximate the LoS link regime, and the non-line-of-sight (NLoS) link regime beyond the critical distance is approximated by the far-field PLE [32]. The DS-PLM can be extended to NS-PLM as follows.
In the rest of the paper, the notations r c and r f are used to denote the critical distances of the cloud and the FN, respectively.

Problem Formulation
In the considered system model depicted in Figure 1, we assume that the ENs' transmitted data in the uplink phase is partially processed at the FN and a portion of it (i.e., non-latency sensitive ones) is relayed to the cloud data center. In other words, if a packet of S bits is delivered via the wireless link to the FN, the FN will process D bits of it, whereas the other S − D bits are forwarded via a wireless link to the cloud data center to perform the necessary processing and computation there. The transmission delay τ of the considered system model can be calculated as where the data rate at the FN R f og is given by and the data rate at the cloud R cloud is expressed as where W denotes the link bandwidth. Note that we assumed all the links to have the same bandwidth, and γ f og and γ cloud denote SINR of the uplinks at the FN and the cloud, respectively. The SINR of the uplink connecting the i-th EN (i.e., i = 1, 2, · · · , n 0 ) and it's associated FN (i.e., the j-th FN; j = 1, 2, · · · , n 1 ) is calculated as where P i stands for the transmit power of the i-th EN, h i is the channel gain of the link between the i-th EN and the j-th FN, x i is the separation distance between the i-th EN and the j-th FN, (x i ) represents the path loss at a separation distance of x i , σ 2 is the noise power, and I f og is the aggregated interference at the FN, which is originated by the simultaneous transmissions of the other ENs. The SINR of the uplink from the j-th FN to the cloud data center is given by where P j denotes the transmit power of the j-th FN, h j is the channel gain of the link between the j-th FN and the cloud, y j is the separation distance between the j-th FN and the cloud, (y j ) is the path loss at a separation distance of y j , and I cloud denotes the aggregated interference at the cloud due to the simultaneous transmissions of the FNs. For the sake of simplicity, we assumed that the system is noise limited (i.e., I f og = I cloud = 0) owing to the interference might be perfectly mitigated or because of the pseudo-wired abstraction if wireless links are millimeter waves [40,41].
As stated earlier, our goal is to minimize the transmission delay τ of the system, hence, the objective function for a single EN (i.e., the i-th EN) transmitting packets through the j-th FN to the cloud can be formulated as Since W, S, and S − D have constant values they can be omitted as they do not impress the optimization. Consequently, the objective function can be written as In view of the fact that the logarithm is a monotonic function, the objective function can be rewritten as Taking the expectation, (11) becomeŝ Here P i and P j do not affect the optimization, provided that the receiver has prior knowledge about their values. Also, σ 2 does not influence the optimization process since it can be estimated by the receiver. Moreover, since the channel gains represent the small-scale fading, which is independent of the path loss, the terms containing them can be rewritten as and equals to a constant value. Thus, they have no impact on the optimization. Therefore, (12) reduces tô In the light of the fact that there are n 0 ENs and n 1 FNs, generalizing (13) for n 0 ENs and n 1 FN leads to Equation (14) shows that the objective function of the system depends on the reciprocal of the expected value of the path loss for the individual links, which are influenced by the PLEs and the spatial distribution of the nodes. In the following section, we present the mathematical approach of optimizing the number of FNs for the path loss models considered in Section 2.2.

The Framework for Optimizing the Number of FNs
In this section, we use the stochastic geometry tool to evaluate the optimum number of FNs in a finite F-RAN. Considering that the number of FN is determined by p, the problem is transformed into optimizing p. Then, (14) can be rewritten as Since there are n 1 FNs, ∑ whereñ 0 denotes the average number of ENs that are associated with a single FN, and it is given bỹ Accordingly, we assumed that the FNs are located at the centers of identical 2-dimensional balls χ j (i.e., b(χ j , R f ); j = 1, 2, · · · , n 1 ) that are scattered to cover the entire deployment area, such that each ball contains on averageñ 0 ENs. Thus, the radius of the area controlled by a single FN R f can be obtained by where λ n 0 = n 0 /πR 2 is the EN density, and In the following subsections, we derive the objective function regarding optimizing the number of FNs for SS-PLM, DS-PLM,and NS-PLM.

Single-Slope
The objective function utilizing SS-PLM can be written as According to [18], the α x -th moment of the distance between the center of the 2-dimensional ball b(χ j , R f ) that enclosesñ 0 nodes and the i-th nearest BPP node is given by where ζ [κ] = Γ(ζ + κ)/Γ(ζ) is the Pochhammer function notation (sometimes called the raising factorial), and Γ(ζ) = (ζ − 1)! denotes the gamma function. Generalizing (21) forñ 0 ENs, this yields Using mathematical induction, we havẽ Substituting (23) into (22), the following closed-form is obtained Analogously, since the FNs are scattered according to BPP in b (o, R), we have Therefore, the objective function utilizing SS-PLM can be expressed as (26) is strictly convex, and hence there is a unique global value of p that minimizes it.

Lemma 1. The objective function utilizing SS-PLM in
Proof. The convexity of the objective function can be proven by the second derivative test. The second derivative of J 1 w.r.t p is; one can observe that the second derivative is positive for 0 < p < 1. Therefore, J 1 is strictly convex and hence there is a unique optimum value of p that minimizes it.
The global value of p that minimizes J 1 can be obtained by solving for the real root in the range 0 < p < 1 of which can be rewritten after performing some operations as the following polynomial equation of the degree where and The optimum value of p and n 1 for the special cases of α x = 2 and α x = 4 are presented in Corollaries 1 and 2, respectively. and respectively.

Proof.
When α x = 2, Equation (29) becomes a quadratic equation, and c 1 = 0. Hence, the optimum value of p is computed as the square root of the constant −c 0 .

Corollary 2.
When α x = 4, the closed-form expression of the optimum value of p and n 1 utilizing SS-PLM can be calculated by and and Proof. Equation (29) is a cubic equation when α x = 4. Therefore, the optimum value of p is the cubic equation's root given in (34).

Dual-Slope
When DS-PLM is utilized, the objective function can be expressed as Bearing in mind that DS-PLM has different PLEs for the range of distances less than and higher than the critical radius of the FN r f . Therefore, ∑ñ whereñ 0(0) is the expected number of ENs that reside inside r f .
where A r f = πr 2 f is the area enclosed by b(χ j , r f ). Sinceñ 0(0) ENs are scattered according to BPP within b(χ j , r f ), identity (24) can be used to obtain the following closed-form expressioñ whereas the α x(1) -th moment of the distance from the center of the ball b(χ j , R f ) to the i-th node that is located beyond r f can be obtained by where the distance distribution function is given by [18] where 2 F 1 a b c ; z stands for Gauss hypergeometric function. Generalizing (45) for (ñ 0 −ñ 0(0) ) ENs, then using the series representation of 2 F 1 yields: Using identity (23), we have It can be verified that, Then, we obtainñ Since there is a single cloud center, following the same steps of deriving (50), the closed-form expression of ∑ n 1 j=1 E 1/ 2 (y j ) can be obtained as where α y(0) and α y(1) are the PLEs before and beyond r c , respectively, and n 1(0) is the expected number of FNs that lay inside b (o, r c ), which can be calculated as where λ n 1 = n 1 /πR 2 , which denotes the FN density, and A r c = π r 2 c , which stands for the area of b(o, r c ).
Then, the objective function that utilizes DS-PLM can be expressed as Substituting the values of n 1 , n 1(0) ,ñ 0 ,ñ 0(0) , and R f into (53), we obtain where Q = R 2 /n r 2 f and W = R 2 /r 2 c .

Lemma 2.
There is a unique optimum global value of p in the range 0 < p < 1 such that p < R 2 /(n r 2 f ) (i.e., R f > r f ) that minimizes (54).
Proof. To prove that (54) is strictly convex in the range of p as specified in Lemma 2, the second derivative test is used. The second derivative of (54) w.r.t. p is After performing some mathematical operations on (55), we have; It can be observed that inequality in (56) is satisfied since the first term of the left-hand side is always greater than one for any value of p in its specified range, while the second term is less than one. Therefore, (54) is strictly convex in the specified range, and hence there is a unique global optimum value of p, which minimizes (54).
The global optimum value of p is the real root in the specified range in Lemma 2 of: which can be rewritten after some simple algebraic operations as: where and Note that (58) is a polynomial equation of the degree (2 + α x(1) /2), which can be solved numerically, by factorization, using algebraic geometry, or any other possible method. For example, the well-known Graeffe's method of solving polynomial equations can be used to compute all the roots of (29) and (58), where the computational complexity estimation indicates that all the roots can be computed using O(θ 2 log θ(θ log θ + log(1/ ))) arithmetic operations, where θ is the degree of the polynomial and > 0 is the relative output error bound [42]. In the following corollaries, we provide closed-form expressions of the optimum values of p and n 1 for the special cases of the far-field PLEs of the links between the ENs and the FN are α x(1) = 2 and α x(1) = 4.

Corollary 3.
When α x(1) = 2, the closed-form expression of the optimum values of p and n 1 can be obtained by as consequence, the optimum number of FNs is given as Proof. Substituting α x(1) = 2 into (58), it reduces to a cubic equation. Hence, the optimum value of p is obtained as in (62) by solving for the real root in the specified range of p in Lemma 2.

Corollary 4.
When α x(1) = 4, the optimum value of p can be expressed as accordingly, the optimum number of FNs can be computed by: Proof. Equation (58) becomes a quartic equation when α x(1) = 4. Therefore, the optimum value of p given by (67) is the real root of the quartic equation in the range stated in Lemma 2.

N-Slope
The objective funcyion utilizing DS-PLM can be extended to NS-PLM as in Lemma 3. The optimum number of FNs for NS-PLM can be calculated by solving for the real root of the first derivative of the objective function or using any numerical method when the closed-form of the root cannot be obtained due to the complexity.

Numerical Results
Here, we present the numerical results of optimizing the number of FNs in a finite BPP F-RAN. We studied the impacts of the DS-PLM parameters on the optimum number of FNs, including the PLEs of near-and far-fields, and the critical distance. Also, the impact of the number of nodes scattered in the deployment area is analysed.
We consider a disk-shaped deployment area of a radius R = 5 km, in which the cloud data center is located at the center, and the nodes are uniformly scattered according to BPP. In order to analyse the impacts of the network parameters on the optimum number of FNs, we plot the objective function in (54) w.r.t. p for various values of the considered parameter. Moreover, Matlab simulations were performed, in which a noise spectral density of −174 dBm/Hz and a bandwidth of W = 1 GHz are assumed, to study the average transmission delay of an EN that transmits a packet size of S = 1 kbit to the FN, which in turn forwards 50% of it to the cloud. The average delay for a single EN is obtained by averaging after performing 1 million iterations for both optimized and unoptimized number of FNs cases. In the optimized case, we fixed the number of FNs to be equal to the optimum number in all the iterations. Whereas in the unoptimized case, a random number of FNs (such that 0 < n 1 < n) is generated for each iteration. Note that the y axis in all figures in this section is in log scale.
In Figure 2, we illustrate the objective function that utilizes SS-PLM for α x = α y = 4. The figure reveals that our optimization approach, which assumes that the FNs are scattered to cover the deployment area, achieves lower optimum probabilities of being a FN, and hence a few FNs are required to optimize the performance compared with the approach presented in [24], which assumes that the FNs constitute a circular mesh around the cloud data center. Objective Function (J   1 ) proposed, n=100 proposed, n=1000 [24], n=100 [24], n=1000  The relationship between the PLEs of the uplink from the EN to the FN and the objective function is illustrated in Figure 3. Wherein we plot the objective function versus p for various combinations of the near-and far-field PLEs of the wireless uplink from the EN to the FN. It can be observed that the objective function curves have a unique minimum point, which represents the optimum value of the probability of being a FN, which in turn specifies the optimum number of FNs. The values of the optimum p, n 1 , and the average number of ENs per each FN that corresponds to Figure 3 are given in Table 1. The results show that when the links from the ENs to the FNs experience low PLEs (i.e., α x(0) = 2 and α x(1) = 3), a small percentage of the nodes, which is 3.39% (i.e., about 17 nodes), needs to be upgraded to FNs to optimize the transmission delay in the considered F-RAN. In this case, it is noteworthy to highlight that the increase in the number of FNs will not improve the performance because more FNs will be located at the edge of the network, and thus the direct links between them and the cloud will experience higher path losses due to the larger distance to the cloud compared with the distance to the closest FN, which in turn degrade the performance. In the second case, when the far-field PLE of the links from the ENs to the cloud increases (i.e., α x(1) = 4), the farthest nodes from the FNs will experience higher path losses, hence a larger number of FNs compared with the first case is required to improve the performance, which is about 34 FNs. However, as the links to the FNs experience severe path losses due to the higher PLE of the near-field (i.e., α x(0) = 3), the value of p that optimizes the transmission delay leaps to 0.2169 (i.e., about 108 FNs) with about 3.61 ENs on average being associated with each FN. Hence, in the non-latency sensitive IoT systems, the direct communication between the ENs and the cloud might be more efficient and cost-saving due to the cost of the large number of FNs that need extra computational capabilities. Hence, a trade-off between the delay and the cost should be done in this case. Moreover, it is noteworthy to point out that as links to the FNs experience lower path losses, additional computational capabilities should be provided to the FNs since a larger number of ENs will be associated with each FN.    Figure 4 demonstrates the simulated transmission delays of the aforementioned cases. The figure shows that the average transmission delay for a data packet increases as the links to the FNs experience higher path loss as a result of the lower achievable data rates. Furthermore, the transmission delay decreases as the transmission power increases. This is due to the higher SINRs and thus higher data rates. Though, the figure depicts that optimizing the number of FNs reduces the transmission delay to be in the range of 1 × 10 −4 to 1 × 10 −6 of its value in the unoptimized cases.
The impact of the PLEs of the link from the FN to the cloud on the objective function is shown in Figure 5. The figure shows that the optimum value of p decreases as the links to the cloud experience higher PLEs, which is the opposite of the behavior observed in Figure 3 toward increasing the PLEs of the links between the ENs and the FNs. This can be explained by the fact that the higher PLEs results in a higher path loss of the links between the FNs and the cloud. Thus, the path loss of the direct links between some of the FNs and the cloud will be higher than the path loss if those nodes utilize other FNs to communicate with the cloud. Therefore, downgrading those FNs to ENs results in lower average transmission delay. Table 2 provides the values of the optimum p, n 1 and n 0 /n 1 for the curves in Figure 5. The table indicates that when the links between the FNs and the cloud are subjected to higher path loss, a higher number of ENs are associated with each FN, which requires more computational capabilities and a larger bandwidth to be allocated to the FNs.   Figure 6 investigates the impact of the critical radius of the FN on the optimum number of FNs. The figure maintains that when the critical radius of the FN increases, the optimum value of p decreases, hence the fewer number of FNs are required to optimize the average transmission delay. This is owing to the larger number of ENs that reside within the critical radius where their links to the FN will be subjected to the low near-field PLE, which results in a lower average path loss that requires fewer FNs to optimize the performance.
The impact of the critical radius of the cloud on the optimum number of FNs is illustrated in Figure 7. The figure shows that the optimum value of p, and as a consequence, the optimum number of FNs decreases with an increase in the critical radius of the cloud. The observed behavior is owing to the fact that as the critical radius of the cloud increases, more FNs reside within it with a lower path loss due to the low PLE of the near-field, which results in a lower transmission delay through those nodes. Thus, some of the FNs beyond r c are degraded to be ENs because their transmission delays through other FNs in the extended radius are less than transmission delays of the direct links between them and the cloud.     Table 3 show the impact of the number of deployed nodes on the optimum number of FNs. We observe that increasing the number of deployed nodes n results in a decrease in the optimum value of p, which is due to the higher probability that there is a FN in the vicinity of the EN because of the lower probable separation distances between the nodes, and thus the lower probability that the node can be selected as a FN. Given that a lower probability of the FNs implies a higher number of ENs that are controlled by each FN, it means that higher computation and bandwidth resources are needed to be assigned to the FNs.  Table 3. The optimum value of p, n 1 , and n 0 /n 1 at α x(0) = 2, α x(1) = 4, α y(0) = 2, α y(1) = 3, r f = 100 m, and r c = 500 m.

Limitations and Future Work
The main objective of this paper is to optimize the number of FNs that minimize the transmission delay for uplink finite F-RAN. However, due to the small values of the propagation and processing delays compared to the transmission delay, both of them are omitted. Moreover, the impacts of the interference on the delay is not investigated for the sake of analytical tractability. In future studies, the impacts of interference, other sources of delay, and the mobility of the nodes on the optimum number of FNs for F-RAN will be investigated. However, such systems are very complex and thus the convexity of the optimization problem cannot be assured, nor a closed-form expression of optimum solution can be obtained. Therefore, finding the optimum solution using the heuristic optimization algorithms, such as Red Fox and Slime Mould, machine learning, or deep learning is highlighted as an open research direction.

Concluding Remarks
In this paper, we proposed a framework using stochastic geometry tool and exploiting MS-PLM to minimize the transmission delay in finite F-RANs by optimizing the number of FNs. We showed that the optimum number of FNs can be obtained by solving for the real root of a polynomial equation, the degree of which is determined by the far-field PLE of the link from the ENs to the FNs. Our simulation results show a significant reduction in the transmission delay gained by optimizing the number of FNs. Also, the impacts of the path loss parameters on the number of FNs have been analyzed. The results show that a small percentage of the deployed nodes are required to be selected as FNs to optimize the delay when the links to the FN experience a low path loss. Thus, additional bandwidth and computational capabilities are required at the FNs. However, this percentage increases as the PLEs of the links become higher. The results demonstrate that a larger number of FNs are needed to optimize the performance when the path loss of links to the FNs are higher than the path loss of the links to the cloud. Therefore, in these circumstances, centralized cloud networks can achieve better performance. The impact of the critical distance was also studied, which shows that the optimum number of FNs decreases when the critical distance increases. We also observe that the networks with densely deployed nodes require less percentage of them to be upgraded to FNs in order to minimize the transmission delay.
In general, the proposed approach can be applied to optimize the number of FNs in any IoT F-RAN, including NB-IoT and CAT-M1 networks, if the nodes selected as FNs are provided with higher computational and bandwidth resources. Finally, due to the accuracy attained by utilizing MS-PLM, our results provide a better insight into the design of F-RANs for more efficient utilization of FNs and virtualization of the cloud.