Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density

Nguyen, Khoa Anh; Nguyen, Quan Anh; Hong, Jun-Pyo

doi:10.3390/s23167184

Open AccessArticle

Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density

by

Khoa Anh Nguyen

¹

,

Quan Anh Nguyen

² and

Jun-Pyo Hong

^1,*

¹

Department of Information and Communications Engineering, Pukyong National University, Busan 48513, Republic of Korea

²

Hello Health Group, Singapore 079333, Singapore

^*

Author to whom correspondence should be addressed.

Sensors 2023, 23(16), 7184; https://doi.org/10.3390/s23167184

Submission received: 10 July 2023 / Revised: 8 August 2023 / Accepted: 14 August 2023 / Published: 15 August 2023

(This article belongs to the Special Issue AI-Empowered Wireless Communications)

Download

Browse Figures

Versions Notes

Abstract

Recently, federated learning (FL) has been receiving great attention as an effective machine learning method to avoid the security issue in raw data collection, as well as to distribute the computing load to edge devices. However, even though wireless communication is an essential component for implementing FL in edge networks, there have been few works that analyze the effect of wireless networks on FL. In this paper, we investigate FL in small-cell networks where multiple base stations (BSs) and users are located according to a homogeneous Poisson point process (PPP) with different densities. We comprehensively analyze the effects of geographic node deployment on the model aggregation in FL on the basis of stochastic geometry-based analysis. We derive the closed-form expressions of coverage probability with tractable approximations and discuss the minimum required BS density for achieving a target model aggregation rate in small-cell networks. Our analysis and simulation results provide insightful information for understanding the behaviors of FL in small-cell networks; these can be exploited as a guideline for designing the network facilitating wireless FL.

Keywords:

federated learning; small-cell networks; stochastic geometry; base station density; Poisson point process

1. Introduction

Recent advances in the sensing and computation capabilities of mobile devices make it possible for end-users to generate various types of data and exploit these data for autonomous and intelligent services [1]. Machine learning (ML), which builds a mathematical model based on training data to make predictions or decisions without human intervention [2], can be considered an essential building block for embedding artificial intelligence (AI) within services.

Traditional ML technologies depend on the prerequisite that the private data on edge devices should be collected at a central parameter server for training the model. However, this centralized approach has to bear the risk of personal data exposure and can violate privacy regulations, which are becoming progressively more stringent over time [3]. This also leads to immense communication overheads for data transmission, causing intolerable latency and communication resource inefficiency [4].

To deal with the limitations of traditional centralized ML, federated learning (FL) has emerged. FL adopts a distributed training approach, where each edge device trains a common model with its own local data samples in a distributed manner and forwards the locally trained model to the parameter server for subsequent operations [5]. Accordingly, by decoupling model training from the necessity of private data collection, the FL mechanism enables users to exploit ML models trained with enormous data without severely compromising user security and privacy threats, as well as communication costs, making this option relevant for numerous wireless applications [6].

To fulfill stringent performance requirements, wireless communication networks have evolved in a complicated manner. In particular, the emergence of small-cell networks made it harder to optimize network performance with inter-cell interference [4]. Thus, FL in small-cell networks are not well understood, triggering challenges for FL implementation in cellular settings [6]. Many previous studies propose methods to address the challenges of FL performance in wireless communications.

Amongst the primary challenges, communication efficiency is a critical training bottleneck due to updates’ high dimensionality, massive quantities of devices, or unreliability of devices’ network conditions. To alleviate this issue, importance-based updating schemes have been studied in [7,8]. The edge stochastic gradient descent (eSGD) algorithm in [7] assigns only a fraction of the gradients for the model update based on the loss values at two successive training iterations, which helps to save a significant fraction of communication resources. In [8], a communication-mitigated federated learning (CMFL) algorithm has been proposed to compare a participant’s local update with the global value to evaluate the update’s relevance, and the irrelevant updates are eliminated to improve accuracy. To reduce the reporting data size, local update compression has been considered in [9,10]. With parameter pruning, trained quantization, and Huffman coding, it has been shown that the the size of a bwell-known model can be significantly reduced without loss of accuracy [9]. In addition to reducing model size with compression, federated dropout, which allows users to train only a subset of model, has been proposed to not only improve the communication efficiency but also reduce the local computation.

Notably, over-the-air computation (AirComp) has received attention as an alternative approach for communication-efficient FL by leveraging the superposition property of a multiple-access channel. Specifically, if multiple devices transmit their analog-modulated local update signals over the same communication resource, the fusion center (FC) can obtain the averaged local updates from its received signal without additional computation at FC. When the number of participating devices is large, AirComp-based FL has been shown to outperform traditional digital communication-based FL in terms of the number of channel users necessary for model convergence [11,12,13].

Although there have been several works that address challenges of incorporating federated learning into wireless networks, FL over a cellular network with multiple base stations (BSs) and devices has not yet been investigated while taking account into geographic deployment. The effect of geographic deployment of BSs and users can be well described on the basis of stochastic geometry-based analysis with the assumption of a Poisson point process (PPP) [14,15]. Although conventional work on stochastic geometry-based cellular network analysis has provided meaningful information for understanding cellular networks by providing closed-form solutions, most analysis results cannot directly apply to FL scenarios due to some assumptions that are required for general cellular communication. For example, some have assumed that each BS serves only one user with a given resource block at a time and that there is at least one user within a Voronoi cell. Several papers [16,17,18] have explored the topic of FL with multi-user association. They have focused on proposing new communication architectures to reduce the communication overhead for FL, called hierarchical FL, in small-scale cellular networks. Even though they have proposed new communication architectures for communication-efficient FL, it is hard to directly apply them to existing cellular networks with distance-based user association and inter-cell interference. Furthermore, their performance analysis results do not easily offer insights into large-scale cellular networks, especially when considering the geographic deployment of nodes.

Motivated by these, we investigated multi-user association-based model aggregation for FL in large-scale cellular networks. Our work focused on analyzing the model aggregation performance and optimizing certain system parameters for communication-efficient FL within existing cellular networks, rather than proposing a new communication strategy. Based on stochastic geometry-based analysis, we derived the closed-form expressions of coverage probability with tractable approximations. With the closed-form expressions, we derived the minimum required BS density for achieving a target model aggregation rate by using a two-step iterative method. The proposed algorithm is helpful to reduce the time and spectrum resources consumed for FL in large-scale cellular networks. Simulation results validated that the coverage probability expressions obtained with stochastic geometry-based analysis describe the actual coverage probability well. Furthermore, simulation results showed the effects of system parameters on the optimized transmission rate and BS density for achieving the target model aggregation performance. Our analysis and simulation results with discussions on the model aggregation rate are valuable for providing insightful information for understanding the behaviors of FL in large-scale cellular networks. They can also be exploited as guidelines for designing cellular networks that facilitate wireless FL.

Considering the aforementioned motivations, we summarize the contributions of this paper below:

To the best of our knowledge, this is the first work that takes the geometry of wireless communication network into account in analyzing FL performance, that analyzes the model aggregation performance, and that optimizes certain system parameters for communication-efficient FL within existing cellular networks;
Based on the stochastic geometry framework, we derive the closed-form expressions of the approximated coverage probability for some special cases in small-cell networks where each base station is capable of receiving updates from multiple associated devices with orthogonal spectrum allocation;
With the closed-form expressions, we propose an iterative algorithm to optimize communication parameters for achieving a target model aggregation rate which can be helpful to reduce the time and spectrum resources consumed for FL in large-scale cellular networks. The purpose is to minimize the communication latency in model aggregation of FL under the existing cellular network with distance-based user association;
Analysis and simulation results provide insightful information for understanding the behaviors of FL over large-scale small-cell networks and provide a guideline for designing cellular networks which facilitate wireless FL.

The remaining sections in this paper are organized as follows. Section 2 describes the system model, discusses the key network parameters of small-cell network, and formulates optimization problems for two different FL scenarios. Section 3 derives the closed-form expressions of coverage probability in small-cell networks. Based on the coverage expressions, we derive solutions to the optimization problems with the iterative method in Section 4. Section 5 validates analysis results through extensive simulations. Finally, Section 6 concludes the paper.

2. System Model

We consider FL in small-cell networks, where the FC updates the global model by combining the locally updated models delivered from users through BSs to the FC. The links between BSs and FC are assumed to have infinite capacity within their wired backhaul connections. BSs and users are assumed to be located in the Euclidean plane, according to homogeneous PPPs

Φ_{BS}

with density

λ_{BS}

and

Φ_{UE}

with density

λ_{UE}

. This assumption makes the network performance analysis significantly more tractable than the traditional grid-based analysis, without causing significant error in the network performance [15]. Each user is assumed to be associated with its nearest BS, so that each BS associates with all users located in its Voronoi cell. An example of BS and user deployment is illustrated in Figure 1.

The procedures of FL in small-cell networks can be summarized as follows:

In the beginning of update round t, FC broadcasts the parameters of the global model, $w^{(t)} \in R^{D}$ , to the distributed users via BSs. Since BSs deliver the same signal $w^{(t)}$ , all users are assumed to successfully receive $w^{(t)}$ without interference;
Each user updates the model parameters on the basis of its local dataset and transmits its local update model to the associated BS. If there are $K_{i}$ users associated with BS, $i \in Φ_{BS}$ , each associated user transmits its local update model with the bandwidth $\frac{B}{K_{i}}$ , where B denotes total bandwidth;
Each BS averages out the local updates received from its associated users and forwards it to FC;
FC updates the global model by combining the parameters aggregated from BSs. Then, the method proceeds to update round $t + 1$ by starting from step 1 if the convergence condition is not satisfied.

Since the local update report from a user to its associated BS suffers from inter-cell interference and limited communication resources, step 2 can easily become a bottleneck for FL in small-cell networks. For this reason, we focus on step 2 to facilitate FL in small-cell networks.

Following Slivnyak’s theorem [19], all analysis is conducted for a typical BS located at the origin

o \in Φ_{BS}

. The received signal of the typical BS can be represented as

\begin{matrix} y_{o} = h_{j} r_{j}^{- α / 2} x_{j} + \sum_{j^{'} \in Φ_{o, \inf}} h_{j^{'}} r_{j^{'}}^{- α / 2} x_{j^{'}} + w, \end{matrix}

(1)

where the index j represents one of the users associated with the typical BS,

h_{i} \sim CN (0, 1)

denotes the Rayleigh block fading channel gain of user i,

r_{i}

denotes link distance between user i and the typical BS,

α

denotes path-loss exponent,

x_{i}

denotes the transmit signal of user i, and

w \sim CN (0, σ^{2})

denotes additive noise. All users are assumed to transmit their update with a constant power P. The set

Φ_{o, \inf}

consists of the other cell users interfering with the signal reception of the BS. The channel state information (CSI)

h_{j}

is assumed to be available only at the BS.

By treating interference as additive noise, the received signal-to-noise-plus-interference (SINR) of a user can be represented as

\begin{matrix} γ & = \frac{| h_{j} |^{2} r_{j}^{- α} P}{\sum_{j^{'} \in Φ_{o}, \inf} {| h_{j^{'}} |}^{2} r_{j^{'}}^{- α} P + σ^{2}} \\ = \frac{g_{j} r_{j}^{- α}}{\sum_{j^{'} \in Φ_{o}, \inf} g_{j^{'}} r_{j^{'}}^{- α} + σ^{2}}, \end{matrix}

(2)

where

g_{i} = {| h_{i} |}^{2} P

follows an exponential distribution with mean P. For the transmission rate u, the conditional coverage probability given

k > 0

users within the Voronoi cell is defined as

\begin{matrix} p_{c} (k) & = Pr [\frac{B}{K} {log}_{2} (1 + γ) \geq u | K = k] \\ = Pr [γ \geq 2^{\frac{u K}{B}} - 1 | K = k] \\ = Pr [γ \geq T (K) | K = k], \end{matrix}

(3)

where

T (K) = 2^{\frac{u K}{B}} - 1

denotes an SINR threshold for successful signal reception. Accordingly, for a unit area of model aggregation, the expected number of aggregated bits of locally trained models in a single transmission interval can be represented as

\begin{matrix} Q & = u λ_{UE} \sum_{k = 1}^{\infty} Pr [K = k] p_{c} (k) \\ = u λ_{UE} P_{c}, \end{matrix}

(4)

where

P_{c} ≜ \sum_{k = 1}^{\infty} Pr [K = k] p_{c} (k)

denotes the coverage probability. As a result, Q is directly related to the communication latency in the model aggregation phase of FL. The time and spectrum resources consumed for the successful model aggregation are inversely proportional to Q. Since the model aggregation phase is considered as a bottleneck in FL over a wireless network, we believe that improving the model aggregation rate is important to expedite communication-efficient FL. Based on this understanding, we considered the following optimization problems in two different scenarios.

In the first scenario, for given node densities

λ_{BS}

and

λ_{UE}

, we optimized the transmission rate u to maximize the model aggregation rate Q. Then, the problem was formulated as

\begin{matrix} \underset{u}{maximize} Q . \end{matrix}

(5)

In the second scenario, we optimized

λ_{BS}

as well as u to find the minimum required BS deployment for achieving a target aggregation rate

Q_{target}

. Then, the problem could be formulated as

\begin{matrix} \underset{λ_{BS}, u}{minimize} & λ_{BS} \\ subject to & Q \geq Q_{target} . \end{matrix}

(6)

3. Performance Analysis and Proposed Algorithm

In PPP, the number of users in a typical Voronoi cell, K, is dependent on the Voronoi cell size. For a given cell, of size

A = a

, the number of users K follows a Poisson distribution with mean

a λ_{UE}

. Accordingly, the probability mass function (PMF) of K can be expanded as

\begin{matrix} Pr [K = k] & = \int_{0}^{\infty} Pr [K = k | A = a] \cdot f_{A} (a) d a \\ = \int_{0}^{\infty} \frac{{(λ_{UE} a)}^{k}}{k!} e^{- λ_{UE} a} \cdot \frac{c^{c}}{Γ (c)} λ_{BS}^{c} a^{c - 1} e^{- c λ_{BS} a} d a, \end{matrix}

(7)

where

Γ (\cdot)

denotes the Gamma function,

f_{A} (a) = \frac{c^{c}}{Γ (c)} λ_{BS}^{c} a^{c - 1} e^{- c λ_{BS} a}

denotes the probability density function (PDF) of typical Voronoi cell size [20,21], and

c = 3.5

is a constant. Thus,

\begin{matrix} Pr [K = k] & = \frac{c^{c}}{Γ (c)} \frac{λ_{UE}^{k} λ_{BS}^{c}}{k!} \int_{0}^{\infty} a^{k + c - 1} e^{- (λ_{UE} + c λ_{BS}) a} d a \\ = \frac{c^{c}}{Γ (c)} \frac{λ_{UE}^{k} λ_{BS}^{c}}{k!} \frac{Γ (k + c)}{{(λ_{UE} + c λ_{BS})}^{k + c}} \\ = \frac{c^{c}}{Γ (c)} \frac{{(λ_{BS} / λ_{UE})}^{c}}{k!} \frac{Γ (k + c)}{{(1 + c λ_{BS} / λ_{UE})}^{k + c}} \\ = \frac{c^{c}}{Γ (c)} \frac{r_{λ}^{c}}{k!} \frac{Γ (k + c)}{{(1 + c r_{λ})}^{k + c}} \\ = \frac{{(\frac{c r_{λ}}{1 + c r_{λ}})}^{c}}{Γ (c)} \frac{Γ (k + c)}{k! {(1 + c r_{λ})}^{k}}, \end{matrix}

(8)

where

r_{λ} = \frac{λ_{BS}}{λ_{UE}}

. The number of users in the typical Voronoi cell depends on the ratio between BS and user densities.

3.1. Coverage Probability

In our uplink system model, the conditional coverage probability can be obtained by modifying the coverage probability expression of the downlink system considered in [15]. The downlink system in [15] assumed there was at least one user in every Voronoi cell, and, therefore, all BSs were active:

λ_{BS} = λ_{act}

. However, such an assumption is valid only when the ratio

r_{λ}

is very low. To derive a general expression that could be applicable to various BS/user deployments, we relaxed the low

r_{λ}

assumption and introduced the active BS density to the coverage probability expression. Furthermore, since our system model assumed that each BS served all its associated users with equal-bandwidth allocation (contrary to the downlink system in [15]), the SINR threshold became a function of random variable K in the coverage probability expression. By additionally taking into account the active BS density and random SINR threshold, the conditional coverage probability (3) could be represented as the following lemma.

Lemma 1

(Conditional coverage probability). For a given transmission rate u and

K = k

, the conditional coverage probability (3) can be computed as

\begin{matrix} p_{c} (k) & = π λ_{BS} \int_{0}^{\infty} e^{- π (λ_{BS} + λ_{act} ρ (k)) v - \frac{1}{P} T (k) σ^{2} v^{α / 2}} d v, \end{matrix}

(9)

where

\begin{matrix} ρ (k) = T {(k)}^{2 / α} \int_{T {(k)}^{- 2 / α}}^{\infty} \frac{1}{1 + t^{α / 2}} d t, \end{matrix}

(10)

and

λ_{act}

denotes the density of active BSs that contain at least one user in their Voronoi cell.

Based on the independence of BSs from user deployments, the deployment of active BSs follows PPP with density

\begin{matrix} λ_{act} & = λ_{BS} Pr [K \neq 0] \\ = λ_{BS} (1 - {(\frac{c r_{λ}}{1 + c r_{λ}})}^{c}) . \end{matrix}

(11)

Then, based on (8) and (9), the coverage probability can be represented as

\begin{matrix} P_{c} & = \sum_{k = 1}^{\infty} p_{c} (k) Pr [K = k] \\ = \frac{π (λ_{BS} - λ_{act})}{Γ (c)} \sum_{k = 1}^{\infty} \frac{Γ (k + c)}{k! {(1 + c r_{λ})}^{k}} J (k), \end{matrix}

(12)

where

J (k) = \int_{0}^{\infty} e^{- π (λ_{BS} + λ_{act} ρ (k)) v - \frac{1}{P} T (k) σ^{2} v^{α / 2}} d v

.

3.2. Coverage Probability in High SNR Regime

In a high SNR regime with a large P, the effect of additive noise becomes negligible compared to the inter-cell interference. Accordingly, the conditional coverage probability (9) simplifies to

\begin{matrix} p_{c} (k) & = π λ_{BS} \int_{0}^{\infty} e^{- π (λ_{BS} + λ_{act} ρ (k)) v} d v \\ = π λ_{BS} \frac{1}{π (λ_{BS} + λ_{act} ρ (k))} \\ = \frac{1}{1 + Pr [K \neq 0] ρ (k)} . \end{matrix}

(13)

Then, the coverage probability (12) can be reduced to

\begin{matrix} P_{c} = \frac{Pr [K = 0]}{Γ (c)} \sum_{k = 1}^{\infty} \frac{Γ (k + c)}{k! {(1 + c r_{λ})}^{k}} \frac{1}{1 + Pr [K \neq 0] ρ (k)} . \end{matrix}

(14)

Remark 1.

In a high SNR regime, the following emerge:

The coverage probability (14) is dependent on the densities of BSs and users, since the probabilities $Pr [K = 0]$ and $Pr [K \neq 0]$ are functions of $r_{λ}$ . This observation is different from the coverage probability expression presented in [15];
The conditional coverage probability is a decreasing function of the user density $λ_{UE}$ , given $K = k$ and $λ_{BS}$ , and it is bounded below by

$\begin{matrix} p_{c} (k) & \geq lim_{r_{λ} \to 0} p_{c} (k) \\ = \frac{1}{1 + ρ (k)} . \end{matrix}$

(15)

Special Case:

α = 4

For a path-loss exponent

α = 4

, Equation (10) simplifies to

\begin{matrix} ρ (k) & = \sqrt{T (k)} \int_{{\sqrt{T (k)}}^{- 1}}^{\infty} \frac{1}{1 + t^{2}} d t \\ = \sqrt{T (k)} arctan (\sqrt{T (k)}) . \end{matrix}

(16)

Then, the coverage probability (14) reduces to

\begin{matrix} P_{c} & = \frac{Pr [K = 0]}{Γ (c)} \sum_{k = 1}^{\infty} \frac{Γ (k + c)}{k! {(1 + c r_{λ})}^{k}} \\ \times \frac{1}{1 + Pr [K \neq 0] \sqrt{T (k)} arctan (\sqrt{T (k)})} . \end{matrix}

(17)

Even though the coverage probability is simplified with the assumption

α = 4

, its expression (17) is still complicated to handle. Hence, instead of computing exact coverage probability

P_{c}

by taking the expectation of

p_{c} (k)

over the random variable K, we proposed to use its approximation

p_{c} (E [K])

. This approximation is validated later in the simulation results in Section 6.

The expected number of users in a Voronoi cell can be obtained by

\begin{matrix} E [K] & = λ_{UE} S (1, 1) E [A] \\ = λ_{UE} \int_{0}^{\infty} a f_{A} (a) d a \\ = \frac{λ_{UE}}{λ_{BS}} = \frac{1}{r_{λ}}, \end{matrix}

(18)

where

S (n, k)

denotes the Stirling number of the second kind. Eventually, for high SNR and

α = 4

, the approximated coverage probability is represented by

\begin{matrix} P_{c} & \approx p_{c} (E [K]) \\ = \frac{1}{1 + (1 - {(\frac{c r_{λ}}{1 + c r_{λ}})}^{c}) \sqrt{2^{u / B r_{λ}} - 1} arctan (\sqrt{2^{u / B r_{λ}} - 1})} . \end{matrix}

(19)

4. Transmission Rate and BS Density Optimization

In the first scenario, the optimization problem (5) is equivalent to the problem maximizing the throughput

u P_{c}

with respect to the transmission rate u. Accordingly, in the high SNR regime with

α = 4

, based on (19), the optimization problem can be re-written by

\begin{matrix} \underset{u}{maximize} f (u, λ_{BS}), \end{matrix}

(20)

where

\begin{matrix} f (u, λ_{BS}) \\ ≜ \frac{u}{1 + (1 - {(\frac{c λ_{BS}}{λ_{UE} + c λ_{BS}})}^{c}) \sqrt{2^{\frac{u λ_{UE}}{B λ_{BS}}} - 1} arctan (\sqrt{2^{\frac{u λ_{UE}}{B λ_{BS}}} - 1})} . \end{matrix}

(21)

Experimentally, the objective

f (u, λ_{BS}) = Q / λ_{UE}

is a continuous bell-shaped function with respect to the transmission rate u. Based on this observation, we can see that the solution of problem (20) satisfies the condition as follows:

\begin{matrix} \frac{\partial}{\partial u} f (u^{†}, λ_{BS}) = \frac{1}{u} f (u, λ_{BS}) (1 - f (u, λ_{BS}) \\ \times (\frac{C D log 2 - 2^{D x - 1} arctan (\sqrt{2^{D x} - 1})}{\sqrt{2^{D x} - 1}} + \frac{1}{2} C D log 2)) \\ = 0, \end{matrix}

(22)

where

C ≜ 1 - (\frac{c λ_{BS}}{λ_{UE} + c λ_{BS}})

and

D ≜ \frac{λ_{UE}}{B λ_{BS}}

. Then, to find the solution

u^{†}

that satisfies condition (22), we can apply the bisection method. Based on Algorithm 1, the solution to problem (20) is obtained as

\begin{matrix} u^{†} = B I S E C T I O N (f_{1} (u), u_{\min}, u_{\max}, 0, ϵ_{1}), \end{matrix}

(23)

where

f_{1} (u) = \frac{\partial}{\partial u} f (u, λ_{BS})

,

ϵ_{1} > 0

denotes an arbitrary small constant, and

u_{\min}

and

u_{\max}

denote the minimum and maximum values of the transmission rate, respectively.

In the second scenario, we adopted a two-step iterative method to solve the joint optimization problem (6). According to (21), for a given transmission rate u, the model aggregation rate Q is a monotonically increasing function of the BS density

λ_{BS}

. Hence, the minimum required BS density

λ_{BS}^{†}

for a given u satisfies the constraint with the equality

\begin{matrix} λ_{UE} f (u, λ_{BS}^{†}) = Q_{target} . \end{matrix}

(24)

Based on (24), the density

λ_{BS}^{†}

can be obtained by

\begin{matrix} λ_{BS}^{†} = B I S E C T I O N (f_{2} (λ_{BS}), λ_{BS, \min}, λ_{BS, \max}, Q_{target} / λ_{UE}, ϵ_{2}), \end{matrix}

(25)

where

f_{2} (λ_{BS})

is equivalent to the function

f (u, λ_{BS})

with a fixed value of u,

ϵ_{2} > 0

denotes an arbitrary small constant, and

λ_{BS, \min}

and

λ_{BS, \max}

denote the minimum and maximum values of the BS density, respectively. Eventually, the solution to problem (6) can be obtained by alternately solving Equations (22) and (24) with Algorithm 1. The process of solving problem (6) is summarized by Algorithm 2.

Algorithm 1 Bisection Method.

1:: function Bisection( $f, x_{low}, x_{high}, s, ϵ$ )
2:: while $x_{high} - x_{low} > ϵ$ do
3:: $x_{temp} \leftarrow (x_{low} + x_{high}) / 2$
4:: if $(f (x_{low}) - s) (f (x_{temp}) - s) < 0$ then
5:: $x_{high} \leftarrow x_{temp}$
6:: else
7:: $x_{low} \leftarrow x_{temp}$
8:: end if
9:: end while
10:: return $x_{temp}$
11:: end function

Algorithm 2 Two-Step Iterative Method.

1:: Initialize $u_{\min}, u_{\max}, λ_{BS, \min}, λ_{BS, \max}$
2:: Initialize $ϵ_{1}, ϵ_{2}$
3:: Initialize $u_{new} \leftarrow u_{\max}$
4:: Initialize $λ_{BS, new} \leftarrow λ_{BS, \max}$
5:: repeat
6:: $u_{temp} \leftarrow u_{new}$
7:: $λ_{BS, temp} \leftarrow λ_{BS, new}$
8:: Set $f_{1} (u) = \frac{\partial}{\partial u} f (u, λ_{BS, temp})$
9:: $u_{new} \leftarrow BISECTION (f_{1} (u), u_{\min}, u_{\max}, 0, ϵ_{1})$
10:: Set $f_{2} (λ_{BS}) = f (u_{new}, λ_{BS})$
11:: $λ_{new} \leftarrow BISECTION (f_{2} (λ_{BS}), λ_{BS, \min}, λ_{BS, \max}, \frac{Q_{target}}{λ_{UE}}, ϵ_{2})$
12:: until $|u_{new} - u_{temp}| < ϵ_{1}$ and $|λ_{BS, new} - λ_{BS, temp}| < ϵ_{2}$
13:: return $λ_{BS, new}, u_{new}$

5. Simulation Results

In this section, we validate the analysis results for the coverage probability and show the behavior and performance of the proposed algorithm through numerical simulations. All simulation results were obtained by computing the empirical coverage probability using a Monte Carlo method in a network area measuring

20 \times 20

[km]. In every single deployment of BSs and UEs, we sampled 10,000 independent channel realizations to compute a single coverage probability conditioned on that deployment as the number of incidents received by the SINR exceeded the threshold

T (K)

divided by 10,000. Eventually, the marginal coverage probability

P_{c}

was computed by taking the average of the conditional coverage probabilities of 2000 independent deployments. Unless otherwise stated, the simulation environment followed the simulation parameters in Table 1. Those simulation parameters were thoughtfully chosen based on a combination of factors, including the system model, the performance analysis, and relevant work [15,20,22].

Figure 2 shows the coverage probabilities obtained from Monte Carlo simulation and our performance analysis. It is shown that the expressions for the coverage probability ((17) and (19)) characterized the actual coverage probability well. In particular, the approximation fn the number of in-cell users (19) did not introduce significant error on the coverage probability, and this analysis error becomes negligibly small when

λ_{BS} ≪ λ_{UE}

. Furthermore, it is confirmed that the coverage probability is a monotonically increasing function of the BS density

λ_{BS}

and a monotonically decreasing function of the transmission rate u. This is because the SINR threshold for successful signal reception

T (K) = 2^{\frac{u K}{B}} - 1

increases with the transmission rate u.

Figure 3 shows the optimized transmission rates obtained from Monte Carlo simulation and the proposed method (23). It is shown that the optimized transmission rate was a monotonically increasing function of the BS density. Even though there was some error in the analysis result, it is shown to characterize the effect of BS density on the optimized transmission rate well. Furthermore, similar to Figure 2, the analysis error is shown to be negligible when the BS density was low.

Based on the validations of our analysis, Figure 4 shows the results of Algorithm 2 for the joint optimization problem (6) in the second scenario. It is shown that both transmission rate and BS density increased as the performance requirement

Q_{target}

grew. In addition, if the inter-cell interference became severe with the growth in user density, the transmission rate and BS density were shown to be changed so as to increase the coverage probability. It is interesting to note that the optimized BS density was nearly saturated at a high user density, even though the transmission rate monotonically decreased with the user density. From this observation, we can see that the increase in users was mainly handled by the transmission rate control.

6. Conclusions

In this paper, we have investigated the wireless model aggregation for FL in small-cell networks, where BSs cooperatively aggregate the locally trained model of edge users. Based on stochastic geometry, we have analyzed the effects of geographic node deployment on the coverage probability and the model aggregation rate. With the approximation on the number of in-cell users in a typical Voronoi cell, we have derived a tractable closed form of the coverage probability in the interference-limited environment. Based on the derived expression, we have proposed two algorithms for maximizing the model aggregation rate and finding the minimum required BS density for achieving the target aggregation rate. The simulation results have confirmed that our analysis results accurately characterize the actual performance obtained using a Monte Carlo method and that the analysis error becomes negligible when the density ratio

r_{λ}

is low. Furthermore, our discussions on the minimum required BS density provides insightful information for understanding the model aggregation in small-cell networks, which can be exploited as a guideline for designing networks which facilitate wireless FL.

Author Contributions

Conceptualization, K.A.N. and J.-P.H.; methodology, K.A.N.; software, K.A.N. and Q.A.N.; formal analysis, K.A.N.; data curation, K.A.N. and Q.A.N.; writing—original draft, K.A.N.; writing—review and editing, J.-P.H.; visualization, K.A.N.; supervision, J.-P.H.; project administration, J.-P.H. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by a Research Grant from Pukyong National University (2021).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

All data derived from this study are presented in the article.

Conflicts of Interest

The authors declare no conflict of interest.

References

McMahan, H.B.; Moore, E.; Ramage, D.; Hampson, S.; Arcas, B.A.Y. Communication-Efficient Learning of Deep Networks from Decentralized Data. arXiv 2016, arXiv:1602.05629. [Google Scholar]
Amiri, M.M.; Gündüz, D. Federated Learning Over Wireless Fading Channels. IEEE Trans. Wirel. Commun. 2020, 19, 3546–3557. [Google Scholar] [CrossRef]
Liu, D.; Zhu, G.; Zhang, J.; Huang, K. Data-Importance Aware User Scheduling for Communication-Efficient Edge Machine Learning. IEEE Trans. Cogn. Commun. Netw. 2021, 7, 265–278. [Google Scholar] [CrossRef]
Sattler, F.; Wiedemann, S.; Müller, K.-R.; Samek, W. Robust and Communication-Efficient Federated Learning from Non-i.i.d. Data. IEEE Trans. Neural Netw. Learn. Syst. 2020, 31, 3400–3413. [Google Scholar] [CrossRef]
Yang, K.; Jiang, T.; Shi, Y.; Ding, Z. Federated Learning via Over-the-Air Computation. IEEE Trans. Wirel. Commun. 2020, 19, 2022–2035. [Google Scholar] [CrossRef]
Zhan, Y.; Li, P.; Qu, Z.; Zeng, D.; Guo, S. A Learning-Based Incentive Mechanism for Federated Learning. IEEE Internet Things J. 2020, 7, 6360–6368. [Google Scholar] [CrossRef]
Tao, Z.; Li, Q. eSGD: Communication Efficient Distributed Deep Learning on the Edge. In Proceedings of the USENIX Workshop on Hot Topics in Edge Computing (HotEdge ’18), Boston, MA, USA, 10 July 2018. [Google Scholar]
Tao, Z.; Li, Q. CMFL: Mitigating Communication Overhead for Federated Learning. In Proceedings of the 2019 IEEE 39th International Conference on Distributed Computing Systems (ICDCS), Dallas, TX, USA, 7–9 July 2019. [Google Scholar]
Han, S.; Mao, H.; Dally, W.J. Deep Compression: Compressing Deep Neural Networks with Pruning, Trained Quantization and Huffman Coding. arXiv 2015, arXiv:1510.00149. [Google Scholar]
Caldas, S.; Konecny, J.; McMahan, H.B.; Talwalkar, A. Expanding the Reach of Federated Learning by Reducing Client Resource Requirements. arXiv 2018, arXiv:1812.07210. [Google Scholar]
Zhu, G.; Wang, Y.; Huang, K. Broadband Analog Aggregation for Low-Latency Federated Edge Learning. IEEE Trans. Wirel. Commun. 2020, 19, 491–506. [Google Scholar] [CrossRef]
Zeng, Q.; Du, Y.; Huang, K.; Leung, K.K. Energy-Efficient Radio Resource Allocation for Federated Edge Learning. In Proceedings of the 2020 IEEE International Conference on Communications Workshops (ICC Workshops), Dublin, Ireland, 7–11 June 2020. [Google Scholar]
Lim, W.Y.B.; Nguyen, C.L.; Dinh, T.H.; Jiao, Y.; Liang, Y.-C.; Yang, Q.; Niyato, D.; Miao, C. Federated Learning in Mobile Edge Networks: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 2020, 22, 2031–2063. [Google Scholar]
Haenggi, M.; Andrews, J.; Baccelli, F.; Dousse, O.; Franceschetti, M. Stochastic geometry and random graphs for the analysis and design of wireless networks. IEEE J. Sel. Areas Commun. 2009, 27, 1029–1046. [Google Scholar] [CrossRef]
Andrews, J.; Baccelli, F.; Ganti, R. A Tractable Approach to Coverage and Rate in Cellular Networks. IEEE Trans. Commun. 2011, 59, 3122–3134. [Google Scholar] [CrossRef]
Liu, S.; Yu, G.; Chen, X.; Bennis, M. Joint User Association and Resource Allocation for Wireless Hierarchical Federated Learning With IID and Non-IID Data. IEEE Trans. Wirel. Commun. 2022, 21, 7852–7866. [Google Scholar] [CrossRef]
Lim, W.Y.B.; Ng, J.S.; Xiong, Z.; Niyato, D.; Miao, C.; Kim, D.I. Dynamic Edge Association and Resource Allocation in Self-Organizing Hierarchical Federated Learning Networks. IEEE J. Sel. Areas Commun. 2021, 39, 3640–3653. [Google Scholar] [CrossRef]
Wu, Q.; Chen, X.; Ouyang, T.; Zhou, Z.; Zhang, X.; Yang, S.; Zhang, J. HiFlash: Communication-Efficient Hierarchical Federated Learning With Adaptive Staleness Control and Heterogeneity-Aware Client-Edge Association. IEEE Trans. Parallel Distrib. Syst. 2023, 34, 1560–1579. [Google Scholar] [CrossRef]
Yang, H.H.; Liu, Z.; Quek, T.Q.S.; Poor, H.V. Scheduling Policies for Federated Learning in Wireless Networks. IEEE Trans. Commun. 2020, 68, 317–333. [Google Scholar] [CrossRef]
Singh, S.; Dhillon, H.S.; Andrews, J.G. Offloading in Heterogeneous Networks: Modeling, Analysis, and Design Insights. IEEE Trans. Wirel. Commun. 2013, 12, 2484–2497. [Google Scholar] [CrossRef]
Ferenc, J.-S.; Néda, Z. On the size distribution of Poisson Voronoi cells. Physica A Stat. Mech. Its Appl. 2007, 385, 518–526. [Google Scholar] [CrossRef]
Stoyan, D.; Kendall, W.; Mecke, J. Stochastic Geometry and Its Applications, 2nd ed.; John Wiley and Sons: Chichester, UK, 1996. [Google Scholar]

Figure 1. Deployments of BSs and users in an area measuring

20 k m \times 20 k m

.

Figure 1. Deployments of BSs and users in an area measuring

20 k m \times 20 k m

.

Figure 2. Coverage probability with respect to

λ_{BS}

.

Figure 2. Coverage probability with respect to

λ_{BS}

.

Figure 3. Optimized transmission rate in problem (5) for various BS densities.

Figure 4. Solution to the joint optimization problem (6) for various user densities: (a) Optimized transmission rate. (b) Optimized BS density.

Table 1. Simulation parameters.

Symbol	Description	Value [Unit]
$α$	Path-loss exponent	4
B	Bandwidth	20 [MHz]
$λ_{UE}$	User density	50 [users/km $^{2}$ ]
u	Transmission rate	10 [Mbps]
P	Transmit power	20 [dBm]
$σ^{2}$	Additive noise power	$- 104$ [dBm]

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Nguyen, K.A.; Nguyen, Q.A.; Hong, J.-P. Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density. Sensors 2023, 23, 7184. https://doi.org/10.3390/s23167184

AMA Style

Nguyen KA, Nguyen QA, Hong J-P. Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density. Sensors. 2023; 23(16):7184. https://doi.org/10.3390/s23167184

Chicago/Turabian Style

Nguyen, Khoa Anh, Quan Anh Nguyen, and Jun-Pyo Hong. 2023. "Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density" Sensors 23, no. 16: 7184. https://doi.org/10.3390/s23167184

APA Style

Nguyen, K. A., Nguyen, Q. A., & Hong, J.-P. (2023). Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density. Sensors, 23(16), 7184. https://doi.org/10.3390/s23167184

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Federated Learning in Small-Cell Networks: Stochastic Geometry-Based Analysis on the Required Base Station Density

Abstract

1. Introduction

2. System Model

3. Performance Analysis and Proposed Algorithm

3.1. Coverage Probability

3.2. Coverage Probability in High SNR Regime

4. Transmission Rate and BS Density Optimization

5. Simulation Results

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI