On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity

Zhang, Peng; Willems, Frans M. J.

doi:10.3390/e22040418

Open AccessArticle

On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity

by

Peng Zhang

^1,* and

Frans M. J. Willems

²

¹

IMEC the Netherlands, 5656AE Eindhoven, The Netherlands

²

Department of Electrical Engineering, Technical University of Eindhoven, 5612AZ Eindhoven, The Netherlands

^*

Author to whom correspondence should be addressed.

Entropy 2020, 22(4), 418; https://doi.org/10.3390/e22040418

Submission received: 14 February 2020 / Revised: 3 April 2020 / Accepted: 4 April 2020 / Published: 7 April 2020

(This article belongs to the Special Issue Wireless Networks: Information Theoretic Perspectives)

Download

Browse Figures

Versions Notes

Abstract

:

We investigate the downlink of a cell-free massive multiple-in multiple-out system in which all access points (APs) are connected in a linear-topolpgy fronthaul with constrained capacity and send a common message to a single receiver. By modeling the system as an extension of the multiple-access channel with partially cooperating encoders, we derive the channel capacity of the two-AP setting and then extend the results to arbitrary N-AP scenarios. By developing a cooperating mode concept, we investigate the optimal cooperation among the encoders (APs) when we limit the total fronthaul capacity, and the total transmit power is constrained as well. It is demonstrated that achieving capacity requires a water-pouring distribution of the total available fronthaul capacity over the fronthaul links. Our study reveals that a linear growth of total fronthaul capacity results in a logarithmic growth of the beamforming capacity. Moreover, even if the number of APs would be unlimited, only a finite number of them need to be activated. We found an expression for this number.

Keywords:

channel capacity; distributed beamforming; cell-free MIMO; constrained fronthaul

1. Introduction

Recently, cell-free massive multiple-input multiple-output (mMIMO) has been considered as a key technology for beyond-5G networks. In such user-centric transmission systems, a large number of distributed access points (APs) are connected to one central processing unit (CPU) via fronthaul links and phase coherently cooperate to cover a wide area for a small number of users in the same time-frequency resource using time-division operation. Compared to cell-based collocated mMIMO solutions, such technology improves energy-spectral efficiency and enhances immunity to shadow fading without extra signal processing burdens. We refer to [1,2,3] and the references therein for a general overview of current developments of cell-free mMIMO.

Effectively utilizing fronthaul resources is of critical importance for deploying a scalable cell-free mMIMO system. Considering the downlink for instance, simple distributed conjugate beamforming is optimal, as shown in [1]. However, it can already be seen that a large amount of information exchange over fronthaul links is required since all the APs need to know the message that is to be transmitted. A star-topology fronthaul where each APs are individually connected to a CPU was originally modeled and has been widely studied, see, e.g., [4,5,6] and the references therein. Currently, a serial fronthaul connecting APs in a linear topology is considered for achieving a cost-efficient architecture, both in deployment and maintenance [3]. A novel and promising technique relying on a linear topology is the radio stripe system, where multiple APs are embedded in a cable/strip, see [3,7] in detail. Such radio stripes can be easily and invisibly deployed indoor or outdoor in existing constructions to enable numerous new applications [8].

The focus of prior work in cell-free mMIMO study was on developing wireless signaling techniques. In this paper, we study from an information-theoretic perspective the downlink of a cell-free mMIMO system shown by Figure 1, where single-antenna APs are connected in a linear topology with constrained fronthaul capacities to communicate to one single-antenna terminal receiver (Rx).

The considered multiple-in single-out (MISO) setup forms a distributed massive beamforming system and can be formulated as a multiple-access-channel (MAC) with limited fronthaul capacity, which is defined as the maximal amount of information that can be reliably sent per MAC channel use [9]. By investigating the channel capacity of such a MAC, we reveal essential relations between the three the most fundamental resources of the system, i.e., the total available number of APs (N), the total transmit power (P), and total available fronthaul capacity (

C_{B}

). Specifically, in the current cell-free mMIMO literature, the only configuration of APs that is considered is where full cooperation (full beamforming) is realized and where the same information is shared at all involved APs. Therefore, for a real-valued Gaussian MISO channel with N APs and unity channel gains, the maximum downlink rate is given by the channel capacity

\begin{matrix} C_{full} : = \frac{1}{2} {log}_{2} (1 + N \cdot SNR) bits / channel use \end{matrix}

(1)

where

SNR

is the received signal-to-noise ratio (SNR) if only one AP is active with all available transmit power assigned to it. It requires

C_{B}^{full} : = (N - 1) C_{full}

fronthaul capacity among N APs.

In this work, we focus on the case where the available fronthaul capacity is not large enough to support full cooperation of the APs. We were motivated to investigate the achievable downlink rates given that fronthaul resources for communication between the APs is constrained. We call this setting partial beamforming, since

C_{B} < C_{B}^{full}

. We could derive the channel capacity and the optimal cooperation strategies among APs for given total available P and N.

1.1. Related Work

We can model the studied system as a special extension of the multiple-access channel (MAC) with partially cooperating encoders studied by Willems [9]. In particular, we can generalize the system setup in [9] to a network of encoders by considering only one source but employing an arbitrary number of encoders, namely APs, via unidirectional conferences. Since fronthaul links can be treated as separate channels that are orthogonal to the beamforming MAC, our setup might also be viewed as an extension of a special case of the orthogonal-component relay channel due to El Gamal and Zahedi [10], which is generalized to relay networks by Ghabeli and Aref in [11]. In addition, if only two APs are considered, our study is also strongly related to the multiple access diamond channel as studied in [12,13]. Moreover, the two APs setup looks very similar to the semi-deterministic relay channels [14]. Furthermore, it is also worth to note that in our system, all APs cooperatively send one message to a receiver at a same time. In this sense, our channel setting is “noncausal”, which is related to the relay-with-delay channel studied in [15] in general.

1.2. Contributions and Organization

By investigating the MAC with limited fronthaul capacity in the discrete channel case and in the Gaussian channel cases, the main findings of our research work include

The channel capacity is found for an arbitrary number of APs for both discrete channel and the Gaussian channel with constrained transit power, where the total fronthaul capacity and the total number of APs are limited.
When numerous APs are engaged, a linear growth of total fronthaul capacity results in a logarithmic growing of the channel (beamforming) capacity.
A concept of cooperating modes is developed to demonstrate the optimal cooperation among APs to achieve capacity based on superposition coding.
When the channel capacity is only limited by the fronthaul capacity, the number of required APs is quasi-linear to the available fronthaul capacity even if the number of APs would be unlimited.
A new and sharp lower bound of the Lambert-W function is derived for computing the number of required APs given by the total fronthaul constraint.

In the rest of this paper, the system model is first presented in Section 2. In Section 3, we start with investigating a two-APs setting consisting of one fronthaul link. This setting serves as a baseline system where the cooperating mode concept is developed. In Section 4, the study is extended to the case where an arbitrary number of APs is engaged and the behavior and exact solution of the channel capacity is derived. In Section 5, the number of required APs is derived to leverage limited fronthaul resources if the number of available APs is unlimited. Finally, the conclusion and final remarks can be found in Section 6. Detailed proofs and derivations of the presented results are collected in the Appendix A. Partial material in this paper was presented in [16].

2. Problem Setup

2.1. Notation

Although all the paper, capital letters, e.g., X, denote random variables, and their realizations are denoted by small letters, e.g., x. The probability mass or density function according to X is denoted by

p_{X} (x)

or simply

p (x)

. The expectation of X is denoted by

E [X]

. The entropy of X is denoted by

H (X)

and the differential entropy is denoted by

h (X)

. The mutual information between X and Y is denoted by

I (X; Y)

. The consecutive integer range from i to j with

i \leq j

is denoted by

[i : j]

. In addition, a set of elements

x_{m}

with index m in range of i to j is denoted as

{x_{m}}_{m = i}^{j}

.

2.2. System Model

The investigated system is modeled as Figure 2, where we denote the CPU as the source, the APs as encoders, while for the destination, the receiver is denoted as the decoder. As plotted, one-directional fronthaul links connect N adjacent encoders that simultaneously send a uniformly distributed message

W \in [1 : M]

to a decoder (receiver). We focus on the study of the fronthaul resource usage among all encoders. The discrete memoryless MAC denoted by

(X_{1} \times X_{2} \times \dots \times X_{N}, p (y | x_{1}, x_{2}, \dots, x_{N}), Y, {C_{m, m + 1}}_{m = 1}^{N - 1})

consists of input alphabets

{X_{m}}_{m = 1}^{N}

, output alphabet

Y

, a transition probability distribution

p (y | x_{1}, x_{2}, \dots, x_{N})

, and a set of fronthaul capacity constraints

{C_{m, m + 1}}_{m = 1}^{N - 1}

between N encoders.

Before the beginning of each n channel uses, (partial) information about the generated message W is first shared among N encoders. Let

W_{m, m + 1} \in [1 : M_{m, m + 1}]

for

m \in [1 : N - 1]

be the message sent over the fronthaul link between encoder m to encoder

m + 1

. Then, the encoders map the messages W and

{W_{m, m + 1}}_{m = 1}^{N - 1}

into codewords

{x_{m}^{n}}_{m = 1}^{N}

as follows

\begin{matrix} e_{1} (W) & \to & (X_{1}^{n}, W_{12}), \\ e_{m} (W_{m - 1, m}) & \to & (X_{m}^{n}, W_{m, m + 1}), \\ e_{N} (W_{N - 1, N}) & \to & X_{N}^{n}, \end{matrix}

where

{e_{m} (\cdot)}_{m = 1}^{N}

are the corresponding encoding functions. Meanwhile, the generated fronthaul messages should satisfy

\frac{1}{n} {log}_{2} M_{m, m + 1} \leq C_{m, m + 1} .

(2)

As presented, the corresponding fronthaul link capacity

C_{m, m + 1} \geq 0

is defined as the maximal amount of information that can be reliably sent per channel use of the MAC channel over the link from encoder m to encoder

m + 1

.

At the decoder, a deterministic decoding function

d : Y^{n} \to [1 : M]

is applied to obtain the message-estimate

\hat{W}

based on the channel output

y^{n}

. We define the average probability of error at the decoder as

P_{e}^{(n)} \overset{Δ}{=} Pr (\hat{W} \neq W) .

(3)

Now we say that a rate R is achievable with given fronthaul capacities

{C_{m m + 1}}_{m = 1}^{N - 1}

if there exists N encoders and a corresponding decoder, such that

\begin{matrix} {log}_{2} M & \geq & n (R - δ), \\ {log}_{2} M_{m, m + 1} & \leq & n C_{m, m + 1}, \\ P_{e}^{(n)} & \leq & δ, \end{matrix}

(4)

for all

δ > 0

and large enough n. The channel capacity C (of MAC) as a function of the fronthaul capacities is defined as the supremum of all achievable rates given by all the fronthaul constraints. Eventually, we will be interested only in a constraint on the sum of the fronthaul capacities

C_{B}

that is defined as

C_{B} \overset{Δ}{=} \sum_{m = 1}^{N - 1} C_{m, m + 1} .

(5)

To interpret the capacity results for the partial beamforming, we focus on MACs with additive white Gaussian noise. At the output of the Gaussian MAC, the decoder receives

Y_{i} = \sum_{m = 1}^{N} X_{m i} + Z_{i},

(6)

at time i, where

X_{m i}

is the transmitted symbol by encoder m and

Z_{i}

is modeled as independent and identically distributed (i.i.d.) Gaussian noise at the decoder for all

i \in [1 : n]

. For individual encoder m,

m \in [1 : N]

, the transmit power constraint is

\frac{1}{n} \sum_{i = 1}^{n} E [X_{m i}^{2}] \leq P_{m}

(7)

for

P_{m} \geq 0

. Then, the total transmit power is limited as

\sum_{m = 1}^{N} P_{m} \leq P .

(8)

Without loss of generality, we assume that

Z_{i} \sim N (0, 1)

. Therefore, the transmit SNR can be directly represented by the total constrained transmit power P.

3. Two-Encoder Result

We first investigate the simplest system setting where only two encoders are involved. The MAC is now denoted by

(X_{1} \times X_{2}, p (y | x_{1}, x_{2}), Y, C_{12})

. The fronthaul message

W_{12} \in [1 : M_{12}]

must satisfy the constraint

\frac{1}{n} {log}_{2} M_{12} \leq C_{12},

(9)

which is same to the total fronthaul capacity

C_{B}

in this case. The underlying Gaussian MAC is given by

Y_{i} = X_{1 i} + X_{2 i} + Z_{i} .

(10)

Although this two-encoder setting can be considered as a special case of related work, see discussion later, we provide here the capacity proofs for both discrete and Gaussian MACs. The applied approach carries over to the N-encoder setting that is investigated in Section 2.

In the following, the channel capacity as a function of the fronthaul capacity is first obtained for the discrete memoryless MAC. Then, we derive capacity results for the Gaussian case with total transmit power constraint. Within this study, a so-called cooperating mode concept is developed that will be very useful to provide cooperation insights among encoders when more of them are engaged.

3.1. Discrete Channel

First, consider the discrete channel setup.

Theorem 1.

For the discrete memoryless channel

p (y | x_{1}, x_{2})

, the channel capacity C as a function of the fronthaul capacity

C_{12}

is given by

C (C_{12}) = max_{p (x_{1}, x_{2})} min {I (X_{1}, X_{2}; Y), I (X_{1}; Y | X_{2}) + C_{12}},

(11)

where distribution

p (x_{1}, x_{2}, y) = p (x_{1}, x_{2}) p (y | x_{1}, x_{2})

is determined by the input distribution

p (x_{1}, x_{2})

.

The detailed proof is provided in Section Appendix A.1, where the converse is based on the Markovities of

W \to (X_{1}^{n}, X_{2}^{n}) \to Y^{n}

and

(W, W_{12}) \to (X_{1}^{n}, X_{2}^{n}) \to Y^{n}

, and the achievability is based on applying superposition coding. For the achievability, the source splits the message W into two parts

(W_{1}, W_{12})

and delivers the index of

W_{12}

over the fronthaul link to encoder 2 that maps

W_{12}

into the inner code while encoder 1 of the source encodes

W_{1}

into an outer code-word which is super-imposed on the inner code-word. Although this coding scheme is simple, the cooperating mode concept that is important for studying the multi-encoder setup will be developed based on the superposition scheme as discussed later.

Remark 1.

By viewing the two-encoder setting as a special setup of ([9], Figure 1), where only one source and one conference link are deployed, we can have Theorem 1 by letting the common message

U = X_{2}

and the conference capacity

C_{21} = 0

in ([9], Thm.). Note that the achievability in [9] which is based on binning becomes superposition coding.

Remark 2.

By viewing the two-encoder setting as a special setup of the multiple access diamond channel where one source connects to two encoders (relays) by using two separate noiseless links, see [12,13], Theorem 1 can also be obtained if letting

C_{1} = \infty

, the common message

V = X_{2}

, and the common message rate

R_{0} = C_{2}

(or

C_{2} = \infty

,

V = X_{1}

,

R_{0} = C_{1}

) in ([12], Thm. 2). Note that the achievability based on superposition and Marton-coding in [12,13] becomes superposition coding only.

3.2. Gaussian Channel

Now we consider the Gaussian MAC of the two-encoder channel setting given by (10) with total power constraint P, i.e.,

P_{1} + P_{2} \leq P

, where

\frac{1}{n} \sum_{i = 1}^{n} E [x_{1 i}^{2}] \leq P_{1} and \frac{1}{n} \sum_{i = 2}^{n} E [x_{2 i}^{2}] \leq P_{2} .

(12)

This first leads to the following result.

Theorem 2.

The channel capacity

C (C_{12}, P)

of the two-encoder Gaussian MAC is

\begin{matrix} C (C_{12}, P) & = & max_{0 \leq β \leq 1} min \{\frac{1}{2} {log}_{2} (1 + (1 + β) P), \frac{1}{2} {log}_{2} (1 + (1 - β) P) + C_{12}\} . \end{matrix}

(13)

The proof is the adaptation of the discrete channel version given in Section Appendix A.1 by considering the transmit power constraints and Gaussian channel noise.

Proof.

(i) Converse. First note that without loss of generality (and without violating the power constraints) we may assume that all

E [X_{1 i}] = E [X_{2 i}] = 0

for all

i \in [1, n]

. If we define

(X_{1}, X_{2}, Y)

being the random triple with density

p_{X_{1}, X_{2}, Y} (x_{1}, x_{2}, y) = \frac{1}{N} \sum_{i = 1}^{N} p_{X_{1 i}, X_{2 i}, Y_{i}} (x_{1}, x_{2}, y)

then converse in Section Appendix A.1 shows that

\begin{matrix} I (X_{1}^{n}, X_{2}^{n}; Y^{n}) & \leq & n I (X_{1}, X_{2}; Y), \\ I (X_{1}^{n}; Y^{n} | X_{2}^{n}) & \leq & n I (X_{1}; Y | X_{2}), \end{matrix}

(14)

where the random variables

X_{1}

,

X_{2}

satisfy

E [X_{1}] = E [X_{2}] = 0

and

\begin{matrix} E [X_{1}^{2}] & \leq & \frac{1}{n} \sum_{i = 1}^{n} E [X_{1 i}^{2}] \leq P_{1}, \end{matrix}

(15)

\begin{matrix} E [X_{2}^{2}] & \leq & \frac{1}{n} \sum_{i = 1}^{n} E [X_{2 i}^{2}] \leq P_{2} . \end{matrix}

(16)

First consider the random pair

(X_{1}, X_{2})

. By applying the Cholesky factorization ([17], Thm. 4.2.7) to the covariance matrix of

{[X_{2}, X_{1}]}^{T}

, the assignment of

\{\begin{cases} X_{1} & = & α_{21} S_{2} + α_{1} S_{1}, \\ X_{2} & = & α_{22} S_{2}, \end{cases}

(17)

can be obtained, where

S_{1}

and

S_{2}

are uncorrelated with zero means and unit variances.

Next, observe that if we take

α_{21}^{'} = α_{22}^{'} = (α_{21} + α_{22}) / 2 = α_{2}

this choice does not affect

I (X_{2}; Y) = I (S_{2}; Y)

and

I (X_{1}; Y | X_{2}) = I (S_{1}; Y | S_{2})

, but minimizes the total transmit power for fixed

α_{21} + α_{22}

, since

2 α_{2}^{2} = 2 {(\frac{α_{21} + α_{22}}{2})}^{2} \leq α_{21}^{2} + α_{22}^{2} .

(18)

Therefore we only need to consider assignment

\{\begin{cases} X_{1} & = & α_{2} S_{2} + α_{1} S_{1}, \\ X_{2} & = & α_{2} S_{2} . \end{cases}

(19)

Now we take

α_{2}^{2} = P_{2}

,

α_{2}^{2} + α_{1}^{2} = P_{1}

, and

2 α_{2}^{2} + α_{1}^{2} = P

. By denoting

β \overset{Δ}{=} \frac{2 α_{2}^{2}}{P}

(20)

in

[0, 1]

, we further have

α_{1}^{2} = (1 - β P) and α_{2}^{2} = β P / 2 .

(21)

Taking the signal assignment (19) and the power assignment (21) gives that

\begin{matrix} I (X_{1}, X_{2}; Y) & = & I (S_{1}, S_{2}; Y) = h (α_{1} S_{1} + 2 α_{2} S_{2} + Z) - h (Z) \\ \overset{(a)}{\leq} & \frac{1}{2} {log}_{2} (1 + (1 + β) P), \end{matrix}

(22)

\begin{matrix} I (X_{1}; Y | X_{2}) & = & I (S_{1}; Y | S_{2}) = h (α_{1} S + Z) - h (Z) \\ \overset{(b)}{\leq} & \frac{1}{2} {log}_{2} (1 + (1 - β) P), \end{matrix}

(23)

where

(a)

and

(b)

follow by the maximum differential entropy theorem, see ([18], Thm. 8.6.5).

(ii) Achievability. Taking the assignment (19) by letting

S_{1} \sim N (0, 1)

and

S_{2} \sim N (0, 1)

. Using the power assignment (21) directly gives

\begin{matrix} I (X_{1}, X_{2}; Y) & = & I (S_{1}, S_{2}, Y) = \frac{1}{2} {log}_{2} (1 + (1 + β) P), \end{matrix}

(24)

\begin{matrix} I (X_{1}; Y | X_{2}) & = & I (S_{1}; Y | S_{2}) = \frac{1}{2} {log}_{2} (1 + (1 - β) P) . \end{matrix}

(25)

The rest of the proof follows by first establishing a coding theorem for the discrete memoryless channel with input cost (power constraint). The step from discrete to Gaussian channels is justified by the relation between differential entropy and discrete entropy, see, e.g., ([18], Thm. 9.3.1). □

Now, by optimizing over

β

in (13), we can further express C as a function only in total transmit power P and total fronthaul capacity

C_{B}

, which is

C_{12}

for this two-encoder setup.

Corollary 1.

The channel capacity

C (C_{B}, P)

of the total transmit power constrained two-encoder Gaussian MAC can be expressed as

\begin{matrix} C (C_{B}, P) = \{\begin{matrix} \frac{1}{2} {log}_{2} (1 + 2 P), if C_{B} \geq \frac{1}{2} {log}_{2} (1 + 2 P) \\ \frac{1}{2} {log}_{2} (1 + P) + \frac{1}{2} {log}_{2} (2 - \frac{2}{2^{2 C_{B}} + 1}), otherwise . \end{matrix} \end{matrix}

(26)

Proof.

The two logarithms on the RHS of (13) are monotonically increasing and decreasing in

β

respectively and equal to each other at

β = 0

. Hence, we can set

\frac{1}{2} {log}_{2} (1 + (1 + β) P) = \frac{1}{2} {log}_{2} (1 + (1 - β) P) + C_{B}

(27)

to obtain the

β

that maximizes C for

\forall C_{B} \in [0, \frac{1}{2} {log}_{2} (1 + 2 P)]

as

β^{⋆} = \frac{(1 + P) (2^{2 C_{B}} - 1)}{P (2^{2 C_{B}} + 1)} .

(28)

This results in the second capacity expression in (26). Then, if

C_{B} > \frac{1}{2} {log}_{2} (1 + 2 P)

, the second term is always larger than the first term for any

β

in (13). This corresponds to the situation where

C_{B}

is large enough and the transmission over the MAC is the bottleneck of the network. In this case, C remains at its global maximum. □

Note that, for

C_{B} < \frac{1}{2} {log}_{2} (1 + 2 P)

, the first term of the capacity result (26) is the channel capacity with no beamforming and the second term directly represents the partial beamforming gain that is independent of transmit power P and only grows as the fronthaul capacity increases. As revealed, the partial beamforming gain increases with a same rate regardless of the transmit power P.

3.3. Cooperating Modes

Based on assignment (19) that possesses a superposition structure, we can naturally denote two cooperating modes as what follows to describe the optimal cooperation between the encoders for the capacity achieving.

mode 1: Sending a private message given by $α_{1} S_{1}$ from encoder 1;
mode 2: Coherently sending a common message given by $α_{2} S_{2}$ from encoder 2 and encoder 1.

According to (20), the parameter

β

represents the fraction of the total transmit power assigned to mode 2 while

1 - β

represents the remaining fraction assigned to mode 1. Note that

β^{⋆}

given by (28) should be taken for achieving the capacity.

Now consider the cooperation scenarios of the two encoders based on the availability of

C_{B}

. If

C_{B} = 0

, the transmission reduces to the point-to-point communication case. This is represented by having only

m o d e

1 active and encoder 2 is inactive. If

C_{B} \geq C_{full}

, full cooperation can be achieved by activating

m o d e

2 only. For

C_{12} \in (0, C_{full})

, two encoders cooperate to achieve partial beamforming capacity by activating both cooperating modes. Figure 3 illustrates the cooperating modes activating and deactivating at encoders depending on

C_{B}

increasing from 0 to

C_{full}

.

For the two-encoder setting, the modes evolution due to available amount of

C_{B}

looks straightforward. Nevertheless, it will be shown that this cooperating modes interpretation provides a clear insight of leveraging available encoders for given certain total fronthaul and transmit power constraints, where the optimal cooperation is not trivial as the number of encoders goes largely.

4. N-Encoder Result

Based on the investigations of the two-encoder setting, we extend the study to the system model with arbitrarily N encoders, where

N \geq 2

. The parameter N in principle can be any large integer so that a distributed massive beamforming is obtained. The investigation is focused on the Gaussian MAC under the constraints of the total fronthaul capacity

C_{B}

and the total transmit power P, which are defined by (5) and (8), respectively. Before addressing the exact capacity solution for arbitrary N encoders, we first derive capacity bounds of

C (C_{B}, P)

to provide a general behavior of channel capacity C in total fronthaul capacity

C_{B}

. The obtained result indicates that the growth of C requires an exponential growth of

C_{B}

. By using the compound mode, the exact capacity solution with the optimal cooperation among encoders are derived. The results show that the distributed beamforming system works most efficiently when it is working in its fronthaul-capacity-limited regime. As a result, we consider the case where encoders are always available to be activated as needed to leverage the entire fronthaul resource.

4.1. Discrete Channel

For simplicity, let the tuple

{\underset{̲}{X}}_{l}^{m} ≜ (X_{l}, X_{l + 1}, \dots, X_{m})

be the collection of ordered transmitted random variables that are generated at encoder l to encoder m with

l \leq m

for one channel use. In addition, let

{\underset{̲}{C}}_{b} ≜ {C_{j, j + 1}}_{j = 1}^{N - 1}

be the collection of the corresponding fronthaul capacities.

Theorem 3.

For the discrete memoryless N-encoder setting, channel capacity C of the channel

P (y | x_{1}, x_{2}, \dots, x_{N})

as a function of fronthaul capacities

{\underset{̲}{C}}_{b}

is

C ({\underset{̲}{C}}_{b}) = max_{p (x_{1}, x_{2}, \dots, x_{N})} min \{I ({\underset{̲}{X}}_{1}^{N}; Y), I ({\underset{̲}{X}}_{1}^{m}; Y | {\underset{̲}{X}}_{m + 1}^{N}) + C_{m, m + 1}}_{m = 1}^{N - 1}\},

(29)

with

N \geq 2

.

A sketch of the proof is given in Section Appendix A.2. As shown in the achievability, the capacity is achieved by applying an N-layer superposition coding among the encoders, which naturally agrees with the studied linear topology.

4.2. Gaussian Channel under Total fronthaul Constraint

By considering on the total power and separate fronthaul constraints, we first have the following result.

Theorem 4.

The the N-encoder Gaussian setting with the total transmit power constraint of P, the channel capacity C as a function of the fronthaul capacities

{\underset{̲}{C}}_{b}

is

\begin{matrix} C ({\underset{̲}{C}}_{b}, P) = max_{\underset{̲}{β}} min \{\frac{1}{2} {log}_{2} (1 + \sum_{l = 1}^{N} l β_{l} P), {\frac{1}{2} {log}_{2} (1 + \sum_{l = 1}^{m} l β_{l} P) + C_{m, m + 1}}_{m = 1}^{N - 1}\}, \end{matrix}

(30)

where

\underset{̲}{β} = {(β_{1}, β_{2}, \dots, β_{N})}^{T}

is a probability vector.

Proof.

Similar to proof of the two-encoder setting, the generic signal assignment

X_{m} = \sum_{l = m}^{N} α_{l m} S_{l}

(31)

can be used at each encoder for

m \in [1, N]

, where

{S_{l}}_{l = 1}^{N}

are uncorrelated and have zero mean and unit variance. Again, we can further apply the special signal assignment

X_{m} = \sum_{l = m}^{N} α_{l} S_{l}

(32)

to minimize the total transmit power without affecting dependency of the different signals

{S_{l}}_{l = 1}^{N}

at the decoder that determines the beamforming capacity. In this way, the transmit power allocated for signal

S_{l}

can be expressed by

β_{l} = \frac{l α_{l}^{2}}{P}

(33)

such that

\sum_{l = 1}^{N} β_{l} = 1

.

Thus, for the converse, we can use the assignment (32) to evaluate (29) and the mutual informations on the RHS are bounded as given by (30). Then, for the achievability, by letting

S_{l} \sim N (0, 1)

, the result follows. □

The proof shows that all the transmitted signals at encoders should form a Markov chain

X_{N} \to X_{N - 1} \to \dots \to X_{1}

. Again, since signal

α_{l} S_{l}

represents the common messages used at first l encoders, we say that cooperating mode l is active if the signal

S_{l}

is generated and sent and there can be N cooperating modes in total for this N-encoder setting.

Now, we can solve the optimization problem

\begin{matrix} \underset{{\underset{̲}{C}}_{b}}{maximize} & C ({\underset{̲}{C}}_{b}, P) \end{matrix}

(34)

\begin{matrix} subject to & \sum_{m = 1}^{N - 1} C_{m, m + 1} = C_{B}, \end{matrix}

(35)

where

C ({\underset{̲}{C}}_{b}, P)

is given by (30), to investigate the total power limited capacity C under the constraint of total fronthaul capacity

C_{B}

for a given P. To do so, we first prove the following lemma. Note that the full-cooperation capacity is now

C_{full} (N) = \frac{1}{2} {log}_{2} (1 + N P)

when N encoders are used. For simplicity, we denote the mutual informations as

I_{m} ≜ \frac{1}{2} {log}_{2} (1 + \sum_{l = 1}^{m} l β_{l} P)

(36)

for any

m \in [1 : N]

.

Lemma 1.

For the N-encoder setting with any given

C_{B} \leq (N - 1) C_{full} (N)

, power distribution

\underset{̲}{β}

can only be optimal if equality of all the terms on the RHS of (30) is achieved.

The proof is given in Appendix A.3.

Remark 3.

Lemma 1 indicates that asymmetric distribution of

C_{B}

over fronthaul link is optimal. This result will be further demonstrated after the capacity result is derived.

Based on the reduced

\underset{̲}{β}

set given by Lemma 1, we make the terms on the RHS of (30) equal and have

C_{m, m + 1} = I_{N} - I_{m} with m \in [1 : N - 1] .

(37)

Thus, the channel capacity and the required total fronthaul capacity in the power allocation vector

\underset{̲}{β}

can now be represented as

\begin{matrix} C (\underset{̲}{β}, P) = I_{N} = \frac{1}{2} {log}_{2} (1 + \sum_{l = 1}^{N} l β_{l} P), \end{matrix}

(38)

and

\begin{matrix} C_{B} (\underset{̲}{β}, P) = (N - 1) I_{N} - \sum_{m = 1}^{N - 1} I_{m} = \frac{N - 1}{2} {log}_{2} (1 + \sum_{l = 1}^{N} l β_{l} P) - \sum_{m = 1}^{N - 1} \frac{1}{2} {log}_{2} (1 + \sum_{l = 1}^{m} l β_{l} P) \end{matrix}

(39)

respectively for a fixed P if

C_{B} \leq (N - 1) C_{full} (N)

. Based on (38) and (39), we can have the following theorem.

Theorem 5.

For the N-encoder Gaussian channel under the total power constraint P and total fronthaul constraint

C_{B}

, the channel capacity is given by

\begin{matrix} \underset{\underset{̲}{β}}{maximize} & C (\underset{̲}{β}, P) = I_{N} \end{matrix}

\begin{matrix} subject to & (N - 1) I_{N} - \sum_{m = 1}^{N - 1} I_{m} = C_{B} \end{matrix}

(40)

\begin{matrix} \sum_{l = 1}^{N} β_{l} = 1 and 0 \leq β_{l} \leq 1 . \end{matrix}

(41)

To evaluate the channel capacity, we only need to maximize the function by introducing a Lagrange multiplier

λ

as

\begin{matrix} g (\underset{̲}{β}, P, λ) & = & I_{N} - λ C_{B} \\ = & (1 - (N - 1) λ) I_{N} + λ \sum_{m = 1}^{N - 1} I_{m} \\ = & \frac{1 - (N - 1) λ}{2} {log}_{2} (1 + \sum_{l = 1}^{N} l β_{l} P) + \frac{λ}{2} \sum_{m = 1}^{N - 1} {log}_{2} (1 + \sum_{l = 1}^{m} l β_{l} P) \end{matrix}

(42)

under the constraint that

\underset{̲}{β}

is a probability vector to derive the solution of

C (C_{B})

for the general N-encoder case. Note that the parameter

λ

is the slope of

C (C_{B})

. However, before working out the exact solution of this optimization problem, we first derive general bounds of

C (C_{B}, P)

to reveal the capacity behavior of the studied distributed beamforming.

4.3. Capacity Behavior Bounds

To obtain a simple but meaningful insight of the relation between C and the constrained

C_{B}

and P for an arbitrary N, we propose an upper bound and a lower bound of the channel capacity to draw the following conclusion.

Proposition 1.

For any fixed total transmit power P and number of encoders N, a linear growth of total fronthaul capacity

C_{B}

results in a logarithmical growing of the channel capacity as C can be bounded as

C \leq \frac{1}{2} {log}_{2} (1 + P) + \frac{1}{2} {log}_{2} (1 + 2 ln 2 \cdot C_{B}),

(43)

and

C > \frac{1}{2} {log}_{2} (1 + {(2 ln 2 \cdot P C_{B})}^{2 / 3}) .

(44)

for

C_{B} \leq (N - 1) C_{full}

.

Proof.

(1) Upper bound. By considering

L \in [1 : N]

as a random variable with distribution

\underset{̲}{β}

, the capacity (38) can be expressed as

C = \frac{1}{2} {log}_{2} (1 + μ_{L} P),

(45)

where

μ_{L} ≜ E [L]

. By applying Jensen’s inequality, the corresponding fronthaul capacity

C_{B}

in (39) can be lower bounded as

\begin{matrix} C_{B} & \geq & (N - 1) C - \frac{N - 1}{2} {log}_{2} (\frac{\sum_{m = 1}^{N - 1} (1 + \sum_{l = 1}^{m} l β_{l} P)}{N - 1}) \\ = & (N - 1) C - \frac{N - 1}{2} {log}_{2} (1 + \frac{\sum_{l = 1}^{N - 1} (N - l) l β_{l} P}{N - 1}) \\ \overset{(a)}{\geq} & (N - 1) C - \frac{N - 1}{2} {log}_{2} (1 + \frac{μ_{L} (N - μ_{L}) P}{N - 1}) \\ = & \frac{N - 1}{2} {log}_{2} (\frac{1 + μ_{L} P}{1 + \frac{μ_{L} (N - μ_{L}) P}{N - 1}}) \\ \overset{(b)}{\geq} & \frac{(μ_{L} - 1) μ_{L} P}{2 ln 2 \cdot (1 + μ_{L} P)} \end{matrix}

(46)

where

(a)

follows

μ_{L}^{2} \leq E [L^{2}]

and

(b)

follows

ln x \geq 1 - \frac{1}{x}

for

x > 0

. Since

μ_{L} \geq 1

, we can have

C_{B} \geq \frac{(μ_{L} - 1) μ_{L} P}{2 ln 2 \cdot (μ_{L} + μ_{L} P)} = \frac{(μ_{L} - 1) P}{2 ln 2 \cdot (1 + P)}

that results in

μ_{L} \leq 1 + 2 ln 2 \cdot (\frac{1 + P}{P}) C_{B}

and thus (43).

(2) Lower bound. Consider time-sharing of the rates given by only using one cooperating mode. Hence, the channel capacity should be larger than or equal to an achievable rate R as

C \geq R ≜ \frac{1}{2} {log}_{2} (1 + k P),

(47)

where k is the number of the activated encoders corresponding to the required total fronthaul capacity

C_{B} = (k - 1) R

(48)

that achieves R. By applying

ln x \leq \frac{x - 1}{\sqrt{x}}

for

x \geq 1

, see, e.g., ([19], Section 3.6.15), we have

R \leq \frac{1}{2 ln 2} \cdot \frac{k P}{\sqrt{1 + k P}}

(49)

that gives an upper bound of

C_{B}

as

\begin{matrix} C_{B} & \leq & \frac{(k - 1) k P}{2 ln 2 \sqrt{1 + k P}} \\ < & \frac{(k - 1) k P}{2 ln 2 \sqrt{k P}} \\ < & \frac{k \sqrt{k P}}{2 ln 2} . \end{matrix}

(50)

Therefore, we can have

k^{3} > {(2 ln 2 \cdot C_{B})}^{2} / P

that directly gives (44). □

The upper bound (43) and lower bound (44) thus indicate the logarithmical behavior of C in

C_{B}

. Figure 4 gives an illustration of these two bounds for 10-encoder Gaussian setting where total transmit power is set at

P = 21

.

The exact capacity solution derived shortly is plotted as well as a comparison, showing that the bounds describes the capacity behavior.

4.4. Compound Mode and Exact Solution

In what follows, we perform evaluation of the channel capacity given in Theorem 5 by defining a compound mode

〈 j, k 〉

as a collection of all consecutive cooperating modes between and including modes

j, k \in [1 : N]

with

j \leq k

. A compound mode

〈 j, k 〉

is referred to as active if all

{β_{l}}_{l = j}^{k}

are nonzero and the other elements in

\underset{̲}{β}

are zeros. Note that using a single mode is a special case of compound mode. By denoting

b (j) ≜ \frac{1}{j (2 + (j + 1) P)},

(51)

we have the following results.

Corollary 2.

For an N-encoder Gaussian setting where

C_{B} \leq (N - 1) C_{full}

with a fixed transmit power P, if there is a compound mode

〈 j, k 〉

such that

U B \geq L B

, where

U B ≜ \{\begin{matrix} \frac{1}{2 (k - 1)} & if & j = 1 \\ min {\frac{1}{2 (k - 1)}, b (j - 1)} & if & j > 1, \end{matrix}

(52)

L B ≜ \{\begin{matrix} max {\frac{1}{2 k}, b (j)} & if & k < N \\ b (j) & if & k = N, \end{matrix}

(53)

the channel capacity corresponding to the slope

λ \in [L B, U B]

is achieved and only achieved by using that compound mode which gives

\begin{matrix} C (λ) = \frac{1}{2} {log}_{2} (\frac{k (1 - (k - 1) λ) (1 + j P)}{j (1 - (j - 1) λ)}), \end{matrix}

(54)

and

\begin{matrix} C_{B} (λ) = \frac{1}{2} {log}_{2} (\frac{{(2 λ)}^{j - k} {(1 + j P)}^{j - 1}}{\prod_{l = j}^{k - 1} \frac{l (l + 1)}{2}} \cdot \frac{{(k (1 - (k - 1) λ))}^{k - 1}}{{(j (1 - (j - 1) λ))}^{j - 1}}) . \end{matrix}

(55)

The proof is given in Appendix A.4.

Remark 4.

The proof in Appendix A.4 shows that if compound mode

〈 j, k 〉

achieves the capacity and

k \geq j + 2

, the modes in

[j + 1 : k - 1]

should be assigned with same power amount as the optimal setting.

Remark 5.

By rewriting (55) and comparing it to (54), we can also represent

C_{B}

in terms of C as

C_{B} (λ) = (j - 1) C (λ) + \frac{1}{2} {log}_{2} (\frac{{[λ^{- 1} k (1 - (k - 1) λ)]}^{k - j}}{\prod_{l = j}^{k - 1} l (l + 1)}) .

(56)

for a certain slope

λ \in [L B, U B]

.

4.5. Modes Selection for Capacity Achieving

The results in Corollary 2 state how the capacity is achieved and expressed over a certain

λ

range. To further elaborate how to exactly use cooperating modes from no cooperation to full cooperation, a procedure efficiently activating modes is developed based on applying the following result, where an identification of valid compound modes that are the ones resulting in capacity is provided in terms of using a power penalty.

Corollary 3.

For any fixed P, a compound mode

〈 j, k 〉

achieves the capacity if and only if

\begin{matrix} \frac{2 (k - j - 1)}{j (j + 1)} < P \leq \frac{2 (k - j + 1)}{j (j - 1) ⌈ 1 - \frac{k}{N} ⌉}, \end{matrix}

(57)

where

⌈ \cdot ⌉

is the ceiling function. If (57) is satisfied, compound modes

〈 j^{'}, k^{'} 〉

with

j^{'} > j

and

k^{'} < k

do not achieve the capacity.

The proof is given in Appendix A.5. The power condition (57) indicates that a compound mode needs certain transmit power to be supported to be optimal. On the other hand, some compound modes can never be optimal if the transmit power is too large.

Now, note that C is monotonically increasing in

C_{B}

owing to nonnegative slope

λ

and monotonically decreasing in

λ

according to (54) when j and k are fixed. It shows that to achieve the capacity, compound modes should be activated in a way such that the corresponding slope range varies from large to small as

C_{B}

increases. Therefore, based on the results in Corollary 3, an algorithm is resulted for computing C and

C_{B}

over

C_{B} \in [0, (N - 1) C_{full}]

by activating valid compound modes sequentially.

Algorithm 1 represents the cooperating strategy among encoders. It reveals that cooperating modes should be activated one-by-one to form new compound modes with the increase of

C_{B}

. At certain point of the growth of

C_{B}

, the first mode dies, i.e., deactivated owing to the limited P or N. With the further increasing of

C_{B}

, lower modes die in a one-by-one fashion till the full cooperation is obtained.

Figure 5 plots the results for

P = 1

and

P = 21

by applying Algorithm 1,

C (C_{B})

over the full range of

C_{B} \in [0, (N - 1) C_{full}]

. Different number of the available encoders are considered. In the plot, each color segment represents the corresponding activated compound mode. In addition, the pentagram markers label the points where a lower mode has to be deactivated (dead) because of the power penalty (57) or because all N encoders are all used up, namely operations in line 13 and line 9 of the algorithm, respectively. It is shown that for low SNR, i.e.,

P = 1

, the modes die fast due to the small power. On the other hand, for large SNR, i.e.,

P = 21

, the larger available encoder number the slower the modes die such that higher capacity can be achieved (consider curves of using 2-encoder, 3-encoder, 4-encoder, and 5-encoder).

Algorithm 1 Compute C and

C_{B}

from no cooperation to full cooperation

Initialize:

j \leftarrow 1

and

k \leftarrow 1

Ensure:

1 \leq j \leq k \leq N

1:: while $j < N$ do
2:: if Power condition (57) is satisfied then
3:: $λ \leftarrow [L B, U B]$
4:: Compute $C (λ)$ and $C_{B} (λ)$ by (54) and (55)
5:: if $k < N$ then
6:: $k \leftarrow k + 1$
7:: else
8:: $k \leftarrow N$
9:: $j \leftarrow j + 1$
10:: end if
11:: else
12:: $k \leftarrow k - 1$
13:: $j \leftarrow j + 1$
14:: end if
15:: end while

Proposition 2.

As probably the most natural strategy, the way of applying modes in the lower bound proof of Proposition 1, i.e., time-sharing full cooperation of small number of encoders, is not optimal in general. However, it is sub-optimal when SNR is small as the compound modes that achieve capacity reduce to single modes.

To visualize each mode evolution from no beamforming to total beamforming, we can illustrate the power allocation for each cooperating mode as

C_{B}

increases. By incorporating calculations of

\underset{̲}{β}

(given in Appendix A.4) and (37) into Algorithm 1, Figure 6 and Figure 7 show the modes’ power evolution of the 10-encoder setting for

P = 5

and

P = 21

, respectively. It is shown that the first mode dies faster when P is relatively small. They also interestingly show that once

C_{B}

is large enough to approach the total beamforming, the last mode dominates as other modes all vanish.

Moreover, we can also elaborate the cooperating of encoders in terms of showing optimal distribution of

C_{B}

over fronthaul links. Figure 8 illustrates the distribution of the 10-encoder setting where the bolder curves are for

P = 5

while the lighter curves are for

P = 21

. In each case, the fronthaul capacity curves for

C_{m m + 1}

for

m = 1

to

m = 9

are located from left to right in the plot. This result further demonstrates the asymmetric water-pouring assignment of

C_{B}

over fronthaul links, see Remark 3.

4.6. $〈 1, k 〉$ Mode and Capacity Regimes

Consider the case where

〈 1, k 〉

mode achieves the capacity for

k \leq N

. In this case, the growth rate of

C (C_{B})

is independent of P and N, see the expression of

C_{B}

in (56) with

j = 1

. Therefore, we call that the system works in a fronthaul-capacity-limited regime when a

〈 1, k 〉

mode is used. The reason why we are interested in the fronthaul-capacity-limited regime is that

C (C_{B})

achieves the fast growth rate regardless of P and N. As a further increase of

C_{B}

, the first mode dies due to either limited P or limited N. We then call the system works in a power-limited regime or encoder-limited regime, respectively. When the system is in either power-limited regime or encoder-limited regime,

C (C_{B})

growth is slowed down compared to when the system works in the fronthaul-capacity-limited regime. This is due to the discontinuities of the slope

λ

, see the derived optimal upper and lower bounds of

λ

. The following result shows how to determine which regime the system works in for given

C_{B}

, P, and N.

Proposition 3.

For given P and N, if

P \leq N - 2,

(58)

the capacity growth is limited by P and the system works in a fronthaul-capacity-limited regime if

C_{B} ⪅ \frac{1}{2} {log}_{2} (\frac{{((1 + \tilde{P}) (2 + \tilde{P}))}^{\tilde{P}}}{\tilde{P}! (\tilde{P} + 1)!}),

(59)

where

\tilde{P} ≜ ⌈ P ⌉

. Otherwise it works in a power-limited regime.

On the other hand, if

P > N - 2

, the capacity growth is limited by N and the system works in a fronthaul-capacity-limited regime if

C_{B} \leq \frac{1}{2} {log}_{2} (\frac{{(N (3 + 2 P - N))}^{N - 1}}{(N - 1)! N!}) .

(60)

Otherwise it works in a encoder-number-limited regime.

Proof.

Modes dying hampers the growth of C in

C_{B}

. Consider that the first mode of the compound mode

〈 1, k 〉

dies because of constrained P not N. In this case, P must satisfy (58) which is given by the lower bound of (57). Consequently, at the moment after the first mode dies, i.e., compound mode

〈 2, k 〉

is active, we have

j = 2

and

k \approx ⌈ P ⌉ + 1

given by taking the upper bound of (57). This

j, k

setting results in

λ = \frac{1}{2 + 2 P}

so that (59) is obtained by evaluating (55).

Similarly, considering the compound mode

〈 1, N 〉

can be supported by P, the first mode dies because that no new encoders can be used. At the moment of first mode dying, i.e., compound mode

〈 1, N 〉

is still active, we thus have

j = 1

and

k = N

, which also result in

λ = \frac{1}{2 + 2 P}

. Hence, (60) is resulted. □

Figure 9 plots

C (C_{B})

of 10-encoder setting for

P = 5

and

P = 21

, respectively, where the regime separations are indicated at the first mode dies for both powers. It is illustrated that in the fronthaul-capacity-limited regime, C has the highest growth rate no matter what its initial value is (point-to-point communication). In the next subsection, we focus on a system working at the fronthaul-capacity-limited regime.

5. Infinitely Many Encoders

Consider designing a system in practice when

C_{B}

and P are critical resources while available encoders could be many, for instance, the radio stripe system. Based on the previous study, we should always try to let the system work in its fronthaul-capacity-limited regime where the fronthaul capacity is maximally utilized. Hence, we are motivated to determine the number of encoders that are required to be activated for a given

C_{B}

by considering infinitely many of them are available when the system is purely fronthaul constrained.

To directly solve k, the highest active mode that is the number of required encoders, from (55) or (56) is not trivial. To achieve an accurate approximate result, we first need the following lemma, of which the proof follows the outline in [20] and is given in Appendix A.6.

Lemma 2.

The non-principle branch of Lambert W function

W_{- 1} (\cdot)

defined in the interval

[- e^{- 1}, 0)

, see [21], can be bounded as follows

W_{- 1} (- e^{- (x + 1)}) \geq - x - \sqrt[3]{2 x} - 1

(61)

for

x \geq 0

.

Remark 6.

Figure A3 shows that for

\forall x \geq 0.5

, the lower bound (61) is much tighter than

W_{- 1} (- e^{- (x + 1)}) \geq - x - \sqrt{2 x} - 1

given in [20], which is the tightest bound of

W_{- 1} (\cdot)

reported in the literature so far, to our best knowledge.

Proposition 4.

When the system works in the fornthaul limited regime, the number of required encoders

\tilde{k}

is quasi-linear to the available fronthaul capacity as

\tilde{k} = ⌊ C_{B} + {(2 C_{B} + 1)}^{1 / 3} + 1.5 ⌋,

(62)

where

C_{B}

in nats per channel use and

⌊ \cdot ⌋

is the floor function.

Proof.

As the system works in the fronthaul-limited regime, compound mode

〈 1, k 〉

exists. Thus, according to (56),

C_{B} = \frac{1}{2} ln (\frac{{[λ^{- 1} k (1 - (k - 1) λ)]}^{k - 1}}{\prod_{l = 1}^{k - 1} l (l + 1)})

(63)

in nats. To upper bound k, we lower bound

C_{B}

by taking

λ = \frac{1}{2 (k - 1)}

, see the bound (52), which gives

\begin{matrix} C_{B} & \geq & \frac{1}{2} ln (\frac{k^{k - 2} {(k - 1)}^{k - 1}}{{((k - 1)!)}^{2}}) \\ \overset{(a)}{\geq} & \frac{1}{2} ln (\frac{e^{2 (k - 1)} k^{k - 2}}{e^{2} {(k - 1)}^{k}}) \\ = & k - 2 - ln k + \frac{k}{2} ln (\frac{k}{k - 1}) \\ \overset{(b)}{\geq} & k - ln k - 1.5 \end{matrix}

(64)

where

(a)

follows by applying

(k - 1)! \leq e {(k - 1)}^{k - \frac{1}{2}} e^{- k + 1}

derived based on ([22], 6.1.38), and

(b)

follows by taking the fact that

{(\frac{k}{k - 1})}^{k}

is monotonically decreasing in k and goes to e as

k \to \infty

. Therefore, we have

- exp (C_{B} + 1.5) > - k exp (- k) .

(65)

Now, solving k and applying (61) give

\begin{matrix} k & \leq & - W_{- 1} (- exp (- (C_{B} + 1.5))) \\ \leq & C_{B} + \sqrt[3]{2 C_{B} + 1} + 1.5 . \end{matrix}

(66)

Finally, since the number of encoders is a integer, (62) is resulted. □

In Figure 10, the bound of k given by (66) and the actual number encoders required to be activated are plotted as a function of

C_{B}

. It is revealed that the derived result is accurate enough.

6. Concluding Remarks

In this paper, the downlink of a cell-free mMIMO in which multiple APs connected in a linear fronthaul topology serve as a single receiver was studied to reveal relations between the three fundamental network resources, namely the total fronthaul capacity

C_{B}

, the total transmit number P, and the number of available APs N. Specifically, we focused on partial distributed beamforming where the total available fronthaul capacity is not enough to support full cooperation between all APs, i.e., beamforming. By formulating the problem as a MAC channel with multiple encoders linked in a feed-and-forward setting, we derived the channel capacity as a function of the total fronthaul capacity for both discrete and Gaussian channels. The derivation was started by considering two encoders and then we extended the analysis multiple encoders. It was demonstrated that capacity is achieved by multi-layer superposition coding from which the concept of cooperating mode was developed for the Gaussian channel. This cooperating mode technique leads to optimal cooperation among encoders. Bounds on the capacity for N-encoder setting demonstrated that this channel capacity grows logarithmically in

C_{B}

for a fixed P. The exact capacity solution shows that the capacity is achieved if and only if by certain compound modes are used. An algorithm was derived for computing which compound modes should be activated as as function of

C_{B}

, which grows from zero to the value obtaining full beamforming. We demonstrated that

C_{B}

should be water-poured over the fronthaul links to obtain optimality. Finally, by considering the case where infinitely many encoders are available, we showed that the number of required encoders is quasi-linear to the available total fronthaul capacity when the system is purely constrained by fronthaul resources.

Future directions include extending the results to channels with links which do not have unit gain as is the case here, and considering multiple receivers. Another interesting direction would be the equivalent uplink case.

Author Contributions

All authors conceived the problem and solution. P.Z. wrote the paper. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Proofs and Derivations

Appendix A.1. Proof of Theorem 1

(i) Converse. Consider Fano’s inequality

H (W | Y^{n}) \leq 1 + P_{e}^{(n)} {log}_{2} M = F

for

F \overset{Δ}{=} 1 + P_{e}^{(n)} {log}_{2} M

. Since

W \to (X_{1}^{n}, X_{2}^{n}) \to Y^{n}

forms a Markov chain, we have that

\begin{matrix} {log}_{2} M = H (W) & = & I (W; Y^{n}) + H (W | Y^{n}) \leq I (W, X_{1}^{n}, X_{2}^{n}; Y^{n}) + F \\ \leq & I (X_{1}^{n}, X_{2}^{n}; Y^{n}) + F \leq \sum_{i = 1}^{n} I (X_{1 i}, X_{2 i}; Y_{i}) + F \\ \leq & n I (X_{1}, X_{2}; Y | Q) + F \leq n I (X_{1}, X_{2}; Y) + F . \end{matrix}

(A1)

Moreover, from the Markovity of

(W, W_{12}) \to (X_{1}^{n}, X_{2}^{n}) \to Y^{n}

, we obtain that

\begin{matrix} {log}_{2} M = H (W) & = & I (W; Y^{n}) + H (W | Y^{n}) \leq I (W, W_{12}; Y^{n}) + F \\ = & I (W_{12}; Y^{n}) + I (W; Y^{n} | W_{12}) + F \leq {log}_{2} M_{12} + I (X_{1}^{n}; Y^{n} | X_{2}^{n}) + F \\ \leq & n C_{12} + I (X_{1}^{n}; Y^{n} | X_{2}^{n}) + F \leq n C_{12} + \sum_{i = 1}^{n} I (X_{1 i}; Y_{i} | X_{2 i}) + F \\ \leq & n C_{12} + n I (X_{1}; Y | X_{2}, Q) + F \leq n C_{12} + n I (X_{1}; Y | X_{2}) + F . \end{matrix}

(A2)

In the above derivations, the random variable Q is uniformly distributed on

[1 : n]

and

Pr {X_{1} = x_{1}, X_{2} = x_{2}} = \frac{1}{N} \sum_{i = 1}^{N} Pr {X_{1 i} = x_{1}, X_{2 i} = x_{2}}

for

x_{1} \in X_{1}, x_{2} \in X_{2}

. If now both

δ \to 0

and

n \to \infty

we obtain that

R \leq min {I (X_{1}, X_{2}; Y), I (X_{1}; Y | X_{2}) + C_{12}},

(A3)

for some distribution

p (x_{1}, x_{2}, y) = p (x_{1}, x_{2}) p (y | x_{1}, x_{2})

, for all achievable rate R. This concludes the converse for the discrete memoryless two-encoder case.

(ii) Achievability. We prove that if the message rate

(1 / n) {log}_{2} M < C_{B} (C_{12})

for a given fronthaul capacity

C_{12}

, the message error probability

P_{e}^{(n)}

approaches zero if the codeword length n increases. Our coding method is based on superposition.

Codebook Generation: First fix a joint probability distribution

{p (x_{1}, x_{2}), x_{1} \in X_{1}, x_{2} \in X_{2}}

. This distribution determines

p_{X_{2}} (x_{2}) = \sum_{x_{1} \in X_{1}} p (x_{1}, x_{2})

and

p_{X_{1} | X_{2}} (x_{1} | x_{2}) = p (x_{1}, x_{2}) / p_{X_{2}} (x_{2})

for

x_{2}

with

p_{X_{2}} (x_{2}) > 0

. Now generate at random

M_{2}

i.i.d. sequences

x_{2}^{n} \in {X_{2}}^{n}

of length n, each drawn according to

Pr {X_{2}^{n} = x_{2}^{n}} = \prod_{i = 1}^{n} p_{X_{2}} (x_{2 i})

and index these sequences as

x_{2}^{n} (w_{2})

as an inner code, where

w_{2} \in [1 : M_{2}]

. Then, for each such

x_{2}^{n} (w_{2})

, generate

M_{1}

sequences

x_{1}^{n} (w_{1}, w_{2})

drawn according to

Pr {X_{1}^{n} = x_{1}^{n} | X_{2}^{n} = x_{2}^{n} (w_{2})} = \prod_{i = 1}^{n} p_{X_{1} | X_{2}} (x_{1 i} | x_{2 i} (w_{2}))

in an i.i.d. fashion as an outer code, where

w_{1} \in [1 : M_{1}]

. The resulting codebook is revealed to both encoders and to the decoder.

Encoding: Split the message W that is uniformly distributed on

[1 : M]

into

(W_{1}, W_{2})

with

M = M_{1} \times M_{2}

, where the first part

W_{1}

, which is uniformly distributed on

[1 : M_{1}]

, is transmitted by encoder 1 and the second part

W_{2}

, which is uniformly distributed on

[1 : M_{2}]

and is conveyed to encoder 2 by

W_{12}

, is transmitted by two encoders cooperatively. Hence, when

(W_{1}, W_{2}) = (w_{1}, w_{2})

, encoder 2 sends

x_{2}^{n} (w_{2})

while encoder 1 inputs

x_{1}^{n} (w_{1}, w_{2})

into the MAC.

Decoding: Let

ϵ > 0

. Based on the observed channel output sequence

y^{n}

, the decoder finds the message pair

(w_{1}, w_{2})

such that

(x_{1}^{n} (w_{1}, w_{2}), x_{2}^{n} (w_{2}), y^{n}) \in A_{ϵ}^{(n)} (X_{1} X_{2} Y),

(A4)

where set

A_{ϵ}^{(n)} (X_{1} X_{2} Y)

is the set of jointly

ϵ

-typical sequences, see Cover and Thomas [18]. If such a pair cannot be found, or if there are more than one such pairs, an error is declared.

Probability of Error: Due to symmetry, the average probability of error is equivalent to the probability of error for an arbitrary message

w \in {1, \dots, 2^{n R}}

. Hence, without loss of generality, we assume

W = w = (w_{1}, w_{2})

. Thus, we have

\begin{matrix} P_{e}^{(n)} & = & Pr {E^{c} (w_{1}, w_{2}) \cup ⋃_{({\tilde{w}}_{1}, {\tilde{w}}_{2}) \neq (w_{1}, w_{2})} E ({\tilde{w}}_{1}, {\tilde{w}}_{2})} \\ \leq & Pr {E^{c} (w_{1}, w_{2})} + \sum_{({\tilde{w}}_{1}, {\tilde{w}}_{2}) \neq (w_{1}, w_{2})} Pr {E ({\tilde{w}}_{1}, {\tilde{w}}_{2})} \\ = & Pr {E^{c} (w_{1}, w_{2})} + \sum_{{\tilde{w}}_{1} \neq w_{1}} Pr {E ({\tilde{w}}_{1}, w_{2})} + \sum_{({\tilde{w}}_{1}, {\tilde{w}}_{2}) : {\tilde{w}}_{2} \neq w_{2}} Pr {E ({\tilde{w}}_{1}, {\tilde{w}}_{2})}, \end{matrix}

(A5)

where

E (w_{1}, w_{2}) \overset{Δ}{=} {(x_{1}^{n} (w_{1}, w_{2}), x_{2}^{n} (w_{2}), Y^{n}) \in A_{ϵ}^{(n)} (X_{1} X_{2} Y)} .

(A6)

Due to the Asymptotic Equipartition Property (AEP), it can be shown that

Pr {E^{c} (w_{1}, w_{2})} \leq ϵ,

(A7)

for all n large enough. Moreover

\begin{matrix} \sum_{{\tilde{w}}_{1} \neq w_{1}} Pr {E ({\tilde{w}}_{1}, w_{2})} & \leq & (M_{1} - 1) \sum_{(x_{1}^{n}, x_{2}^{n}, y^{n}) \in A_{ϵ}^{(n)}} P (x_{1}^{n} | x_{2}^{n}) P (x_{2}^{n}) P (y^{n} | x_{2}^{n}) \\ \leq & (M_{1} - 1) 2^{- n (I (X_{1}; Y | X_{2}) - 4 ϵ)}, \end{matrix}

(A8)

and

\begin{matrix} \sum_{({\tilde{w}}_{1}, {\tilde{w}}_{2}) : {\tilde{w}}_{2} \neq w_{2}} Pr {E ({\tilde{w}}_{1}, {\tilde{w}}_{2})} & = & M_{1} (M_{2} - 1) \sum_{(x_{1}^{n}, x_{2}^{n}, y^{n}) \in A_{ϵ}^{(n)}} P (x_{1}^{n} | x_{2}^{n}) P (x_{2}^{n}) P (y^{n}) \\ \leq & M_{1} (M_{2} - 1) 2^{- n (I (X_{1}, X_{2}; Y) - 3 ϵ)} . \end{matrix}

(A9)

Now as long as

\begin{matrix} M_{1} & \leq & 2^{n (I (X_{1}; Y | X_{2}) - 5 ϵ)}, \\ M_{1} M_{2} & \leq & 2^{n (I (X_{1}, X_{2}; Y) - 4 ϵ)}, \end{matrix}

(A10)

P_{e}^{(n)} \leq 2 ϵ

for all n large enough. Therefore we take

\begin{matrix} {log}_{2} M_{2} & = & min {n (I (X_{2}; Y) + ϵ), n C_{12}}, \\ {log}_{2} M_{1} & = & n (I (X_{1}; Y | X_{2}) - 5 ϵ), \end{matrix}

(A11)

then both (9) and (A10) are satisfied. Note that this implies that

\begin{matrix} {log}_{2} M_{1} M_{2} & = & min {n (I (X_{1}; Y | X_{2}) - 5 ϵ) + n C_{12}, n (I (X_{1}, X_{2}; Y) - 4 ϵ)} . \end{matrix}

(A12)

If we now let

ϵ \to 0

, the achievability part of the Theorem 1 is thus established.

Appendix A.2. Proof of Theorem 3

The proof is a generalization of the proof of the two-encoder settings. Consider a simplified block diagram of the N-encoder setting as shown in Figure A1. Now, consider a cut of the fronthaul link between

X_{m}

and

X_{m + 1}

for any given

m \in [1 : N - 1]

such that the nodes in the network are separated in two sets of

{{\underset{̲}{X}}_{1}^{m}}

and

{{\underset{̲}{X}}_{m + 1}^{N}, Y}

.

Figure A1. Simplified illustration of the MAC for the N-encoder setting.

(i) Converse. Consider the Markovity of

W \to (X_{1}^{n}, X_{2}^{n},

\dots, X_{N}^{n}) \to Y^{n}

. By applying Fano’s inequality, we first have

\begin{matrix} {log}_{2} M & \leq & I (W, W_{12}, \dots, W_{N - 1, N}; Y^{n}) + H (W | Y^{n}) \\ \leq & I (X_{1}^{n}, X_{2}^{n}, \dots, X_{N}^{n}; Y^{n}) + F \\ \leq & \sum_{i = 1}^{n} I (X_{1 i}, X_{2 i}, \dots, X_{N_{i}}; Y_{i}) + F \\ \leq & n I ({\underset{̲}{X}}_{1}^{N}; Y) + F . \end{matrix}

(A13)

Then, considering the cut between

X_{m}

and

X_{m + 1}

, we have that

\begin{matrix} {log}_{2} M & = & I (W, W_{12}, \dots, W_{N - 1, N}; Y^{n}) + H (W | Y^{n}) \\ \leq & I (W_{m, m + 1}; Y^{n}) + I (W, W_{12}, \dots, W_{m - 1, m}; Y^{n} | W_{m, m + 1}, \dots, W_{N - 1, N}) + F \\ \leq & {log}_{2} M_{m, m + 1} + I (X_{1}^{n}, X_{2}^{n}, \dots, X_{m}^{n}; Y^{n} | X_{m + 1}^{n}, X_{N}^{n}) + F \\ \leq & n C_{m, m + 1} + \sum_{i = 1}^{n} I (X_{1 i}, X_{2 i}, \dots, X_{m_{i}}; Y_{i} | X_{m + 1, i}, \dots, X_{N, i}) + F \\ \leq & n C_{m, m + 1} + n I ({\underset{̲}{X}}_{1}^{m}; Y | {\underset{̲}{X}}_{m + 1}^{N}) + F . \end{matrix}

(A14)

Note that the above result is valid for any m in

[1 : N - 1]

. Thus, by letting

n \to \infty

the converse follows.

(ii) Achievability. First consider the message W that can be represented by N independent messages as

W = {W_{i}}_{i = 1}^{N}

, where each

W_{i}

is uniformly distributed on

[1 : M_{i}]

with

\prod_{N}^{i = 1} = M

. Then, given by the linear topology of encoders, we distribute

{W_{i}}_{i = 1}^{N}

into the network in the manner illustrated by Figure A2, i.e., for the link between any

X_{m}

and

X_{m + 1}

, the fronthaul message

W_{m, m + 1}

conveys corresponding messages

{W_{i}}_{i = m}^{N}

. Therefore, for a fixed distribution

p (x_{1}, x_{2}, \dots, x_{N})

and corresponding marginals, we can first generate

M_{N}

i.i.d. n-sequences

x_{N}^{n} (w_{N})

with

w_{N} \in [1, M_{N}]

according to

Pr (X_{N}^{n} = x_{n}^{N}) = \prod_{i = 1}^{n} p_{X_{n}} (x_{n i})

and then for each

x_{N}^{n} (w_{N})

generate

M_{N - 1}

i.i.d. n-sequences

x_{N - 1}^{n} (w_{N - 1}, w_{N})

with

w_{N - 1} \in [1, M_{N - 1}]

according to

Pr (X_{N - 1}^{n} = x_{N - 1}^{n} | X_{N}^{n} = x_{n}^{N} (w_{N})) = \prod_{i = 1}^{n} p_{X_{n - 1} | X_{n}} (x_{n - 1 i} | x_{n i} (w_{N}))

and so on. In this way, an N-layer superposition codebook is generated and revealed at both encoders and decoder.

Figure A2. N-layer superposition coding message structure.

Thus, for sending a message w, encoders transmit sequences

{x_{m}^{n} (w_{m}, w_{m + 1}, \dots, w_{N})}_{m = 1}^{N}

over the MAC channel. At the decoder, a unique message tuple

(w_{1}, w_{2}, \dots, w_{N})

is found by using simultaneous typicality decoding as performed for the two-encoder case. By taking the similar probability of error analysis, it gives that, as long as

\begin{matrix} M_{1} & \leq & 2^{n (I (X_{1}; Y | {\underset{̲}{X}}_{2}^{N}) - 5 ϵ)}, \\ M_{1} M_{2} & \leq & 2^{n (I (X_{1}, X_{2}; Y | {\underset{̲}{X}}_{3}^{N})) - 5 ϵ)}, \\ ⋮ \\ \prod_{m = 1}^{N} M_{m} & \leq & 2^{n (I ({\underset{̲}{X}}_{1}^{N}; Y) - 4 ϵ)}, \end{matrix}

(A15)

we can have

P_{e}^{(n)} \leq 2 ϵ

for all sufficiently large n and any

ϵ > 0

. By further considering

M_{N} \leq 2^{n C_{N - 1, N}}

,

M_{N - 1} M_{N} \leq 2^{n C_{N - 2, N - 1}}

, …, and

\prod_{m = 2}^{N} \leq 2^{n C_{12}}

, we can subsequently take

\begin{matrix} {log}_{2} M_{N} & = & min {n (I (X_{N}; Y) + ϵ), n C_{N - 1, N}} \\ {log}_{2} M_{N - 1} & = & min {n I (X_{N - 1}; Y | X_{N}), n C_{N - 2, N - 1} - {log}_{2} M_{N}} \\ ⋮ \\ {log}_{2} M_{1} & = & n (I (X_{1}; Y | {\underset{̲}{X}}_{2}^{N}) - 5 ϵ) . \end{matrix}

(A16)

Finally, observe that (again subsequently)

\begin{matrix} {log}_{2} M_{N} & = & min {n (I (X_{N}; Y) + ϵ), n C_{N - 1, N}} \\ {log}_{2} M_{N - 1} M_{N} & = & min {I (X_{N - 1}, X_{N}; Y) + ϵ, \\ I (X_{N - 1}; Y | X_{N}) + n C_{N - 1, N}, n C_{N - 2, N - 1}} \\ ⋮ \\ {log}_{2} M & = & {log}_{2} \prod_{m = 1}^{N} M_{m} \\ = & min {n (I ({\underset{̲}{X}}_{1}^{N}; Y) - 4 ϵ), \\ {I ({\underset{̲}{X}}_{1}^{m}; Y | {\underset{̲}{X}}_{m + 1}^{N}) + C_{m, m + 1}}_{m = 1}^{N - 1}}, \end{matrix}

(A17)

which establishes the achievability for

n \to \infty

and

ϵ \to 0

.

Appendix A.3. Proof of Lemma 1

For any fixed

C_{B} \leq (N - 1) C_{full} (N)

, we first consider a realization of

\underset{̲}{β}

. Since

I_{m} \leq I_{m + 1}

for any

m \in [1, N - 1]

, by distributing

C_{B}

on top of

{I_{m}}_{m = 1}^{N - 1}

in a water-filling fashion, we can always have two possible cases if the equality of all terms on the RHS of (30) can not be achieved, i.e.,

Case (a) : L (C_{B}, \underset{̲}{β}) < I_{j},

(A18)

or Case (b) : L (C_{B}, \underset{̲}{β}) > I_{N},

(A19)

where j is the smallest index such that

C_{m, m + 1} = 0

for all

m \in [j - 1, N - 1]

, and

L (C_{B}, \underset{̲}{β}) = I_{m} + C_{m, m + 1}

for any m such that

C_{m, m + 1} \neq 0

, namely the ‘water level’. Once the water-filling is performed, we fix the corresponding distribution of

C_{B}

.

For Case (a) where

C = L

, we now can decrease

{β_{m}}_{m = j}^{N}

to increase

β_{1}

such that

L

is increased. For Case (b) where

C = I_{N}

, we can look for an

m < N

with

β_{m} > 0

such that by decreasing

β_{m}

,

β_{N}

increases and

I_{N}

increases as well. Therefore, the

\underset{̲}{β}

satisfying Case (a) and Case (b) are not optimal. The equality of the terms on the RHS of (30) is thus necessary.

Appendix A.4. Proof of Proposition 2

Three steps are taken in the proof. In step (1) we show that an active compound mode

〈 j, k 〉

achieves the capacity if

L B \leq U B

is satisfied. In step (2) we show that using any two separated active modes (all other modes are inactive) does not achieve the capacity. In step (3) we show that exact solutions of (54) and (55) are resulted.

Step (1) Note that function g given in (42) is convex-∩ when

λ \leq \frac{1}{k - 1}

if the largest activated mode is k. So, we set the partial derivatives of g with respect to

{β_{i}}_{i = 1}^{N}

according to the Kuhn–Tucker conditions, see ([23], eqn.4.4.10 and eqn.4.4.11), when the active compound mode

〈 j, k 〉

achieves the capacity. By considering (42) in nats, the partial derivative of function g with respect to

β_{i}

is

\begin{matrix} \frac{\partial g}{\partial β_{i}} & = & \frac{1 - (k - 1) λ}{2} \cdot \frac{i P}{1 + \sum_{l = j}^{N} l β_{l} P} \\ + \frac{λ}{2} \sum_{m = i}^{N - 1} \frac{i P}{1 + \sum_{l = 1}^{m} l β_{l} P}, \end{matrix}

(A20)

where

i \in [1 : N]

.

Step (1.1) Firstly, by only considering that compound mode

〈 j, k 〉

is active, i.e., all

{β_{i}}_{i = j}^{k}

are nonzero, while the other

{β_{i}}_{i = 1}^{j - 1}

and

{β_{i}}_{i = k + 1}^{N}

are zeros, the partial derivative can be expressed as

\begin{matrix} \frac{\partial g}{\partial β_{i}} & = & \frac{1 - (k - 1) λ}{2} \cdot \frac{i P}{1 + \sum_{l = j}^{k} l β_{l} P} \\ + \frac{λ}{2} \sum_{m = i}^{k - 1} \frac{i P}{1 + \sum_{l = j}^{m} l β_{l} P} . \end{matrix}

(A21)

For simplicity, we denote that

D (i) ≜ 1 + \sum_{l = j}^{i} l β_{l} P .

(A22)

Now, consider that the partial derivatives corresponding to

i \in [j : k]

should be all identical to some value

μ

, i.e.,

\frac{\partial g}{\partial β_{j}} = \frac{\partial g}{\partial β_{j + 1}} = \dots = \frac{\partial g}{\partial β_{k}} : = μ,

(A23)

to find the capacity solution in terms of optimal distribution of

\underset{̲}{β}

. Note that

\frac{\partial g}{\partial β_{k}} = \frac{1 - (k - 1) λ}{2} \cdot \frac{k P}{D (k)} = μ .

(A24)

For the case of

k > j

, we can recursively evaluate the equalities in (A23) as

\partial g / \partial β_{i} = \partial g / \partial β_{i - 1}

by taking i from k to

j + 1

in a descending order with the use of (A21). In such a way, it is obtained that

\frac{λ}{2} \cdot \frac{i (i + 1) P}{D (i)} = μ for i \in [j : k - 1] .

(A25)

Now, based on (A24) and (A25), we can derive expressions of

{β_{i}}_{i = j}^{k}

by considering two scenarios.

Scenario 1: Consider

k \geq j + 2

, i.e., at least three consecutive modes are active. By taking

i = j

and

i = j + 1

, (A25) can be used twice to obtain the equality

\frac{j}{D (j)} = \frac{j + 2}{D (j + 1)}

that gives the relation

β_{j} P = \frac{(1 + j) β_{j + 1} P}{2} - \frac{1}{j} .

(A26)

For

k > j + 2

, expression (A25) allows us to further obtain

\frac{i (i + 1)}{D (i)} = \frac{(i + 1) (i + 2)}{D (i + 1)}

(A27)

by taking i in the order of

j + 1

to

k - 1

, which results in an interesting and important relation

β_{i} = β_{j + 1} for i \in [j + 2 : k - 1] .

(A28)

Therefore, by applying relation (A26), we can express

D (k)

as

D (k) = \frac{k^{2} - k}{2} β_{j + 1} P + k β_{k} P,

(A29)

which is valid for the case of

k = j + 2

as well. So, by setting (A25) equal to (A24) with

i = j

as

\frac{1 - (k - 1) λ}{2} \cdot \frac{k}{D (k)} = \frac{λ}{2} \cdot \frac{j (j + 1)}{D (j)},

(A30)

and substituting

D (k)

in (A29), we can first derive the power of modes from

j + 1

to

k - 1

as

β_{j + 1} P = \frac{2 λ (1 + j P)}{j (1 - (j - 1) λ)} .

(A31)

According to (A26), we can then obtain the power of the first mode as

β_{j} P = \frac{λ (1 + j) (1 + j P)}{j (1 - (j - 1) λ)} - \frac{1}{j} .

(A32)

Furthermore, owing to

\sum_{i = j}^{k} β_{i} = 1

, we can finally represent the power of the last mode as

β_{k} P = P + \frac{1}{j} - \frac{1}{2} (2 k - j - 1) β_{j + 1} P .

(A33)

Now, applying the total power constraint, we should have

\{\begin{cases} 0 < β_{k} P < P \\ 0 < β_{j + 1} P < P \\ 0 < β_{j} P < P . \end{cases}

(A34)

By substituting (A33), (A31), and (A32), the corresponding slope

λ

should simultaneously satisfy

\{\begin{matrix} \frac{1}{2 (k - 1) + j P (2 k - j - 1)} & < & λ & < & \frac{1}{2 (k - 1)} \\ 0 & < & λ & < & \frac{j P}{2 + j P + j^{2} P} \\ \frac{1}{j (2 + (j + 1) P)} & < & λ & < & \frac{1}{2 j} . \end{matrix}

(A35)

Since

k > j + 1

, it is easy to see that the lower bound of

λ

is

\frac{1}{j (2 + (j + 1) P)}

. For the upper bound, if

\frac{1}{2 (k - 1)} > \frac{j P}{2 + j P + j^{2} P}

, it leads to

P < \frac{2}{2 j k - j^{2} - 3 j}

. Due to

j \leq k - 2

, such P results in

\begin{matrix} j (2 + (j + 1) P) < 2 j + \frac{2 (j + 1)}{2 k - j - 3} \leq 2 (k - 1), \end{matrix}

(A36)

which contradicts the lower bound. Therefore, to make the compound mode exist, the slope should be in the range

\frac{1}{j (2 + (j + 1) P)} < λ < \frac{1}{2 (k - 1)} .

(A37)

Scenario 2: Consider

k = j + 1

, i.e., the compound mode only consists of two modes. So, letting (A24) equal to (A25) gives the relation

β_{j} P = \frac{λ (1 + j) β_{j + 1} P}{1 - 2 j λ} - \frac{1}{j} .

(A38)

By considering

β_{j} + β_{j + 1} = 1

now, it is obtained

\begin{matrix} β_{j + 1} P & = & \frac{(1 - 2 λ j) (1 + j P)}{j (1 - (j - 1) λ)} \end{matrix}

(A39)

and a same

β_{j} P

expression as in (A32). If such compound mode gives optimal solution, the condition

0 < β_{j} P < P

should be also satisfied, which gives

λ \in (\frac{1}{j (2 + (1 + j) P)}, \frac{1}{2 j})

that is consistent with the result of (A37).

Step (1.2) Secondly, consider that the derivative

\partial g / \partial β_{k + i}

for

\forall i \in [1 : N - k]

with

k < N

should be less than

μ

as denoted in Step (1.1). Since

{β_{k + i}}_{i = 1}^{N - k} = 0

, we have

\begin{matrix} \frac{\partial g}{\partial β_{k + i}} = \frac{1 - (k + i - 1) λ}{2} \cdot \frac{(k + i) P}{D (k)} \end{matrix}

(A40)

\begin{matrix} \leq & \frac{\partial g}{\partial β_{k}} = \frac{1 - (k - 1) λ}{2} \cdot \frac{k P}{D (k)} \end{matrix}

(A41)

which directly gives

λ \geq \frac{1}{2 k} .

(A42)

Step (1.3) Finally, consider that the derivative

\partial g / \partial β_{j - i}

for

\forall i \in [1 : j - 1]

with

j > 1

should be less than

μ

as well. Since

{β_{j - i}}_{i = 1}^{j - 1} = 0

, we have

\frac{\partial g}{\partial β_{j - i}} = \frac{1 - (k - 1) λ}{2} \cdot \frac{(j - i) P}{D (k)} + \frac{λ}{2} (j - i) P + \frac{λ}{2} \sum_{m = j}^{k - 1} \frac{(j - i) P}{D (m)} .

(A43)

Similarly, by upper bounding this derivative by

μ

given in (A24), we can have

\frac{1 - (k - 1) λ}{2} \cdot \frac{i P}{D (k)} + \frac{λ}{2} \sum_{m = j}^{k - 1} \frac{i P}{D (m)} \geq \frac{λ}{2} (j - i) P,

(A44)

where the summation can be computed by using relation of (A25) as

\frac{λ}{2} \sum_{m = j}^{k - 1} \frac{i P}{D (m)} = i μ (\frac{1}{j} - \frac{1}{k}) .

(A45)

By further incorporating

μ

given by (A24) and

D (k)

given by (A29) into (A44), the inequality becomes to

\frac{1 - (k - 1) λ}{j} \geq \frac{(j - 1) λ}{i} (\frac{k - 1}{2} β_{j + 1} P + β_{k} P) .

(A46)

By substituting (A31) and (A33) for

β_{j + 1} P

and

β_{k} P

, it can be easily shown that

λ \leq \frac{i}{i (j - i) + (j - 1) (1 + j P)} \leq \frac{1}{(j - 1) (2 + j P)},

(A47)

where

i \geq 1

is applied in the bounding for the last step.

Note that, Step (1.2) and (1.3) and resulted bounds of

λ

also cover the the case of

j = k

, i.e., only one mode is active and achieves the capacity. Now, by considering the ranges given by (A37), (A42), and (A47), the slope bounds in (52) and (53) are resulted.

Step (2) Assume that the capacity can also be achieved by only activating any two separated modes

j^{'}

and

k^{'}

, where

j^{'} \in [1 : k^{'} - 1]

and

k^{'} \in [j^{'} + 1, N]

. Then, the Kuhn–Tucker condition requires

\partial g / β_{k^{'}} = \partial g / β_{j^{'}}

, which results in the relation of

\frac{1 - (k^{'} - 1) λ}{2} \frac{1}{j^{'} D (k^{'})} = \frac{λ}{2} \frac{1}{1 + j^{'} β_{j}^{'}} .

(A48)

Note that in

D (k^{'})

only

β_{k}^{'}

and

β_{j}^{'}

are nonzero. Moreover, for

\forall i \in [1 : k^{'} - j^{'} - 1]

, we have

\frac{\partial g}{\partial β_{k^{'} - i}} = \frac{1 - (k^{'} - 1) λ}{2} \frac{(k^{'} - i) P}{D (k^{'})} + \frac{λ}{2} \frac{i (k^{'} - i) P}{1 + j^{'} β_{j}^{'}}

(A49)

By substituting the relation (A48) into above derivative, it can be shown that

\begin{matrix} \partial g / β_{k^{'} - i} - \partial g / β_{k}^{'} & = & P \frac{1 - (k^{'} - 1) λ}{2 D (k^{'})} (k^{'} - i + \frac{i (k^{'} - i)}{j^{'}} - k^{'}) \\ = & P \frac{1 - (k^{'} - 1) λ}{2 D (k^{'})} i (\frac{k^{'} - i}{j^{'}} - 1) \\ > & 0 \end{matrix}

(A50)

where the last step is due to

k^{'} - i > j^{'}

. This result contradicts to the Kuhn–Tucker condition. This demonstrates the only compound modes achieves the capacity.

Step (3) For a slope

λ

in the range (52) and (53), the obtained optimal

\underset{̲}{β}

from Step (1.1) can be used to evaluate (38) and (39) directly. Thus, (54) and (55) are resulted, respectively.

Appendix A.5. Proof of Corollary 3

Consider four possible settings of

U B

and

L B

based on (52) and (53) as

\begin{matrix} Case 1 : \frac{1}{2 (k - 1)} & \leq & b (j - 1), \frac{1}{2 k} \leq b (j), b (j) < \frac{1}{2 (k - 1)}, \\ Case 2 : \frac{1}{2 (k - 1)} & \geq & b (j - 1), \frac{1}{2 k} \leq b (j), b (j) < b (j - 1), \\ Case 3 : \frac{1}{2 (k - 1)} & \leq & b (j - 1), \frac{1}{2 k} \geq b (j), \frac{1}{2 k} < \frac{1}{2 (k - 1)}, \\ Case 4 : \frac{1}{2 (k - 1)} & \geq & b (j - 1), \frac{1}{2 k} \geq b (j), \frac{1}{2 k} \leq b (j - 1), \end{matrix}

where inequalities given by

\frac{1}{2 k}

are only valid for

k < N

. By evaluating the inequalities, the power condition (57) is resulted. Since the

L B

and

U B

are derived from the Kuhn–Tucker conditions, the derived power condition is a sufficient and necessary condition. This result directly gives that using compound mode

〈 j^{'}, k^{'} 〉

with

j^{'} > j

and

k^{'} < k

is not a capacity solution.

Appendix A.6. Proof of Lemma 2

By following the method in [20], we define

f (x) = x - ln (x + 1)

and prove

f (x) + \sqrt[3]{2 f (x)} - x \geq 0

(A51)

for

x \geq 0

. Once this is done, one can take the procedure applied in ([20], Thm. 1) and the lower bound (61) follows.

Now, we start converting (A51) to an equivalent problem. First, by substituting

f (x)

in (A51), we need to prove

\sqrt[3]{2 x - 2 ln (1 + x)} \geq ln (1 + x) .

(A52)

By denoting

t : = x + 1

with

t \geq 1

and considering

x^{3}

and

x^{\frac{1}{3}}

are monotonically increasing in x for

x \geq 0

, showing (A52) is equivalent to show

2 t - 2 - ln t - {(ln t)}^{3} \geq 0 .

(A53)

let

v (t) : = 2 t - 2 - ln t - {(ln t)}^{3}

, we can focus on showing

v^{'} (t) = 2 - 2 t^{- 1} - 3 t^{- 1} {(ln t)}^{2} \geq 0,

(A54)

since

v (1) = 0

. Considering

t \geq 1

, we finally convert proving (A51) to demonstrating

ψ (t) : = t v^{'} (t) = 2 t - 2 - 3 {(ln t)}^{2} \geq 0 .

(A55)

To do so, we take the first derivative of

ψ (t)

and set it to zero to have

3 ln t = t .

(A56)

By solving (A56), we can evaluate the local maxima and/or local minima of

ψ (t)

. It can be seen that (A56) has only two real roots for

t \geq 1

, which are

t_{1} = - 3 W_{0} (- \frac{1}{3}) and t_{2} = - 3 W_{- 1} (- \frac{1}{3}),

(A57)

where

W_{0} (\cdot)

is the principle branch of the Lambert W function defined over

[- e^{- 1}, \infty)

. Based on the property of the Lambert W function,

1 < t_{1} < t_{2}

. Therefore, since

ψ (1) = 0

, we can have two scenarios as

Scenario 1: if $ψ (t_{1}) > 0$ , we must have $ψ (t_{2}) < ψ (t_{1})$ , i.e., $t_{1}$ gives a local maxima and $t_{2}$ gives a minima;
Scenario 2: if $ψ (t_{1}) < 0$ , we must have $ψ (t_{2}) > ψ (t_{1})$ , i.e., $t_{1}$ gives a global minima and $t_{2}$ gives a maxima.

Hence, if we can prove that

ψ (t_{1}) > 0

and

ψ (t_{2}) > 0

, both, we can conclude (A55) and thus (A51). We show this in what follows.

By setting

t^{⋆} = t_{1} o r t_{2}

,

ψ (t^{⋆})

can be expressed as

ψ (t^{⋆}) = ψ (t) |_{t_{1}, t_{2}} = 2 t^{⋆} - 2 - \frac{{(t^{⋆})}^{2}}{3} .

(A58)

If

ψ (t^{⋆}) > 0

, it is equivalent to have

{(t^{⋆})}^{2} - 6 t^{⋆} + 6 < 0

, which results in the range of

3 - \sqrt{3} < t^{⋆} < 3 + \sqrt{3}

(A59)

must be satisfied. For

t_{2}

, by applying the bounds given in ([20], Thm. 1) which is

- 1 - \sqrt{2 x} - x < W_{- 1} (- e^{- x - 1}) < - 1 - \sqrt{2 x} - \frac{2}{3} x

, it is easily obtained that

4.53 < t_{2} < 4.62

, which is in the range (A59). So,

ψ (t_{2}) > 0

follows. For

t_{1}

, we apply an upper bound on

W_{0} (\cdot)

given in ([24], Thm. 2.3), which is

W_{0} (x) \leq ln (\frac{x + y}{1 + ln y})

(A60)

for

x > - e^{- 1}

and

y > e^{- 1}

. By taking

y = 0.5

,

t_{1}

can be bounded as

t_{1} > 1.8

. By also considering

t_{1} < t_{2}

,

t_{1}

must in the range (A59) as well, which completes showing

ψ (t_{1}) > 0

.

Numerical evaluation of the bound (61) verifies the proof as illustrated in Figure A3, where the bound in [20] is plotted as a reference as well.

Figure A3. The bounds on the Lambert function

W_{- 1} (- e^{- (x + 1)})

.

Figure A3. The bounds on the Lambert function

W_{- 1} (- e^{- (x + 1)})

.

References

Ngo, H.Q.; Ashikhmin, A.; Yang, H.; Larsson, E.G.; Marzetta, T.L. Cell-free massive MIMO versus small cells. IEEE Trans. Wirel. Commun. 2017, 16, 1834–1850. [Google Scholar] [CrossRef] [Green Version]
Zhang, J.; Chen, S.; Lin, Y.; Zheng, J.; Ai, B.; Hanzo, L. Cell-free massive MIMO: A new next-generation paradigm. IEEE Access 2019, 7, 99878–99888. [Google Scholar] [CrossRef]
Interdonato, G.; Björnson, E.; Ngo, H.G.; Frenger, P.; Larsson, E.G. Ubiquitous cell-free massive MIMO communications. EURASIP J. Wirel. Commun. Netw. 2019, 197. [Google Scholar] [CrossRef] [Green Version]
Bashar, M.; Cumanan, K.; Burr, A.G.; Ngo, H.Q.; Debbah, M. Cell-Free massive MIMO with limited fronthaul. In Proceedings of the 2018 IEEE International Conference on Communications (ICC), Kansas, MO, USA, 20–24 May 2018; pp. 1–7. [Google Scholar]
Bashar, M.; Cumanan, K.; Burr, A.G.; Ngo, H.Q.; Larsson, E.G.; Xiao, P. On the energy efficiency of limited-fronthaul cell-free massive MIMO. In Proceedings of the 2019 IEEE International Conference on Communications (ICC), Shanghai, China, 20–24 May 2019; pp. 1–7. [Google Scholar]
Femenias, G.; Riera-Palou, F. Cell-free millimeter-wave massive MIMO systems with limited fronthaul capacity. IEEE Access 2019, 7, 44596–44612. [Google Scholar] [CrossRef]
Frenger, P.; Hederen, J.; Hessler, M.; Interdonato, G. Improved Antenna Arrangement for Distributed Massive MIMO (2017). Patent Application WO2018103897. Available online: patentscope.wipo.int/search/en/WO2018103897 (accessed on 21 January 2020).
Radio Stripes: Re-Thinking Mobile Networks. Available online: https://www.ericsson.com/en/blog/2019/2/radio-stripes (accessed on 21 January 2020).
Willems, F. The discrete memoryless multiple access channel with partially cooperating encoders (corresp). IEEE Trans. Inf. Theory 1983, 29, 441–445. [Google Scholar] [CrossRef]
El Gamal, A.; Zahedi, S. Capacity of a class of relay channels with orthogonal components. IEEE Trans. Inf. Theory 2005, 51, 1815–1817. [Google Scholar] [CrossRef]
Ghabeli, L.; Aref, M.R. A new achievable rate and the capacity of some classes of multilevel relay network. EURASIP J. Wirel. Commun. Netw. 2008, 2008, 135857. [Google Scholar] [CrossRef] [Green Version]
Kang, W.; Liu, N.; Chong, W. The Gaussian multiple access diamond channel. IEEE Trans. Inf. Theory 2015, 61, 6049–6059. [Google Scholar] [CrossRef]
Saeedi Bidokhti, S.; Kramer, G. Capacity Bounds for Diamond Networks With an Orthogonal Broadcast Channel. IEEE Trans. Inf. Theory 2016, 62, 7103–7122. [Google Scholar] [CrossRef] [Green Version]
El Gamal, A.; Aref, M. The capacity of the semideterministic relay channel (corresp.). IEEE Trans. Inf. Theory 1982, 28, 536. [Google Scholar] [CrossRef]
EL Gamal, A.; Hassanpour, N.; Mammen, J. Relay networks with delays. IEEE Trans. Inf. Theory 2007, 53, 3413–3431. [Google Scholar] [CrossRef]
Zhang, P.; Willems, F.; Huang, L. Capacity study of distributed beamforming in relation to constrained backbone communication. In Proceedings of the 2014 International Symposium on Information Theory and its Applications, Melbourne, VIC, Australia, 26–29 October 2014; pp. 473–477. [Google Scholar]
Golub, G.H.; Van Loan, C.F. Matirx Computations; The Johns Hopkins University Press: Baltimore, MD, USA, 2013. [Google Scholar]
Cover, T.; Thomas, J. Elements of Information Theory; Wiley: New York, NY, USA, 2006. [Google Scholar]
Mitrinović, D.; Vasić, P. Analytic Inequalities, ser. Grundlehren der Mathematischen Wissenschaften; Springer: Berlin, Germany, 1970. [Google Scholar]
Chatzigeorgiou, I. Bounds on the Lambert function and their application to the outage analysis of user cooperation. IEEE Commun. Lett. 2013, 17, 1505–1508. [Google Scholar] [CrossRef] [Green Version]
Corless, R.M.; Gonnet, G.H.; Hare, D.; Jeffrey, D.J.; Knuth, D.E. On the Lambert W function. Adv. Comput. Math. 1996, 5, 329. [Google Scholar] [CrossRef]
Abramowitz, M.; Stegun, I. Handbook of Mathematical Functions, with Formulas, Graphs, and Mathematical Tables; Dover Publications, Incorporated: St. Mineola, NY, USA, 1964. [Google Scholar]
Gallager, G. Information Theory and Reliable Communication; John Wiley & Sons, Inc.: New York, NY, USA, 1968. [Google Scholar]
Hoorfar, A.; Hassani, M. Inequalities on the Lambert W function and hyperpower function. J. Inequal. Pure Appl. Math. 2008, 9, 51. [Google Scholar]

Figure 1. Cell-free transmit beamforming where access points (APs) are connected via serial fronthaul.

Figure 2. N encoders that cooperate in sending a message W to a decoder with limited fronthaul communication.

Figure 3. Modes activation for the two-encoder Gaussian setting.

Figure 4. Upper and lower bounds on

C (C_{B})

at

P = 21

for 10-encoder Gaussian setting.

Figure 4. Upper and lower bounds on

C (C_{B})

at

P = 21

for 10-encoder Gaussian setting.

Figure 5.

C (C_{B})

at

P = 1

and

P = 21

for different N-encoder settings.

Figure 5.

C (C_{B})

at

P = 1

and

P = 21

for different N-encoder settings.

Figure 6. Optimal power allocation for modes distribution for 10-encoder Gaussian setting at

P = 5

.

Figure 6. Optimal power allocation for modes distribution for 10-encoder Gaussian setting at

P = 5

.

Figure 7. Optimal power allocation for modes distribution for 10-encoder Gaussian setting at

P = 21

.

Figure 7. Optimal power allocation for modes distribution for 10-encoder Gaussian setting at

P = 21

.

Figure 8. Optimal distribution of

C_{B}

at

P = 5

and

P = 21

for 10-encoder Gaussian setting.

Figure 8. Optimal distribution of

C_{B}

at

P = 5

and

P = 21

for 10-encoder Gaussian setting.

Figure 9.

C (C_{B})

at

P = 5

and

P = 21

for 10-encoder Gaussian setting.

Figure 9.

C (C_{B})

at

P = 5

and

P = 21

for 10-encoder Gaussian setting.

Figure 10. Number of required encoders in relation to

C_{B}

for fronthaul resource maximum usage.

Figure 10. Number of required encoders in relation to

C_{B}

for fronthaul resource maximum usage.

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Zhang, P.; Willems, F.M. . On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity. Entropy 2020, 22, 418. https://doi.org/10.3390/e22040418

AMA Style

Zhang P, Willems FM . On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity. Entropy. 2020; 22(4):418. https://doi.org/10.3390/e22040418

Chicago/Turabian Style

Zhang, Peng, and Frans M. J. Willems. 2020. "On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity" Entropy 22, no. 4: 418. https://doi.org/10.3390/e22040418

APA Style

Zhang, P., & Willems, F. M. . (2020). On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity. Entropy, 22(4), 418. https://doi.org/10.3390/e22040418

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity

Abstract

1. Introduction

1.1. Related Work

1.2. Contributions and Organization

2. Problem Setup

2.1. Notation

2.2. System Model

3. Two-Encoder Result

3.1. Discrete Channel

3.2. Gaussian Channel

3.3. Cooperating Modes

4. N-Encoder Result

4.1. Discrete Channel

4.2. Gaussian Channel under Total fronthaul Constraint

4.3. Capacity Behavior Bounds

4.4. Compound Mode and Exact Solution

4.5. Modes Selection for Capacity Achieving

4.6. $〈 1, k 〉$ Mode and Capacity Regimes

5. Infinitely Many Encoders

6. Concluding Remarks

Author Contributions

Funding

Conflicts of Interest

Appendix A. Proofs and Derivations

Appendix A.1. Proof of Theorem 1

Appendix A.2. Proof of Theorem 3

Appendix A.3. Proof of Lemma 1

Appendix A.4. Proof of Proposition 2

Appendix A.5. Proof of Corollary 3

Appendix A.6. Proof of Lemma 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

On the Downlink Capacity of Cell-Free Massive MIMO with Constrained Fronthaul Capacity

Abstract

1. Introduction

1.1. Related Work

1.2. Contributions and Organization

2. Problem Setup

2.1. Notation

2.2. System Model

3. Two-Encoder Result

3.1. Discrete Channel

3.2. Gaussian Channel

3.3. Cooperating Modes

4. N-Encoder Result

4.1. Discrete Channel

4.2. Gaussian Channel under Total fronthaul Constraint

4.3. Capacity Behavior Bounds

4.4. Compound Mode and Exact Solution

4.5. Modes Selection for Capacity Achieving

4.6. 〈 1 , k 〉 Mode and Capacity Regimes

5. Infinitely Many Encoders

6. Concluding Remarks

Author Contributions

Funding

Conflicts of Interest

Appendix A. Proofs and Derivations

Appendix A.1. Proof of Theorem 1

Appendix A.2. Proof of Theorem 3

Appendix A.3. Proof of Lemma 1

Appendix A.4. Proof of Proposition 2

Appendix A.5. Proof of Corollary 3

Appendix A.6. Proof of Lemma 2

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

4.6. $〈 1, k 〉$ Mode and Capacity Regimes