Next Article in Journal
Classical and Quantum Causal Interventions
Next Article in Special Issue
Quasi-Concavity for Gaussian Multicast Relay Channels
Previous Article in Journal
Constructal Optimization for Cooling a Non-Uniform Heat Generating Radial-Pattern Disc by Conduction
Previous Article in Special Issue
Non-Orthogonal eMBB-URLLC Radio Access for Cloud Radio Access Networks with Analog Fronthauling
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver ,

1
School of Computer Science and Electronic Engineering, University of Essex, Colchester CO4 3SQ, UK
2
The Intelligent Systems and Networks group of Department of Electrical and Electronics, Imperial College London, London SW7 2AZ, UK
*
Authors to whom correspondence should be addressed.
This paper is an extended version of our paper published in the 2017 IEEE International Symposium on Information Theory (ISIT), Aachen, Germany, 25–30 June 2017.
This work has been carried out when the first author was with the Information Processing and Communications Laboratory at Imperial College London.
Entropy 2018, 20(9), 686; https://doi.org/10.3390/e20090686
Submission received: 21 May 2018 / Revised: 20 August 2018 / Accepted: 5 September 2018 / Published: 7 September 2018
(This article belongs to the Special Issue Information Theory for Data Communications and Processing)

Abstract

:
The capacity region of a two-transmitter Gaussian multiple access channel (MAC) under average input power constraints is studied, when the receiver employs a zero-threshold one-bit analogue-to-digital converter (ADC). It is proven that the input distributions of the two transmitters that achieve the boundary points of the capacity region are discrete. Based on the position of a boundary point, upper bounds on the number of the mass points of the corresponding distributions are derived. Furthermore, a lower bound on the sum capacity is proposed that can be achieved by time division with power control. Finally, inspired by the numerical results, the proposed lower bound is conjectured to be tight.

1. Introduction

The energy consumption of an analogue-to-digital converter (ADC) (measured in Joules/sample) grows exponentially with its resolution (in bits/sample) [1,2].When the available power is limited, for example, for mobile devices with limited battery capacity, or for wireless receivers that operate on limited energy harvested from ambient sources [3], the receiver circuitry may be constrained to operate with low resolution ADCs. The presence of a low-resolution ADC, in particular a one-bit ADC at the receiver, alters the channel characteristics significantly. Such a constraint not only limits the fundamental bounds on the achievable rate, but it also changes the nature of the communication and modulation schemes approaching these bounds. For example, in a real additive white Gaussian noise (AWGN) channel under an average power constraint on the input, if the receiver is equipped with a K-bin (i.e., log 2 K -bit) ADC front end, it is shown in [4] that the capacity-achieving input distribution is discrete with at most K + 1 mass points. This is in contrast with the optimality of the Gaussian input distribution when the receiver has infinite resolution.
Especially with the adoption of massive multiple-input multiple-output (MIMO) receivers and the millimetre wave (mmWave) technology enabling communication over large bandwidths, communication systems with limited-resolution receiver front ends are becoming of practical importance. Accordingly, there has been a growing research interest in understanding both the fundamental information theoretic limits and the design of practical communication protocols for systems with finite-resolution ADC front ends. In [5], the authors showed that for a Rayleigh fading channel with a one-bit ADC and perfect channel state information at the receiver (CSIR), quadrature phase shift keying (QPSK) modulation is capacity-achieving. In case of no CSIR, [6] showed that QPSK modulation is optimal when the signal-to-noise ratio (SNR) is above a certain threshold, which depends on the coherence time of the channel, while for SNRs below this threshold, on-off QPSK achieves the capacity. For the point-to-point multiple-input multiple-output (MIMO) channel with a one-bit ADC front end at each receive antenna and perfect CSIR, [7] showed that QPSK is optimal at very low SNRs, while with perfect channel state information at the transmitter (CSIT), upper and lower bounds on the capacity are provided in [8].
To the best of our knowledge, the existing literature on communications with low-resolution ADCs focuses exclusively on point-to-point systems. Our goal in this paper is to understand the impact of low-resolution ADCs on the capacity region of a multiple access channel (MAC). In particular, we consider a two-transmitter Gaussian MAC with a one-bit quantizer at the receiver. The inputs to the channel are subject to average power constraints. We show that any point on the boundary of the capacity region is achieved by discrete input distributions. Based on the slope of the tangent line to the capacity region at a boundary point, we propose upper bounds on the cardinality of the support of these distributions. Finally, based on numerical analysis for the sum capacity, it is observed that we cannot obtain a sum rate higher than is achieved by time division with power control.
The paper is organized as follows. Section 2 introduces the system model. In Section 3, the capacity region of a general two-transmitter memoryless MAC under input average power constraints is investigated. The main result of the paper is presented in Section 3, and a detailed proof is given in Section 4. The proof has two parts: (1) it is shown that the support of the optimal distributions is bounded by contradiction; and (2) we make use of this boundedness to prove the finiteness of the optimal support by using Dubins’ theorem [9]. Section 5 analyses the sum capacity, and finally, Section 6 concludes the paper.
Notations: Random variables are denoted by capital letters, while their realizations with lower case letters. F X ( x ) denotes the cumulative distribution function (CDF) of random variable X. The conditional probability mass function (pmf) p Y | X 1 , X 2 ( y | x 1 , x 2 ) will be written as p ( y | x 1 , x 2 ) . For integers m n , we have [ m : n ] = { m , m + 1 , , n } . For 0 t 1 , H b ( t ) t log 2 t ( 1 t ) log 2 ( 1 t ) denotes the binary entropy function. The unit-step function is denoted by s ( · ) .

2. System Model and Preliminaries

We consider a two-transmitter memoryless Gaussian MAC (as shown in Figure 1) with a one-bit quantizer Γ at the receiver front end. Transmitter j = 1 , 2 encodes its message W j into a codeword X j n and transmits it over the shared channel. The signal received by the decoder is given by:
Y = Γ ( X 1 , i + X 2 , i + Z i ) , i [ 1 : n ] ,
where { Z i } i = 1 n is an independent and identically distributed (i.i.d.) Gaussian noise process, also independent of the channel inputs X 1 n and X 2 n with Z i N ( 0 , 1 ) , i [ 1 : n ] . Γ represents the one-bit ADC operation given by:
Γ ( x ) = 1 x 0 0 x < 0 .
This channel can be modelled by the triplet X 1 × X 2 , p ( y | x 1 , x 2 ) , Y , where X 1 , X 2 ( = R ) and Y ( = { 0 , 1 } ), respectively, are the alphabets of the inputs and the output. The conditional pmf of the channel output Y conditioned on the channel inputs X 1 and X 2 (i.e., p ( y | x 1 , x 2 ) ) is characterized by:
p ( 0 | x 1 , x 2 ) = 1 p ( 1 | x 1 , x 2 ) = Q ( x 1 + x 2 ) ,
where Q ( x ) 1 2 π x + e t 2 2 d t .
We consider a two-transmitter stationary and memoryless MAC model ( X 1 × X 2 , p ( y | x 1 , x 2 ) , Y ) , where X 1 = X 2 = R , Y = { 0 , 1 } , p ( y | x 1 , x 2 ) is given in (1).
A ( 2 n R 1 , 2 n R 2 , n ) code for this channel consists of (as in [10]):
  • two message sets [ 1 : 2 n R 1 ] and [ 1 : 2 n R 2 ] ,
  • two encoders, where encoder j = 1 , 2 assigns a codeword x j n ( w j ) to each message w j [ 1 : 2 n R j ] , and
  • a decoder that assigns estimates ( w ^ 1 , w ^ 2 ) [ 1 : 2 n R 1 ] × [ 1 : 2 n R 2 ] or an error message to each received sequence y n .
The stationary property means that the channel does not change over time, while the memoryless property indicates that p ( y i | x 1 i , x 2 i , y i 1 , w 1 , w 2 ) = p ( y i | x 1 , i , x 2 , i ) for any message pair ( w 1 , w 2 ) .
We assume that the message pair ( W 1 , W 2 ) is uniformly distributed over [ 1 : 2 n R 1 ] × [ 1 : 2 n R 2 ] . The average probability of error is defined as:
P e ( n ) P r ( W ^ 1 , W ^ 2 ) ( W 1 , W 2 ) .
Average power constraints are imposed on the channel inputs as:
1 n i = 1 n x j , i 2 ( w j ) P j , w j [ 1 : 2 n R j ] , j { 1 , 2 } ,
where x j , i ( w j ) denotes the i-th element of the codeword x j n ( w j ) .
A rate pair ( R 1 , R 2 ) is said to be achievable for this channel if there exists a sequence of ( 2 n R 1 , 2 n R 2 , n ) codes satisfying the average power constraints (3), such that lim n P e ( n ) = 0 . The capacity region C ( P 1 , P 2 ) of this channel is the closure of the set of achievable rate pairs ( R 1 , R 2 ) .

3. Main Results

Proposition 1.
The capacity region C ( P 1 , P 2 ) of a two-transmitter stationary and memoryless MAC with average power constraints P 1 and P 2 is the set of non-negative rate pairs ( R 1 , R 2 ) that satisfy:
R 1 I ( X 1 ; Y | X 2 , U ) , R 2 I ( X 2 ; Y | X 1 , U ) , R 1 + R 2 I ( X 1 , X 2 ; Y | U ) ,
for some F U ( u ) F X 1 | U ( x 1 | u ) F X 2 | U ( x 2 | u ) , such that E [ X j 2 ] P j , j = 1 , 2 . Furthermore, it is sufficient to consider | U | 5 .
Proof of Proposition 1.
The proof is provided in Appendix A. ☐
The main result of this paper is provided in the following theorem. It bounds the cardinality of the support set of the capacity-achieving distributions.
Theorem 1.
Let J be an arbitrary point on the boundary of the capacity region C ( P 1 , P 2 ) of the memoryless MAC with a one-bit ADC front end (as shown in Figure 1). J is achieved by a distribution in the form of F U J ( u ) F X 1 | U J ( x 1 | u ) F X 2 | U J ( x 2 | u ) . Furthermore, let l J be the slope of the line tangent to the capacity region at this point. For any u U , the conditional input distributions F X 1 | U J ( x 1 | u ) and F X 2 | U J ( x 2 | u ) have at most n 1 and n 2 points of increase (a point Z is said to be a point of increase of a distribution if for any open set Ω containing Z, we have P r { Ω } > 0 ), respectively, where:
( n 1 , n 2 ) = ( 3 , 5 ) l J < 1 ( 3 , 3 ) l J = 1 ( 5 , 3 ) l J > 1 .
Furthermore, this result remains unchanged if the one-bit ADC has a non-zero threshold.
Proof of Theorem 1.
The proof is provided in Section 4. ☐
Proposition 1 and Theorem 1 establish upper bounds on the number of mass points of the distributions that achieve a boundary point. The significance of this result is that once it is known that the optimal inputs are discrete with at most a certain number of mass points, the capacity region along with the optimal distributions can be obtained via computer programs.

4. Proof of Theorem 1

In order to show that the boundary points of the capacity region are achieved, it is sufficient to show that the capacity region is a closed set, i.e., it includes all of its limit points.
Let U be a set with | U | 5 and Ω be defined as:
Ω F U , X 1 , X 2 | U U , X 1 U X 2 , E [ X j 2 ] P j , j = 1 , 2 ,
which is the set of all CDFs on the triplet ( U , X 1 , X 2 ) , where U is drawn from U , and the Markov chain X 1 U X 2 and the corresponding average power constraints hold.
In Appendix B, it is proven that Ω is a compact set. Since a continuous mapping preserves compactness, the capacity region is compact. Since the capacity region is a subset of R 2 , it is closed and bounded (note that a subset of R k is compact if and only if it is closed and bounded [11]). Therefore, any point P on the boundary of the capacity region is achieved by a distribution denoted by F U J ( u ) F X 1 | U J ( x 1 | u ) F X 2 | U J ( x 2 | u ) .
Since the capacity region is a convex space, it can be characterized by its supporting hyperplanes. In other words, any point on the boundary of the capacity region, denoted by ( R 1 b , R 2 b ) , can be written as:
( R 1 b , R 2 b ) = arg max ( R 1 , R 2 ) C ( P 1 , P 2 ) R 1 + λ R 2 ,
for some λ ( 0 , ) . Here, we have excluded the cases λ = 0 and λ = , where the channel is not a two-transmitter MAC any longer, and boils down to a point-to-point channel, whose capacity is already known.
Any rate pair ( R 1 , R 2 ) C ( P 1 , P 2 ) must lie within a pentagon defined by (4) for some F U F X 1 | U F X 2 | U that satisfies the power constraints. Therefore, due to the structure of the pentagon, the problem of finding the boundary points is equivalent to the following maximization problem.
max ( R 1 , R 2 ) C ( P 1 , P 2 ) R 1 + λ R 2 = max I ( X 1 ; Y | X 2 , U ) + λ I ( X 2 ; Y | U ) 0 < λ 1 max I ( X 2 ; Y | X 1 , U ) + λ I ( X 1 ; Y | U ) λ > 1 ,
where on the right-hand side (RHS) of (7), the maximizations are over all F U F X 1 | U F X 2 | U that satisfy the power constraints. It is obvious that when λ = 1 , the two lines in (7) are the same, which results in the sum capacity.
For any product of distributions F X 1 F X 2 and the channel in (1), let I λ be defined as:
I λ ( F X 1 F X 2 ) I ( X 1 ; Y | X 2 ) + λ I ( X 2 ; Y ) 0 < λ 1 I ( X 2 ; Y | X 1 ) + λ I ( X 1 ; Y ) λ > 1 .
With this definition, (7) can be rewritten as:
max ( R 1 , R 2 ) C ( P 1 , P 2 ) R 1 + λ R 2 = max i = 1 5 p U ( u i ) I λ ( F X 1 | U ( x 1 | u i ) F X 2 | U ( x 2 | u i ) ) ,
where the second maximization is over distributions of the form p U ( u ) F X 1 | U ( x 1 | u ) F X 2 | U ( x 2 | u ) , such that:
i = 1 5 p U ( u i ) E [ X j 2 | U = u i ] P j , j = 1 , 2 .
Proposition 2.
For a given F X 1 and any λ > 0 , I λ ( F X 1 F X 2 ) is a concave, continuous and weakly differentiable function of F X 2 . In the statement of this proposition, F X 1 and F X 2 could be interchanged.
Proof of Proposition 2.
The proof is provided in Appendix C. ☐
Proposition 3.
Let P 1 , P 2 be two arbitrary non-negative real numbers. For the following problem:
max F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 I λ ( F X 1 F X 2 ) ,
the optimal inputs F X 1 * and F X 2 * , which are not unique in general, have the following properties,
(i) 
The support sets of F X 1 * and F X 2 * are bounded subsets of R .
(ii) 
F X 1 * and F X 2 * are discrete distributions that have at most n 1 and n 2 points of increase, respectively, where:
( n 1 , n 2 ) = ( 5 , 3 ) 0 < λ < 1 ( 3 , 3 ) λ = 1 ( 3 , 5 ) λ > 1 .
Proof of Proposition 3.
We start with the proof of the first claim. Assume that 0 < λ 1 , and F X 2 is given. Consider the following optimization problem:
I F X 2 * sup F X 1 : E [ X 1 2 ] P 1 I λ ( F X 1 F X 2 ) .
Note that I F X 2 * < + , since for any λ > 0 , from (8),
I λ ( λ + 1 ) H ( Y ) ( 1 + λ ) < + .
From Proposition 2, I λ is a continuous, concave function of F X 1 . Furthermore, the set of all CDFs with bounded second moment (here, P 1 ) is convex and compact. The compactness follows from Appendix I in [12], where the only difference is in using Chebyshev’s inequality instead of Markov’s inequality. Therefore, the supremum in (10) is achieved by a distribution F X 1 * . Since for any F X 1 ( x ) = s ( x x 0 ) with | x 0 | 2 < P 1 , we have E [ X 1 2 ] < P 1 , the Lagrangian theorem and the Karush–Kuhn–Tucker conditions state that there exists a θ 1 0 such that:
I F X 2 * = sup F X 1 I λ ( F X 1 F X 2 ) θ 1 x 2 d F X 1 ( x ) P 1 .
Furthermore, the supremum in (11) is achieved by F X 1 * , and:
θ 1 x 2 d F X 1 * ( x ) P 1 = 0 .
Lemma 1.
The Lagrangian multiplier θ 1 is non-zero. From (12), this is equivalent to having E [ X 1 2 ] = P 1 , i.e., the first user transmits with its maximum allowable power (note that this is for λ 1 , as used in Appendix D).
Proof of Lemma 1.
In what follows, we prove that a zero Lagrangian multiplier is not possible. Having a zero Lagrangian multiplier means the power constraint is inactive. In other words, if θ 1 = 0 , (10) and (11) imply that:
sup F X 1 E [ X 1 2 ] P 1 I λ ( F X 1 F X 2 ) = sup F X 1 I λ ( F X 1 F X 2 ) .
We prove that (13) does not hold by showing that its left-hand side (LHS) is strictly less than one, while its RHS equals one. The details are provided in Appendix D. ☐
I λ ( F X 1 F X 2 ) ( 0 < λ 1 ) can be written as:
I λ ( F X 1 F X 2 ) = + + y = 0 1 p ( y | x 1 , x 2 ) log p ( y | x 1 , x 2 ) [ p ( y ; F X 1 F X 2 ) ] λ [ p ( y ; F X 1 | x 2 ) ] 1 λ d F X 1 ( x 1 ) d F X 2 ( x 2 ) = + i ˜ λ ( x 1 ; F X 1 | F X 2 ) d F X 1 ( x 1 )
= + i λ ( x 2 ; F X 2 | F X 1 ) d F X 2 ( x 2 ) ,
where we have defined:
i ˜ λ ( x 1 ; F X 1 | F X 2 ) + D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 F X 2 ) + ( 1 λ ) y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 F X 2 ) p ( y ; F X 1 | x 2 ) d F X 2 ( x 2 ) ,
and:
i λ ( x 2 ; F X 2 | F X 1 ) + D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 F X 2 ) d F X 1 ( x 1 ) ( 1 λ ) D p ( y ; F X 1 | x 2 ) | | p ( y ; F X 1 F X 2 ) .
p ( y ; F X 1 F X 2 ) is nothing but the pmf of Y with the emphasis that it has been induced by F X 1 and F X 2 . Likewise, p ( y ; F X 1 | x 2 ) is the conditional pmf p ( y | x 2 ) when X 1 is drawn according to F X 1 . From (14), i ˜ λ ( x 1 ; F X 1 | F X 2 ) can be considered as the density of I λ over F X 1 when F X 2 is given. i λ ( x 2 ; F X 2 | F X 1 ) can be interpreted in a similar way.
Note that (11) is an unconstrained optimization problem over the set of all CDFs. Since x 2 d F X 1 ( x ) is linear and weakly differentiable in F X 1 , the objective function in (11) is concave and weakly differentiable. Hence, a necessary condition for the optimality of F X 1 * is:
{ i ˜ λ ( x 1 ; F X 1 * | F X 2 ) + θ 1 ( P 1 x 1 2 ) } d F X 1 ( x 1 ) I F X 2 * , F X 1 .
Furthermore, (18) can be verified to be equivalent to:
i ˜ λ ( x 1 ; F X 1 * | F X 2 ) + θ 1 ( P 1 x 1 2 ) I F X 2 * , x 1 R ,
i ˜ λ ( x 1 ; F X 1 * | F X 2 ) + θ 1 ( P 1 x 1 2 ) = I F X 2 * , if   x 1   is   a   point   of   increase   of   F X 1 * .
The justifications of (18)–(20) are provided in Appendix E.
In what follows, we prove that in order to satisfy (20), F X 1 * must have a bounded support by showing that the LHS of (20) goes to with x 1 . The following lemma is useful in the sequel for taking the limit processes inside the integrals.
Lemma 2.
Let X 1 and X 2 be two independent random variables satisfying E [ X 1 2 ] P 1 and E [ X 2 2 ] P 2 , respectively ( P 1 , P 2 [ 0 , + ) ). Considering the conditional pmf in (1), the following inequalities hold.
| D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 F X 2 ) | 1 2 log Q ( P 1 + P 2 )
p ( y ; F X 1 | x 2 ) Q P 1 + | x 2 |
y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 F X 2 ) p ( y ; F X 1 | x 2 ) 2 log Q P 1 + P 2 2 log Q P 1 + | x 2 |
Proof of Lemma 2.
The proof is provided in Appendix F. ☐
Note that
lim x 1 + + D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 * F X 2 ) d F X 2 ( x 2 ) = + lim x 1 + D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 * F X 2 ) d F X 2 ( x 2 )
= log p Y ( 1 ; F X 1 * F X 2 )
log Q ( P 1 + P 2 ) ,
where (24) is due to the Lebesgue dominated convergence theorem [11] and (21), which permit the interchange of the limit and the integral; (25) is due to the following:
lim x 1 + D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 * F X 2 ) = lim x 1 + y = 0 1 p ( y | x 1 , x 2 ) log p ( y | x 1 , x 2 ) p ( y ; F X 1 * F X 2 ) = log p Y ( 1 ; F X 1 * F X 2 ) ,
since p ( 0 | x 1 , x 2 ) = Q ( x 1 + x 2 ) goes to zero when x 1 + and p Y ( y ; F X 1 * F X 2 ) ( y = 0 , 1 ) is bounded away from zero by (A34) ; (26) is obtained from (A34) in Appendix F. Furthermore,
lim x 1 + + y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 * F X 2 ) p ( y ; F X 1 * | x 2 ) d F X 2 ( x 2 ) = + lim x 1 + y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 * F X 2 ) p ( y ; F X 1 * | x 2 ) d F X 2 ( x 2 )
= log p Y ( 1 ; F X 1 * F X 2 ) + log p ( 1 ; F X 1 * | x 2 ) d F X 2 ( x 2 ) < log Q P 1 + P 2 ,
where (27) is due to the Lebesgue dominated convergence theorem along with (23) and (A39) in Appendix F; (28) is from (22) and the convexity of log Q ( α + t ) in t when α 0 (see Appendix G).
Therefore, from (26) and (28),
lim x 1 + i ˜ λ ( x 1 ; F X 1 * | F X 2 ) ( 2 λ ) log Q ( P 1 + P 2 ) < + .
Using a similar approach, we can also obtain:
lim x 1 i ˜ λ ( x 1 ; F X 1 * | F X 2 ) ( 2 λ ) log Q ( P 1 + P 2 ) < + .
From (29) and (30) and the fact that θ 1 > 0 (see Lemma 1), the LHS of (19) goes to when | x 1 | + . Since any point of increase of F X 1 * must satisfy (19) with equality and I F X 2 * 0 , it is proven that F X 1 * has a bounded support. Hence, from now on, we assume X 1 [ A 1 , A 2 ] for some A 1 , A 2 R (note that A 1 and A 2 are determined by the choice of F X 2 ).
Similarly, for a given F X 1 , the optimization problem:
I F X 1 * = sup F X 2 : E [ X 2 2 ] P 2 I λ ( F X 1 F X 2 ) ,
boils down to the following necessary condition:
i λ ( x 2 ; F X 2 * | F X 1 ) + θ 2 ( P 2 x 2 2 ) I F X 1 * , x 2 R ,
i λ ( x 2 ; F X 2 * | F X 1 ) + θ 2 ( P 2 x 2 2 ) = I F X 1 * , if   x 2   is   a   point   of   increase   of   F X 2 * ,
for the optimality of F X 2 * . However, there are two main differences between (32) and (20). First is the difference between i λ and i ˜ λ . Second is the fact that we do not claim θ 2 to be nonzero, since the approach used in Lemma 1 cannot be readily applied to θ 2 . Nonetheless, the boundedness of the support of F X 2 * can be proven by inspecting the behaviour of the LHS of (32) when | x 2 | + .
In what follows, i.e., from (33)–(38), we prove that the support of F X 2 * is bounded by showing that (32) does not hold when | x 2 | is above a certain threshold. The first term on the LHS of (32) is i λ ( x 2 ; F X 2 * | F X 1 ) . From (17) and (21), it can be easily verified that:
lim x 2 + i λ ( x 2 ; F X 2 * | F X 1 ) = λ log p Y ( 1 ; F X 1 F X 2 * ) λ log Q ( P 1 + P 2 ) , lim x 2 i λ ( x 2 ; F X 2 * | F X 1 ) = λ log p Y ( 0 ; F X 1 F X 2 * ) λ log Q ( P 1 + P 2 )
From (33), if θ 2 > 0 , the LHS of (32) goes to with | x 2 | , which proves that X 2 * is bounded.
For the possible case of θ 2 = 0 , in order to show that (32) does not hold when | x 2 | is above a certain threshold, we rely on the boundedness of X 1 , i.e., X 1 [ A 1 , A 2 ] . Then, we prove that i λ approaches its limit in (33) from below. In other words, there is a real number K such that i λ ( x 2 ; F X 2 * | F X 1 ) < λ log p Y ( 1 ; F X 1 F X 2 * ) when x 2 > K , and i λ ( x 2 ; F X 2 * | F X 1 ) < λ log p Y ( 0 ; F X 1 F X 2 * ) when x 2 < K . This establishes the boundedness of X 2 * . In what follows, we only show the former, i.e., when x 2 + . The latter, i.e., x 2 , follows similarly, and it is omitted for the sake of brevity.
By rewriting i λ , we have:
i λ ( x 2 ; F X 2 * | F X 1 ) = λ p ( 1 ; F X 1 | x 2 ) log p Y ( 1 ; F X 1 F X 2 * )   A 1 A 2 H b ( Q ( x 1 + x 2 ) ) d F X 1 ( x 1 ) + ( 1 λ ) H ( Y | X 2 = x 2 ) H b Q ( x 1 + x 2 ) d F X 1 ( x 1 ) λ p ( 0 ; F X 1 | x 2 ) Q ( x 1 + x 2 ) d F X 1 ( x 1 ) log p Y ( 0 ; F X 1 F X 2 * ) .
It is obvious that the first term on the RHS of (34) approaches λ log p Y ( 1 ; F X 1 F X 2 * ) from below when x 2 + , since p ( 1 ; F X 1 | x 2 ) 1 . It is also obvious that the remaining terms go to zero when x 2 + . Hence, it is sufficient to show that they approach zero from below, which is proven by using the following lemma.
Lemma 3.
Let X 1 be distributed on [ A 1 , A 2 ] according to F X 1 ( x 1 ) . We have:
lim x 2 + A 1 A 2 H b ( Q ( x 1 + x 2 ) ) d F X 1 ( x 1 ) H b A 1 A 2 Q ( x 1 + x 2 ) d F X 1 ( x 1 ) = 1 .
Proof of Lemma 3.
The proof is provided in Appendix H. ☐
From (35), we can write:
A 1 A 2 H b ( Q ( x 1 + x 2 ) ) d F X 1 ( x 1 ) = γ ( x 2 ) H b A 1 A 2 Q ( x 1 + x 2 ) d F X 1 ( x 1 ) ,
where γ ( x 2 ) 1 (due to the concavity of H b ( · ) ), and γ ( x 2 ) 1 when x 2 + (due to (35)). Furthermore, from the fact that lim x 0 H b ( x ) c x = + ( c > 0 ) , we have:
H b A 1 A 2 Q ( x 1 + x 2 ) d F X 1 ( x 1 ) = η ( x 2 ) log p Y ( 0 ; F X 1 F X 2 * ) A 1 A 2 Q ( x 1 + x 2 ) d F X 1 ( x 1 ) ,
where η ( x 2 ) > 0 and η ( x 2 ) + when x 2 + . From (36)–(37), the second and the third line of (34) become:
1 γ ( x 2 ) + λ η ( x 2 ) λ η ( x 2 ) log p Y ( 0 ; F X 1 F X 2 * ) A 1 A 2 Q ( x 1 + x 2 ) d F X 1 ( x 1 ) 0 .
Since γ ( x 2 ) 1 and η ( x 2 ) + as x 2 + , there exists a real number K such that 1 γ ( x 2 ) + λ η ( x 2 ) λ < 0 when x 2 > K . Therefore, the second and the third line of (34) approach zero from below, which proves that the support of X 2 * is bounded away from + . As mentioned before, a similar argument holds when x 2 . This proves that X 2 * has a bounded support.
Remark 1.
We remark here that the order of showing the boundedness of the supports is important. First, for a given F X 2 (not necessarily bounded), it is proven that F X 1 * is bounded. Then, for a given bounded F X 1 , it is shown that F X 2 * is also bounded. Hence, the boundedness of the supports of the optimal input distributions is proven by contradiction. The order is reversed when λ > 1 , and it follows the same steps as in the case of λ 1 . Therefore, it is omitted.
We next prove the second claim in Proposition 3. We assume that 0 < λ < 1 , and a bounded F X 1 is given. We already know that for a given bounded F X 1 , F X 2 * has a bounded support denoted by [ B 1 , B 2 ] . Therefore,
I F X 1 * = sup F X 2 : E [ X 2 2 ] P 2 I λ ( F X 1 F X 2 ) I F X 1 * = sup F X 2 S 2 : E [ X 2 2 ] P 2 I λ ( F X 1 F X 2 ) ,
where S 2 denotes the set of all probability distributions on the Borel sets of [ B 1 , B 2 ] . Let p 0 * = p Y ( 0 ; F X 1 F X 2 * ) denote the probability of the event Y = 0 , induced by F X 2 * and the given F X 1 . Furthermore, let P 2 * denote the second moment of X 2 under F X 2 * . The set:
F 2 = F X 2 S 2 | B 1 B 2 p ( 0 | x 2 ) d F X 2 ( x 2 ) = p 0 * , B 1 B 2 x 2 2 d F X 2 ( x 2 ) = P 2 *
is the intersection of S 2 with two hyperplanes (note that S 2 is convex and compact). We can write:
I F X 1 * = sup F X 2 F 2 I λ ( F X 1 F X 2 ) .
Note that having F X 2 F 2 , the objective function in (41) becomes:
λ H ( Y ) constant + ( 1 λ ) H ( Y | X 2 ) H ( Y | X 1 , X 2 ) linear in F X 2 .
Since the linear part is continuous and F 2 is compact (The continuity of the linear part follows similarly the continuity arguments in Appendix C. Note that this compactness is due to the closedness of the intersecting hyperplanes in F 2 , since a closed subset of a compact set is compact [11]. The hyperplanes are closed due to the continuity of x 2 2 and p ( 0 | x 2 ) (see (A16)).), the objective function in (41) attains its maximum at an extreme point of F 2 , which, by Dubins’ theorem, is a convex combination of at most three extreme points of S 2 . Since the extreme points of S 2 are the CDFs having only one point of increase in [ B 1 , B 2 ] , we conclude that given any bounded F X 1 , F X 2 * has at most three mass points.
Now, assume that an arbitrary F X 2 is given with at most three mass points denoted by { x 2 , i } i = 1 3 . It is already known that the support of F X 1 * is bounded, which is denoted by [ A 1 , A 2 ] . Let S 1 denote the set of all probability distributions on the Borel sets of [ A 1 , A 2 ] . The set:
F 1 = F X 1 S 1 | A 1 A 2 p ( 0 | x 1 , x 2 , j ) d F X 1 ( x 1 ) = p ( 0 ; F X 1 * | x 2 , j ) , j [ 1 : 3 ] , A 1 A 2 x 1 2 d F X 1 ( x 1 ) = P 1 ,
is the intersection of S 1 with four hyperplanes. Note that here, since we know θ 1 0 , the optimal input attains its maximum power of P 1 . In a similar way,
I F X 2 * = sup F X 1 F 1 I λ ( F X 1 F X 2 ) ,
and having F X 1 F 1 , the objective function in (44) becomes:
I λ = λ H ( Y ) + ( 1 λ ) i = 1 3 p X 2 ( x 2 , i ) H ( Y | X 2 = x 2 , i ) constant H ( Y | X 1 , X 2 ) linear in F X 1
Therefore, given any F X 2 with at most three points of increase, F X 1 * has at most five mass points.
When λ = 1 , the second term on the RHS of (45) disappears, which means that F 1 could be replaced by:
F X 1 S 1 | A 1 A 2 p ( 0 | x 1 ) d F X 1 ( x 1 ) = p ˜ 0 * , A 1 A 2 x 1 2 d F X 1 ( x 1 ) = P 1 ,
where p ˜ 0 * = p Y ( 0 ; F X 1 * F X 2 ) is the probability of the event Y = 0 , which is induced by F X 1 * and the given F X 2 . Since the number of intersecting hyperplanes has been reduced to two, it is concluded that F X 1 * has at most three points of increase. ☐
Remark 2.
Note that, the order of showing the discreteness of the support sets is also important. First, for a given bounded F X 1 (not necessarily discrete), it is proven that F X 2 * is discrete with at most three mass points. Then, for a given discrete F X 2 with at most three mass points, it is shown that F X 1 * is also discrete with at most five mass points when λ < 1 and at most three mass points when λ = 1 . When λ > 1 , the order is reversed, and it follows the same steps as in the case of λ < 1 . Therefore, it is omitted.
Remark 3.
If X 1 , X 2 are assumed finite initially, similar results can be obtained by using the iterative optimization in the previous proof and the approach in Chapter 4, Corollary 3 of [13].

5. Sum Rate Analysis

In this section, we propose a lower bound on the sum capacity of a MAC in the presence of a one-bit ADC front end at the receiver, which we conjecture to be tight. The sum capacity is given by:
C sum = sup I ( X 1 , X 2 ; Y | U ) ,
where the supremum is over F U F X 1 | U F X 2 | U ( | U | 5 ), such that E [ X j 2 ] P j , j = 1 , 2 . We obtain a lower bound for the above by considering only those input distributions that are zero-mean per any realization of the auxiliary random variable U, i.e., E [ X j | U = u ] = 0 , u U , j = 1 , 2 . Let P 1 and P 2 be two arbitrary non-negative real numbers. We have:
sup F X 1 F X 2 : E [ X j 2 ] P j E [ X j ] = 0 , j = 1 , 2 I ( X 1 , X 2 ; Y ) sup F X ˜ : E [ X ˜ 2 ] P 1 + P 2 I ( X ˜ ; Y )
= 1 H b Q P 1 + P 2
where in (47), X ˜ X 1 + X 2 , p Y | X ˜ ( 0 | x ˜ ) = Q ( x ˜ ) ; (48) follows from [4] for the point-to-point channel. Therefore, when E [ X j | U = u ] = 0 , u U , j = 1 , 2 , we can write:
I ( X 1 , X 2 ; Y | U ) = i = 1 5 p U ( u i ) I ( X 1 , X 2 ; Y | U = u i ) 1 i = 1 5 p U ( u i ) H b Q E [ X 1 2 | U = u i ] + E [ X 2 2 | U = u i ] 1 H b Q E [ X 1 2 ] + E [ X 2 2 ]
1 H b Q P 1 + P 2 ,
where (49) is due to the fact that H b Q ( x + y ) is a convex function of ( x , y ) , and (50) follows from E [ X j 2 ] P j , j = 1 , 2 .
The upper bound in (50) can be achieved by time division with power control as follows. Let U = { 0 , 1 } and p U ( 0 ) = 1 p U ( 1 ) = P 1 P 1 + P 2 . Furthermore, let F X 1 | U ( x | 1 ) = F X 2 | U ( x | 0 ) = s ( x ) , where s ( · ) is the unit step function, and:
F X 1 | U ( x | 0 ) = F X 2 | U ( x | 1 ) = 1 2 s ( x + P 1 + P 2 ) + 1 2 s ( x P 1 + P 2 ) .
With this choice of F U F X 1 | U F X 2 | U , the upper bound in (50) is achieved. Therefore,
C s u m 1 H b Q P 1 + P 2 .
A numerical evaluation of (46) is carried out as follows (the codes that are used for the numerical simulations are available at https://www.dropbox.com/sh/ndxkjt6h5a0yktu/AAAmfHkuPxe8rMNV1KzFVRgNa?dl=0). Although E [ X j 2 ] is upper bounded by P j ( j = 1 , 2 ), the value of E [ X j 2 | U = u ] ( u U ) has no upper bound and could be any non-negative real number. However, in our numerical analysis, we further restrict our attention to the case E [ X j 2 | U = u ] 20 P j , u U , j = 1 , 2 . Obviously, as this upper bound tends to infinity, the approximation becomes more accurate (This further bounding of the conditional second moments is justified by the fact that the sum capacity is not greater than one, which is due to the one-bit quantization at the receiver. As a result, I ( X 1 , X 2 ; Y | U = u ) increases at most sublinearly with E [ X j 2 | U = u ] , j = 1 , 2 , while p U ( u ) needs to decrease at least linearly to satisfy the average power constraints. Hence, the product p U ( u ) I ( X 1 , X 2 ; Y | U = u ) decreases with E [ X j 2 | U = u ] when E [ X j 2 | U = u ] is above a threshold.). Each of the intervals [ 0 , 20 P 1 ] and [ 0 , 20 P 2 ] are divided into 201 points uniformly, which results in the discrete intervals P 1 10 [ 0 : 200 ] and P 2 10 [ 0 : 200 ] , respectively. Afterwards, for any pair ( α , β ) P 1 10 [ 0 : 200 ] × P 2 10 [ 0 : 200 ] , the following is carried out for input distributions with at most three mass points.
max F X 1 F X 2 : E [ X 1 2 ] α , E [ X 2 2 ] β I ( X 1 , X 2 ; Y )
The results are stored in a 201 × 201 matrix accordingly. In the above optimization, the MATLAB function fmincon is used with three different initial values, and the maximum of these three experiments is chosen. Then, the problem boils down to finding proper gains, i.e., the mass probabilities of U, that maximize I ( X 1 , X 2 ; Y | U ) and satisfy the average power constraints E [ X j 2 ] P j . This is done via a linear program, which can be efficiently solved by the linprog function in MATLAB. Several cases were considered, such as ( P 1 , P 2 ) = ( 1 , 1 ) , ( P 1 , P 2 ) = ( 1 , 2 ) , ( P 1 , P 2 ) = ( 3 , 1 ) , etc. In all these cases, the numerical evaluation of (46) leads to the same value as the lower bound in (51). Since the problem is not convex, it is not known whether the numerical results are the global optimum solutions; hence, we leave it as a conjecture that the sum capacity can be achieved by time division with power control.

6. Conclusions

We have studied the capacity region of a two-transmitter Gaussian MAC under average input power constraints and one-bit ADC front end at the receiver. We have derived an upper bound on the cardinality of this auxiliary variable, and proved that the distributions that achieve the boundary points of the capacity region are finite and discrete. Finally, a lower bound is proposed on the sum capacity of this MAC that is achieved by time division with power control. Through numerical analysis, this lower bound is shown to be tight.

Author Contributions

All three authors contributed to the paper. B.R. derived most of the claims with discussion with the M.V. and D.G.; The numerical results regarding the sum-rate analysis were obtained by the M.V.; Writing and editing of the paper has been done jointly, and all three authors have read and approved the final manuscript.

Funding

This research was supported in part by the European Research Council (ERC) through Starting Grant BEACON (Agreement No. 677854), by the U.K. Engineering and Physical Sciences Research Council (EPSRC) through the project COPES (EP/N021738/1) and by the British Council Institutional Link Program under Grant NO. 173605884.

Acknowledgments

The authors thank Professor Barbie and Professor Shirokov for their help in showing the preservation of the Markov chain in the weak convergence of the joint distributions in Appendix B.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A

The capacity region of the discrete memoryless (DM) MAC with input cost constraints has been addressed in Exercise 4.8 of [10]. If the input alphabets are not discrete, the capacity region is still the same because: (1) the converse remains the same if the inputs are from a continuous alphabet; (2) the region is achievable by coded time sharing and the discretization procedure (see Remark 3.8 in [10]). Therefore, it is sufficient to show the cardinality bound | U | 5 .
Let P be the set of all product distributions (i.e., of the form F X 1 ( x 1 ) F X 2 ( x 2 ) ) on R 2 . Let g : P R 5 be a vector-valued mapping defined element-wise as:
g 1 ( F X 1 | U ( · | u ) F X 2 | U ( · | u ) ) = I ( X 1 ; Y | X 2 , U = u ) , g 2 ( F X 1 | U ( · | u ) F X 2 | U ( · | u ) ) = I ( X 2 ; Y | X 1 , U = u ) , g 3 ( F X 1 | U ( · | u ) F X 2 | U ( · | u ) ) = I ( X 1 , X 2 ; Y | U = u ) , g 4 ( F X 1 | U ( · | u ) F X 2 | U ( · | u ) ) = E [ X 1 2 | U = u ] , g 5 ( F X 1 | U ( · | u ) F X 2 | U ( · | u ) ) = E [ X 2 2 | U = u ] .
Let G R 5 be the image of P under the mapping g (i.e., G = g ( P ) ). Given an arbitrary ( U , X 1 , X 2 ) F U F X 1 | U F X 2 | U , we obtain the vector r as:
r 1 = I ( X 1 ; Y | X 2 , U ) = U I ( X 1 ; Y | X 2 , U = u ) d F U ( u ) , r 2 = I ( X 2 ; Y | X 1 , U ) = U I ( X 2 ; Y | X 1 , U = u ) d F U ( u ) , r 3 = I ( X 1 , X 2 ; Y | U ) = U I ( X 1 , X 2 ; Y | U = u ) d F U ( u ) , r 4 = E [ X 1 2 ] = U E [ X 1 2 | U = u ] d F U ( u ) , r 5 = E [ X 2 2 ] = U E [ X 2 2 | U = u ] d F U ( u ) .
Therefore, r is in the convex hull of G R 5 . By Carathéodory’s theorem [9], r can be written as a convex combination of six ( = 5 + 1 ) or fewer points in G , which states that it is sufficient to consider | U | 6 . Since P is a connected set ( P is the product of two connected sets; therefore, it is connected. Each of the sets in this product is connected because of being a convex vector space.) and the mapping g is continuous (this is a direct result of the continuity of the channel transition probability), G is a connected subset of R 5 . Therefore, the connectedness of G refines the cardinality of U to | U | 5 .
It is also important to note that for the boundary points of C ( P 1 , P 2 ) that are not sum-rate optimal, it is sufficient to have | U | 4 . The proof is as follows. Any point on the boundary of the capacity region that does not maximize R 1 + R 2 is either of the form ( I ( X 1 ; Y | X 2 , U ) , I ( X 2 ; Y | U ) ) or ( I ( X 1 ; Y | U ) , I ( X 2 ; Y | X 1 , U ) ) for some F U F X 1 | U F X 2 | U that satisfies E [ X j 2 ] P j , j = 1 , 2 . In other words, it is one of the corner points of the corresponding pentagon in (4). As in the proof of Proposition 1, define the mapping g : P R 4 , where g 1 and g 2 are the coordinates of this boundary point conditioned on U = u , and g 3 , g 4 are the same as g 4 and g 5 in (A1), respectively. The sufficiency of | U | 4 in this case follows similarly.

Appendix B

Since | U | 5 , we assume U = { 0 , 1 , 2 , 3 , 4 } without loss of generality, since what matters in the evaluation of the capacity region is the mass probability of the auxiliary random variable U, not its actual values.
In order to show the compactness of Ω , we adopt a general form of the approach in [12].
First, we show that Ω is tight (a set of probability distributions Θ defined on R k , i.e., the set of CDFs F X 1 , X 2 , , X k , is said to be tight, if for every ϵ > 0 , there is a compact set K ϵ R k such that [14]:
Pr ( X 1 , X 2 , , X k ) R k \ K ϵ < ϵ , F X 1 , X 2 , , X k Θ .
Choose T j , j = 1 , 2 , such that T j > 2 P j ϵ . Then, from Chebyshev’s inequality,
Pr | X j | > T j P j T j 2 < ϵ 2 , j = 1 , 2 .
Let K ϵ = [ 0 , 4 ] × [ T 1 , T 1 ] × [ T 2 , T 2 ] R 3 . It is obvious that K ϵ is a closed and bounded subset of R 3 and, therefore, compact. With this choice of K ϵ , we have:
Pr ( U , X 1 , X 2 ) R 3 K ϵ Pr { U [ 0 , 4 ] } + Pr { X 1 [ T 1 , T 1 ] } + Pr { X 2 [ T 2 , T 2 ] } < 0 + ϵ 2 + ϵ 2 = ϵ ,
where (A3) is due to (A2). Hence, Ω is tight.
From Prokhorov’s theorem [14] (p. 318), a set of probability distributions is tight if and only if it is relatively sequentially compact (a subset of topological space is relatively compact if its closure is compact). This means that for every sequence of CDFs { F n } in Ω , there exists a subsequence { F n k } that is weakly convergent (the weak convergence of { F n } to F (also shown as F n ( x ) w F ( x ) ) is equivalent to:
lim n R ψ ( x ) d F n ( x ) = R ψ ( x ) d F ( x ) ,
for all continuous and bounded functions ψ ( · ) on R . Note that F n ( x ) w F ( x ) if and only if d L ( F n , F ) 0 .) to a CDF F 0 , which is not necessarily in Ω . If we can show that this F 0 is also an element of Ω , then the proof is complete, since we have shown that Ω is sequentially compact and, therefore, compact (Compactness and sequential compactness are equivalent in metric spaces. Note that Ω is a metric space with Lévy distance.).
Assume a sequence of distributions { F n ( · , · , · ) } in Ω that converges weakly to F 0 ( · , · , · ) . In order to show that this limiting distribution is also in Ω , we need to show that both the average power constraints and the Markov chain ( X 1 U X 2 ) are preserved under F 0 . The preservation of the second moment follows similarly to the argument in (Appendix I, [12]). In other words, since x 2 is continuous and bounded from below, from Theorem 4.4.4 in [15]:
x j 2 d 3 F 0 ( u , x 1 , x 2 ) lim inf n x j 2 d 3 F n ( u , x 1 , x 2 ) P j , j = 1 , 2 ,
Therefore, the second moments are preserved under the limiting distribution F 0 .
For the preservation of the Markov chain X 1 U X 2 , we need the following proposition.
Proposition A1.
Assume a sequence of distributions { F n ( · , · ) } over the pair of random variables ( X , Y ) that converges weakly to F 0 ( · , · ) . Furthermore, assume that Y has a finite support, i.e., Y = { 1 , 2 , , | Y | } . Then, the sequence of conditional distributions (conditioned on Y) converges weakly to the limiting conditional distribution (conditioned on Y), i.e.,
F n ( · | y ) w F 0 ( · | y ) , y Y , p 0 ( y ) > 0 .
Proof of Proposition A1.
The proof is by contradiction. If (A6) is not true, then there exists y Y , such that p 0 ( y ) > 0 and F n ( · | y ) w F 0 ( · | y ) . This means, from the definition of weak convergence, that there exists a bounded continuous function of x, denoted by g y ( x ) , such that:
g y ( x ) d F n ( x | y ) g y ( x ) d F 0 ( x | y ) .
Let f ( x , y ) be any bounded continuous function that satisfies:
f ( x , y ) = 0 y Y , y y g y ( x ) y = y .
With this choice of f ( x , y ) , we have:
f ( x , y ) d 2 F n ( x , y ) f ( x , y ) d 2 F 0 ( x , y ) ,
which violates the assumption of the weak convergence of F n ( · , · ) to F 0 ( · , · ) . Therefore, (A6) holds. ☐
Since { F n ( · , · , · ) } in Ω converges weakly to F 0 ( · , · , · ) and U is finite, from Proposition A1, we have:
F n ( · , · | u ) w F 0 ( · , · | u ) , u U ,
where it is obvious that the arguments are x 1 and x 2 . Since F n Ω , we have F n ( x 1 , x 2 | u ) = F n ( x 1 | u ) F n ( x 2 | u ) u U . Furthermore, since the convergence of the joint distribution implies the convergence of the marginals, we have (Theorem 2.7, [16,17]),
F 0 ( x 1 , x 2 | u ) = F 0 ( x 1 | u ) F 0 ( x 2 | u ) u U ,
which states that under the limiting distribution F 0 , the Markov chain X 1 U X 2 is preserved (Alternatively, this could be proven by the lower-semicontinuity of the mutual information as follow:
I F 0 ( X 1 ; X 2 | U = u ) lim inf n I F n ( X 1 ; X 2 | U = u )
= 0 , u U ,
where I F denotes the mutual information under distribution F. The last equality is from the conditional independence of X 1 and X 2 given U = u under F n . Therefore, I F 0 ( X 1 ; X 2 | U = u ) = 0 , u U , which is equivalent to (A11).). This completes the proof of the compactness of Ω .

Appendix C

Appendix C.1. Concavity

When 0 < λ 1 , we have:
I λ ( F X 1 F X 2 ) = λ H ( Y ) + ( 1 λ ) H ( Y | X 2 ) H ( Y | X 1 , X 2 ) .
For a given F X 1 , H ( Y ) is a concave function of F X 2 , while H ( Y | X 2 ) and H ( Y | X 1 , X 2 ) are linear in F X 2 . Therefore, I λ is a concave function of F X 2 . For a given F X 2 , H ( Y ) and H ( Y | X 2 ) are concave functions of F X 1 , while H ( Y | X 1 , X 2 ) is linear in F X 1 . Since ( 1 λ ) 0 , I λ is a concave function of F X 1 . The same reasoning applies to the case λ > 1 .

Appendix C.2. Continuity

When λ 1 , the continuity of the three terms on the RHS of (A14) is investigated. Let { F X 2 , n } be a sequence of distributions, which is weakly convergent to F X 2 . For a given F X 1 , we have:
lim x 2 x 2 0 p ( y ; F X 1 | x 2 ) = lim x 2 x 2 0 Q ( x 1 + x 2 ) d F X 1 ( x 1 ) = lim x 2 x 2 0 Q ( x 1 + x 2 ) d F X 1 ( x 1 )
= p ( y ; F X 1 | x 2 0 ) ,
where (A15) is due to the fact that the Q function can be dominated by one, which is an absolutely integrable function over F X 1 . Therefore, p ( y ; F X 1 | x 2 ) is continuous in x 2 , and combined with the weak convergence of { F X 2 , n } , we can write:
lim n p ( y ; F X 1 F X 2 , n ) = lim n p ( y ; F X 1 | x 2 ) d F X 2 , n ( x 2 ) = p ( y ; F X 1 | x 2 ) d F X 2 ( x 2 ) = p ( y ; F X 1 F X 2 ) .
This allows us to write:
lim n y = 0 1 p ( y ; F X 1 F X 2 , n ) log p ( y ; F X 1 F X 2 , n ) = y = 0 1 p ( y ; F X 1 F X 2 ) log p ( y ; F X 1 F X 2 ) ,
which proves the continuity of H ( Y ) in F X 2 . H ( Y | X 2 = x 2 ) is a bounded ( [ 0 , 1 ] ) continuous function of x 2 , since it is a continuous function of p ( y ; F X 1 | x 2 ) , and the latter is continuous in x 2 (see (A16)). Therefore,
lim n H ( Y | X 2 = x 2 ) d F X 2 , n ( x 2 ) = H ( Y | X 2 = x 2 ) d F X 2 ( x 2 ) ,
which proves the continuity of H ( Y | X 2 ) in F X 2 . In a similar way, it can be verified that H ( Y | X 1 = x 1 , X 2 = x 2 ) d F X 1 ( x 1 ) is a bounded and continuous function of x 2 , which guarantees the continuity of H ( Y | X 1 , X 2 ) in F X 2 , since:
H ( Y | X 1 , X 2 ) = H ( Y | X 1 = x 1 , X 2 = x 2 ) d F X 1 ( x 1 ) d F X 2 ( x 2 )
Therefore, for a given F X 1 , I λ is a continuous function of F X 2 . Exchanging the roles of F X 1 and F X 2 , also the case λ > 1 can be addressed similarly, so they are omitted for the sake of brevity.

Appendix C.3. Weak Differentiability

For a given F X 1 , the weak derivative of I λ at F X 2 0 is given by:
I λ ( F X 1 F X 2 ) | F X 2 0 = lim β 0 + I λ ( F X 1 ( ( 1 β ) F X 2 0 + β F X 2 ) ) I λ ( F X 1 F X 2 0 ) β ,
if the limit exists. It can be verified that:
I λ ( F X 1 F X 2 ) | F X 2 0 = lim β 0 + i λ ( x 2 ; ( 1 β ) F X 2 0 + β F X 2 | F X 1 ) d ( ( 1 β ) F X 2 0 ( x 2 ) + β F X 2 ( x 2 ) ) i λ ( x 2 ; F X 2 0 | F X 1 ) d F X 2 0 ( x 2 ) β = i λ ( x 2 ; F X 2 0 | F X 1 ) d F X 2 ( x 2 ) i λ ( x 2 ; F X 2 0 | F X 1 ) d F X 2 0 ( x 2 ) = i λ ( x 2 ; F X 2 0 | F X 1 ) d F X 2 ( x 2 ) I λ ( F X 1 F X 2 0 ) ,
where i λ has been defined in (17). In a similar way, for a given F X 2 , the weak derivative of I λ at F X 1 0 is:
I λ ( F X 1 F X 2 ) | F X 1 0 = i ˜ λ ( x 1 ; F X 1 0 | F X 2 ) d F X 1 ( x 1 ) I λ ( F X 1 0 F X 2 ) ,
where i ˜ λ has been defined in (16). The case λ > 1 can be addressed similarly.

Appendix D

We have:
sup F X 1 : E [ X 1 2 ] P 1 I λ ( F X 1 F X 2 ) sup F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 I λ ( F X 1 F X 2 ) sup F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 I ( X 1 , X 2 ; Y )
sup F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 H ( Y ) inf F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 H ( Y | X 1 , X 2 ) = 1 inf F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 H b Q ( x 1 + x 2 ) d F X 1 ( x 1 ) d F X 2 ( x 2 ) = 1 inf F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 H b Q x 1 2 + x 2 2 d F X 1 ( x 1 ) d F X 2 ( x 2 )
1 inf F X 1 F X 2 : E [ X j 2 ] P j , j = 1 , 2 Q x 1 2 + x 2 2 d F X 1 ( x 1 ) d F X 2 ( x 2 )
= 1 Q P 1 + P 2
< 1 ,
where (A20) is from the non-negativity of mutual information and the assumption that 0 < λ 1 ; (A21) is justified since the Q function is monotonically decreasing and the sign of the inputs does not affect the average power constraints; X 1 and X 2 can be assumed non-negative (or alternatively non-positive) without loss of optimality; in (A22), we use the fact that Q x 1 2 + x 2 2 1 2 , and for t [ 0 , 1 2 ] , H b ( t ) t ; (A23) is based on the convexity and monotonicity of the function Q ( u + v ) in ( u , v ) , which is shown in Appendix G. Therefore, the LHS of (13) is strictly less than one.
Since X 2 has a finite second moment ( E [ X 2 2 ] P 2 ), from Chebyshev’s inequality, we have:
P ( | X 2 | M ) P 2 M 2 , M > 0 .
Fix M > 0 , and consider X 1 F X 1 ( x 1 ) = 1 2 [ s ( x 1 + 2 M ) + s ( x 1 2 M ) ] . By this choice of F X 1 , we get:
I λ ( F X 1 F X 2 ) = I ( X 1 ; Y | X 2 ) + λ I ( X 2 ; Y ) I ( X 1 ; Y | X 2 ) = + I ( X 1 ; Y | X 2 = x 2 ) d F X 2 ( x 2 ) M + M I ( X 1 ; Y | X 2 = x 2 ) d F X 2 ( x 2 ) inf F X 2 M + M H ( Y | X 2 = x 2 ) d F X 2 ( x 2 ) sup F X 2 M + M H ( Y | X 1 , X 2 = x 2 ) d F X 2 ( x 2 )
1 P 2 M 2 H b 1 2 1 2 ( Q ( 3 M ) + Q ( M ) ) H b Q ( 2 M ) ,
where (A27) is due to (A25) and the fact that H ( Y | X 2 = x 2 ) = H b ( 1 2 Q ( 2 M + x 2 ) + 1 2 Q ( 2 M + x 2 ) ) is minimized over [ M , M ] at x 2 = M (or, alternatively at x 2 = M ) and H ( Y | X 1 , X 2 = x 2 ) = 1 2 H b ( Q ( 2 M + x 2 ) ) + 1 2 H b ( Q ( 2 M + x 2 ) ) is maximized at x 2 = 0 . (A27) shows that I λ ( 1 ) can become arbitrarily close to one given that M is large enough. Hence, its supremum over all distributions F X 1 is one. This means that (13) cannot hold, and θ 1 0 .

Appendix E. Justification of (18), (19) and (20)

Let X be a vector space and Z be a real-valued function defined on a convex domain D X . Suppose that x * maximizes Z on D and that Z is Gateaux differentiable (weakly differentiable) at x * . Then, from (Theorem 2, p. 178, [18]),
Z ( x ) | x * 0 ,
where Z ( x ) | x * is the weak derivative of Z at x * .
From (A19), we have the weak derivative of I λ at F X 1 * as:
I λ ( F X 1 F X 2 ) | F X 1 * = i ˜ λ ( x 1 ; F X 1 * | F X 2 ) d F X 1 ( x 1 ) I λ ( F X 1 * F X 2 ) .
Now, the derivation of (18) is immediate by inspecting that the weak derivative of the objective of (11) at F X 1 * is given by:
I λ ( F X 1 F X 2 ) | F X 1 * θ 1 x 1 2 d F X 1 ( x 1 ) x 1 2 d F X 1 * ( x 1 ) = i ˜ λ ( x 1 ; F X 1 * | F X 2 ) d F X 1 ( x 1 ) I λ ( F X 1 * F X 2 ) θ 1 x 1 2 d F X 1 ( x 1 ) x 1 2 d F X 1 * ( x 1 ) .
Letting (A30) be lower than or equal to zero (as in (A28)) results in (18).
The equivalence of (18) to (19) and (20) follows similarly to the proof of Corollary 1 in (p. 210, [19]).

Appendix F

Equation (21) is obtained as follows.
| D p ( y | x 1 , x 2 ) | | p ( y ; F X 1 F X 2 ) | = y = 0 1 p ( y | x 1 , x 2 ) log p ( y | x 1 , x 2 ) p ( y ; F X 1 F X 2 ) | H ( Y | X 1 = x 1 , X 2 = x 2 ) | + y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 F X 2 ) 1 + y = 0 1 log p ( y ; F X 1 F X 2 )
= 1 y = 0 1 log p ( y ; F X 1 F X 2 ) 1 2 min log p Y ( 0 ; F X 1 F X 2 ) , log p Y ( 1 ; F X 1 F X 2 ) 1 2 log Q ( P 1 + P 2 ) < ,
where (A31) is due to the fact that the binary entropy function is upper bounded by one. (A32) is justified as follows.
min p Y ( 0 ; F X 1 F X 2 ) , p Y ( 1 ; F X 1 F X 2 ) inf F X 1 F X 2 : E [ X j 2 ] P j min p Y ( 0 ; F X 1 F X 2 ) , p Y ( 1 ; F X 1 F X 2 ) = inf F X 1 F X 2 : E [ X j 2 ] P j p Y ( 0 ; F X 1 F X 2 ) = inf F X 1 F X 2 : E [ X j 2 ] P j Q ( x 1 + x 2 ) d F X 1 ( x 1 ) d F X 2 ( x 2 ) = inf F X 1 F X 2 : E [ X j 2 ] P j Q x 1 2 + x 2 2 d F X 1 ( x 1 ) d F X 2 ( x 2 )
Q P 1 + P 2 ,
where (A34) is based on the convexity and monotonicity of the function Q ( u + v ) , which is shown in Appendix G.
Equation (22) is obtained as follows.
p ( y ; F X 1 | x 2 ) min p ( 0 ; F X 1 | x 2 ) , p ( 1 ; F X 1 | x 2 ) Q | x 1 | + | x 2 | d F X 1 ( x 1 ) = Q x 1 2 + | x 2 | d F X 1 ( x 1 ) Q P 1 + | x 2 | ,
where (A35) is due to the convexity of Q ( α + x ) in x for α 0 .
Equation (23) is obtained as follows.
y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 F X 2 ) p ( y ; F X 1 | x 2 ) y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 | x 2 ) y = 0 1 p ( y | x 1 , x 2 ) log p ( y ; F X 1 F X 2 ) y = 0 1 log p ( y ; F X 1 | x 2 ) y = 0 1 log p ( y ; F X 1 F X 2 )
2 log Q P 1 + P 2 2 log Q P 1 + | x 2 | ,
where (A36) is from p ( y | x 1 , x 2 ) 1 ; and (A37) is from (A35) and (A34).
Note that (A37) is integrable with respect to F X 2 due to the concavity of log Q ( α + x ) in x for α 0 as shown in Appendix G. In other words,
+ 2 log Q P 1 + P 2 2 log Q P 1 + | x 2 | d F X 2 ( x 2 ) < 4 log Q P 1 + P 2
< + .

Appendix G. Two Convex Functions

Let f ( x ) = log Q ( a + x ) for x , a 0 . We have,
f ( x ) = e ( a + x ) 2 2 2 2 π x Q ( a + x ) ,
and:
f ( x ) = e ( a + x ) 2 2 4 x 2 π Q 2 ( a + x ) ( a + x + 1 x ) Q ( a + x ) ϕ ( a + x ) ,
where ϕ ( x ) = 1 2 π e x 2 2 . Note that:
( 1 + a t + t 2 ) Q ( a + t ) + a ϕ ( a + t ) > 1 + ( a + t ) 2 Q ( a + t )
> ( a + t ) ϕ ( a + t ) , a , t > 0 ,
where (A41) and (A42) are, respectively, due to ϕ ( x ) > x Q ( x ) and ( 1 + x 2 ) Q ( x ) > x ϕ ( x ) ( x > 0 ). Therefore,
( a + x + 1 x ) Q ( a + x ) > ϕ ( a + x ) ,
which makes the second derivative in (A40) positive and proves the (strict) convexity of f ( x ) .
Let f ( u , v ) = Q ( u + v ) for u , v 0 . By simple differentiation, the Hessian matrix of f is:
H = e ( u + v ) 2 2 2 π 1 2 u u + u + v 4 u u + v 4 u v u + v 4 u v 1 2 v v + u + v 4 v .
It can be verified that det ( H ) > 0 and trace ( H ) > 0 . Therefore, both eigenvalues of H are positive, which makes the matrix positive definite. Hence, Q ( u + v ) is (strictly) convex in ( u , v ) .

Appendix H

Let A max { A 1 , A 2 } .
Figure A1. The figure depicting (A46) and (A48). Note that in the statement of Lemma 3, x 2 + . Hence, we have assumed x 2 > A in the figure.
Figure A1. The figure depicting (A46) and (A48). Note that in the statement of Lemma 3, x 2 + . Hence, we have assumed x 2 > A in the figure.
Entropy 20 00686 g0a1
It is obvious that:
Q ( x 2 + A ) A A Q ( x 1 + x 2 ) d F X 1 ( x 1 ) Q ( x 2 A ) .
Therefore, we can write:
A A Q ( x 1 + x 2 ) d F X 1 ( x 1 ) = β Q ( x 2 + A ) + ( 1 β ) Q ( x 2 A ) ,
for some β [ 0 , 1 ] . Note that β is a function of x 2 . Furthermore, due to the concavity of H b ( · ) , we have:
H b A A Q ( x 1 + x 2 ) d F X 1 ( x 1 ) A A H b ( Q ( x 1 + x 2 ) ) d F X 1 ( x 1 ) .
From the fact that:
H b ( x ) H b ( p ) H b ( a ) p a ( x a ) + H b ( a ) , x [ a , p ] , a , p [ 0 , 1 ] ( a < p ) ,
we can also write:
A A H b ( Q ( x 1 + x 2 ) ) d F X 1 ( x 1 ) H b ( Q ( x 2 A ) ) H b ( Q ( x 2 + A ) ) Q ( x 2 A ) Q ( x 2 + A ) A A Q ( x 1 + x 2 ) d F X 1 ( x 1 ) Q ( x 2 + A ) + H b ( Q ( x 2 + A ) ) = β H b ( Q ( x 2 + A ) ) + ( 1 β ) H b ( Q ( x 2 A ) ) ,
where (A45) and (A47) have been used in (A48). (A46) and (A48) are depicted in Figure A1.
From (A45) and (A48), we have:
β H b ( Q ( x 2 + A ) ) + ( 1 β ) H b ( Q ( x 2 A ) ) H b β Q ( x 2 + A ) + ( 1 β ) Q ( x 2 A ) A A H b ( Q ( x 1 + x 2 ) ) d F X 1 ( x 1 ) H b A A Q ( x 1 + x 2 ) d F X 1 ( x 1 ) 1 .
Let:
β * arg min β β H b ( Q ( x 2 + A ) ) + ( 1 β ) H b ( Q ( x 2 A ) ) H b β Q ( x 2 + A ) + ( 1 β ) Q ( x 2 A ) .
This minimizer satisfies the following equality:
d d β β H b ( Q ( x 2 + A ) ) + ( 1 β ) H b ( Q ( x 2 A ) ) H b β Q ( x 2 + A ) + ( 1 β ) Q ( x 2 A ) | β = β * = 0 .
Therefore, we can write:
β H b ( Q ( x 2 + A ) ) + ( 1 β ) H b ( Q ( x 2 A ) ) H b β Q ( x 2 + A ) + ( 1 β ) Q ( x 2 A ) β * H b ( Q ( x 2 + A ) ) + ( 1 β * ) H b ( Q ( x 2 A ) ) H b β * Q ( x 2 + A ) + ( 1 β * ) Q ( x 2 A )
= H b ( Q ( x 2 A ) ) H b ( Q ( x 2 + A ) ) Q ( x 2 A ) Q ( x 2 + A ) H b β * Q ( x 2 + A ) + ( 1 β * ) Q ( x 2 A )
H b ( Q ( x 2 A ) ) H b ( Q ( x 2 + A ) ) Q ( x 2 A ) Q ( x 2 + A ) H b ( Q ( x 2 + A ) ) ,
where (A52) is from the definition in (A50); (A53) is from the expansion of (A51), and H b ( t ) = log ( 1 t t ) is the derivative of the binary entropy function; (A54) is due to the fact that H b ( t ) is a decreasing function.
Applying L’Hôspital’s rule multiple times, we obtain:
lim x 2 + H b ( Q ( x 2 A ) ) H b ( Q ( x 2 + A ) ) Q ( x 2 A ) Q ( x 2 + A ) H b ( Q ( x 2 + A ) ) = lim x 2 + H b ( Q ( x 2 A ) ) 1 H b ( Q ( x 2 + A ) ) H b ( Q ( x 2 A ) ) Q ( x 2 A ) 1 Q ( x 2 + A ) Q ( x 2 A ) log ( 1 Q ( x 2 + A ) Q ( x 2 + A ) ) = lim x 2 + H b ( Q ( x 2 A ) ) Q ( x 2 A ) log ( Q ( x 2 + A ) ) = lim x 2 + e ( x 2 A ) 2 2 log ( Q ( x 2 A ) ) e ( x 2 A ) 2 2 log ( Q ( x 2 + A ) ) + Q ( x 2 A ) Q ( x 2 + A ) e ( x 2 + A ) 2 2 = lim x 2 + log ( Q ( x 2 A ) ) log ( Q ( x 2 + A ) ) + 1 = lim x 2 + Q ( x 2 + A ) e A x 2 Q ( x 2 A ) e A x 2 = 1
From (A49), (A54) and (A55), (35) is proven. Note that the boundedness of X 1 is crucial in the proof. In other words, the fact that Q ( x 2 A ) 0 as x 2 + is the very result of A < + .

References

  1. Walden, R.H. Analog-to-digital converter survey and analysis. IEEE J. Sel. Areas Commun. 1999, 17, 539–550. [Google Scholar] [CrossRef] [Green Version]
  2. Murmann, B. ADC Performance Survey. CoRR. 2014. Available online: http://web.stanford.edu/~murmann/adcsurvey.html (accessed on 20 May 2018).
  3. Gunduz, D.; Stamatiou, K.; Michelusi, N.; Zorzi, M. Designing intelligent energy harvesting communication systems. IEEE Comm. Mag. 2014, 52, 210–216. [Google Scholar] [CrossRef]
  4. Singh, J.; Dabeer, O.; Madhow, U. On the limits of communication with low-precision analog-to-digital Conversion at the Receiver. IEEE Trans. Commun. 2009, 57, 3629–3639. [Google Scholar] [CrossRef]
  5. Krone, S.; Fettweis, G. Fading channels with 1-bit output quantization: Optimal modulation, ergodic capacity and outage probability. In Proceedings of the 2010 IEEE Information Theory Workshop, Dublin, Ireland, 30 August–3 September 2010; pp. 1–5. [Google Scholar]
  6. Mezghani, A.; Nossek, J.A. Analysis of Rayleigh-fading channels with 1-bit quantized output. In Proceedings of the 2008 IEEE International Symposium on Information Theory, Toronto, ON, Canada, 6–11 July 2008. [Google Scholar]
  7. Mezghani, A.; Nossek, J.A. On ultra-wideband MIMO systems with 1-bit quantized outputs: Performance analysis and input optimization. In Proceedings of the 2007 IEEE International Symposium on Information Theory, Nice, France, 24–29 June 2007; pp. 1286–1289. [Google Scholar]
  8. Mo, J.; Heath, R. Capacity analysis of one-bit quantized MIMO systems with transmitter channel state information. IEEE Trans. Signal Process. 2015, 63, 5498–5512. [Google Scholar] [CrossRef]
  9. Witsenhausen, H.S. Some aspects of convexity useful in information theory. IEEE Trans. Inf. Theory 1980, 26, 265–271. [Google Scholar] [CrossRef]
  10. El Gamal, A.; Kim, Y.H. Network Information Theory; Cambridge University Press: Cambridge, UK, 2012. [Google Scholar]
  11. Rudin, W. Principles of Mathematical Analysis; McGraw-Hill: New York, NY, USA, 1976. [Google Scholar]
  12. Abou-Faycal, I.C.; Trott, M.D.; Shamai, S. The capacity of discrete-time memoryless Rayleigh-fading channels. IEEE Trans. Inf. Theory 2001, 47, 1290–1300. [Google Scholar] [CrossRef]
  13. Gallager, R.G. Information Theory and Reliable Communication; Wiley: New York, NY, USA, 1968. [Google Scholar]
  14. Shiryaev, A.N. Probability, 2nd ed.; Springer: Berlin, Germany, 1996. [Google Scholar]
  15. Chung, K.L. A Course in Probability Theory, 2nd ed.; Academic Press: New York, NY, USA, 1974. [Google Scholar]
  16. Billingsley, P. Convergence of Probability Measures, 2nd ed.; Wiley: New York, NY, USA, 1968. [Google Scholar]
  17. Sagitov, S. Lecture Notes: Weak Convergence of Probability Measures. Available online: http://www.math. chalmers.se/~serik/C-space.pdf (accessed on 20 May 2018).
  18. Luenberger, D. Optimization by Vector Space Methods; Wiley: New York, NY, USA, 1969. [Google Scholar]
  19. Smith, J.G. The information capacity of amplitude and variance constrained scalar Gaussian channels. Inf. Control 1971, 18, 203–219. [Google Scholar] [CrossRef]
Figure 1. A two-transmitter Gaussian multiple access channel (MAC) with a one-bit analogue-to-digital converter (ADC) front end at the receiver.
Figure 1. A two-transmitter Gaussian multiple access channel (MAC) with a one-bit analogue-to-digital converter (ADC) front end at the receiver.
Entropy 20 00686 g001

Share and Cite

MDPI and ACS Style

Rassouli, B.; Varasteh, M.; Gündüz, D. Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver ,. Entropy 2018, 20, 686. https://doi.org/10.3390/e20090686

AMA Style

Rassouli B, Varasteh M, Gündüz D. Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver ,. Entropy. 2018; 20(9):686. https://doi.org/10.3390/e20090686

Chicago/Turabian Style

Rassouli, Borzoo, Morteza Varasteh, and Deniz Gündüz. 2018. "Gaussian Multiple Access Channels with One-Bit Quantizer at the Receiver ," Entropy 20, no. 9: 686. https://doi.org/10.3390/e20090686

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop