On the Capacity and the Optimal Sum-Rate of a Class of Dual-Band Interference Channels †

We study a class of two-transmitter two-receiver dual-band Gaussian interference channels (GIC) which operates over the conventional microwave and the unconventional millimeter-wave (mm-wave) bands. This study is motivated by future 5G networks where additional spectrum in the mm-wave band complements transmission in the incumbent microwave band. The mm-wave band has a key modeling feature: due to severe path loss and relatively small wavelength, a transmitter must employ highly directional antenna arrays to reach its desired receiver. This feature causes the mm-wave channels to become highly directional, and thus can be used by a transmitter to transmit to its designated receiver or the other receiver. We consider two classes of such channels, where the underlying GIC in the microwave band has weak and strong interference, and obtain sufficient channel conditions under which the capacity is characterized. Moreover, we assess the impact of the additional mm-wave band spectrum on the performance, by characterizing the transmit power allocation for the direct and cross channels that maximizes the sum-rate of this dual-band channel. The solution reveals conditions under which different power allocations, such as allocating the power budget only to direct or only to cross channels, or sharing it among them, becomes optimal.


Introduction
Current technology such as 4G (e.g., [1]) is rapidly becoming inadequate to support the exponential growth in wireless traffic [2].Moreover, the potential for improvement of network throughput in 4G is limited due to the shortage of spectrum in the incumbent microwave band (i.e., carrier frequencies below 6 GHz).Thus, methods to tackle the ever growing amount of mobile traffic has become a key research area (see [3] and references therein).Several new technologies are being considered, among which employing additional spectrum in the 28-300 GHz frequency range, often referred to as the millimeter wave (mm-wave) band, seems to be a promising solution to the problem of spectrum scarcity [3,4].Specifically, integrating additional spectrum from the mm-wave band to complement transmissions in the microwave band is poised to play a central role in the functioning of 5G networks.
Transmission in the mm-wave band is distinctly different from that in the microwave band.Due to higher operating frequencies, omnidirectional transmission in the mm-wave band is subject to much higher absorption and power loss [5] compared to that in the microwave band, and thus a transmitter needs to employ beamforming with highly directional antenna arrays to counter this loss and reach its receiver [2].However, beamforming constrains most of the transmission energy to the line-of-sight (LOS) component and very few, if any, significant non-LOS components exist [6,7].Thus, transmission in this band is highly directional and point-to-point.Such mm-wave channels can support high data rates due to their vast bandwidth, but nevertheless are prone to blockage and absorption [8] due to their point-to-point nature.In contrast, microwave links are much more reliable due to rich scattering and diffraction, but cannot support as high rates as the mm-wave links.Thus, in a dual-band setting, conventional traffic and control information can be reliably communicated over microwave links, and high data-rate traffic can be sent through mm-wave links [2,7,[9][10][11][12][13][14][15].
Most studies on microwave and mm-wave dual-band transmission have focused on how to improve network layer performance metrics of cellular access or backhaul networks by using the high-bandwidth, highly directional mm-wave links [9,13].For example, the authors in [7] posed the problem of optimal resource allocation in a cellular setting in the dual microwave and mm-wave band, and showed that certain network level performance parameters, e.g., the number of simultaneously supported users and the link connection probability, are vastly improved with their proposed solution.Similarly, the authors in [11] study a two-tier cellular network where the 60 GHz band is used to create point-to-point directional links, and the 70 GHz band is used to establish long range connections just as the microwave band is used in our work, and propose a hybrid scheme involving both bands that improves the network throughput.In a more related article [14], the authors characterize the benefits of beamforming over a point-to-point dual band multi-antenna channel, and study the performance of a hybrid adaptive queueing scheme over both bands that maximizes the delay-constrained throughput.
Recent works on joint transmission in the microwave band and mm-wave band [14,[16][17][18][19][20] indicate that it is possible to isolate the transmissions in the microwave band from that in the mm-wave band, and communicate over the two bands simultaneously.For example, the authors in [14] designed a queue-based scheme that transmits from a single transmitter to a single receiver simultaneously over the 3 GHz microwave band and the 30 GHz mm-wave band using beamforming, and conducted successful practical experiments in this dual-band setup.In addition, Intel has recently announced the production of a dual-band modem that supports both sub-6 GHz and 28 GHz bands [16].Moreover, recent works on resource allocation in the microwave and mm-wave dual-band setting in [19,20], and their variant in [21], show that simultaneous transmission in both microwave and mm-wave bands are indeed feasible, and are gaining acceptance as an architecture for cellular access in 5G.
We study here the performance of a two-transmitter two-receiver (2 × 2) dual-band interference channel from an information theoretic perspective, where a transmitter communicates to its respective receiver over the microwave band and the mm-wave band simultaneously.In the microwave band, each receiver observes the superposition of signals from both transmitters as in a conventional single band Gaussian interference channel (GIC) [22].However, as the mm-wave channels are considered to be highly directional, a transmitter in this band is well-modeled as being able to transmit towards one intended receiver, while causing negligible to no interference to the other receiver [6].This raises the question: to which receiver should a transmitter in the mm-wave band transmit?In this 2 × 2 GIC, the transmitters in the mm-wave band can transmit from: (a) the first transmitter (Tx 1 ) to the first receiver (Rx 1 ), and the second transmitter (Tx 2 ) to the second receiver (Rx 2 ); (b) Tx 1 to Rx 2 , and Tx 2 to Rx 1 ; (c) Tx 1 and Tx 2 to Rx 2 ; or (d) Tx 1 and Tx 2 to Rx 1 .We focus here on the first two cases where the transmitters in the mm-wave band either transmit (a) from Tx 1 to Rx 1 , and from Tx 2 to Rx 2 , i.e., in the direct channels; or (b) from Tx 1 to Rx 2 , and from Tx 2 to Rx 1 , i.e., in the cross channels; or (c) share the spectrum between the two modes.We denote the resulting channels by Direct-Link IC (DLIC), Cross-Link IC (CLIC) and Direct-and-Cross-Link IC (DCLIC), respectively, and study their capacity.
The capacity of the conventional single band GIC [22] has been characterized when it has strong [22] or very strong interference [23].We know that, under strong interference, encoding the message at each transmitter with independent Gaussian distributed codewords, and decoding both the desired and the interfering messages at each receiver is the optimal strategy [22].However, if the GIC has weak interference [24], the capacity and the optimal strategy is still unknown in general.
The 2 × 2 parallel Gaussian IC (PGIC) consists of several orthogonal parallel channels (sub-channels) such that a 2 × 2 conventional GIC operates in each sub-channel without interfering with that in other sub-channels [25].The optimal strategies for the PGIC are not known in general; however, its capacity was characterized in [26] when the GIC in each sub-channel has strong interference.In a related study, the capacity of the ergodic fading GIC was characterized in [27] when each fading state has strong interference.In the dual-band GIC considered here, the number of channel uses in the microwave band and the mm-wave band may differ.We model this with a bandwidth mismatch factor (BMF) in the system model.Note that the dual-band GIC considered here is a special case of the ergodic fading GIC in [27] if one identifies each fading state as a different sub-channel.In the special case that the bandwidth mismatch factor between the microwave and mm-wave bands is 1, it is also a special case of the PGIC in [26].
Moreover, the studies in [26,27] show that if every sub-channel (or fading state) has strong interference, the capacity is achieved by encoding jointly over all sub-channels and decoding messages from both transmitters.Encoding independently over each sub-channel of a PGIC is suboptimal in general [28], except for the GIC in the very weak (noisy) interference regime [29].In fact, joint encoding over all sub-channels generally achieves better rates as it can potentially offset the weak interference in one or more sub-channels if the other sub-channels have strong interference [28].
In this sequel, we study the capacity of the DCLIC and the CLIC.First, we present a useful result that decomposes the capacity of the DCLIC into that of the underlying CLIC and the set of direct channels.This result shows that the capacity of the DCLIC can be established if the capacity of a corresponding CLIC is known.Hence, we focus on the CLIC next.In particular, we consider two specific classes, the strong CLIC and the weak CLIC, where the underlying GIC in the microwave band has strong and weak interference, respectively, and characterize sufficient conditions on the channel parameters under which their capacity is established.
Resource allocation techniques that maximize the sum-rate or throughput of interference channels have been investigated throughly (e.g., see [25,30,31]) as they indicate how to allocate resources in practice.The DCLIC models a basic multiuser network over the microwave and mm-wave dual-band, whose performance is likely to be dominated by the mm-wave channels due to their large bandwidths [9].Thus, it is useful to understand how to optimize the performance of the DCLIC over the parameters in the mm-wave band.Therefore, we study the power allocation in the direct and cross channels in the mm-wave band that maximizes the sum-rate of the DCLIC.We derive the optimal powers in closed form, and study how the channel parameters influence the optimal power allocation.
The contributions of this paper are summarized as follows: • We show that the capacity region of the DCLIC can be decomposed into the capacity region of the underlying CLIC and two non-interfering direct links in the mm-wave band.This illustrates that the cross channels are actively involved in characterizing the capacity of the CLIC, whereas the direct channels improve the rates of individual users.• We characterize the capacity of the strong CLIC, and observe that the strong interference condition in the microwave band is sufficient to characterize the capacity.• For the weak CLIC, we characterize sufficient channel conditions under which its capacity is established.This shows that even if the GIC in the microwave band has weak interference, adequately strong cross channels in the mm-wave band are sufficient to characterize the capacity.• We characterize the optimal power allocation in the direct and cross channels that maximizes the sum-rate of the DCLIC, and study channel conditions under which the optimal power allocation either assigns the entire power budget to a specific subset of channels, or shares the power budget among all channels.We establish a direct relation between the channel parameters and the optimal powers, from which we observe the following: -The optimal power allocation distributes the power budget among the direct and cross channels following two properties: a waterfilling-like property and a max-min property.-When the power budget is sufficiently small, the optimal allocation assigns power to either both direct channels, or both cross channels and at most one direct channel.
-Due to the max-min property, the optimal allocation imposes a maximum limit on the cross-channel powers.When the power budget exceeds a certain threshold, the limit on the cross-channel powers are reached, and all additional increments to the power budget are then added only to the direct channels that do not have such limits.-If the underlying GIC in the microwave band has very strong interference, the optimal power allocation assigns the power budget entirely to the direct channels.-If the channel parameters satisfy one of the following criteria, then transmitting only in the direct channels is approximately optimal, in the sense that the difference between the sum-rates resulting from allocating to only direct channels and allocating optimally in all channels, is negligibly small: (a) the transmit powers in the underlying GIC in the microwave band is very small; or (b) the cross channel gains in the mm-wave band are very large.
The rest of the paper is organized as follows.We define the system model in Section 2. We present the decomposition result on the DCLIC and the capacity of the strong CLIC in Section 3. We present the capacity result on the weak CLIC in Section 4. In Section 5, we formulate the optimum sum-rate problem and discuss its solution, and we conclude in Section 6.
Notation: We denote sets in calligraphic (e.g., A k ), except the sets of reals, positive reals, and positive integers, which are denoted by R, R + and Z + , respectively.Vectors are in bold (e.g., x), and x 0 denotes that each element of x is in R + , where 0 is the zero vector.We denote by E(Y) the statistical expectation of a random variable (RV) Y, and by X ∼ N (µ, σ 2 ) an RV following the Gaussian distribution with mean µ and variance σ 2 .We also denote an n-length vector (X 1 , X 2 , . . ., X n ) by X n , the empty set by ∅, log 2 x by log x, and define C(x) := 1 2 log(1 + x).

System Model
First, we define the discrete memoryless (DM) model of the DCLIC, and from that we define the Gaussian models.In the DCLIC, a bandwidth mismatch between the first (microwave) band and the second (mm-wave) band may exist.Thus, we assume that during n accesses of the first band, the second band is accessed n 1 (n) times as cross channels, and n 2 (n) times as direct channels.We model this by two bandwidth mismatch factors (BMF) α k := lim n→∞ n k (n)/n, k = 1, 2. Thus, the total normalized channel uses in the second band is asymptotically α := α 1 + α 2 , which is shared between the cross channels (α 1 ) and direct channels (α 2 ).For ease of exposition, we denote n k (n) by n k , k = 1, 2.
The 2 × 2 DM interference channel is defined by ((X k , Y k ) 2 k=1 , p(y 1 , y 2 |x 1 , x 2 )) where X k and Y k are the discrete input and output alphabets of user k, k = 1, 2, and p(y 1 , y 2 |x 1 , x 2 ) is the set of channel transition probabilities [32] (Chapter 6.1).We define the 2 × 2 DM DCLIC similarly by the tuple, ((X k , Xk , Xk , Y k , Ŷk , Ȳk ) 2 k=1 , p), where X k and Y k are the input and output alphabets of the interference channel in the first band, Xk and Ȳk are the input and output alphabets for the Tx k to Rx k direct channels in the second band, k = 1, 2, and X1 and Ŷ2 (respectively X2 and Ŷ1 ) are the input and output alphabets for the Tx 1 to Rx 2 (resp.Tx 2 to Rx 1 ) cross channel in the second band.The joint probability mass function (pmf) of the DCLIC decomposes as We define a (2 nR 1 , 2 nR 2 , n, n 1 , n 2 ) code for the DCLIC to consist of (a) two uniformly distributed and independent message sets, M 1 := {1, . . ., 2 nR 1 } and M 2 := {1, . . ., 2 nR 2 }, respectively, for Tx 1 and Tx 2 ; (b) two encoding functions, φ 1 and φ 2 , respectively for Tx 1 and Tx 2 such that ; and (c) two decoding functions, ψ 1 and ψ 2 , respectively, for Rx 1 and Rx 2 such that We define the decoding probability of error at Rx k by P n e,k := Pr ) is an achievable rate pair of the DCLIC, if there exists a sequence of (2 nR 1 , 2 nR 2 , n, n 1 , n 2 ) codes such that n k ≤ α k n, and P n e,k → 0 as n, n k → ∞ for k = 1, 2. The capacity region of the DCLIC is defined as the closure of the set of all achievable rate tuples.
Following the above notations, the GIC in the first band of the Gaussian DCLIC is modeled as in [22].The signals received at Rx 1 and Rx 2 are given by where kk are normalized to 1 as in [22]), and Z k ∼ N (0, 1) are i.i.d.noise, k = 1, 2. In addition, the codewords now satisfy the average power constraint, The cross links in the second band of the DCLIC are point-to-point, and are modeled as where Xk , Ŷk ∈ R, c km ∈ R are coefficients of the channels from Tx k to Rx m , k = m ∈ {1, 2}, and Ẑk ∼ N (0, 1) are i.i.d.noise.The codewords satisfy the average power constraint, 1 The direct links in the second band are similarly modeled as where Xk , Ȳk ∈ R, d k ∈ R are the direct channel coefficients, Zk ∼ N (0, 1) are i.i.d.noise, k = 1, 2, and the codewords satisfy the average power constraint, 1 , as well.We present the Gaussian DCLIC in Figure 1.We define a (2 nR 1 , 2 nR 2 , n, n 1 , n 2 ) code for the Gaussian DCLIC from that of the DM DCLIC, by choosing all input and output variables to be in R, and by imposing the average power constraints on the codewords, X n k , Xn 1 k and Xn 2 k , k = 1, 2, defined above.Next, the Gaussian CLIC is defined from the Gaussian DCLIC by imposing the restrictions, n 2 = 0 and Xk = Ȳk = ∅, k = 1, 2, such that the channel outputs in the CLIC are described by (1)-( 4), respectively.A (2 nR 1 , 2 nR 2 , n, n 1 ) code for CLIC is defined from a (2 nR 1 , 2 nR 2 , n, n 1 , n 2 ) code of the Gaussian DCLIC with the above-mentioned restrictions.
For ease of exposition, we define two classes of the CLICs: we say that a CLIC is a strong CLIC or a weak CLIC, if the underlying GIC in the first band has strong interference (i.e., a 2  12 ≥ 1, a 2 21 ≥ 1) or weak interference (i.e., a 2  12 < 1, a 2 21 < 1), respectively.Moreover, a symmetric CLIC is a CLIC where and P1 = P2 .In addition, a symmetric DCLIC is a DCLIC with an underlying symmetric CLIC, and Finally, the Gaussian model for the DLIC are defined from that of the DCLIC by imposing the restrictions, n 1 = 0 and Xk = Ŷk = ∅, k = 1, 2, and thus the channel outputs are described by (1), ( 2), ( 5) and ( 6), respectively.In the sequel, we focus on the Gaussian model of the DCLIC and the CLIC.

Decomposition Result on the Capacity of the DCLIC
Recall that, in the DCLIC, there are n channel uses in the first band, n 1 cross channel uses in the second band, and n 2 direct channels uses in the second band.We show below that the capacity of the DCLIC can be decomposed into the capacity of the underlying CLIC, complemented by the direct channels that are used to transmit individual user information to their respective receivers.
Theorem 1.The capacity region of the DCLIC with BMFs α 1 and α 2 is given by the set of all nonnegative rate tuples (R 1 , R 2 ) that satisfy the decomposition where (r 1 , r 2 ) is an achievable rate tuple in the underlying CLIC with BMF α 1 .
Proof.The proof is relegated to Appendix A.
Therefore, an achievable rate pair in the DCLIC with BMFs α 1 and α 2 consists of a rate pair achievable in the CLIC with BMF α 1 , and the rate pair achieved in the direct channels, )).The capacity of the DCLIC can thus be characterized from that of the underlying CLIC.Hence, we focus on the CLIC next.In particular, we consider two specific classes of the CLIC, the strong CLIC and the weak CLIC.First, we present the capacity of the strong CLIC.
Lemma 1.The capacity region of the strong CLIC with BMF α 1 is given by the set of all nonnegative rate tuples (R 1 , R 2 ) that satisfy The proof of Lemma 1 follows from that of [27] in a straightforward manner.Hence, we omit the proof here, and discuss only the key idea.In the strong CLIC, the GIC in the first band has strong interference.Additionally, in the second band, the cross-channel gains are positive and the direct-channel gains are zero, which results in a GIC with strong interference.Hence, the strong CLIC is a parallel GIC with strong interference in both bands.The capacity is thus achieved by encoding jointly over both bands and decoding messages from both transmitters as in [27].
Moreover, the capacity of the DCLIC with strong underlying CLIC, which follows from Theorem 1 and Lemma 1, is characterized below.
Corollary 1.The capacity region of the DCLIC with BMFs α 1 and α 2 that has a strong underlying CLIC is given by the set of all nonnegative rate tuples (R 1 , R 2 ) that satisfy Furthermore, if the DLIC, where the second band is used as direct channels only, has strong underlying GIC, the capacity region follows from Corollary 1 with α 1 = 0 and α 2 = α.

Capacity of the Weak CLIC
Now, consider the weak CLIC where the underlying GIC has weak interference.Recall that in a conventional GIC with weak interference, decoding both messages is suboptimal in general.However, in the weak CLIC, if messages are encoded jointly over both bands and the cross channels in the second band are sufficiently strong, then decoding both messages is indeed optimal.We characterize the sufficient conditions below.Lemma 2. Provided the channel parameters of the CLIC with BMF α 1 satisfy a 2  12 < 1, a 2 21 < 1, as well as the conditions (1 the following inequalities hold for all X n 1 and X n 2 with product distributions p(x n 1 )p(x n 2 ) on R n × R n , n ∈ Z + , which satisfy the average power constraints, Proof.We prove only (18) as (19) follows similarly.From the system model for the first band in (1) and ( 2), we have where (a) follows from unconditioning, h( 2 ), and since h(Z n 1 ) = h(Z n 2 ); (b) follows since 1/a 2 12 > 1, and thus h( 12 − 1) and Z 1 ∼ N (0, 1), respectively; (g) follows by invoking the worst additive noise result of [33] to every mutual information term inside the summation of ( f ), where and (k) follows from condition (16).
We characterize the capacity of the weak CLIC under the conditions of Lemma 2 below.
Theorem 2. The capacity region of the weak CLIC with BMF α 1 that satisfies the conditions in ( 16) and ( 17) is given by the set of all nonnegative rate tuples (R 1 , R 2 ) that satisfy Proof.The proof is relegated to Appendix B.
The result shows that, in the weak CLIC, where sufficient interference forwarding is not possible through the underlying weak GIC, if the cross links in the second band are sufficiently strong, it is possible to forward enough interference by encoding jointly over both bands such that decoding both messages becomes optimal.Hence, these conditions can be classified as being able to push the receivers to the "strong interference regime over both bands".
Next, we illustrate the relationship between the channel parameters in ( 16) and ( 17) with an example of a symmetric weak CLIC, where ( 16) and ( 17) imply the same condition.We denote by c 2 min := (((1 ), respectively the minimum c 2 and the minimum α 1 required to achieve (16), and show the interplay between a 2 , c 2 min and α 1,min for 2a, we plot c 2 min against a 2 ∈ (0, 1) for α 1 ∈ {0.5, 1, 2}.Note that c 2 min reduces monotonically as a 2 or α 1 increases.This follows since, if a 2 increases, then the first band forwards more interference, and, if α 1 increases, then the pre-log factor of the cross-channel capacity increases.In either case, smaller c 2 min is required to achieve (16).Similarly in Figure 2b, we plot α 1,min against a 2 ∈ (0, 1) for c 2 ∈ {0.5, 1, 2}, and note that α 1,min reduces as a 2 or c 2 increases.Finally, in Figure 2c, we depict the set of cross-channel gains (a 2 and c 2 ) of a symmetric CLIC with and partition it depending on whether the capacity has been characterized in each set.Note that the capacity for the set where a 2 < 1 and c 2 < c 2 m has not been characterized yet.Capacity not yet characterized: Capacity is characterized: Capacity is characterized: In (a,b), we plot c 2 min and α 1,min , respectively.In (c), the channel gains of a symmetric CLIC is partitioned based on whether its capacity has been characterized in each set.

The Optimal Sum-Rate Problem
In this section, we study the power allocation scheme over the direct and cross channels in the DCLIC that maximizes its sum-rate.Recall from Corollary 1 that, if the underlying GIC of the DCLIC has strong interference (i.e., a 2  12 ≥ 1 and a 2 21 ≥ 1), then the sum-rate of the DCLIC is known for all values of the remaining channel parameters, and, in particular, for all transmit powers in the second band.Therefore, we pose this problem for the class of DCLICs with strong underlying GIC (see [A1] below).Also recall that the normalized bandwidth (α) of the second band is shared between the direct channels (α 2 ) and the cross channels (α 1 ).We denote the fraction of α alloted to the direct and cross channels by β and β, respectively, where β := α 2 /α and β := 1 − β = α 1 /α, and β, β ∈ (0, 1).Thus, β provides a trade-off between the bandwidths in the direct channels (βα) and the cross channels ( βα).
We formulate the problem under the following assumptions: 1. the underlying GIC of the DCLIC has strong interference, but not very strong interference, 1 ≤ a 2 12 < 1 + Q 1 and 1 ≤ a 2 21 < 1 + Q 2 ; 2. the underlying GIC of the DCLIC satisfies: 21 Q 2 ; 3. β and β are fixed a priori; 4. the transmission power in the direct channel (p k ) and cross channel (q k ) from transmitter k (Tx k ) satisfy the constraint, βp k + βq k = P, k = 1, 2, where P is the power budget.
In [A1], we assume that the underlying GIC does not have very strong interference, since the power allocation in that case is trivial (see Section 5.4).For ease of exposition, we assume in [A2] that the underlying GIC receives equal power in both its receivers.Note that the class of GICs that satisfies [A2] also contains the symmetric GICs.Moreover, the analysis under [A2] reveals enough insight such that it can be extended to the general case (see Remark 1).
In [A3], β is assumed to be fixed a priori and known.This models practical constraints in many wireless networks where dynamically allocating the bandwidth may not be feasible or straightforward [34].In [A4], we assume that power budget (P) in both transmitters are the same.There is no loss of generality in this assumption as the relative difference between the power budgets of the first and the second user can be absorbed into the channel gains.

Problem Formulation and Solution
For a fixed power allocation (p 1 , q 1 , p 2 , q 2 ) in the mm-wave channels, we denote the sum-rate achievable at Rx 1 and Rx 2 in ( 14) and ( 15) by Σ 1 and Σ 2 , and the interference-free sum-rate given by the sum of individual rates in ( 12) and ( 13) by Σ, and present them below where . Therefore, a necessary and sufficient condition for R to be an achievable sum-rate of the DCLIC is R ≤ min{Σ 1 , Σ 2 , Σ}, for some power allocation (p 1 , q 1 , p 2 , q 2 ).The optimization problem that maximizes R over the transmit powers (p 1 , q 1 , p 2 , q 2 ) is then ) (p 1 , q 1 , p 2 , q 2 , R) 0.
It is well known that optimal power allocation in parallel Gaussian point-to-point channels follows the Waterfilling (WF) property.Due to this, if the power budget is sufficient small, it is allocated entirely to the "strongest" sub-channel, and, as the power budget is increased, power is allocated to the other "weaker" sub-channels, in addition to the strongest one (see Chapter 3.4.3 in [32]).In the ensuing discussion, it becomes clear that the optimal power allocation in [P1] (hereby referred to as "the optimal allocation") has two noticeable properties: a WF-like property, due to which it assigns power to the cross and direct channels following a WF-like allocation, and a max-min property, due to which it increases the minimum of the sum-rate constraints.
Due to its WF-like property, the optimal allocation assigns the entire power budget (P) to only a subset of all the direct and cross channels, depending on channel conditions that indicate whether the direct channels are stronger than the cross channels or vice versa.Moreover, if P is sufficiently increased, it becomes optimal to allocate power to the remaining set of channels.In addition, since the objective of the problem is to maximize min{Σ 1 , Σ 2 , Σ}, the optimal allocation assigns powers in such a way that minimizes or eliminates any difference between Σ 1 , Σ 2 and Σ.We observe that, due to this property (max-min property), the optimal allocation imposes a maximum limit on the cross-channel powers, which is unlike WF in [32] where no such limit exists.
We study the optimal allocation by partitioning the entire set of channel parameters and P into disjoint sets (S (.) ), such that the optimal allocation can be classified according to the power levels in the direct and cross channels in each set.Without loss of generality, we present the optimal allocation under c 2 21 > c 2  12 in Table 1, and study it in detail.In this case, only fours sets, S D , S C , S CD and S sat , are sufficient.The power allocation under c 2 21 < c 2 12 can be readily obtained from Table 1 by swapping the indices 1 and 2, and the case with c 2 21 = c 2 12 is briefly discussed in Section 5.3.For notational convenience, we express the optimal powers and the conditions of the sets S (.) in Table 1 in terms of the following functions: and P * 4 := β(γ − 1)/c 2 12 .We relegate the details of the derivation to Appendix C. In the following, we use "the optimal allocation" and OA 1 interchangeably to refer to the optimal power allocation for [P1] under c 2 21 > c 2 12 , and discuss some interesting characteristics of OA 1 .
Table 1.Optimal power allocation and conditions of sets in OA 1 .

Set Optimal Powers Condition
First, the condition g 1 (P) < 1 in S D implies that the direct channels are "stronger" than the cross channels, in the sense that, for sufficiently small P, the sum-rate achieved from allocating P only to both direct channels is larger than that achieved from any other subset of channels.Thus, following its WF-like property, OA 1 allocates P entirely to the direct channels, i.e., p 1 = p 2 = P β , and zero power to the cross channels, i.e., q 1 = q 2 = 0.This allocation also achieves Σ 2 = Σ 1 , which is consistent with the max-min property of OA 1 .
Second, the condition g 2 (P) < 1 in S C implies that the cross channels are "stronger" than the direct channels, in the sense that, for sufficiently small P, the sum-rate achieved from allocating P to both cross channels and a direct channel is larger than that achieved from only direct channels.Note that, unlike in S D , OA 1 needs to allocate power to a direct channel in addition to both cross channels in S C , since allocating P entirely to the cross channels causes an imbalance between Σ 1 and Σ 2 (Σ 2 < Σ 1 ) due to c 2 21 > c 2 12 , which violates the max-min property.Therefore, OA 1 shares P among the cross and direct channels from Tx 2 to preserve Σ 2 = Σ 1 .Moreover, the condition P < P * 4 ensures that power in the cross channels have not yet reached their maximum limits.
Third, as P is increased, OA 1 shares P among all channels in S CD .As P increases, the additional benefits from transmitting in a particular subset of channels (either direct channels as in S D , or cross and direct channels as in S C ) begins to diminish.Thus, following its WF-like property, OA 1 starts sharing P among all channels.In addition, OA 1 shares P in such a way that preserves Finally, if P is increased sufficiently, OA 1 follows the allocation in S sat , where the cross-channel powers have reached their maximum limits, q k,sat := (γ − 1)/c 2 km , k = m ∈ {1, 2}.Therefore, as P increases further, all subsequent increments of P is allotted to only direct channels.We now say that the cross channels are saturated, in the sense that allocating more power beyond these limits does not improve the sum-rate.Such limits for the cross channels in OA 1 is unlike the WF allocation in [32].
The cross channels become saturated due to the max-min property of OA 1 .Recall that, in S CD , OA 1 allocates powers to all channels, which increase as P increases.The increases in p 1 and p 2 results in the increase of Σ 1 , Σ 2 and Σ by an equal margin.However, an increase in q 1 only increases Σ 2 , and an increase in q 2 only increases Σ 1 .Now, note that, due to its max-min property, OA 1 preserves Σ 1 = Σ 2 in S D , S C , and S CD .However, there may exist a gap between Σ 1 = Σ 2 and Σ.As q 1 and q 2 are increased, this gap reduces and finally becomes zero in S sat .At this point, both cross channels become saturated simultaneously.If any more power is allocated to either q 1 or q 2 , a suboptimal sum-rate R = Σ < min{Σ 1 , Σ 2 } will result.Therefore, OA 1 maintains q k = q k,sat and diverts all additional power to the direct channels.
Note that once, in S sat , the sum-rates achieved by joint decoding (Σ 1 and Σ 2 ) become equal to the sum of interference-free user rates (Σ), which is somewhat similar to the behavior of the GIC under very strong interference.At this point, all additional increments in P are allocated to the direct channels to increase the individual user rates.
In addition, note that the sets S (.) form a partition due to their construction using the optimal Lagrange multipliers (defined in Appendix C).Thus, the conditions in Table 1 are mutually exclusive.
Moreover, these conditions can be equivalently described in terms of three critical powers, P * k ∈ R + , defined by g k (P * k ) = 1, where g k (P) is defined in ( 33)-( 35) for k ∈ {1, 2, 3}.Specifically, the conditions of S D , S C , and S sat are given by P < P * 1 , P < min{P * 2 , P * 4 }, and P > max{P * 3 , P * 4 }, respectively.In addition, we note that the direct channels are "stronger" than the cross channels in the sense specified above, if P * 1 > 0. Similarly, if P * 2 > 0, the cross channels are "stronger" than the direct channels.Furthermore, if 0 < P * 3 < P * 4 < P * 2 , the cross channels are said to be "much stronger", in the sense that OA 1 continues allocating power to the cross channels as in S C until they become saturated, and only after that it assigns power to both direct channels.
We note from the mutual exclusiveness of S D and S C that, if P * 1 > 0, then no P * 2 ∈ R + exists that satisfies g 2 (P * 2 ) = 1.This shows that, if the direct channels are stronger, the allocation in S C is suboptimal for any P > 0. Similarly, if P * 2 > 0, then no P * 1 ∈ R + exists such that g 1 (P * 1 ) = 1, and thus the allocation in S D is suboptimal for any P > 0.

The Waterfilling-Like Nature of the Optimal Power Allocation
Now, we characterize how OA 1 adapts the power allocation, as the power budget (P) increases and crosses the critical thresholds P * k , k ∈ {1, 2, 3}.We say that OA 1 follows the sequence A 1 → A 2 → A 3 where A l ∈ {S D , S C , S CD , S sat }, if OA 1 allocates power as in A 1 for sufficiently small P, and then adapts the powers according to the allocation in A 2 and A 3 as P increases.In this regard, we note that OA 1 follows one of the three sequences, as explained below and illustrated graphically in Figure 3. as in S CD , and thus q 1 and q 2 also increase with P; (iii) finally, when P > P * 3 , OA 1 follows S sat where the cross channels become saturated simultaneously, and all increments of P are added to p 1 and p 2 .
We depict the resulting constraints Σ 1 , Σ 2 , and Σ in Figure 4b.First, note that OA 1 preserves R = Σ 1 = Σ 2 for all P.However, there exists a gap between Σ 1 = Σ 2 and Σ in S D and S CD .Specifically, in S D , the gap remains constant (A − A 1 ); in S CD , it reduces gradually as OA 1 transmits in the cross channels, and, in S sat , it becomes zero as OA 1 achieves R = Σ 1 = Σ 2 = Σ, as expected.Next, we illustrate an example of [S2] with the channel gains, d 2 1 = 0.5, d 2 2 = 1, c 2 12 = 1.5, c 2 21 = 3, such that the cross channels are stronger than the direct channels in the sense that P * 2 = 0.22 and P * 1 = −0.21∈ R + .In Figure 5a, we plot the optimal powers against P, and observe the following: (i) when P < P * 2 = 0.22, OA 1 follows the allocation in S C , and thus p 2 , q 1 and q 2 increase with P, whereas p 1 = 0; (ii) when P * 2 ≤ P ≤ P * 3 = 0.97, OA 1 allocates power to all channels as in S CD , and thus p 1 now increases with P; (iii) finally, when P > P * 3 , OA 1 follows S sat , and thus the cross channels become saturated simultaneously, as expected in [S2].In Figure 5b, we plot the sum-rate constraints, and note that OA 1 achieves R = Σ 1 = Σ 2 in all the sets.In addition, the gap between Σ 1 = Σ 2 and Σ is gradually offset as OA 1 transmits in the cross channels in S C and S CD , and it finally becomes zero in S sat , as expected.We omit an example of [S3] (S C → S sat ), which is somewhat similar to [S2].Therefore, OA G allocates p 1 = 0, q 1 = P/ β, which increases only Σ 2 , and assigns p 2 = P/β, q 2 = 0, which represents the allocation in Ŝ1 .

6
Ŝ2 , p 2 = P β , q 2 = 0, P > P * 6 , P > P * 7 , , p 2 = P β , q 2 = 0, P > P * 5 , P ≥ P * 8 , f 3 (P) > 1 Furthermore, as P increases and P > max{P * 6 , P * 7 }, additional benefits from transmitting only in the cross channel with gain c 2  12 reduces, and thus OA G starts allocating a fraction of P to the direct channel with gain d 2 1 , due to its WF-like property.Thus, OA G follows the power allocation in Ŝ2 , which remains optimal as long as P < P * 8 .The power allocation in Ŝ3 can be interpreted similarly.
The important insight is that OA G can be explained with the WF-like and max-min properties as in OA 1 , and it does not reveal any new fundamental properties.

Optimum Power Allocation in the Symmetric DCLIC
We briefly discuss the optimum allocation for the symmetric DCLIC, where a 2 := a 2 12 = a 2 21 , and Due to symmetry, considering only symmetric power allocation of the form (p, q, p, q) is sufficient, and does not cause loss of generality.Moreover, any feasible (symmetric) power allocation achieves Σ 1 = Σ 2 , rendering the constraint R ≤ Σ 2 in (28) redundant.The sum-rate optimization problem in this case can be formulated and solved as in [P1], and is omitted here for brevity.In addition, we denote the optimal power allocation for this case by OA S .
In this case, OA S can be described by the power allocation in four disjoint sets, Sl , l ∈ {D, C, CD, sat}, which are counterparts of the sets S l in Table 1.The conditions and optimal powers in these sets are given in Table 4, and they can be derived following the same procedure as in Appendix C, and thus is omitted here.
where γ := ((1 OA S has the same WF-like and max-min properties as that of OA 1 .In fact, a rudimentary inspection of the conditions reveals that OA S follows one of the three possible sequences of sets depending on the channel gains, as before: • If c 2 ≤ 2d 2 (direct channels are "stronger"): OA S follows the sequence SD → SCD → Ssat .
It transmits only in the direct channels as in SD when P < P * 1 , then transmits in all channels as in SCD when P * 1 ≤ P ≤ P * 3 , and finally starts following the allocation in Ssat when P > P * 3 .• If 2d 2 < c 2 < 2d 2 γ (cross channels are "stronger"): OA S follows the sequence SC → SCD → Ssat .
It transmits only in the cross channels as in SC when P < P * 2 , then transmits in all channels as in SCD when P * Thus, OA S allocates all of P to either both direct or both cross channels if P is small, and then shares P among all channels as P increases.In addition, when P is increased beyond the saturation threshold, the cross channels become saturated.Note that, due to symmetry, OA S needs to transmit in both cross channels only in SC to preserve Σ 1 = Σ 2 , unlike in S C , where OA 1 additionally transmits in a direct channel.
We note the following, which are revealed due to symmetry.First, if c 2 < 2d 2 , the direct channels are considered to be stronger than the cross channels.Similarly, if c 2 > 2d 2 , the cross channels are considered to be stronger.Moreover, in the special case of c 2 = 2d 2 , neither the direct nor the cross channels are stronger than the other, and thus allocating P entirely to only direct or cross channel would be suboptimal.Therefore, OA S transmits in all channels as in S CD when P ≥ 0. Finally, the condition c 2 ≥ 2d 2 γ implies that the cross channels are much stronger than the direct channels.Thus, it is optimal to transmit only in the cross channels as in SC until they become saturated, at which point the allocation in Ssat becomes optimal.Note that these conditions follow trivially from the corresponding conditions of [S1], [S2], and [S3] in Section 5.2 by applying symmetry.
Furthermore, due to symmetry, one can now directly observe that the conditions of the sets in Table 4 are mutually exclusive.For example, consider the conditions of SD , c 2 < 2d 2 and P < P * 1 , and note that they violate the second condition of SC (since P * 2 < 0), the first condition in SCD (since P < P * 1 ), and the second condition in Ssat (trivially follows from γ > 1).In Figure 6, we depict the set of cross and direct channel gains (c 2 and d 2 ) of a symmetric DCLIC with parameters a 2 = 1.5, Q 1 = 5, α = 2, β = 0.5, and partition it depending on whether the direct or the cross channels are stronger.It also has a max-min property, and, as a result, imposes a maximum limit on the cross-channel powers.This power allocation evaluates the impact of high bandwidth point-to-point mm-wave channels on the sum-rate performance, and thus can be useful in allocating resources in practice.Potential future work includes analyzing other multi-user networks that operate with dual microwave and mm-wave bands.

Figure 1 .
Figure 1.System model of the Gaussian DCLIC, which consists of an underlying GIC in the microwave band and the set of direct channels and cross channels in the mm-wave band.

Figure 6 .
Figure 6.The set of all c 2 and d 2 is partitioned depending on whether the cross or the direct channels are "stronger".

B
k ∩ C l D 1 Cond.D 2 Cond.D 3 Cond.
2 12 − 1), are i.i.d.; (c) follows since the first term in b is maximized by X 2 ∼ N (0, Q 2 ), i.i.d.;(d) follows from the chain rule; (e) follows since Z 3 are i.i.d., and since unconditioning does not reduce entropy; ( f ) follows by replacing the i.i.d.RVs, Z 3 and Z 1 , with

Table 3 .
Optimal Power Allocation and Conditions of the "New" Sets in OA G .

Table 4 .
Optimal power allocation and conditions of sets in OA S .
+1 , k = 1, 2. Similarly, for the two direct channels, we generate 2 tnR k i.i.d.sequences Ūt k (M k ), distributed as ∏ =1 p( xk ), where xk ∼ N (0, Pk ), for k = 1, 2, and = 1, ..., n 2 .Thus, to transmit M k , k = 1, 2, U t 1 (M 1 ) and U t 2 (M 2 ) are transmitted through the CLIC, and Ūt 1 (M 1 ) and Ūt 2 (M 2 ) are transmitted through the two direct channels in the second band.Upon receiving the sequences (Y nt k , Ŷn 1 t k ) and Ȳn 2 t k , Rx k employs joint typical decoding to estimate the transmitted message as in [32] (Chapter 4.3).The probability of decoding error vanishes with t→ ∞ if Pk ), k = 1, 2,(A2)which matches the upper bound on R 1 and R 2 , as n 2 ≤ α 2 n, and n → ∞.Finally, the capacity region of the CLIC with BMF α is defined as the closure of the union of all sets of achievable rate pairs (R 1 , R 2 ) that satisfy (A2) with n 2 = 0, where the union is taken over all product distributions,

Table A1 .
Compatibility of B k ∩ C l and D j .