Block Diagonal Hybrid Precoding and Power Allocation for QoS-Aware BDMA Downlink Transmissions

Beam Division Multiple Access (BDMA) with hybrid precoding has recently been proposed for multi-user multiple-input multiple-output (MU-MIMO) systems by simultaneously transmitting multiple digitally precoded users’ data-streams via different beams. In contrast to most existing works that assume the number of radio frequency (RF) chains must be greater than or equal to that of data-streams, this work proposes a novel BDMA downlink system by first grouping transmitting data-streams before digitally precoding data group by group. To fully harvest the benefits of this new architecture, a greedy user grouping algorithm is devised to minimize the inter-group interference while two digital precoding approaches are developed to suppress the intra-group interference by maximizing the signal-to-interference-and-noise ratio (SINR) and the signal-to-leakage-and-noise ratio (SLNR), respectively. As a result, the proposed BDMA system requires less RF chains than the total number of transmit data-streams. Furthermore, we optimize the power allocation to satisfy each user’s quality of service (QoS) requirement using the D.C. (difference of convex functions) programming technique. Simulation results confirm the effectiveness of the proposed scheme.


Introduction
To meet the explosive demand for higher user data rates, it is envisioned that the next-generation cellular systems will be equipped with massive antenna arrays. Capitalizing on a large number of antennas at the base station (BS), Beam Division Multiple Access (BDMA) has recently been proposed as a promising method for 5G communications [1][2][3][4]. Different beams are allowed to transmit multiple users' data-streams from BS. In contrast to the more conventional multiple access schemes such as Code Division Multiple Access (CDMA), Time Division Multiple Access (TDMA) or Orthogonal Frequency Multiple Division Access (OFDMA) that multiplex users in code, time and frequency domains, BDMA separates users in the beam space by transmitting data to different users in orthogonal beam directions. BDMA was first proposed in [1] to decompose the multi-user multiple-input multiple-output (MU-MIMO) system into multiple single-user MIMO channels by multiplexing multiple users' data onto non-overlapping beams. Since beamforming is commonly implemented in the analog domain using low-cost phase shifters, BDMA becomes particularly attractive in practice in recent years. Moreover, joint beam selection and user scheduling were formulated under the Lyapunov-drift optimization framework before the optimal user-beam scheduling policy for BDMA was derived in a closed-form [2]. However, the assumption of non-overlapping orthogonal beams is generally difficult to be satisfied in practice. As a result, analog-only BDMA applications are heavily handicapped by the non-orthogonal inter-user interference among beams.
In the meantime, digital precoding has been widely investigated as an effective signal processing method to suppress the inter-user interference for MU-MIMO. It is well known that the classical fully digital precoding requires a dedicated radio frequency (RF) chain for each antenna. However, power consumption and the high hardware cost render the fully digital precoding impractical for massive MIMO systems [5][6][7]. To address this challenge, hybrid digital and analog beamforming has been proposed for massive MIMO transmissions by separating the precoding process into two steps, namely analog and digital precoding [8,9]. More specifically, the transmitted signals are first precoded digitally using a smaller number of RF chains followed by the analog precoder exploited by a much larger number of low-cost phase shifters [10,11]. As a result, the hybrid analog-digital precoding architecture requires significantly fewer RF chains as compared to the fully digital precoding [12]. It has been reported in the literature that the hybrid beamforming structure is capable of achieving performance compared to the fully digital beamforming scheme if the number of RF chains at each end is greater than or equal to twice the number of the data-streams [13]. Therefore, the hybrid precoded massive MU-MIMO system can benefit from the interference suppression supplied by the digital precoding while harvesting large antenna beamforming gains by implementing the massive antennas available in the systems [14]. This hybrid structure is particularly attractive for millimeter wave (mmWave) MIMO systems to support the transmissions of Gbps-order data throughput by exploiting the vast vacant spectrum available at RF of 6 GHz or above [15]. Furthermore, the notion of block diagonal (BD) precoding was first introduced to the conventional fully digital schemes to reduce the precoding complexity in [16]. By dividing the inverse of a large matrix into the inverse of multiple much smaller matrices, the BD precoding can be efficiently exploited with only marginal or no performance degradation as compared to the fully digital precoding [16]. In recent years, the BD design has been extended to the hybrid precoding for MU-MIMO [17]. However, most existing hybrid BD precoding schemes were constructed based on a crucial assumption, i.e., the number of RF chains must be no less than the total number of data-streams to be transmitted. Some pioneering works have investigated to relax this limitation by implementing the state-of-the-art fast-speed phase shifters and switches that can change their states symbol by symbol [18]. However, [19] requires users to resume their symbols via the compressive sensing technique, which makes the scheme impractical for low-complexity receivers.
Meanwhile, power allocation is also an important problem in co-channel interference management for multi-user wireless networks. In many MIMO applications, it is desirable to design a system satisfying the quality of service (QoS) constraint for each user by adjusting the power allocated to different users [20]. Since the objective function is highly non-convex, the problem is usually very difficult and complicated, especially for the coupled analog and digital precoding constraints [21]. Therefore, most existing works maximize the sum-rate capacity by implementing the water-filling algorithm without considering the QoS requirement for each user. For instance, [22] alternatively optimized the power allocation for sum-rate maximization by using the water-filling algorithm, assuming that the analog precoders are strictly orthogonal among distinct users. However, the water-filling algorithm cannot satisfy the per-user QoS constraint as users with poor channel conditions are not allocated any transmit power by the water-filling algorithm. To cope with this problem, the signal-to-interference-and-noise ratio (SINR)-balanced power allocation has been proposed to achieve identical SINR for all users [23][24][25][26]. However, the system performance using the SINR-based power allocation is limited by the user with the worst channel conditions.
In this work, we propose a downlink BDMA scheme empowered by BD digital precoding and global power allocation over multipath channels. Compared to the existing BDMA works [1,2], our proposed BDMA schemes can substantially suppress multi-user interference without requiring perfectly orthogonal beams as residual interference can be greatly removed by digital precoding. Furthermore, in sharp contrast to the conventional hybrid precoding schemes, the proposed scheme can use fewer RF chains than the number of transmit data-streams by exploiting the hybrid BD precoding architecture built upon the state-of-the-art fast-speed phase shifters [27] and switches [18]. Furthermore, an iterative algorithm for power allocation is proposed to satisfy per-user QoS requirement based on the difference of convex functions (D.C.) programming technique.
The main contributions of this paper are summarized as follows: • A block diagonal hybrid precoding scheme is proposed by exploiting the state-of-the-art fast-speed phase shifters and switches. The resulting scheme can use fewer RF chains than the number of transmit data-streams by jointly performing hybrid analog-digital precoding and user-beam grouping. • Furthermore, we develop a greedy grouping algorithm to minimize the inter-group interference while maximizing the intra-group interference. Then, the intra-group interference is eliminated by two proposed digital precoders, namely the SINR-and SLNR-based precoders. • In contrast to most works in the literature that used the single-path channel model, we analyze the sum-rate capacity using the multipath channel model. • Finally, for given analog and digital precoders, an optimized power allocation scheme is derived to satisfy per-user QoS requirement by using the D.C. programming technique.
The rest of this paper is organized as follows: In section 2, we present the block diagonal system model with reduced RF chains before formulating the optimization problem. By allocating the power uniformly to each user, the analog and digital precoders are derived in Section 3. After that, the performance of the proposed system is analyzed in Section 5. Finally, to satisfy the QoS constraint, Section 6 proposes a QoS-aware power allocation algorithm based on the D.C. programming technique followed by extensive numerical results presented in Section 7.
Notation: In this paper, we use uppercase boldface and lowercase boldface letters to denote matrices and vectors, respectively. I N denotes the identity matrix with size N × N. A T and A H denote the transpose and conjugate transpose of A, respectively. A † is the pseudo inverse of A while A stands for the L2 norm of A and |A| denotes the absolute value of A.
[a] i denotes the i-th element of a. |I| is the cardinality of the enclosed set I. X 2 (k) represents the chi-square distribution with k degrees of freedom. A, B is the inner product of A and B. X ← x stands for the addition of element x to set X while X \ x removal of element x from X . Finally, E[·] denotes the expectation of the enclosed random variable.

System Model and Problem Formulation
In this paper, we consider a MU-MIMO downlink system as shown in Figure 1 in which N U users are scheduled for service. N RF RF chains and N T antennas are equipped on BS which transmits N U data-streams to N U receivers with N R receive antennas at each time slot. In a practical MIMO system, the number of RF chains is typically much smaller than the number of antennas, i.e., N RF N T . We assume that only one data stream is designated to each scheduled receiver for transmission. Denoted by s(n) the n-th block of N U data to be transmitted, s(n) has unit power with E ss H = 1 N U I N U . In the sequel, we can concentrate on a single block and omit the temporal index n for notation simplicity.

Transmitter
In our proposed group-by-group BD digital precoding system shown in Figure 1, the N U users are first divided into K groups with the group size being M K , where 0 < M k ≤ N U for k = 1, 2, · · · , K.

It is clear that
Accordingly, the data-streams s can be rewritten in groups as: where s k ∈ C M k ×1 is the data vector transmitted to the users in the k-th group and modeled as: with s k,u being the data transmitted to the u-th user in the k-th group for u = 1, 2, · · · , M k . Next, we focus on modeling the digital precoding process. Denoted by F k of N RF × M k the digital precoder for the k-th group for k = 1, 2, · · · , K, F k can be written as: where f k,u represents the digital precoding vector for the u-th user in the k-th group. Thus, the overall digital precoding matrix can be expressed as a block diagonal matrix as follows: Clearly, inverting a BD matrix is less computationally expensive than a non-BD matrix of the same dimension. Therefore, the BD structure of F in Equation (4) can potentially lead to reduced computational complexity. Similarly, we model the corresponding analog precoder in groups as where V k of N T × N RF , the analog precoder for the k-th group for k = 1, 2, · · · , K, is given by: with v k,u being the analog beamforming vector for the u-th user in the k-th group. Finally, the resulting hybrid precoded signal x ∈ C N T ×1 is transmitted to all N U users.

Channel Models
We denote H k,u ∈ C N R ×N T the MIMO multipath channel matrix between the transmitter and the u-th receiver in the k-th group using the Saleh-Valenzuela model [8]: where L k,u is the total number of multipath components between the transmitter and the u-th user in the k-th group. Furthermore, α k,u,l , θ r k,u,l /φ r k,u,l and θ t k,u,l /φ t k,u,l are the complex path gain, angles of arrival (AoA) and azimuth/elevation angles of departure (AoD) of the l-th path of the u-th user in the k-th group, respectively. Furthermore, a(φ, θ) is the array response vector. Finally, D k,u = N T N R L k,u is a constant parameter. For a uniform planar array (UPA) of size P × Q considered in this work, the array response vector is given by [8]: where κ = 2π λ is the wavenumber and d is the distance between two adjacent antennas.

Receiver
Consequently, we formulate the receiver structure of the u-th user in the k-th group. The received signal is represented by where n k,u is the complex additive white Gaussian noise with zero mean and variance equal to σ 2 k,u . Assuming that the receivers are all low-cost terminals that perform analog beamforming only in decoding, the decoded signal denoted byŝ k,u is given by: where w k,u of length N R is the analog beamforming vector employed by the receiver with the power constraint of w k,u 2 = 1 and Please note that the first term in Equation (11) stands for the desired signal while the second term the sum of inter-and intra-group interference as well as receiver thermal noise.

Group-By-Group Hybrid Precoding
For notational simplicity, we denote by g (j)H k,u the effective analog beamforming gain vector observed by the u-th user in the k-th group from the j-th group for j, k = 1, 2, · · · , K.
Let p be the transmitted power vector, where the power allocated to the u-th user in the k-th group is denoted by p k,u with Then, the resulting channel capacity can be computed as with and σ 2 k,u is the noise power. Subsequently, the system sum-rate capacity can be computed as a function of W, V , F and p: It is worth noting that the digital beamforming vectors can be designed to eliminate user interference for the conventional hybrid beamforming with sufficient RF chains, i.e., In contrast, it can only achieve interference-free asymptotically as N T grows a very large number since the proposed BD precoding scheme requires fewer RF chains, i.e., N RF < N U . Thus, the capacity of the proposed BD precoding scheme is constrained by the residual inter-and intra-group interference in the system. Given K groups, we can derive the optimal analog and block digital precoding matrices by where k = 1, 2, · · · , K and u = 1, 2, · · · , M k in C 1 , C 2 and C 3 . In problem P 1 , C 1 and C 2 confine the analog beamforming vectors to the phase-only structure in transmitter and receiver while C 3 ensures that each precoded signal is of unit power. Furthermore, C 4 and C 5 define the analog and digital precoder, respectively. C 6 constrains the maximum number of data-streams in each group to be within the number of RF chains. Finally, C 7 defines the downlink transmitted power constraint while C 8 guarantees the minimal data rate λ k,u for each user.
The problem P 1 is challenging due to its non-convex and combinatorial nature. Thus, it is analytically intractable to derive a closed-form optimal solution. Instead, we consider a two-stage suboptimal solution: In the first stage, we focus on the analog and digital precoder design while assuming uniform power allocation; After fixing the analog and digital precoders, we derive the QoS-aware optimal power allocation in the second stage.

Proposed Block Hybrid Beamforming for RF Chains Reduction
In this section, we will first ignore the constraints C 7 and C 8 in P 1 by uniformly allocating the power to each user while assuming that the user grouping is given.

Analog Beamforming Design
We begin with the analog beamforming design for both transmitter and receiver. It is well known that distinct array response vectors are asymptotically orthogonal as the number of antennas in an antenna array goes to infinity [1], i.e., However, since the antenna number is finite in practice, the residual interference must be considered in the analog precoding design. Recalling the channel model presented in Equation (8), we can asymptotically orthogonalize the transmitted signals by optimizing the design of w k,u and v k,u : Furthermore, SINR k,u is given by with γ = 1 . The optimal analog beamforming precoder can be straightforwardly found by exhaustively searching in the feasible sets of A T k,u and A R k,u .

Digital Precoder Design
In this section, two digital precoding schemes are proposed to maximize the system sum-rate by minimizing the intra-group interference.

Block Zero-Forcing (Bzf) Scheme
In contrast to the conventional ZF hybrid beamforming scheme [28] that requires N U ≤ N RF , zero-forcing digital precoding scheme is first proposed to transmit data-streams group by group. More specifically, the digital precoder for each block is designed as the inverse of the effective channel of the block: with To satisfy the constraint C 3 in P 1 , power normalization is performed on each f k,u derived from Subsequently, this scheme is referred to as the block zero-forcing (BZF) scheme. It is worth noting that BZF degenerates to [28] if K = 1, i.e., all users are grouped into one independent group. On the other hand, BZF becomes the analog-only BDMA if K = N U , i.e., each user forms one group and only analog beamforming is performed.

Block SLNR Maximization (BSM) Scheme
Instead of the received interference elimination, we can alternatively devise the digital precoder to suppress the interference leakage by maximizing SLNR [29]. More specifically, we denote by P Desired k,u the desired signal power transmitted to the u-th user in the k-th group Furthermore, if we define leakage signal as the transmitted signal that is intended to a specific user but leaked to other users, the leakage signal power from the u-th user in the k-th group can be given as Finally, the SLNR for the u-th user in the k-th group can be formulated as Denoted by f BSM k,u the optimal digital precoder maximizing SLNR, it has been shown that f BSM k,u turns out to be the eigenvector associated with the largest eigenvalue of the following matrix [29]: where Q k,u is the leakage covariance matrix related to the u-th user in the k-th group and given as: In contrast to the conventional hybrid precoding algorithms with complexity O(N 3 U ), the proposed group precoding schemes can reduce the complexity to O(N 3 RF ). From the above derivation, it is apparent that the user grouping algorithm plays an important role on the amount of inter-group interference, and subsequently the system performance. Thus, a heuristic algorithm for user grouping is investigated in the next section.

User Grouping Algorithm
Since the intra-group interference is eliminated by the digital precoding, we will focus on using the user grouping to maximize the intra-group interference while minimizing the inter-group interference. More specifically, we propose to group N U users into K groups with minimal inter-group interference. Since the total number of possible group combinations is large, a greedy grouping algorithm is proposed in Algorithm 1.

Input:
X : the universal group and user index set; a T (φ t x , θ t x ): Array response vector of index x; I k = ∅ : the user index set for the k-th group; k = 1: group index; Initialize I 1 ← x * with x * being the user index of the largest channel gain and X \ x * ; Procedures: 1: while X is not empty do Solve the optimal analog precoder by Equation (21) 4: Let A be the analog precoders of grouped users 5: for x in X do 6: Compute end for 8: Find the user index x * with maximum S(x) 9: Update A ← x * , I k ← x * and X \ x *

11:
if |I k | = N RF then 12: Update k ← k + 1 13: for x in X do 14: Compute end for 16: Find the user index x * with minimum S(x) 17: Update A ← x * , I k ← x * and X \ x * 18:

19: end while
In this algorithm, for k = 1, we first group users who cause most interference to each other into Group 1 detailed in Stage 1. This selection is motivated by the observation that most interference can be eliminated by the digital precoder applied among each group. When the size of Group 1 reaches the number of RF chains, the user whose array response vector is most orthogonal to Group 1 is selected as the first member of Group 2 as shown in Stage 2, which is designed to minimize the inter-group interference. This process repeats until all users are assigned to different groups.
It is worth noting that the grouping problem is NP-hard. The greedy algorithm is proposed to find a suboptimal partition with complexity O(N 2 U ).

Performance Analysis
We first investigate the capacity for the conventional analog-only BDMA scheme.
where I k,u is the received interference represented as Proposition 1. If the optimal analog beamformers are designed as w * k,u = a R (φ r k,u,l * , θ r k,u,l * ) and v * H k,u = a H T (φ t k,u,l * , θ t k,u,l * ), respectively, the following approximation holds: The proof is given in Appendices A and B. From Proposition 1, Equation (31) can be rewritten as where and Capitalizing on the Extreme Value Theory [30,31], we can derive the cumulative distribution function (CDF) of Z as where C = 2N T N R /L k,u . The detailed derivation is shown in Appendix C. The residual interference of distinct beams is negligible as compared to the desired signal. Thus, Y can be upper bounded by where 0 ≤ T ≤ 1 with T being the expected residual interference power between distinct beams. In our proposed system, the beams will be selected and grouped to reduce the residual interference. Clearly, T = 0 if the number of antennas goes to infinity or the steering vectors of different users are strictly orthogonal. In contrast, T = 1 if different users have same AoDs. The value of T can be numerically derived.
Finally, the CDF of the SINR lower bound can be given by The detailed derivation can be found in Appendix D. Using the CDF above, the lower and upper bounds of the sum-rate capacity can be derived as It is analytically intractable to obtain a closed-form solution to Equation (44). We will show the numerical results in simulation section. Please note that the upper bound is achieved if the interference from other users can be eliminated. Furthermore, since the number of transmitter antennas is finite in practice, the analog beamforming vectors shown in Equations (21) and (23) inevitably incur residual inter-user interference. Therefore, digital precoders are required to further suppress the residual interference.

Proposed QoS-Aware Power Allocation Algorithm Based on D.C. Programming
For given analog and digital precoders, we investigate the QoS-aware power allocation p in P 1 by using the D.C. programming technique in this section.
We begin with reformulating P 1 as subject to C 1 : Following the procedures in [32], the problem above can be cast as a D.C. programming problem: For given analog and digital precoders, both f (p) and g(p) are concave in p, i.e., Equation (46) is a D.C. function. Starting from a feasible p (0) , the optimal p (n+1) at the n-th iteration is generated as the optimal solution of a convex problem: which can be efficiently solved by any existing convex programming software, such as CVX [33]. The computational complexity of Equation (47) is O(N 3 RF ) in each iteration [32]. As g(p n ) is concave, its gradient g(p (n) ) is also super-gradient: The proof is given in Appendix E.

Simulation Results
In this section, we will use computer simulation to evaluate the sum-rate performance of the proposed block diagonal digital precoding schemes. Unless specified otherwise, we consider a transmitter equipped with a 12 × 12 UPA (i.e., N T = 144) and N U = 16 users each equipped with an 8 × 8 UPA (i.e., N R = 64). The number of paths is set to L k,u = 4 and the additive Gaussian noise power σ 2 k,u = −30 dBm for each user. We consider the azimuth AoA/AoD's uniformly distributed over [0, 2π] while the elevation AoA/AoD's uniformly distributed over [−π/2, π/2], respectively. For each computer experiment, we compute the average over 100 realizations.
In Figure 2, we first set K = 1, i.e., no grouping. As a result, 16 RF chains are required to support 16 data-streams. As shown in Figure 2, BZF slightly outperforms BSM as it can eliminate more multi-user interference even in multipath environment. It is observed that even in the high SNR regime, BDMA suffers from inter-beam interference and has poor performance. Next, we evaluate the two proposed BD precoding schemes with K = 2 groups and 8 RF chains. The 16 users are grouped into K = 2 groups. The curves labeled as "BZF" and "BSM" stands for the proposed BD precoding schemes where only 8 RF chains are used to transmit 16 data-streams. It is observed that BZF and BSM have comparable performance. Furthermore, the curve labeled as "Conventional Hybrid BF (8 RF Chains)" is the sum-rate for the conventional hybrid beamforming system with 8 RF chains serving 8 users. Finally, BDMA is the analog-only precoding system that has the worst performance. Inspection of Figure 3 reveals that the proposed BZF and BSM have much better sum-rate performance than the conventional hybrid precoding algorithm over the SNR range [−10, 10] dB. When the SNR is larger than 12 dB, the system becomes interference-limited. Thus, the performance of BZF and BSM tends to saturate beyond this point. In Figure 4, we investigate the sum-rate capacity improvement as a function of the number of RF chains while fixing the SNR at 5 dB. The upper bound is the conventional ZF precoding system with 16 RF chains for 16 users. Interestingly, the performance improvement generated by an additional RF chain increases only marginally as the number of RF chains grows from eight to 14.  Next, we vary the number of groups while fixing the total number of users at N u = 16 and SNR = 5 dB. Figure 5 shows that BZF and BSM are lower bounded by BDMA and upper bounded by the conventional ZF system with 16 RF chains. When K = 1, the system degenerates back to the conventional ZF system with N RF = M 1 = 16. On the other hand, if K = 16, the system becomes the conventional BDMA. We then investigate the sum-rate performance as the number of transmit antennas increases. Figure 6 shows that the capacity of BZF and BSM has been significantly increased as the number of transmit antennas increases. This is because that the inter-group interference is asymptotically removed as indicated in Equation (20). Finally, we evaluate the performance of the power allocation generated with the D.C. programming technique. We assume that the minimum QoS threshold for each user is set to 3 bps/Hz. Figures 7 and 8 show the performance achieved by our proposed QoS-aware power allocation algorithm. The curve labeled as "Water-filling Power Allocation" is obtained by allocating user power via the water-filling algorithm without taking into account the QoS requirement. The curve labeled as "QoS-Aware Power Allocation" shows the performance of the proposed power allocation algorithm. Compared to the curve labeled as "Uniform Power Allocation", the proposed algorithm has demonstrated significant advantages in terms of the sum-rate capacity. Furthermore, Figure 8 depicts the CDF of the user data rate. It is evident that all users served by the QoS-aware power allocation satisfy the minimum QoS requirement (i.e., 3 bps/Hz). In contrast, the water-filling-based power algorithm suffers from an outage rate of about 20% where outage is defined as the user data rate being below the minimum required data rate.

Conclusions
In this paper, we have developed block diagonal hybrid precoding schemes with optimized power allocation for mmWave massive MIMO systems by jointly performing hybrid analog-digital precoding and user-beam grouping. The proposed system requires fewer RF chains as compared to the conventional hybrid precoding systems by digitally precoding data-streams group by group. Although the intra-group interference is eliminated by the digital precoding, a greedy grouping algorithm has been derived to minimize the inter-group interference by carefully grouping users with orthogonal beams to different groups. Furthermore, two digital precoding schemes have been proposed to suppress the intra-group interference based on SINR and SLNR, respectively. In addition, the upper and lower bounds of the system sum-rate capacity have been derived based on the multipath channel model. Finally, QoS-aware power allocation has been proposed by using the D.C. programming technique. Simulation results have demonstrated the good performance of the proposed grouped BDMA block diagonal hybrid precoding system. Since analog beamformers are distinct for different receivers, we have I 3 I 2 by Equation (20). Finally, it is straightforward to derive A numerical example is given in Appendix B.

Appendix C. The Derivation for Equation (39)
From Proposition 1, Z can be approximated as Z = max |w * H k,u H k,u v * k,u | 2 , ≈ max |D k,u α k,u,l w * H k,u a R (θ k,u,l )a H T (θ k,u,l )v * k,u | 2 , = max 0<l<L k,u |D k,u α k,u,l | 2 . (A5) Since α k,u,l has the distribution of X 2 (2), the CDF of Z can be derived in a straightforward manner [31]: (43) Recalling Equation (35), the SINR can be represented as

Appendix D. The Derivation for Equation
To derive the CDF of SINR, we must first calculate the CDF of Y. From Proposition 1, Y can be represented as As the AoDs and AoAs are i.i.d for UPA antennas, the expectation can be calculated as where B(·) is given by ∑ p=0 e jκd[p(sin φ k,u sin θ k,u −sin φ j,i sin θ j,i )+q(cos θ k,u −cos θ j,i )] , = (1 − e jκdn(sin φ k,u sin θ k,u −sin φ j,i sin θ j,i ) ) P 1 − e jκdn(sin φ k,u sin θ k,u −sin φ j,i sin θ j,i )

×
(1 − e jκdn(cos θ k,u −cos θ j,i ) ) Q 1 − e jκdn(cos θ k,u −cos θ j,i ) . (A10) As shown in Figure A3, the value of T can be numerically estimated. As the number of antennas increases, the value of T decreases gradually, which confirms Equation (20). For a given CDF of Z in Equation (39), the CDF of SINR can be computed as where we have γ > 0 in the derivation above.