Low Spatial Peak-to-Average Power Ratio Transmission for Improved Energy Efficiency in Massive MIMO Systems

A significant portion of the operating power of a base station is consumed by power amplifiers (PAs). Much of this power is dissipated in the form of heat, as the overall efficiency of currently deployed PAs is typically very low. This is because the structure of conventional precoding techniques typically results in a relatively high variation in output power at different antennas in the array, and many PAs are operated well below saturation to avoid distortion of the transmitted signals. In this work, we use a realistic model for power consumption in PAs and study the impact of power variation across antennas in the array on the energy efficiency of a massive MIMO downlink system. We introduce a family of linear precoding matrices that allow us to control the spatial peak-to-average power ratio by projecting a fraction of the transmitted power onto the null space of the channel. These precoding matrices preserve the structure of conventional precoders; e.g., they suppress multiuser interference when used together with zeroforcing precoding and bring advantages over these precoders by operating PAs in a more power-efficient region and reducing the total radiated distortion. Our numerical results show that by controlling the power variations between antennas in the array and incorporating the nonlinearity properties of PA into the precoder optimization, significant gains in energy efficiency can be achieved over conventional precoding techniques.


Introduction
Massive multiple-input-multiple-output (MIMO) has become the key technology to meet the capacity requirements of fifth-generation (5G) wireless communications and beyond [1]. The basic idea of massive MIMO is to equip the base station with many antennas and serve multiple users over the same time/frequency resources through spatial multiplexing. However, the performance benefits of massive MIMO come from using more hardware, e.g., multiple radio frequency (RF) chains per base station. Therefore, the study and optimization of the total cost and energy efficiency of massive MIMO systems has been an active research area over the last decade [2].
In this context, energy efficiency is typically defined as the number of bits that can be reliably transmitted per unit of power consumed. The authors in [3] consider a model to quantify the power consumed by digital signal processing and analog circuits in a multi-user MIMO system. This model is then used for analyses that provide insight into how to choose the number of antennas at the base station, the number of user equipment (UEs) served, and the transmit power to maximize energy efficiency. A key finding of [3] is that energy-efficient multi-user systems operate in a high signal-to-noise ratio (SNR) region where interference-suppressing processing, such as zeroforcing (ZF) precoding, is strongly preferable to interference-ignoring transmission schemes, such as maximum-ratio transmission (MRT).
A significant portion of the operating energy of a base station is consumed by the power amplifiers (PAs); see, e.g., Reference [4]. Much of this power is dissipated as heat, as the overall efficiency of currently deployed PAs is typically very low. This is because in conventional transmission schemes, many of the PAs are operated well below saturation to avoid distortion of the transmitted signals. Therefore, a reasonable approach to improve energy efficiency is to design the transmission schemes in such a way that the PA power dissipation is minimized. Examples of such energy efficient transmission schemes are proposed in [5][6][7]. The authors in [5] consider a single-user MIMO channel and propose a transmission power allocation strategy that provides an improved rate for a given consumed power compared to conventional MRT transmission. In [6], it is shown that transmit antenna selection can achieve near-optimal performance (in terms of ergodic capacity) over a multiple-input-single-input (MISO) channel under constraints on per-antenna and total power consumed. More recently, the authors in [7] extended the analysis of [5,6] to point-to-point and multiuser MIMO scenarios. In the multiuser case, a non-convex optimization problem is formulated and approached to minimize the total power consumed at the PAs such that a certain minimum signal-to-interference-plus-noise ratio (SINR) is achieved by all UEs.
The solutions in [5][6][7] do not take into account the effects of the distortion introduced by the nonlinear PAs on the quality of the received signal. The amount and directivity of the nonlinear distortion depend strongly on the covariance matrix of the beamformed signal [8,9]. Motivated by this fact, a precoder optimization framework was proposed in [10] that incorporates the PA nonlinearity properties in maximizing the spectral and energy efficiency.
As a continuation of the work in [10], in this paper, we study the impact of per-antenna power distribution among different antennas in the array on the energy efficiency of a massive MIMO downlink system. We show that using a precoding matrix with a lower spatial peak-to-average power ratio (SPAPR), it is possible to operate all PAs with high power efficiency. Moreover, with such a precoding matrix, the total amount of distortion emanating from nonlinear PAs can be drastically reduced by using a relatively small backoff. We introduce a family of linear precoding matrices generated by adding a term to the expression of conventional precoding matrices and exploiting the excess of spatial degrees of freedom in massive MIMO systems. This term is projected onto the null space of the channel and adjusted such that the resulting precoding matrix provides equal transmit power at different antennas. The beneficial properties of this precoding scheme are achieved at the cost of increased total transmission power and by wasting some of this power by transmitting into null space. Accordingly, transmission with zero power variation can be significantly suboptimal in different scenarios. For this reason, we introduce a framework that allows a more flexible control of SPAPR by jointly optimizing both the precoder matrix and the power variation across the antennas in the array. Finally, we perform numerical simulations to compare the energy efficiency of the proposed precoding solutions with that of conventional ZF precoding.
The remainder of this paper is organized as follows. In Section 2, we introduce the system model considered and give relevant formulations for quantifying the spectral and energetic efficiency. The impact of power variations between antennas in the array on the spectral and energy efficiency is studied in Section 3. In Section 4, a simple framework for generating precoding matrices with equal antenna powers is presented. This framework is extended to a more general case with flexible control of SPAPR in Section 5. Numerical results are given in Section 6, and finally the paper is concluded in Section 7.
Notation: The vectors and matrices are marked in bold with lower and upper case letters. The superscripts (·) * , (·) T , and (·) H denote the complex conjugate, the transpose, and the Hermitian transpose, respectively. We use E[·] to denote the expected value. Moreover, a is used to denote the 2 -norm of the vector a. The M × M identity matrix is denoted by I M . By A B, we denote the Hadamard product (entry-wise product) of two equally sized matrices A and B. Furthermore, diag(a) represents a diagonal matrix containing the elements of the vector a on its diagonal, and diag(A) is the principal diagonal of a square matrix A. The element-wise magnitude of a matrix A is represented by |A|. The distribution of a circularly symmetric complex Gaussian random vector with covariance matrix C ∈ C M×M is denoted by CN (0, C).

System Model and Preliminaries
We consider the downlink of a multiuser MIMO transmission where the base station uses a co-located array of B antennas to serve U single-antenna UEs as depicted in The signal at each antenna branch x b passes through a nonlinear PA. The nonlinear characteristics of the PAs are modeled using (2K + 1)th order polynomial as [11] where β 2K+1 are complex-valued model parameters corresponding to the bth PA capturing both amplitude-to-amplitude modulation (AM/AM) and amplitude-to-phase modulation (AM/PM) distortions.
Considering the multiuser MIMO system model in Figure 1, the received signal at the uth user is given by where h u ∈ C B is the channel vector, x is the precoded signal as in (1), and f (x) = [ f 1 (x 1 ), . . . , f B (x B )] T denotes the output of the PAs as described in (2). Furthermore, w u ∼ CN (0, N 0 ) is the additive white Gaussian noise (AWGN).

Channel Model
In this paper, we use a channel model that captures the sparse scattering properties of millimeter-wave channels in non-line-of-sight (nLoS) environments, namely when there is no dominant path. In this model, typically referred to as a geometric channel model [12], each scatterer contributes to a single path, and the channel coefficients can be expressed as follows: for u = 1, . . . , U, where N path represents the number of paths. Moreover, ψ u, is the angle of departure (AoD) for the th path, and a(ψ u, ) is the corresponding array response vector. For a uniform linear array (ULA) with half a wavelength antenna element separation, the bth entry of the array response vector is given by for b = 1, . . . , B. Furthermore, ζ u, ∼ CN (0, γ 2 u ) is the independent and identically distributed (i.i.d.) channel gain (including path loss) corresponding to the th path. Throughout the paper, we assume that the UEs know the channel coefficients perfectly. However, on the transmitter side, we consider the availability of both the perfect and imperfect channel state information (CSI).

A Lower Bound on the Sum Rate Capacity
The nonlinear power amplifiers introduce distortion into the transmitted signal, which can significantly degrade the capacity of the multiuser MIMO system. As a first step in deriving a tractable approximation of the ergodic sum rate capacity of the channel inputoutput model in (3), we use Bussgang's theorem [13], which allows us to decompose the output of the nonlinear function into a scaled linear signal and an uncorrelated distortion. Since x is circularly symmetric complex Gaussian distribution (which follows from the assumption of Gaussian s), we can decompose f (x) into where G = diag(g 1 , g 2 , . . . , g B ) is diagonal matrix whose entries are the Bussgang gain values given by [14] g The distortion term in (6), i.e., e ∈ C B , is uncorrelated with x, i.e., E[xe H ] = 0 B×B . For the (2K + 1)th order polynomial model in (2), the Bussgang gain in (7) can be derived using the moments of complex Gaussian random variables [15] as where σ 2 x b is the variance of the precoded signal at the bth antenna, i.e., x b . Using (1), the matrix G can be derived as a function of the precoding matrix P as where for k = 1, . . . , K and C x = E[xx H ] = PP H is the input covariance matrix. Using (9) and following a similar approach as in, e.g., Equation (11) of Reference [8], the covariance matrix of the distortion e can be derived as where The linear decomposition described in (6)-(13) allows for derivation of a lower bound on the ergodic sum rate capacity. Substituting the Bussgang decomposed transmitted signal in (3) yields where w eff,u is the effective noise that consists of the inter-user interference, the received nonlinear distortion, and the AWGN terms. In view of the fact that w eff,u is not Gaussian, the exact evaluation of the capacity is not straightforward. Using the so-called "auxiliary channel lower bound" [16] and via replacing w eff,u by a complex Gaussian noisew eff,u that has the same variance as w eff,u , the following achievable sum rate is obtained in closed form where SINDR u (P) denotes the signal-to-interference-noise-and-distortion ratio (SINDR) at the uth user and is given as It is worth noting that the lower bound in (15) and (16) corresponds to the ergodic sum rate that can be obtained using a Gaussian codebook and a mismatched nearest-neighbor decoder at the UEs under the assumption that the channel coefficients h u are perfectly known to the uth UE [17].

Power Consumption and Energy Efficiency
Energy efficiency analysis requires careful modeling of the power consumption. In this paper, we focus on quantifying and minimizing the power consumption in the PAs. The power efficiency of the bth PA is defined as where ρ tx denotes the output power, given by and ρ (b) cons represents the consumed power. Similarly to [5][6][7][8], we adopt the following simple yet accurate model for the power efficiency: Here, ρ max is the maximum output power of the bth PA, and η (17) and (19), the consumed power at the bth PA can be expressed as max . In other words, the consumed power is proportional to the square root of the output power, which was experimentally verified in various studies; see, e.g., Equation (6.93) in Reference [18].
The achievable sum rate in (15) and the power consumption model in (20) facilitate quantification of an energy efficiency metric, measured in bits per Joule, as where W denotes the bandwidth and ρ (tot) is the total consumed power which can be evaluated using (20).

The Impact of SPAPR on Spectral and Energy Efficiency
As can be seen from the metrics formulated in Sections 2.2 and 2.3, the choice of the precoder matrix P not only affects the spectral efficiency by controlling the useful signal power, multiuser interference, and distortion, but it also directly affects the power efficiency of the PAs and the overall energy efficiency of the system. In conventional precoding schemes, the structure of P is such that there is considerable variation in the average output power between antennas in the array at each coherence interval. An example of such power variation can be seen in Figure 2a Figure 2b. It can be seen that as a consequence of the power variation across the antennas, many of the PAs exhibit poor efficiency. Furthermore, as shown in Figure 2c, the contribution of these PAs to the nonlinear distortion is very different. More specifically, it can be seen that small backoffs can considerably reduce the amount of distortion. This is because the covariance of the distortion (12) is a function of C x |C x | 2k , where C x = PP H , and thus the distortion decreases faster than linearly at reduced per-antenna powers.
Motivated by the observations made in Figure 2, in what follows, we seek precoding matrices that yield a favorable tradeoff between total power consumption and total radiated distortion. To this end, we propose precoding schemes with reduced power imbalance over different antennas. It is worth noting that our proposed solutions are different from precoding techniques with low peak-to-average power ratio (see, e.g., References [19,20]), which aim at reducing the power variation in the temporal dimension. They also differ from the solutions proposed in [21], where the power variation between antennas is reduced by precoding at the symbol level. Instead, we focus on block-level precoding schemes, which is the common approach in this context.  (2)

Precoding with Equal Antenna Powers
In this section, we present a family of linear precoding schemes with equal transmit powers from different antennas in the array. To this end, we propose exploiting the excess of spatial degrees of freedom in massive MIMO transmission to project part of the transmit power to the null space of the channel. In particular, the precoding matrices with equal antenna powers (EAP) can be generated as where P conv is the precoding matrix corresponding to a conventional precoding scheme, such as ZF and MRT, and P ⊥ is the orthogonal projection onto the null space of the channel matrix H = [h 1 , . . . , h U ] given by where (·) † stands for the pseudo-inverse operation and assuming that H has linearly independent rows, it is defined as H † = H H (HH H ) −1 . Moreover, Q is an arbitrary matrix, which will be used to guarantee the EAP condition, as explained below. Finally, α in (23) stands for the normalization factor, which is introduced such that the following two constraints are satisfied for different choices of precoding matrix: 1.
The per-antenna power constraint (PAPC) given by for b = 1, . . . , B, where x b is the precoded signal input to the bth PA and can be obtained by multiplying the bth row of the precoding matrix P by the transmitted symbols s.

2.
The total radiated power constraint (TRPC) given by We now introduce a simple approach for finding Q such that the EAP transmission is realized. In particular, we start by generating a matrixP without power variation across the antennas by normalizing the rows of P conv such that for all b = 1, . . . , B. We then calculate the difference betweenP and P conv as and find the matrix Q such thatQ = P ⊥ Q, namely, by calculating With a few iterations of (28) and (29), a precoder matrix is found that yields an almost zero power variation. The reason why we need multiple iterations is the numerical imprecision in the computation of P ⊥ P † ⊥Q ≈Q. In particular, the differences between P ⊥ P † ⊥Q andQ in the first iteration lead to some disturbances in the per-antenna powers which can be circumvented with some iterations of (28) and (29). A final step of normalization to satisfy the TRPC in (26) gives the final solution for P EAP . The precoding scheme in (23) allows for preserving the structure of the conventional precoding matrix P conv . For example, if P conv is substituted by the ZF precoder, the resulting precoding matrix for P EAP also suppresses the multiuser interference as the remaining part of the signal is projected onto the null space of the channel.
In Figure 3a, we plot the per-antenna power values for P EAP obtained by replacing P conv with the ZF precoding in Figure 2. The fractions of power transmitted in the direction of the UEs and the null space of the channel are shown in blue and white, respectively. The corresponding PA efficiencies are shown in Figure 3b and are compared with the PA efficiencies in conventional ZF precoding. The precoding scheme with EAP in (23) operates the PAs with their highest efficiency, namely with η b = 0.55 for all b = 1, . . . , B in this particular example. Moreover, in precoding with EAP, unlike conventional ZF, where different PAs contribute very differently to the distortion (see Figure 2c), different PAs contribute almost equal amounts of nonlinear distortion, and therefore the total amount of distortion can be drastically reduced by small backoffs. However, this is achieved at the cost of increased total radiated power and by wasting some of this power by transmitting in the null space of the channel. Nevertheless, we show in Section 6 that precoding with EAP can lead to improved energy efficiency over conventional precoding schemes in certain output power ranges.

SPAPR-Controlled Distortion-Aware Precoding
Despite its attractive properties, the solution proposed in Section 4 can be significantly suboptimal in different scenarios, since a large amount of power is wasted when transmitting along the null space of the channel. Moreover, the conventional precoding schemes such as ZF do not take into account the effects of nonlinear distortion and thus may provide poor performance in the distortion-limited regimes. Motivated by these, in this section, we present a more advanced precoding technique that provides more flexible control over the power variation across antennas and further considers the effects of nonlinear distortion. In particular, we extend the solution in (23) by considering the following structure where the conventional precoder P conv in (23) is replaced with a distortion-aware precoder P DA and the parameter κ is introduced for a more flexible control of the power variation across the antennas. Next, we present a two-step optimization procedure for finding P DA and κ. In the first step, similarly to Algorithm 1 in [10], we adopt a projected gradient ascent, to solve for In particular, we seek a precoding matrix that maximizes energy efficiency under TRPC and PAPC. Here, ρ min is introduced for two purposes. First, it prevents the algorithm from converging to a precoding matrix P DA with very low ρ (tot) cons and R sum values. The second advantage of constraining the per-antenna power from below is that the optimization solutions inherently exhibit less power variation across the antennas in the array, which can be favorable in light of the analysis in Sections 3 and 4.

Algorithm 1
Steps for computing P DA [10].  6: To solve (31), similarly to Algorithm 1 in [10], we start with an initial precoding matrix P (0) and update it by taking steps along the steepest ascent direction of the objective func-tion Γ EE (P DA ) = R sum (P DA ) /ρ (tot) cons (P DA ). The resulting precoding matrix is then normalized to ensure the feasibility of the solution. This procedure can be formulated as [22] where i = 1, . . . , I is the iteration index, I is the maximum number of iterations, µ (i) is the step size of the ith iteration, and [·] + denotes the normalization of the updated precoding matrix such that the two power constraints are satisfied. Once this normalization is applied, the objective function η EE (P) = R sum (P) /ρ (tot) cons (P) is calculated, and if η EE ( P) > η EE (P (i−1) ), we update the precoding matrix to P (i) = P and reset the step size µ (i) = µ (0) . Otherwise, we do not update the precoding matrix, i.e., P (i) = P (i−1) , and decrease the step size µ (i) = 1 2 µ (i−1) . Finally, we choose P DA = P (I) as the output of the algorithm. The steps required for computing P DA is summarized in Algorithm 1. Now, for a fixed P DA , we obtain the matrix Q in (30) using a similar procedure as described in Section 4 and then find the optimal value for κ by conducting a simple line search over the interval [0, 1] with the goal of maximizing R sum (P SPDA ) /ρ (tot) cons (P SPDA ). The value of κ = 1 corresponds to the special case with EAP precoding similar to the solution in (23). The optimization over κ provides an additional control on the SPAPR value, yielding a better tradeoff between the total consumed power and the amount of the radiated distortion.

Numerical Results
In this section, we perform numerical simulations to compare the performance of the solutions presented in Sections 4 and 5 with the performance of conventional ZF precoding, assuming both perfect and imperfect CSI at the transmitter (CSIT). In our simulation setup, we assume that all base station antennas are equipped with identical PAs whose nonlinearity characteristics are modeled by (2) with K = 1, β . The maximum total radiated power is ρ tot = 45 dBm, and the variance of the AWGN is set to N 0 = −82 dBm.
The channel coefficients are generated according to (4) and (5), where we set N path = 4 and assume that the AoD ψ u, is uniformly distributed over the interval [0 • , 180 • ). Moreover, we adopt the nLoS path loss model presented in [24] and, assuming that the system operates at a carrier frequency f c = 28 GHz, calculate the path loss for user u (at a distance of d u meters) using We further assume that the UEs are uniformly distributed in a disk-shaped area with the base station at its center. The minimum and maximum distances from the base station are set to d min = 5 and d max = 35 m, respectively. At this setting, the average path loss is approximately γ 2 avg = −108.5 dB, which corresponds to a user at the distance of 17.8 m from the base station.

Performance Analysis under Perfect CSIT Assumption
We first consider a setup with B = 32 antennas and U = 4 UEs. Assuming that the perfect CSI is available at the transmitter and UEs, we evaluate in Figure 4 the average sum rate (evaluated using (15)) versus the average total consumed power (computed using (22)) for three different precoding schemes, namely P SPDA in (30), ZF with EAP in (23), and the conventional ZF by sweeping the total radiated power. It can be seen that the ZF with EAP has worse energy efficiency compared to the conventional ZF at the lower radiated power values. This is because in this regime, the performance degradation due to the power wasted in transmission along the null space outweighs the benefits that can be obtained by EAP transmission. However, at higher transmit powers, due to the operation of PAs with higher power efficiencies and due to the introduction of a smaller amount of nonlinear distortion, ZF with EAP outperforms the conventional ZF precoding, resulting in a higher maximum achievable sum rate (marked by circles in Figure 4). For instance, in Table 1, we compute the average received useful signal power E |h T u G(P)p u | 2 and the average received nonlinear distortion power E h T u C e (P)h * u for a total average consumed power of 15.25 dBW (this is the point where ZF achieves its maximum achievable sum rate as depicted in Figure 4). It can be seen that ZF with EAP results in about 4.5 dB less distortion than ZF at the price of about 1 dB less average useful signal power, improving the overall signal-to-distortion ratio by about 3.5 dB. Figure 4 also shows that the SPAPR-controlled distortion-aware precoding yields a superior performance compared to ZF with EAP and conventional ZF. This improved performance is due to the consideration of the impact of nonlinear distortion in the precoder optimization procedure, as well as the mechanism for more flexible control of the SPAPR via optimization of the parameter κ. It is worth noting that the performance of the SPAPR-controlled distortion-aware precoding can be further improved by using a more sophisticated optimization to find the matrix Q instead of the procedure described in Section 4. However, this extension is beyond the scope of this paper and will be left to future work. 11 11   The results in Figure 4 were obtained assuming perfect CSIT, but this assumption is not generally valid in practice. To investigate the impact of the channel estimation error on the performance of the proposed transmission schemes, we model the estimated CSIT with where the channel estimation error is modeled as an additive independent random error term. In particular, in (34), h u and h u denote the estimated and the actual channel for the uth user, respectively. Moreover, τ ∈ [0, 1] is a parameter reflecting the accuracy of the channel estimation, and the elements of the CSI error v are distributed according to CN 0, σ 2 h u . In the following, we consider three different scenarios with different channel estimation accuracies. More precisely, we consider τ = 0, which corresponds to the assumption of perfect CSIT, τ 2 = 0.01, which is an example of a case with low to moderate channel estimation errors, and τ 2 = 0.1, which reflects a scenario with poor channel estimation accuracy. For these scenarios, in Figure 5, we plot the cumulative distribution function (CDF) of the maximum achievable sum rate for ZF with EAP and the conventional ZF for a setup with B = 64 antennas and U = 4 UEs and over realizations of the geometric channel described in (4). As expected, both transmission schemes suffer from performance degradation in scenarios with imperfect CSIT, since precoding matrices computed using an erroneous channel lead to multiuser interference. More importantly, it can be seen that, while P EAP can still provide a performance gain over the conventional ZF transmission in low to moderate channel estimation errors, this gain vanishes in the scenarios with large τ values (e.g., τ 2 = 0.1). This is because with imperfect CSIT, the computed P ⊥ matrices in (23) are not perfectly orthogonal to the channel, leading to additional interference in the direction of the UEs. This additional interference can therefore negate the improvements that can be obtained by transmitting with the equal antenna powers. Extending the proposed solutions to achieve additional robustness to channel estimation errors is a subject of future studies.

Conclusions
We have studied the impact of per-antenna power distribution among different antennas in the array on the energy efficiency of a massive MIMO downlink system. In particular, we have shown that beneficial properties in terms of power efficiency and total radiated distortion can be achieved with a precoding matrix that operates all power amplifiers under the same backoff conditions. We have introduced a family of precoding matrices that can achieve these properties at the cost of some power wastage due to transmission in channel null space. Moreover, we have proposed a more advanced method to maximize energy efficiency by jointly optimizing the precoding matrix and the power variation across the antennas in the array. The performance gains that can be achieved with these precoding schemes over conventional zeroforcing precoding have been demonstrated using numerical experiments.
Author Contributions: Conceptualization and methodology, S.R.A. and T.E.; formal analysis, simulations and original draft preparation, S.R.A. Both authors have read and agreed to the submitted version of the manuscript.
Funding: This research has been carried out in the LINEAR project, a joint research project financed by Swedish Governmental Agency of Innovation Systems VINNOVA, Ericsson, and Bluetest.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: AM/AM amplitude-to-amplitude modulation AM/PM amplitude-to-phase modulation AoD angle of departure AWGN additive white Gaussian noise CDF cumulative distribution function CSI channel state information CSIT channel state information at the transmitter EAP equal antenna powers MIMO multiple-input-multiple-output MISO multiple-input-single-input MRT maximum-ratio transmission PA power amplifier PAPC per-antenna power constraint RF radio frequency SINDR signal-to-interference-noise-and-distortion ratio SINR signal-to-interference-plus-noise ratio SNR signal-to-noise ration SPAPR spatial peak-to-average power ratio TRPC total radiated power constraint UE user equipment ULA uniform linear array ZF zeroforcing i.i.d.
independent and identically distributed nLoS non-line-of-sight