User Clustering and Power Allocation for Energy Efﬁciency Maximization in Downlink Non-Orthogonal Multiple Access Systems

: Non-orthogonal multiple access (NOMA) has been considered a promising technique for the ﬁfth generation (5G) mobile communication networks because of its high spectrum efﬁciency. In NOMA, by using successive interference cancellation (SIC) techniques at the receivers, multiple users with different channel gain can be multiplexed together in the same subchannel for concurrent transmission in the same spectrum. The simultaneously multiple transmission achieves high system throughput in NOMA. However, it also leads to more energy consumption, limiting its application in many energy-constrained scenarios. As a result, the enhancement of energy efﬁciency becomes a critical issue in NOMA systems. This paper focuses on efﬁcient user clustering strategy and power allocation design of downlink NOMA systems. The energy efﬁciency maximization of downlink NOMA systems is formulated as an NP-hard optimization problem under maximum transmission power, minimum data transmission rate requirement, and SIC requirement. For the approximate solution with much lower complexity, we ﬁrst exploit a quick suboptimal clustering method to assign each user to a subchannel. Given the user clustering result, the optimal power allocation problem is solved in two steps. By employing the Lagrangian multiplier method with Karush–Kuhn– Tucker optimality conditions, the optimal power allocation is calculated for each subchannel. In addition, then, an inter-cluster dynamic programming model is further developed to achieve the overall maximum energy efﬁciency. The theoretical analysis and simulations show that the proposed schemes achieve a signiﬁcant energy efﬁciency gain compared with existing methods.


Introduction
To satisfy the multiplied demands in system capacity and throughput performance, non-orthogonal multiple access (NOMA) has been widely considered a novel and promising candidate cellular multiple access scheme for the fifth generation (5G) mobile communication systems [1][2][3][4][5][6]. Many theoretical analyses and experimentation have proved that NOMA can achieve a higher sum rate than orthogonal multiple access (OMA) adopted in the fourth generation (4G) wireless networks [7][8][9][10][11][12]. In NOMA, multiple users with different channel gains can be multiplexed together in the same subchannel and decoded at the receivers through successive interference cancellation (SIC) techniques [13][14][15][16][17][18]. In this mechanism, the power domain is exploited to simultaneously serve multiple users at different power levels, whereby spectrum efficiency can be significantly improved.
There is a lot of research literature in the previous work focusing on further enhancement to achieve a significant spectrum efficiency gain. For example, by fully exploiting the and power allocation under the constraints of maximal transmission power, minimum data transmission rate requirement, and SIC requirement. Since this problem is NP-hard, we first exploit a quick suboptimal clustering method to assign each user to a subchannel to find an approximate solution with much low complexity. By employing the Lagrangian multiplier method with Karush-Kuhn-Tucker optimality conditions in each subchannel, system energy allocation is further modeled as a dynamic programming model to obtain an optimal solution. As the Lagrangian multiplier method's closed-form expressions are obtained, and our dynamic programming model only involves a small number of steps, it can be efficiently solved in practice. Our solution is compared with representative schemes in terms of energy efficiency. Experimental results show that it can achieve a noticeable improvement in energy utilization.
The main contributions of this paper are summarized as follows: • By studying the novel NOMA concept's transmission mechanism, an energy-efficient technique is proposed to optimize the user clustering and power allocation designs. • We perform analysis for user clustering results and apply a clustering strategy to group users with a low-complexity. In our solution, the number of users assigned to the same subchannel is not limited. • We introduce the Karush-Kuhn-Tucker optimality conditions to obtain a closed-form result of power allocation in each subchannel via the Lagrangian multiplier method. • We further formulate the target problem as a dynamic programming model, generating an optimal overall system power allocation.
The rest of this paper is organized as follows. Section 2 formulates the energy efficiency maximization problem for downlink NOMA systems. Section 3 presents our user clustering and power allocation co-design technique in detail. Section 4 shows the experimental results with discussion. Section 5 concludes this paper.

System Model and Problem Formulation
Consider a downlink NOMA system that serves n users, as shown in Figure 1. The maximum transmission power of the base station (BS) for the downlink NOMA system is recorded as P B . The power allocated to the i-th user is denoted as P i , where i = 1, 2, · · · , n and P 1 + P 2 + · · · + P n ≤ P B . (1) The parameter H i represents the normalized channel gain between the BS and the i-th user, where i = 1, 2, · · · , n, and Using SIC in the NOMA systems, multiple users with different channel gain can be multiplexed together into clusters and transmit in the same subchannel. The basic principle of SIC is to gradually subtract the users' interference with the maximum signal power gain. Concretely, the BS broadcasts the superposition of signals to all the users in its servicing cluster via power-domain division, where users are sorted on each subchannel in descending order of power gains [47]. After receiving the transmission signal from the subchannel, SIC is used to eliminate multi-user interference. In the received signal, the SIC detector decides the data of multiple users one by one and subtracts the multiple access interference caused by the user's signal at the same time. In practical applications, to ensure the NOMA systems' overall performance, users with lower channel gain will have higher power gain. Therefore, the SIC detector generally operates in the order of channel gain from low to large, where the signal with the smallest channel gain will be decided first. Formally, the j-th user first decodes the messages of j − 1 users after it, i.e., the message of the i-th user where i > j , and then removes these messages from the received superposed signal in the order of i = n, n − 1, · · · , j. Conversely, other remaining messages, i.e., the message of the i-th user where i < j, are treated as noise [48,49]. In this way, the loop operation is carried out until all of the multiple access interference is eliminated. By these steps, all users can get their desired signal. In such a scenario, the total system bandwidth B and the maximum transmission power P B are evenly divided into m frequency resource blocks with the bandwidth b = B m and the power budge p b = P B m for each resource block. Here, the number of resource blocks m is a preset parameter in the NOMA systems. After clustering n users into c clusters where 1 ≤ c ≤ n 2 , ω j resource blocks are allocated to the j-th cluster where 0 ≤ ω j ≤ m and ω 1 + ω 2 + · · · + ω c = m. The same cluster users can share resource blocks by non-orthogonal scheduling while different clusters can work independently.
For a general n-user downlink NOMA system, the necessary power constraints for efficient SIC [50] can be expressed as where i = 2, 3, · · · , n, and P T are the minimum power difference requirements for signal decoding from the remaining non-decided message signals. According to the Shannon-Hartley theorem [51], the achievable throughput for the i-th of all users assigned to the j-th cluster can be expressed as Let an m × c matrix δ whose entry in the i-th row and the j-th column be given as δ ij , denoting a result of clustering, i.e., δ ij = 1 means the i-th user is grouped into the j-th cluster while δ ij = 0 is not. Let denote the power allocated to each user. Denote the minimum data transmission rate required by the quality of service (QoS) of the i-th user as R i . Define energy efficiency as the ratio of throughput to energy consumption, then the user clustering and power allocation problem for the energy efficiency maximization in the downlink NOMA system with n users and c clusters can be formulated as under the following constraints: • maximum transmission power constraint of the BS • maximum transmission power constraint of each cluster • necessary power constraint for efficient SIC of each cluster • minimum data transmission rate requirement of each user • user clustering and resource allocation constraints • domain constraints δ ij ∈ {0, 1}, ∀i ∈ {1, 2, · · · , n}, j ∈ {1, 2, · · · , c}, 0 < P i < P B , ∀i ∈ {1, 2, · · · , n}.

User Clustering and Power Allocation Co-Design
In application, to achieve the overall energy efficiency maximization in the systems, the BS will schedule the subchannel assignment and power allocation to each user before the downlink NOMA transmission, i.e., solve the energy efficiency maximization problem formulated in (6). However, problem (6) is a mixed-integer nonlinear programming problem, which is NP-hard [43]. Though the optimal result can be found by an exhaustive search of all possible user clustering and power allocation, it is computationally complex and unacceptable in existing systems.
The intractability justifies the development of suboptimal solutions. To break down the original optimal problem into several steps, we first investigate user clustering and put forward a pairing-assignment strategy to group users into clusters with a very low computational complexity. After user clustering, the target problem is formulated as a Lagrangian multiplier based dynamic programming model for obtaining the optimal allocation of the energy. Combining with the Lagrangian multiplier method, the dynamic programming techniques can generate the power allocation's optimal schedule.

Pairing-Assignment Strategy for User Clustering
The idea of user clustering is coherent with NOMA systems' standard application requirements, where the simultaneously multiple transmissions are infeasible to apply on all users jointly due to the additional system overhead for channel feedback coordination and error propagation [52]. With users in the cell divided into multiple clusters, NOMA techniques can be employed effectively within each cluster. Several user clustering algorithms are facing different system environments and implementation complexity. The user clustering strategy should be compatible with the power allocation scheme to attain high energy efficiency with a low computational time cooperatively for our design goals.
Intuitively, it is clear that the users with higher channel gain have higher energy efficiency with equal energy consumption. Moreover, the higher channel gain users can suppress more interfering signals, e.g., the highest channel gain user's signal in each cluster can suppress interfering signals from any other users while the user with the lowest channel gain can not suppress any interferences [53]. Therefore, increasing the allocation power of the user with the highest channel gain in a cluster will significantly improve throughput to enhance energy efficiency. Naturally, the potential energy efficiency gain is positively correlated with the highest channel gain in the cluster. Accordingly, users with high channel gain should be distributed into different clusters.
On the other side, to ensure the minimum data transmission rate requirements for the users with low channel gain, it is necessary to pair them with high channel gain users. Due to the minimum data transmission rate requirements in (10), if users with low channel gain are put into the same cluster, a large part of power is required for them to transmit with the BS and the efficiency of this part of the power will be very low because they are allocated to low channel gain users. Conversely, suppose that high channel gain users and low channel gain users are assigned to the same cluster. In that case, the high channel gain users can achieve a higher transmission rate even with the low allocation power as the rest can satisfy low channel gain users' data transmission rate requirements.
Given n users who are denoted as U 1 , U 2 , · · · , U n in ascending order of normalized channel gain H 1 , H 2 , · · · , H n , based on the above consideration, we adopt the following simple but useful pairing-assignment strategy to allocate them to c clusters:

1.
User pairing: group high channel gain users and low channel gain users in pair, i.e., U 1 and U n in the first group, U 2 and U n−1 in the second group, U 3 and U n−2 in the third group, and so on.

2.
Group assignment: Assign each group to a cluster in the order of user channel gain from high to low, i.e., the first group to the first cluster, the second group to the second cluster, ..., the c-th group to the c-th cluster, the (c + 1)-th group to the first cluster, the (c + 2)-th group to the second cluster, and so on.
The obtained clustering result will become the input of the dynamic programming model. Since this clustering strategy can fully utilize the allocation power of each cluster and assign users based on the consideration of energy efficiency increase, it can effectively improve the energy efficiency of the dynamic programming results.

Lagrangian Multiplier Based Dynamic Programming Model for Power Allocation
After user clustering, the target problem is now reduced to determine how much energy should be allocated to each cluster, i.e., calculate P = {P 1 , P 2 , · · · , P n }. We observe that this problem has an optimal substructure property. Thus, it can be solved by dynamic programming techniques. We choose dynamic programming to solve this problem due to the following reasons. Firstly, dynamic programming can solve a problem optimally by breaking it into sub-problems and recursively finding the optimal solution. Therefore, using a dynamic programming algorithm can examine each cluster's power allocation and combine their results to give the best solution. Secondly, in the present cellular system, the service association between BS and users can remain unchanged for more than hundreds of microseconds. This frequency is enough for BS to perform rescheduling of power allocation, where the user clustering result is relatively static. Therefore, the power allocation can be done offline, and the timing overhead incurred by a dynamic programming algorithm will not cause the latency for communication services between BS and users.
Let f (j, p) be the maximum throughput of the j-th cluster with power budget p. Define the maximum throughput of the previous j clusters with energy budget p as T(j, p), and the corresponding optimal power allocation for the j-th cluster is defined as A(j, p). In our dynamic programming model, we consider decreasing the continuous power budget with the minimum assignable unit ξ and allocating an integral multiple of ξ power to each cluster. We can look back at the optimal way to allocate the previous j − 1 clusters' power and determine how much energy should be allocated to the j-th cluster. Based on this observation, the recursive solution of the optimal subproblem is listed as Algorithm 1 shows the inter-cluster dynamic programming for energy efficiency maximization. The recurrence of the solution compares up to , which denotes the cases in which kξ power budget is assigned to the j-th cluster, and the remaining p − kξ power budget is assigned to the previous j − 1 clusters. The optimal power allocation result, which is denoted as P 1 ,P 2 , . . . ,P c can be obtained by backtracking the values of A(j, p), as showed in steps 25 to 29 in Algorithm 1. Based on the above processes, now the only problem to be solved is calculating the relationship between the maximum throughput and the pow budget in each cluster, i.e., the function f (j, p), which is an input of the inter-cluster dynamic programming algorithm. Considering the particular j-th cluster with n j users, let H i j represents the normalized channel gain between the BS and the i-th users in this cluster, where i = 1, 2, · · · , n j and The maximum transmission power of the j-th cluster is represented as P j max , and the power allocated to each user in this cluster is denoted as Denote the minimum data transmission rate required by the QoS of the i-th user in the j-th cluster as R j i , then the throughput maximization problem of the particular cluster can be formulated as under the following constraints: • maximum transmission power constraint of the cluster • necessary power constraint for efficient SIC of each user • minimum data transmission rate requirement of each user T(0, iξ) ← 0 3: end for 4: for j = 1 to c do 5: for i = 0 to P B ξ do 6: p ← iξ 7: T(j, p) ← 0 8: p ← p −P j

29: end for
This optimization problem can be solved by the Lagrangian multiplier method [54,55]. Concretely, by applying the Lagrangian multiplier method, the optimal problem can be expressed as where λ, µ = µ 1 , µ 2 , . . . , µ n j and v = v 1 , v 2 , . . . , v n j are the Lagrangian multipliers. After the Lagrangian multiplier derivation, the Karush-Kuhn-Tucker optimality conditions [56,57] can be given as follows: • maximum transmission power constraint of the cluster • necessary power constraint for efficient SIC of each user • minimum data transmission rate requirements of each user • optimal power allocation for each user The solution set of this Lagrangian problem can then be given as where s 1 = λ, s 2 ∈ {µ 1 , v 1 }, s 3 ∈ {µ 2 , v 2 }, . . . , s n j ∈ µ n j , v n j . If we define two additional sets as S u = S − v i |i = 2, 3, .., n j and then the closed-form solution of the optimal power allocation in the cluster can be written as (28). Based on this derivation, the detailed processes to calculate f (j, p) are shown in Algorithm 2.
if p ≤ P j max then 6: for all P j calculated by (28) do 7: if P j satisfies the corresponding Karush-Kuhn-Tucker optimality conditions then 8: In Algorithm 2, all P j that satisfy the corresponding Karush-Kuhn-Tucker optimality conditions are traversed, and the maximum throughput obtained is maintained in f (j, p). If no P j satisfies the conditions, the value of f (j, p) remains at 0, which means that the energy budget is not enough to meet all users' QoS requirements in the j-th cluster.
Note that the number of users per cluster is typically limited to a very small value to meet the SIC protocol's decoding needs. Therefore, checking constant combinations of Karush-Kuhn-Tucker optimality conditions in step 7 of the Algorithm 2 is sufficient. Since the computational complexity of Algorithm 2 is O( P B ξ ). Similarly, the computational complexity of the inter-cluster dynamic programming algorithm is O(( P B ξ ) 2 ). Therefore, the proposed solution is a polynomial-time algorithm that mainly depends on the multiples of system maximum power P B to step size ξ. By selecting parameter ξ properly, the energy efficiency maximization problem can be solved with enough precision in a brief period. For example, by setting the parameter ξ as 0.1% of P B , a BS with typical computational performance can quickly complete the power allocation within a few hundred microseconds, whose response capability is sufficient in the existing NOMA system.

Simulation Results and Discussion
This section investigates the downlink NOMA systems' energy efficiency performances with the proposed user clustering and power allocation schemes. In our simulations, we consider a scenario where the user terminals are randomly deployed near a BS. A path loss model is adopted to calculate the channel gain between the BS and users. Using SIC in our simulation scenario, the total system bandwidth of 20 MHz and the maximum transmission power P B of 46 dBm are evenly divided into 100 frequency resources. In addition, the minimum power difference for SIC P T and the minimum assignable unit ξ are respectively set to 10 dBm and 0.0001 × P B in our simulations.
To reduce the demodulating complexity of the SIC receiver, the average number of users in each cluster is upper bounded to 2, 3, and 4 by appropriately setting the cluster number c. Specifically, in 2-user NOMA systems where the cluster number c is set to n 2 , each cluster is only allocated with two users. In 3-user and 4-user NOMA systems, the cluster number c is set to n 3 and n 4 , respectively. Note that the constraint of cluster size is a compromise with the current imperfect SIC implementation in practice. Due to the additional system overhead for channel feedback coordination and error propagation, the simultaneous multiple transmissions are infeasible to apply to a large number of users [52]. However, our algorithmic solutions can be adopted to optimize NOMA systems' energy efficiency without this assumption.
By applying our user clustering strategy and power allocation algorithms, we collect the energy efficiency performance of 2-user, 3-user, and 4-user NOMA systems and compare them with the customary orthogonal frequency division multiple access (OFDMA) based systems. Figure 3 shows the system's overall energy efficiency versus the number of users between 10 to 60. It can be observed from Figure 3 that the energy-efficient performance with our proposed solutions in 2-user, 3-user, and 4-user increase when the number of users grows, and all of them are much better than the OFDMA scheme. In more detail, the average energy efficiency of 2-user, 3-user, and 4-user NOMA systems at different user numbers are 47%, 49%, and 51% more than the OFDMA scheme. This is because the Lagrangian multiplier-based dynamic programming model can effectively allocate power resources so that the systems can fully use the limited energy to achieve better throughput.
To further compare our schemes' performance with conventional OFDMA, simulations for two specific scenarios are conducted. In scenario 1, the number of users deployed near the BS is set to 12. Scenario 2 illustrates a more unpleasant situation. In this simulation scenario, there are also 12 user terminals randomly deployed with a BS. However, half of the users are intended to be set away from the BS to make their channel gains extremely low. To meet the QoS requirements of these users, higher transmission power is needed. Figure 4 shows the system energy efficiency performance versus maximum transmission power of BS between 29 dBm to 46 dBm while other simulation parameters are the same as Table 1. The energy efficiency is expressed as 0 in the case that not all constraints are met. As another benchmark, MaxSE denotes the optimal solution's energy efficiency to maximize spectrum efficiency, which has obtained a lot of attention in the previous work. We can find from Figure 4 that NOMA systems will achieve low energy efficiency at a meager system power budget. However, the OFOMA scheme can not work under the circumstances. After the power budget reaches a certain level, the NOMA systems with our solutions can always outperform the OFDMA scheme significantly. Figure 4 also shows that our algorithms' energy efficiency performance is the same as the spectrum efficiency maximization scheme under a low power budget that both of them increase with the growth of the maximum transmission power. When the power budget becomes higher than the optimal power consumption, our algorithms' energy efficiency is constant while the spectrum efficiency maximization scheme's energy efficiency decreases. After the optimal power consumption level, our algorithms do not allocate extra energy to users, and thus the energy efficiency maintains the maximal value. However, to achieve higher spectrum efficiency, the spectrum efficiency maximization scheme will allocate as much energy to users as possible. With the increase of system throughput, the marginal efficiency of allocated power is declining, and more and more energy consumption is required to improve the spectrum efficiency for the same magnitude. This makes the energy efficiency of spectrum efficiency maximization scheme continuously reduce. When the power budget tends to infinity, the energy efficiency even converges to zero, resulting in tremendous energy waste. This phenomenon suggests that, under the premise of guaranteeing all users' QoS requirements, merely maximizing the spectrum efficiency will cause significant energy waste. Thus, it is necessary to improve the energy efficiency of NOMA systems.    Figure 5 shows the system energy efficiency performance measured in different minimum data transmission rate requirements, ranging from 0.1 Mbps to 1 Mbps. Simultaneously, other simulation parameters are the same as Table 1. It can be seen from Figure 5 that, in order to meet the minimum data transmission rate requirements of users, the energy efficiency of all schemes will decrease with the users' minimum data transmission rate increase. However, the NOMA systems equipped with the proposed user clustering and power allocation methods can always achieve higher energy efficiency than the OFDMA system. When the minimum data transmission rate requirements are too high, the OFDMA scheme can not satisfy all users' requirements while our algorithms can still work. To verify the proposed user clustering strategy and power allocation method's performance more finely, we compare our solution's energy efficiency against two approaches: location-based user clustering and equal power allocation. For the location-based user clustering strategy, users are manually clustered based on their geographical location so that the adjacent users tend to be assigned to the same subchannel. After user clustering, our Lagrangian multiplier based dynamic programming model is used to allocate power to each user. For the equal power allocation method, the user clustering process is the same as our pairing-assignment strategy. At the same time, the scheme achieves the power allocation that equal power is assigned to each cluster, which has the same idea as the algorithm proposed in [58]. Figures 6 and 7 illustrate the energy efficiency performance versus BS's maximum transmission power and users' minimum data transmission rate requirements, respectively, with the same scenario as Figures 4 and 5 using the parameters list in Table 1. In Figures 6 and 7, the systems equipped with the proposed user clustering and power allocation methods are still called NOMA. In contrast, location-based user clustering strategy and equal power allocation scheme are referred to as LUC and EPA, respectively. It can be seen from Figures 6 and 7 that, no matter what number of maximum transmission power and minimum data transmission rate requirements are used, the energy efficiency performance of our algorithms always outperforms both the location-based user clustering strategy and equal power allocation scheme. Under the low transmission power budget, the location-based user clustering strategy can achieve a high energy efficiency performance similar to the proposed pairingassignment strategy. However, with the increase of transmission power budge, our strategy's energy efficiency performance keeps at a very high level while the location-based user clustering strategy declines. When the transmission power budget becomes higher, more power will be allocated to the clusters with low channel gain users, and thus lower energy efficiency that the location-based user clustering strategy can achieve. This can be supported by scenario 2, where the location-based user clustering strategy can not satisfy the energy requirements of low channel gain users under the low transmission power budget. As all low channel gain users are paired with high channel gain users in our strategy, lower energy is needed to ensure their minimum data transmission rate requirements, and more energy can be allocated to the users with high channel gain to improve the global energy efficiency.
From Figure 6b, when the transmission power budget is meager, allocating equal power to each cluster is challenging to satisfy all constraints. At the same time, our Lagrangian multiplier based dynamic programming model can still get a possible result.
As the transmission power budget grows, our algorithms can always allocate energy appropriately. However, the equal power allocation scheme will allocate more redundant energy, which leads to lower energy efficiency.
It is worth mentioning that NOMA systems' energy efficiency performance increases slightly with the average cluster size growth. More users are in the same cluster, there are more combinations for power allocation, and better energy efficiency that the algorithm can obtain. However, for the location-based user clustering strategy and equal power allocation scheme, an opposite result is observed. This is because inappropriate user clustering and power allocation will cause tremendous energy waste, and its influence increases with the cluster size. Therefore, it is more difficult for them to achieve good energy efficiency when the cluster size is more prominent.
Finally, we compare the curves of scenario 1 and scenario 2 in Figures 4-7. We can see that the energy efficiency of scenario 2 is much lower than that of scenario 1. When the user's channel condition worsens, more energy should be consumed to meet the same transmission requirements. On the other hand, our solution performs better than other benchmarks in both scenarios, proving that it is insensitive to the user's distribution and universal to some extent.

Conclusions
Effective user clustering and power allocation are significant for NOMA systems to cater to green communication and enhance the overall system performance. By formulating the energy efficiency maximization problem of downlink NOMA systems as an optimization problem with several constraints, we apply a quick user clustering strategy based on the analysis of downlink NOMA system characteristics. Given user clustering results, each user's optimal power allocation in each cluster is calculated by the Lagrangian multiplier method with Karush-Kuhn-Tucker optimality conditions. An inter-cluster dynamic programming algorithm is further developed to achieve the overall energy efficiency maximization. Numerical simulation results show the energy efficiency gain of the proposed solution. It is shown that, by using the proposed algorithms, the energy efficiency of NOMA systems is much higher than the systems that used OFDMA schemes and other user clustering strategies and power allocation schemes.