Energy-Efﬁcient Power Allocation for Full-Duplex Device-to-Device Underlaying Cellular Networks with NOMA

: Full-duplex (FD), Device-to-Device (D2D) and non-orthogonal multiple access (NOMA) are promising wireless communication techniques to improve the utilization of spectrum resources. Meanwhile, introducing FD, D2D and NOMA in cellular networks is very challenging due to the complex interference problem. To deal with the complex interference of FD D2D underlaying NOMA cellular networks, power allocation (PA) is extensively studied as an efﬁcient interference management technique. However, most of the previous research works on PA to optimize energy efﬁciency only consider the system framework of partially joint combining techniques of FD, D2D and NOMA, and the constraints of optimization problem are very different. In this paper, in order to further improve the energy efﬁciency of a system, a dual-layer iteration power allocation algorithm is proposed to eliminate the complex interference. The outer-layer iteration is to solve the non-linear fractional objective function based on Dinkelbach, and the inner-layer iteration is to solve the non-convex optimization problem based on D.C. programming. Then, the non-convex and non-linear fractional objective function is transformed into a convex function to solve the optimal power allocation. In this approach, FD D2D users reuse the spectrum with downlink NOMA cellular users. Imperfect self-interference (SI) cancellation at the FD D2D users and the successive interference cancellation (SIC) at the strong NOMA user are considered in the system framework. The optimization problem is constructed to maximize the system’s energy efﬁciency with the constraints of successful SIC, QoS requirements, the maximum transmit power of BS and FD D2D users. Numerical results demonstrate that the proposed algorithm outperforms the traditional orthogonal multiple access (OMA) in terms of energy efﬁciency with a higher system sum rate.


Introduction
Recently, researchers have growing interest in Device-to-Device (D2D) communications underlaying cellular networks because D2D users can reuse the spectrum of cellular users, thus improving the spectral efficiency [1][2][3][4]. Similarly, non-orthogonal multiple access (NOMA) and full-duplex (FD) as significant candidate technologies in 5G can greatly improve the spectrum utilization and achieve a higher energy efficiency for next-generation wireless communication [5][6][7]. However, the combination of the above three will cause serious co-channel interference while improving spectral efficiency. Therefore, a reasonable resource allocation has become particularly important, which can effectively suppress interference while ensuring the system spectral efficiency. On the other hand, for massive machine-type communications (mMTC) scenario in 5G, users' energy is always limited [8,9]. How to improve energy efficiency while guaranteeing the quality of service (QoS) of equipments is also a key problem. D2D communication allows users to transmit directly while using the authorized frequency band of the base station (BS), thus reducing the burden of the BS and save communication resources of cellular BS. Therefore, D2D communications can improve the overall capacity of the mobile communication system. For this reason, D2D communication is considered in cellular networks while causing co-channel interference when D2D users multiplex the resources of cellular networks. Resource allocation for D2D communications underlaying cellular networks was researched in [10][11][12][13][14][15]. In [10], the spectral efficiency of D2D users was maximized. Four transmit power control strategies were proposed, ensuring that the interference to the BS should be lower than a threshold. In [11], the system throughput was maximized while ensuring that the co-channel interference remained minimized. The channel allocation, mode selection and power control were also studied.
NOMA is a promising solution for reducing the co-channel interference using the successive interference cancellation (SIC) at the receivers in 5G. Nowadays, many researches have studied the resource allocation of D2D joint NOMA communications scenarios [16][17][18][19][20]. The maximization of the uplink energy efficiency and the achieved rate of the D2D communication based on the NOMA system were studied in [16]. The authors proposed a joint power control and sub-channel assignment algorithm to gain the optimal energy efficiency. In [17], the authors investigated the time scheduling and resource allocation algorithms in a NOMA-based D2D communication while maintaining energy efficiency among D2D users without affecting cellular users' energy efficiency. In [18], the authors optimized the power allocation, the resource assignment and the SIC decoding order for the NOMA users to maximize the obtained rate of D2D communication.
FD communication allows users to transmit and receive signals using the same spectrum simultaneously, and thus theoretically, the capacity can be twice that of half-duplex (HD) communication with the development of self interference (SI) cancellation technology. The combination of FD and D2D communication joint NOMA has aroused researchers' great interest [21][22][23][24][25][26]. The NOMA vehicle-to-everything (V2X) system was considered in [21], in which the vehicle enabled the D2D transmission mode. As for the Roadside Unit (RSU) selection, a full-duplex communication mode was applied to further improve the performance. The ergodic rate of two vehicles in a specific group was compared. In [22], the cell-center D2D user acted as the relay for the cell-edge D2D user, while the cell-center D2D user can operate in HD or FD communication mode to communicate and form a group of NOMA users with multiple relay nodes. Three schemes supporting D2D-NOMA systems were proposed to derive the outage probabilities. Cooperative NOMA (C-NOMA) in cellular downlink systems was discussed in [23], in which users operated in FD mode and can assist the communication from the BS to poor-channel-quality users. D2D user grouping and power control were studied to improve multiplexing gain. In addition, some researchers focus on the machine learning [27][28][29][30] technique to deal with big data and resolve non-linear problems. In [28], a nature deep Q-Learning algorithm in deep reinforcement learning is proposed to solve the power allocation problem for achieving a higher data rate and energy efficiency of MIMO-NOMA wireless network. In [29], the authors aimed to maximize the total energy efficiency of D2D multicast clusters underlaying cellular networks. Since the optimization problem is non-convex, the authors transform it into a mixed-integer programming problem according to the Dinkelbach algorithm and employing the Q-Learning to solve it. In [30], the authors design a novel artificial intelligence (AI)-based framework for maximizing the secrecy energy efficiency (SEE) in FD cooperative relay underlaying cognitive radio NOMA systems. The non-convex SEE optimization problem is solved with ensemble learning to select the optimal relay and with a quantum particle swarm optimization-based technique to optimize power allocation.
Although the above work has carried out a detailed analysis on the research of D2D, FD and NOMA, there is still little work on the comprehensive combination of D2D, FD and NOMA, which is of great value to promote the system's spectral efficiency and energy efficiency. Motivated by this, we research on the resource allocation of FD D2D underlaying cellular networks with downlink NOMA, where FD D2D users can operate at the same spectrum with downlink NOMA users. Considering the energy-limited nature of wireless terminal devices and the requirements of global green communication, we set the maximization of the system's energy efficiency as the ultimate goal. At the same time, we consider that the transmission rate is very important to users, because many services such as watching videos, playing online games and live broadcasting do need relatively high data rates. Therefore, we maximize the system's energy efficiency while ensuring users' QoS requirements.
The contributions of this paper can be summarized as follows: (1) We investigate FD D2D underlaying cellular networks with downlink NOMA, where two downlink cellular users form a group of NOMA users and the FD D2D users reuse the spectrum for NOMA users. (2) The system's energy efficiency is maximized under the conditions of power constraints for all users, the successful SIC for NOMA users and the QoS requirements for FD D2D users and NOMA users. (3) The optimization problem in this paper is non-convex and non-linear fractional, which makes it difficult to obtain the optimal solution directly. Thus, a dual-layer iteration algorithm is proposed to deal with the problem. The outer-layer iteration algorithm is to solve a non-linear fractional problem based on Dinkelbach algorithm, and the inner-layer iteration algorithm, based on the difference of a concave or convex (D.C.) structure, is used to deal with the non-convex programming. (4) The performance of the proposed scheme is compared with the traditional orthogonal multiple access (OMA) scheme, which verifies the superiority of the proposed scheme in terms of energy efficiency.
The rest of the paper is organized as follows. The system model and the optimization problem are introduced in Section 2. Then, in Section 3, we provide the proposed optimal solution. The simulation analysis and relevant discussion are displayed in Section 4. Lastly, a brief summary of this paper is presented in Section 5.

System Model and Problem Formulation
In this section, we present the system model first and, subsequently, discuss the construction of the optimization problem.

System Model
In this subsection, the system model of FD D2D communications underlaying cellular networks with downlink NOMA is presented, as shown in Figure 1. A downlink resource allocation scene is considered where two NOMA users (C 1 and C 2 ) share the downlink spectrum with a pair of D2D users (D 1 and D 2 ) in a single cell system. The same spectrum is accommodated to C 1 and C 2 , utilizing the power multiplexing of NOMA. Therefore, the BS and two NOMA users are both equipped with a single antenna. In particular, the D2D users operate in FD mode and we assume that both D2D users are equipped with separate transmit and receive antennas.
We denote the channels of the BS- 1 and g d 2 ,c 2 , respectively. In addition, we assume that large-scale fading based on the distance path loss model and small-scale fading based on the Rayleigh fading model is the channel model [24]. We further assume that a dedicated control channel is applied to collect the channel state information (CSI) in our networks so that the perfect CSI is assumed. We denote that the transmission power from the BS to C 1 and C 2 are P c 1 and P c 2 . Accordingly, the transmit powers of the D 1 and D 2 are denoted by P d 1 and P d 2 , respectively.
To ensure that the SIC process at the two NOMA downlink users are successful, the users are sorted based on their channel gains, such that g b,c 1 > g b,c 2 . According to this order, C 1 first carries out SIC technology to eliminate interference from other users and then decodes its own signal, while C 2 decodes its own signal by directly treating other signals as interference. The received signals of C 1 and C 2 are accordingly represented as where x c 1 , x c 2 , x d 1 and x d 2 are the transmission signals of users c 1 , c 2 , d 1 and d 2 , respectively, and n c 1 , n c 2 ∼ CN 0, σ 2 denote additive white Gaussian noise (AWGN) with zero mean and σ 2 variance. User c 1 first applies SIC technology to eliminate interference and then decodes its own signal, i.e.,γ whereγ c 2 and γ c 1 are the signal-to-interference and noise ratio (SINR) of user C 2 at user C 1 and user C 1 , respectively. To make the SIC process successful at the C 1 , the transmission power of two NOMA users should meet the following requirement [25,26]: where θ represents the minimum power gap, ensuring that the SIC process is successful between the two NOMA users.
As for user C 2 , it decodes the received signal x c 2 directly while treating other received signals as interference. Therefore, the SINR γ c 2 of user C 2 is Similarly, the received signals of D2D users are denoted as where ηP d 1 and ηP d 2 are the residual SI of D 1 and D 2 , and η denotes the SI cancellation capability of the FD transmitter. n d 1 and n d 2 denote AWGN with zero mean and σ 2 variance. Thus, the received SINRs γ d 1 and γ d 2 at users D 1 and D 2 are, respectively, represented as

Problem Formulation
The energy efficiency optimization problem is formulated mathematically in this subsection while ensuring users' QoS requirements, power constraints and the minimum gap for successful SIC.
Based on Equations (4), (5), (9) and (10), considering the unit bandwidth, the transmission rate corresponding to the user i can be expressed as The energy efficiency of the FD D2D communication underlaying cellular networks with downlink NOMA can be expressed as the ratio of the system sum rate to the total power consumption. The total power consumption consists of average circuit loss power P 0 and the transmit power of BS and D2D users. Therefore, the following linear model is applied to represent the power consumption where 3P 0 is the circuit power at the BS and users D 1 and D 2 .
Maximizing the system's energy efficiency is our purpose while satisfying three kinds of constraints. Firstly, we ensure the minimum transmission rate of each user as a QoS requirement. Secondly, the transmit power of all users is constrained. In addition to the above two common constraints, we also guarantee the successful SIC constraint for NOMA user C 1 . The energy efficiency optimization problem can be expressed as follows: where Equation (13a) is the objective function of the system's energy efficiency maximization. R th1 and R th2 in (13b) are the minimum transmission rates of C 1 and C 2 , respectively.
Since the channel gain of C 2 is worse than that of C 1 , we set R th2 < R th1 . R thd in (13c) is the QoS requirements for D 1 and D 2 . Additionally, the constraint in (13d) ensures that the SIC process of C 1 is successful and k is the normalized coefficient, k = g b,c 1 P d 1 g d 1 ,c 1 +P d 2 g d 2 ,c 1 +σ 2 . In the constraints found in (13e) and (13f), P bmax and P dmax are the maximum transmission powers of the BS and the D2D users, respectively.

Proposed Optimal Solution
In this section, the power allocation of the optimization problem (13a) for FD D2D communications underlaying cellular networks with downlink NOMA is researched. Note that the optimization problem (13a) is a non-convex and non-linear fractional problem, which makes it quite difficult to obtain the optimal solution directly. Consequently, a duallayer iteration algorithm is proposed to deal with the problem. The outer-layer iteration algorithm is to solve the non-linear fractional objective function based on Dinkelbach algorithm. After that, the objective function is still a non-convex optimization because the substructure has the D.C. structure. Therefore, the inner-layer iteration algorithm is to solve the non-convex optimization problem based on the D.C. programming. Thus, the nonconvex and non-linear fractional objective function is transformed into a convex function and we can apply convex optimization to gain the optimal power allocation solution. There is no further improvement in the system's energy efficiency when the iterative process is convergent.

Outer-Layer Iteration Algorithm
For the purpose of solving the objective optimization problem, which is a non-convex and non-linear fractional programming, we mainly focus on transforming the form of the objective function in this subsection, and we assume that the optimal solution of Equation (13a) is λ * , which is given by where P * i is the optimal power allocation. The following proven Theorem 1 can help us transform the objective function [31,32].
, then, F(λ) is a monotonically decreasing function of λ, and the maximum energy efficiency λ * = max Hence, based on Theorem 1, we can transform the non-linear fractional programming of Equation (13a) to the problem of Equation (16). Thus, we can solve Equation (16) to obtain the corresponding optimal power allocation: where D is the set consisting of Equations (13b)-(13g). As for constraints (13b)-(13d), we can transform them into a linear form of power, respectively, in Equation (17). Therefore, the set D is a convex set: Based on the above analysis, we have solved the non-linear fractional objective optimization problem. But we cannot obtain the optimal solution by applying the outer-layer iteration algorithm because the substructure ∑ i∈U R i is still a non-convex structure. Therefore, in next subsection, we will introduce the transform of the non-convex structure and then present the complete iteration algorithm, including outer-layer iteration and inner-layer iteration.

Inner-Layer Iteration Algorithm
Although we have transformed the non-linear fractional function in Equation (13a), the subtraction structure ∑ i∈U R i in problem Equation (16) is still a non-convex structure that needs to be transformed into a convex structure.
First, we rewrite the subtraction structure ∑ i∈U R i as where, and Obviously, f 1 (P) and f 2 (P) are both concave functions on P, and the second part λ(∑ i∈U P i + 3P 0 ) in Equation (16) is a linear function. Therefore, the objective function in (16) has the D.C. structure. In addition, D is a convex set. Thus, we can solve the problem (16) based on D.C. programming.
According to [26], f 2 (P) can be expanded by the first-order Taylor function around P (k) : where x, y = x T y is the inner product of the vectors x and y. f 2 (P (k) ) is the gradient of f 2 (P) at P (k) .
Therefore, the objective function in problem Equation (16) can be transformed to a concave function, which is expressed as: It is easy to solve the concave problem in Equation (22) by applying standard convex optimization techniques, such as the interior-point method [33].
The detailed dual-layer iteration algorithm to solve the optimization problem (22) is shown in Algorithm 1. First, we set the initial conditions of the optimization problem (22). Then, the outer-layer iteration updates λ constantly and the inner-layer iteration updates the power P * until the inner-layer iteration and outer-layer iteration meet the tolerances, respectively. Finally, the optimal solution can be obtained through Algorithm 1 after a finite number of iterations [31].

Simulation Results
We provide the simulation analysis in this section, which illustrates the superiority of the system's energy efficiency for the proposed scheme. Firstly, we analyze the convergence of the iterative algorithm in Section 3. Next, we further analyze the impact of the QoS requirement of each user, the minimum gap for successful SIC and SI cancellation coefficient on the system's energy efficiency. For comparison, we present the simulation results with traditional OMA networks where each cellular user occupies half-spectrum resources and D2D users share the total spectra at the same time. Furthermore, the achievable rate of each user is still an important metric to the system performance. Therefore, we analyze the effect of SI cancellation coefficient and the distance between D2D users on the user's achievable rate, and compare FD D2D with HD D2D to observe the improvement of the achievable sum rate.
The channel model between user i and user j is denoted as g i,j = d −α i,j |h i,j | 2 . Among them, d −α i,j is the distance path loss, in which d i,j represents the distance from the transmitter i to the receiver j in meters, α represents the path loss coefficient and we set α = 4. For the small-scale fading, all users experience independent Rayleigh fading with zero mean and unit variance. Particularly, all terminals are uniformly distributed in the cell with a radius of 400 m and the D2D users are located within a short distance of 40 m. Furthermore, we set R th1 = 3 bps, R th2 = 1 bps, R thd = 3 bps, P bmax = 24 dBm and P dmax = 21 dBm. Unless stated otherwise, Table 1 presents the simulation parameters applied in this paper. The convergence performance of the proposed dual-layer iteration algorithm to obtain the optimal power allocation was researched and is presented in Figure 2. It can be observed that the system's energy efficiency converges to a stable value after finite iterations (four times), which indicates that the iterative algorithm we proposed in Section 3 has a lightweight and less complex nature. In addition, the reduction in the maximum transmission power of BS and D2D users will lead to the reduction in system's energy efficiency. This is because the reduction in the maximum transmission power will lead to the reduction in the power that can be allocated during the transmission, which will reduce the reachable rate of the users, thus reducing the system's energy efficiency. We can also see that the reduction in the maximum transmission power of the BS has a greater influence on the energy efficiency than that of D2D users. This is because when the transmission power of the BS decreases, the BS still needs to allocate more power to maintain the minimum rate of the NOMA weak user C 2 , while allocating less power to NOMA strong user C 1 , hence reducing the system's energy efficiency. Then, the system's energy efficiency is investigated with variable circuit power and minimum gap for successful SIC in Figure 3. We can determine that when the circuit power increases, the system's energy efficiency decreases, and the proposed scheme always outperforms the OMA scheme, which proves the advantage of our proposed scheme and the feasibility of the solution. In addition, when θ increases, the system's energy efficiency of the proposed scheme shows a downward trend. This is because more power should be allocated to the poor channel gain user C 2 and accordingly less power to the strong channel gain user C 1 , which will lessen the system's energy efficiency. Moreover, because the energy efficiency of the OMA scheme is independent of the minimum gap for a successful SIC, we only show one curve of the OMA communication scheme.
The effect of the QoS requirements of users on the system's energy efficiency is depicted in Figure 4, where η = −90 dB. We can see the proposed scheme also outperforms the OMA scheme. In addition, when the minimum transmission rate of user C 2 increases, the energy efficiency of all cases presents a downward trend. This is because when the minimum transmission rate of C 2 increases, the BS will allocate more power to C 2 to maintain the minimum requirement. Then, the power allocated to C 1 will inevitably decrease, which will reduce the system sum rate and then reduce the system's energy efficiency. Accordingly, when the minimum transmission rate of user C 1 increases, the system's energy efficiency also decreases. This is because the increase in the minimum transmission rate of C 1 will cause the BS to allocate more power to user C 1 , which will lead to the insufficient power allocated to user C 2 to maintain the normal communication, thus reducing the energy efficiency of the system.  We present the effects of different minimum transmission rates of D2D users and the SI cancellation coefficient on the system's energy efficiency in Figure 5. We observe that as SI decreases, the energy efficiency of all cases increases. Furthermore, with the increase in R thd , the energy efficiency of the system decreases, no matter in the proposed scheme or the OMA scheme. This is because with the increase in the minimum transmission rate of D2D users, the transmission power of the D2D users also needs to increase. Therefore, the transmission power may not meet the requirements of the minimum transmission rate, resulting in the reduction in the system's energy efficiency.  Although we aim to maximize the system's energy efficiency in this paper, the spectral efficiency of the users is still a key metric that affects communication quality between users. In Figure 6, we have illustrated the relationship between the user's achievable rate and the SI cancellation coefficient. We can see that when η varies, the achievable rate of each user meets the minimum transmission rate requirement. Particularly, C 2 always maintains the minimum threshold, while C 1 's rate decreases and D2D users' rate increases. This is because SI only exists in FD D2D users. Obviously, when η decreases, the achievable rate of D2D users increases. In the meantime, in order to optimize the overall energy efficiency, the D2D users must allocate more power, which means more co-channel interference to the cellular users. Hence, C 1 's rate gradually decreases and C 2 's rate consistently keeps to the minimum in order to maintain QoS requirement.  Figure 7 discusses the effect of the distance between D2D users on the ergodic sum rate. In addition to comparing with the OMA scheme in Figure 7, we also change the FD D2D users to HD D2D users for comparison in order to discuss the influence of SI cancellation coefficient on the system sum rate, herein, namely the HD-D2D scheme. We can see the system sum rate of all cases decrease as the D2D distance increases, because the channel condition of the users becomes worse, reducing the system sum rate. From the figure, we also find out that our proposed scheme has advantages over the other two schemes with regard to the sum rate.

Conclusions
In this paper, we proposed a dual-layer iteration power allocation algorithm to maximize the energy efficiency of the system, where one FD D2D user pair uses the same spectrum resource with two cellular users using a power domain NOMA transmission. The outer-layer iteration is to solve a non-linear fractional problem for energy efficiency based on Dinkelbach and the inner-layer iteration is to solve the non-convex optimization for power allocation based on the D.C. programming. The simulation results show that the proposed algorithm can achieve better energy efficiency than the traditional OMA with the constraints of the successful SIC power level for NOMA users, the QoS requirements and maximal transmission power of D2D users and BS. Although an improved performance is achieved, there are still some challenges to be solved for more complex interference scenarios. In the near future, we will focus on extending our proposed algorithm to the scenario of multiple NOMA groups sharing a spectrum with multiple D2D user pairs by jointly considering channel assignment and power allocation to maximize the system's energy efficiency. Moreover, machine learning methods can be further introduced to solve the joint optimization problem.

Conflicts of Interest:
The authors have no relevant financial or non-financial interest to disclose.