Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications

: Device-to-device (D2D) communication and non-orthogonal multiple access (NOMA) have been considered promising techniques to improve system throughput. In the NOMA-enhanced D2D scenario, a joint channel and power allocation algorithm based on the Stackelberg game is proposed in this paper. The social relationship between the cellular and D2D users is utilized to deﬁne their utility functions. In the two-stage Stackelberg game, the cellular user is the leader and the D2D group is the follower. Cellular users and D2D groups are matched via the Kuhn–Munkres (KM) algorithm to allocate channels for D2D groups in the ﬁrst stage. The power allocation of D2D users is optimized through a penalty-function-based particle swarm optimization algorithm (PSO) in the second stage. The simulation results show that the proposed algorithm can e ﬀ ectively strengthen the cooperation between cellular and D2D users and improve their utility.


Introduction
With the rapid growth of mobile terminals and multimedia services, the demand for high-rate data transmission has increased and the traffic pressure on the core network has become extremely high.Device-to-device (D2D) communication is considered a key technology to relieve the pressure of the core network effectively [1].It enables users to communicate with each other by reusing the resources of other users without passing through the base station, thus significantly improving spectrum utilization and system throughput.Non-orthogonal multiple access (NOMA) is also a recent research hot spot.Compared with orthogonal multiple access (OMA), it has higher spectral efficiency and can provide faster transmission rate and lower outage probability [2].
In wireless communication, resource allocation has attracted widespread attention as it considerably affects system performance [3,4].Game theory is often used to solve such problems [5,6].In D2D communication, many studies focused on the resource allocation problem to improve the system performance.In Reference [7], a joint optimization algorithm for channel and power allocation based on the Nash bargaining game was proposed.It decomposed the optimization problem into two sub-problems, which simplified the calculation and improved the system throughput.In Reference [8], D2D power allocation was studied under cooperative and non-cooperative games, and the D2D transmit power was optimized via sub-gradient methods.In Reference [9], a D2D power auction mechanism based on a stochastic game was proposed to reduce interference and optimize power allocation.In Reference [10], a student-project matching model of cellular, D2D, and relay users, which improved the system throughput, was proposed.However, none of the above algorithms considers NOMA and no further improvement in system throughput was achieved.
A D2D transmitter can send messages to multiple D2D receivers simultaneously through NOMA.The distribution of D2D communication resources assisted by NOMA was studied in Reference [11], and the D2D throughput was optimized while ensuring the quality of service (QoS) of cellular users.In Reference [12], NOMA-based D2D resource allocation was studied as a Nash bargaining game, and the power optimization problem was solved by using the Karush-Kuhn-Tucker (KKT) conditions.In Reference [13], a D2D-NOMA optimization algorithm combining sub-channel allocation, user matching, and power control was proposed to optimize the total transmit power by coordinating interference.However, mobile devices are carried by humans and none of the above solutions considers the influence of social factors.In a practical environment, social relationships will affect user's decision-making and can be used to strengthen the cooperation between users, thereby effectively improving system throughput.
In this study, a cellular uplink network scenario is considered.Cellular users occupy independent sub-channels, and D2D groups, which consist of a D2D transmitter and two D2D receivers, reuse the uplink channels of the cellular users to communicate with each other.In D2D groups, NOMA is considered to make the D2D receivers demodulate the signal correctly from the mixed signal.The resource allocation is modeled as a two-stage Stackelberg game by defining the utility functions of cellular users and D2D groups.The main contributions of this paper can be summarized as follows:

•
The social relationship between cellular and D2D users is considered.When D2D users reuse the channel resources of cellular users for communication, their social relationship will affect the channel selection and transmit power of D2D users.Considering the social relationship can strengthen the cooperation between users, thus increasing the system throughput.

•
In the two-stage Stackelberg game model, the cellular user is the leader.In the first stage, maximum weight matching between cellular users and D2D groups is achieved using the Kuhn-Munkres (KM) algorithm while ensuring the QoS of all users on the sub-channel to allocate channels for D2D groups.

•
The D2D group is the follower.In the second stage, a penalty-function-based particle swarm optimization (PSO) algorithm is utilized to optimize the D2D transmit power.The final power allocation strategy is determined via the convergence of PSO.
The rest of this paper is organized as follows.Section 2 presents the system model.In Section 3, we define the utility functions of cellular users and D2D groups, establish the Stackelberg game model, and prove the convergence of the algorithm.In Section 4, the simulation results are presented and analyzed.In Section 5, we summarize the paper.

System Model
In the cellular communication system, a single-cell uplink transmission scenario is considered.As shown in Figure 1a, we consider that M cellular users {C 1 , C 2 , . . ., C M } and N D2D groups {D 1 , D 2 , . . ., D N } are randomly distributed in the cell.The BS allocates a dedicated subchannel for each cellular user, and subchannels {SC 1 , SC 2 , . . ., SC M } are orthogonal with each other.We assume that the cellular user C m occupies the subchannel SC m without loss of generality.The cellular users communicate with the BS in the traditional cellular mode.The D2D group is different from the traditional D2D pair.There are one D2D transmitter and several D2D receivers in one D2D group.We consider NOMA transmission protocol and serial interference cancellation (SIC) technology within D2D groups, so that the D2D transmitter can send messages to multiple D2D receivers simultaneously and each D2D receiver can demodulate the message which belongs to itself correctly.Considering that as the number of D2D receivers increases, series of problems such as complex interference, huge computational complexity, etc. will occur.This paper assumes that one D2D group only consists Considering that mobile devices are carried by the human, social relationship between the cellular user and D2D user is taken into account.Social relationship will affect the channel selection and power allocation for D2D users and thereby strengthen the cooperation between D2D users and cellular users.Since the system model is closely associated with the social relationship between users, the following is the analysis of the physical domain and social domain, respectively.
Physical domain can be used to describe the impact of channel condition and system interference in practical network.In this paper, D2D groups communicate by reusing cellular users' uplink channel resources.Therefore, cellular users may cause interference to D2D receivers and BS will suffer interference from D2D transmitters.The physical domain can be represented as a graph ( ) , where p V denotes the devices, p E indicates the channel quality for data transmission.The physical domain shows whether the channel can meet the communication requirements of users.Social domain can be used to describe users' social attributes, which is shown in Figure 1b.
Similarly, social domain can be represented as a graph ( )  Considering that mobile devices are carried by the human, social relationship between the cellular user and D2D user is taken into account.Social relationship will affect the channel selection and power allocation for D2D users and thereby strengthen the cooperation between D2D users and cellular users.Since the system model is closely associated with the social relationship between users, the following is the analysis of the physical domain and social domain, respectively.
Physical domain can be used to describe the impact of channel condition and system interference in practical network.In this paper, D2D groups communicate by reusing cellular users' uplink channel resources.Therefore, cellular users may cause interference to D2D receivers and BS will suffer interference from D2D transmitters.The physical domain can be represented as a graph G V p , E p , where V p denotes the devices, E p indicates the channel quality for data transmission.The physical domain shows whether the channel can meet the communication requirements of users.
Social domain can be used to describe users' social attributes, which is shown in Figure 1b.Similarly, social domain can be represented as a graph G(V s , E s ), where V s denotes the users, E s indicates the social relationship between cellular users and D2D receivers.Social relationship is defined as S k m,n , S k m,n ∈ [0, 1], k ∈ {1, 2}.When two users have a very close social relationship, S k m,n should be close to one and they are more willing to cooperate with each other, which means the cellular user is more willing to let the D2D user who occupy its channel to increase the transmit power.
This paper assumes that each cellular user occupies an independent subchannel and each subchannel can be reused by only one D2D group.Meanwhile, each D2D group can only reuse one cellular user's channel.Therefore, the signal received at the BS on the subchannel SC m can be expressed as where P c and P d represent the transmit power of the cellular user and D2D transmitter, respectively.g m,B and g n,B are the channel gain between C m and BS, DT n , and BS, respectively.η m,n indicates whether D n reuse SC m , i.e., SC m is reused by D n , η m,n = 1; otherwise η m,n = 0. x m and x n are the signals sent by C m and DT n , respectively.ζm represents the additive white Gaussian noise (AWGN) on the channel.As a consequence, the signal-to-interference-plus -noise-ratio (SINR) and transmission rate of C m at BS can be defined as where N 0 represents the noise power.
Considering NOMA in D2D groups, we set the power allocation coefficients of the D2D transmitter DT n as α n and β n , and α n + β n ≤ 1.Therefore, the signal received by the D2D receiver DR n need to remove x 2 n and demodulate x 1 n properly through SIC, the following condition must be met [14], which can be represented as where g n,2 and g m,n,2 are the channel gain between DT n and DR 2 n , C m and DR 2 n , respectively.Equation ( 5) can be simplified as As shown in Equation ( 6), the inequality is unrelated to the power allocation coefficient α n , β n and is only related to the channel allocation η n,m .As a consequence, it can be expressed as a function of η.
Thus, the SINR and transmission rate at D2D receiver DR 1 n and DR 2 n can be defined as The above conclusions are all based on the assumption that DR 1 n can remove x 2 n and correctly demodulate x 1  n , and the corresponding other case, that is, DR 2 n remove x 1 n and correctly demodulate x 2 n , is similar to the above and will not be derived again.

Stackelberg Game Based Resource Allocation
According to the system model, this paper mainly studies the channel and power allocation of the D2D group under NOMA.Since the channel D2D group reuse may affect the D2D transmitter's transmit power and the different transmit power is relative to the channel selection, the model is in accordance with the Stackelberg game.Therefore, we designed a two-stage Stackelberg game model where the leader is cellular users and the follower is D2D groups.In the first stage, we use KM algorithm to match the cellular users with D2D groups in order to allocate subchannels for D2D users.In the second stage, PSO algorithm based on penalty function will be used to optimize D2D users' transmit power.

Utility Model
The utility functions of cellular users and D2D groups are defined on the basis of their benefit and loss.The first stage mainly solves the channel allocation problem, that is, the matching problem between cellular users and D2D groups.Considering that when D2D users reuse the channel of cellular users, they cause interference to cellular users and reduce cellular users' throughput.Therefore, when the cellular channel is reused, the D2D user needs to pay a certain price for using the cellular channel.
As a consequence, for cellular users, incentive is mainly derived from the rewards of assigning power to D2D groups based on social relationship.Meanwhile, they also sacrifice some of their throughput.Hence, the utility function of cellular users can be defined as where S 1 m,n and S 2 m,n are the social relationships between the cellular user C m and D2D receivers DR 1 n , DR 2 n , respectively.V represents the price of per unit power. 1 − S k m,n V is the actual price of per unit power and it is related to the social relationship between two users.The closer the social relationship is, the lower the actual price is.R 0 m denotes the data rate of C m when no D2D user reuses SC m .R 0 m can be expressed as The second stage mainly solves the power allocation for D2D users.We do not consider optimizing the cellular users' transmit power here and set it to a certain value.Therefore, power allocation means optimizing the D2D transmitter's transmit power when sending messages to two D2D receivers.For D2D users, the incentive is mainly derived from the increase of data rate after reusing the cellular channels.If the data rate is not improved after reusing the cellular channel, then the utility will be less than zero, and the cellular mode will be selected for communication; if the data rate is increased, D2D users should pay for the transmit power.As a consequence, we can obtain the utility functions of DR 1 n and DR 2 n : where R c n,1 and R c n,2 are the data rates when D2D users do not reuse the cellular channel and send messages to the BS in traditional cellular mode.They can be defined as Equation (13).
Hence, the utility function of the D2D groups is given by

Analysis of Leaders
Cellular users are the leaders in the Stackelberg game.In the first stage, we mainly solve the matching problem among cellular users and D2D groups.Based on cellular user's utility function defined in the previous section, the channel allocation problem can be formulated as the following: where Equation (16a) is the optimization problem we formulate to maximize the cellular users' utility through the channel allocation.Constraint (16b) limits the interference which the D2D user brings to the cellular user and ensures the QoS of the cellular user.Constraint (16c) guarantees the QoS of D2D users.Constraint (16d) represents the requirement which must be met if using SIC.Constraint (16e) indicates that the value of η n,m should be either 1 or 0, representing reusing SC m or not.Constraint (16f) indicates that the D2D group can only reuse one cellular user's subchannel.Constraint (16g) indicates that only one D2D group can be assigned to each subchannel.The objective function is non-convex because it is a 0-1 integer problem.It can be transformed into the optimal matching problem of the weighted bipartite graph.As we can see from Figure 2, the cellular users and D2D groups form two sets of vertices in the bipartite graph and cellular users' utility can represent the weight of edge w m,n .The principle of the matching process is that each vertex can only match one vertex from the other side, and each vertex should select the vertex with the largest weight edge if possible.Therefore, the optimization problem can be converted to max m n w m,n .KM algorithm can be used to solve Equation (16a) because it can solve the maximum weightedmatching problem under complete matching via the Hungarian method.Specifically, it transforms the weight of edges to the vertex and finds a perfect matching via the Hungarian method.During the matching process, it continuously adjusts the vertex value, increases the feasible edges, then uses Hungarian method to find the final matching.However, KM algorithm requires that the bipartite graph is completely symmetrical.We assume that the number of D2D groups is no more than the number of cellular users in this paper.In order to apply KM algorithm in our scenario, it is necessary to add several virtual vertices to D2D groups.In addition, in order to avoid a non-conforming match, we reset the weight of edge to zero if constraints Equation (16a-c) are not met.Furthermore, KM algorithm is inherently in compliance with the constraints Equation (16d-f).As a consequence, we can solve the channel allocation problem through KM algorithm in the first stage.
Proposition 1: KM algorithm converges to the optimal channel allocation strategy.Proof: KM algorithm claims that, during the matching process, the total utility of all the cellular users should not reduce and at least one cellular user's utility should increase if the match changes, which indicates that the matching is optimized to the perfect match.Since the cellular users and D2D groups participating in the match are finite, the corresponding match is also limited.As a consequence, KM is bound to converge to the optimal match after a finite number of iterations.
Proposition 2: The computational complexity of KM is ( ) Proof: The computational complexity of KM is related to the number of vertices.As mentioned above, the number of vertices on both sides of our scenario is M. Hence, the computational complexity of KM is ( )

Analysis of Followers
D2D groups are the followers in the Stackelberg game.In the second stage, we mainly solve the power allocation for D2D users.Based on the utility function of D2D groups defined in Section 3.1, the power allocation problem can be formulated as the following: KM algorithm can be used to solve Equation (16a) because it can solve the maximum weighted-matching problem under complete matching via the Hungarian method.Specifically, it transforms the weight of edges to the vertex and finds a perfect matching via the Hungarian method.During the matching process, it continuously adjusts the vertex value, increases the feasible edges, then uses Hungarian method to find the final matching.However, KM algorithm requires that the bipartite graph is completely symmetrical.We assume that the number of D2D groups is no more than the number of cellular users in this paper.In order to apply KM algorithm in our scenario, it is necessary to add several virtual vertices to D2D groups.In addition, in order to avoid a non-conforming match, we reset the weight of edge to zero if constraints Equation (16a-c) are not met.Furthermore, KM algorithm is inherently in compliance with the constraints Equation (16d-f).As a consequence, we can solve the channel allocation problem through KM algorithm in the first stage.Proposition 1. KM algorithm converges to the optimal channel allocation strategy.
Proof.KM algorithm claims that, during the matching process, the total utility of all the cellular users should not reduce and at least one cellular user's utility should increase if the match changes, which indicates that the matching is optimized to the perfect match.Since the cellular users and D2D groups participating in the match are finite, the corresponding match is also limited.As a consequence, KM is bound to converge to the optimal match after a finite number of iterations.
Proposition 2. The computational complexity of KM is O M 3 .
Proof.The computational complexity of KM is related to the number of vertices.As mentioned above, the number of vertices on both sides of our scenario is M. Hence, the computational complexity of KM is O M 3 .

Analysis of Followers
D2D groups are the followers in the Stackelberg game.In the second stage, we mainly solve the power allocation for D2D users.Based on the utility function of D2D groups defined in Section 3.1, the power allocation problem can be formulated as the following: max where Equation (17a) is the optimization problem we formulate to maximize the D2D group's utility through the power allocation.Constraint Equation (17b,c) ensures the QoS of all the users on SC m .Constraint Equation (17d,e) indicates that the D2D transmitter DT n 's transmit power should not exceed the power threshold, and the transmit power should not be less than zero when DT n sends signals to DR 1 n and DR 2 n , respectively.Considering that Equation (17a) is a constrained optimization problem, we can transform it into an unconstrained optimization problem by the external penalty function method.The corresponding augmented objective function can be defined as Based on the channel allocation in the previous section, Equation (18) mainly optimizes α n and β n .This problem is a non-convex problem and it can be solved via PSO.PSO is a parallel algorithm.The main idea of PSO is to initialize a group of random particles within the definition domain.Each particle adjusts its position according to the fitness determined by the objective function in each iteration.Two factors may affect particle's speed and position.One is the optimal solution found by itself, and the other is the optimal solution currently found by the population.Through continuous iteration, all particles approximate the global optimal solution.
On the basis of the main idea of PSO, the position of the particle can be expressed as X id , where i represents the particle number and d represents the dimension.In this section, (18) mainly optimizes α n and β n , which means each particle represents a set of power allocation coefficients including two parameters α n and β n .Hence, it is a 2D optimization problem.X id can be expressed as (α 1 , β 1 ), (α 2 , β 2 ), . . ., α N pop , β N pop , where N pop represents the size of the population.Each particle constantly adjusts its speed and position to approximate the optimal value based on (18) on the joint definition domain of α n and β n .
The updated speed can be defined as where ω represents inertia weight which determines the speed of finding the optimal solution.ω is non-negative.C 1 and C 2 are the acceleration constant used to characterize cognitive behaviour and social behaviour, respectively.random(0, 1) means a random number between [0, 1].P id represents the individual optimal position of i in dimension d.P gd represents the optimal position of the population in dimension d.
The updated position can be defined as The algorithm stops when the fitness change of the optimal position is less than the convergence threshold ∆ or reaches the maximum number of iterations.Through the continuous updating of particles' speed and positions, the optimal value of the power allocation coefficients can be obtained.The proposed power allocation algorithm is shown in Algorithm 1. N iter = N iter + 1 4: For j = 1: N pop 5: Calculate the fitness according to (18) 6: Compare and update P id and P gd 7: Update particle velocity according to (19) 8: Update particle position according to (20) 9: End for 10: If P gd (i) − P gd (i−1) < ∆ 11: Break 12: End If 13: End for 14: Output: (α * n , β * n ) Proposition 3. PSO based on penalty function converges to the optimal power allocation strategy.
Proof.Reference [15] proves the convergence of PSO.The parameters of the converged PSO should conform to: 2 In this paper, we set ω = 1 and C 1 = C 2 = 1.8 to satisfy the convergence requirement.In addition, although the power allocation for D2D groups involve Nα n and β n , the power allocation of each D2D group is independent with each other.D2D group only causes interference to the corresponding cellular user on the reused channel.Therefore, the power allocation problem can be decomposed into N sub-problems.Each sub-problem will converge to a stable optimal solution through PSO.As a consequence, the optimization problem in the second stage will converge to a stable optimal solution.
Proposition 4. The computational complexity of PSO based on penalty function is O N × N pop × N iter .
Proof.The computational complexity of PSO is related to the number of particles N pop and the number of iterations N iter .It needs to perform PSO every time when optimizing transmit power for a D2D group.Therefore, the computational complexity of each execution of PSO is O N pop × N iter and the total computational complexity in the second stage is O N × N pop × N iter .

Joint Channel and Power Allocation Based on Stackelberg Game
We propose a two-stage Stackelberg game, where the leader is cellular users and the follower is D2D groups.In the first stage, we find the optimal match between cellular users and D2D groups according to Section 3.2.In the second stage, we optimize the D2D transmitter's transmit power in each D2D group according to Section 3.3.The two-stage Stackelberg game will finally converge to a stable solution which will be proved later.The specific two-stage Stackelberg game based joint channel and power allocation algorithm (S-JCPA) is shown in Algorithm 2.

Algorithm 2. Stackelberg game based joint channel and power allocation (S-JCPA)
1: Initialization: Set of cellular users {C 1 , C 2 , . . ., C M }, set of D2D groups {D 1 , D 2 , . . ., D N }, power allocation coefficients {α 1 , α 2 , . . . ,α N } and β 1 , β 2 , . . . ,β N , set of historical channel allocation H i (t) i∈N = ∅, maximum number of iterations K. 2: For t=1: K 3: Allocate channels for D2D groups via KM according to (12) 4: If channel allocation results already exist in H i (t) 5: For i = 1: N 6: Optimize transmit power for D2D users via PSO according to (18) 7: Update α n and β n 8: End for 9: break 10: Else 11: Save the channel allocation result to H i (t) 12: For i = 1: N 13: Optimize transmit power for D2D users via PSO according to (18) 14: Update α n and β n 15: End for 16: End if 17: End for 18: Output: (η * , P * ) According to Sections 3.2 and 3.3, it can be proved that both of the two stages can converge to the optimal solution.According to the characteristic of Stackelberg game, when the leader and follower both have an equilibrium solution, the Stackelberg equilibrium can be achieved.Through the previous analysis, we can easily achieve the network complexity in the system.Considering the computational complexity of KM and PSO, the network complexity is O K × (M 3 + N × N pop × N iter ) .

Simulation and Performance Analysis
This section simulates and analyzes the proposed joint channel and power allocation algorithm based on Stackelberg game.The system model is shown in Figure 1a.The simulation is built in a disc area with a radius of 500 m.The channel gain is subject to large-scale fading based on distance loss and small-scale fading based on Rayleigh fading [16].The large-scale fading can be modeled as κd −α , where d represents the transmit distance, κ and α represent the possible fading and path loss exponent, respectively.The Rayleigh fading follows the exponential distribution with a mean of 1.The simulation parameters are shown in Table 1. Figure 3 plots the utilities of the cellular and D2D users for different numbers of D2D groups.When the number of D2D groups increases, the utilities of both cellular and D2D users decline.This is because, as the number of D2D groups increases, the gap between the number of cellular users and the number of D2D groups is reduced.When performing channel matching, it is difficult to obtain an optimal match for each user because of the lack of channel resources.Consequently, the utilities of both cellular users and D2D users decline.In Reference [17], a joint optimization algorithm for channel allocation and power control was proposed to optimize the throughput of D2D users.However, this study did not consider the effect of the social relationship between cellular and D2D users and did not optimize the utility function based on the social relationship.Hence, the utility obtained with the algorithm in Reference [17] was not as high as that obtained with our algorithm.Figure 4 plots the average throughput(rate) of the cellular and D2D users for different numbers of D2D groups.As the number of D2D groups increases, the average throughput of both cellular and D2D users shows a downward trend.This is because cellular users represent subchannels available for allocation in the system.Similar to the reason in Figure 3, the number of cellular users is unchanged whereas the number of D2D groups increases.Hence, it is difficult to obtain an optimal match for each individual because of the lack of channel resources.Consequently, the average throughput is reduced for both cellular and D2D users.Furthermore, Reference [17] aimed at optimizing the throughput of all the D2D users without considering whether the cellular users were willing to cooperate with them.Hence, the average throughput of D2D users in Reference [17] was higher than that obtained with our algorithm, whereas the average throughput of cellular users in Reference [17] was lower than that obtained with our algorithm.Figure 4 plots the average throughput(rate) of the cellular and D2D users for different numbers of D2D groups.As the number of D2D groups increases, the average throughput of both cellular and D2D users shows a downward trend.This is because cellular users represent subchannels available for allocation in the system.Similar to the reason in Figure 3, the number of cellular users is unchanged whereas the number of D2D groups increases.Hence, it is difficult to obtain an optimal match for each individual because of the lack of channel resources.Consequently, the average throughput is reduced for both cellular and D2D users.Furthermore, Reference [17] aimed at optimizing the throughput of all the D2D users without considering whether the cellular users were willing to cooperate with them.Hence, the average throughput of D2D users in Reference [17] was higher than that obtained with our algorithm, whereas the average throughput of cellular users in Reference [17] was lower than that obtained with our algorithm.match for each individual because of the lack of channel resources.Consequently, the average throughput is reduced for both cellular and D2D users.Furthermore, Reference [17] aimed at optimizing the throughput of all the D2D users without considering whether the cellular users were willing to cooperate with them.Hence, the average throughput of D2D users in Reference [17] was higher than that obtained with our algorithm, whereas the average throughput of cellular users in Reference [17] was lower than that obtained with our algorithm.Figure 5 shows the impact of the social relationships on the utilities of the cellular and D2D users.With a closer social relationship, the utility of D2D users continues to increase and the utility of cellular users continues to decrease, which is determined by their respective utility functions.When the social relationship between D2D and cellular users is not close, cellular users are not willing to allow the D2D users who reuse their channels to increase their transmit power to improve their throughput.Hence, the cellular users have high utility whereas the D2D users have low utility.When the social relationship is close, the D2D users can increase the transmit power on the cellular channel with a small expense.Consequently, the utility of the D2D users increases, whereas the utility of the cellular users gradually decreases.As the social relationship becomes closer, the D2D users can increase their transmit power without paying an expense to the cellular users.Hence, the utility of the cellular users drops sharply, even approaching zero.However, as Reference [17] did not consider the social relationship between cellular and D2D users, the utility function based on the social relationship was not optimized.Consequently, the utilities of both cellular and D2D users were lower than those obtained with our algorithm.
Electronics 2019, 8, x FOR PEER REVIEW 13 of 16 Figure 5 shows the impact of the social relationships on the utilities of the cellular and D2D users.With a closer social relationship, the utility of D2D users continues to increase and the utility of cellular users continues to decrease, which is determined by their respective utility functions.When the social relationship between D2D and cellular users is not close, cellular users are not willing to allow the D2D users who reuse their channels to increase their transmit power to improve their throughput.Hence, the cellular users have high utility whereas the D2D users have low utility.When the social relationship is close, the D2D users can increase the transmit power on the cellular channel with a small expense.Consequently, the utility of the D2D users increases, whereas the utility of the cellular users gradually decreases.As the social relationship becomes closer, the D2D users can increase their transmit power without paying an expense to the cellular users.Hence, the utility of the cellular users drops sharply, even approaching zero.However, as Reference [17] did not consider the social relationship between cellular and D2D users, the utility function based on the social relationship was not optimized.Consequently, the utilities of both cellular and D2D users were lower than those obtained with our algorithm.Figure 6 shows the impact of social relationships on the average throughput of the cellular and D2D users.As Reference [17] did not consider the influence of social relationship, the average throughput of the cellular and D2D users was unchanged.However, in our algorithm, the closer the Figure 6 shows the impact of social relationships on the average throughput of the cellular and D2D users.As Reference [17] did not consider the influence of social relationship, the average throughput of the cellular and D2D users was unchanged.However, in our algorithm, the closer the social relationship between the cellular users and D2D users, the more cellular users are willing to allow the D2D users who reuse their channels to increase their transmit power to improve their throughput.As the social relationship becomes closer, the D2D users only need to pay a small expense to achieve a high transmit power.Therefore, the average throughput of the D2D users continuously increases, whereas the average throughput of the cellular users gradually decreases.Moreover, as the social relationship becomes closer, the average throughput of the D2D group in this study approaches that in Reference [17] and the average throughput of the cellular users becomes higher than that in Reference [17].Figure 7 plots the network complexity for different numbers of D2D groups under different convergence thresholds and compares the proposed algorithm S-JCPA with the algorithm proposed in [17].Figure 8 shows the impact of convergence threshold on the utilities of the cellular and D2D users.The study in Reference [17] first solved the channel allocation problem with KM and then optimized the D2D transmit power with KKT.In S-JCPA, KM and PSO are used to solve the resource allocation problem.Consequently, the network complexity of the algorithm in Reference [17] is less than that of our algorithm.We also compare the network complexity of our algorithm under different convergence thresholds.The results show that, when the convergence threshold is small, the network complexity is higher, and meanwhile, the utilities of the cellular and D2D users are higher as well because PSO can search for more accurate results.Considering that as the convergence threshold decreases, the utilities don't change much, so we choose 0.001 as the convergence threshold instead of continuously reducing the convergence threshold.Figure 7 plots the network complexity for different numbers of D2D groups under different convergence thresholds and compares the proposed algorithm S-JCPA with the algorithm proposed in [17].Figure 8 shows the impact of convergence threshold on the utilities of the cellular and D2D users.The study in Reference [17] first solved the channel allocation problem with KM and then optimized the D2D transmit power with KKT.In S-JCPA, KM and PSO are used to solve the resource allocation problem.Consequently, the network complexity of the algorithm in Reference [17] is less than that of our algorithm.We also compare the network complexity of our algorithm under different convergence thresholds.The results show that, when the convergence threshold is small, the network complexity is higher, and meanwhile, the utilities of the cellular and D2D users are higher as well because PSO can search for more accurate results.Considering that as the convergence threshold decreases, the utilities don't change much, so we choose 0.001 as the convergence threshold instead of continuously reducing the convergence threshold.
than that of our algorithm.We also compare the network complexity of our algorithm under different convergence thresholds.The results show that, when the convergence threshold is small, the network complexity is higher, and meanwhile, the utilities of the cellular and D2D users are higher as well because PSO can search for more accurate results.Considering that as the convergence threshold decreases, the utilities don't change much, so we choose 0.001 as the convergence threshold instead of continuously reducing the convergence threshold.

Conclusion
In this paper, we propose a joint channel and power allocation algorithm based on the Stackelberg game.We first establish the system model including several cellular users and D2D groups.Cellular users communicate through traditional cellular mode while D2D groups communicate by reusing the channel resources of cellular users.In each D2D group, NOMA is adopted to improve throughput.We also set the SINR threshold of each user to ensure the Qos of the system.Secondly, we model the two-stage Stackelberg game in which cellular users are the leader and D2D groups are the follower.The utility functions of cellular users and D2D groups are defined with social relationships, respectively.By using KM and PSO based on penalty function, we finally obtain the optimal channel and power allocation.The convergence and computational complexity are discussed, respectively.The simulation results show that our algorithm can successfully strengthen the cooperation between users and improve the utility of cellular and D2D users.
Author Contributions: W.G. organized and developed the proposal of the study, carried out the mathematical analysis, and performed simulations using MATLAB.Q.Z.provided guidance, key suggestions, and finalized the paper.

Conflicts of Interest:
The authors declare no conflict of interest.

Conclusion
In this paper, we propose a joint channel and power allocation algorithm based on the Stackelberg game.We first establish the system model including several cellular users and D2D groups.Cellular users communicate through traditional cellular mode while D2D groups communicate by reusing the channel resources of cellular users.In each D2D group, NOMA is adopted to improve throughput.We also set the SINR threshold of each user to ensure the Qos of the system.Secondly, we model the two-stage Stackelberg game in which cellular users are the leader and D2D groups are the follower.The utility functions of cellular users and D2D groups are defined with social relationships, respectively.By using KM and PSO based on penalty function, we finally obtain the optimal channel and power allocation.The convergence and computational complexity are discussed, respectively.The simulation results show that our algorithm can successfully strengthen the cooperation between users and improve the utility of cellular and D2D users.
Author Contributions: W.G. organized and developed the proposal of the study, carried out the mathematical analysis, and performed simulations using MATLAB.Q.Z.provided guidance, key suggestions, and finalized the paper.

Algorithm 1 .
PSO based on penalty function 1: Initialization: Population size N pop , maximum number of iterations N ITER , number of iterations N iter , maximum speed of the particle V max , search region [0, 1].Initialize each particle's velocity and position.2: For i = 1: N ITER 3:

Figure 3 .
Figure 3. D2D (Device-to-device) and cellular's utility for different algorithms with different numbers of D2D groups, 20 = M ,

Figure 4 .
Figure 4. Average throughput of cellular and D2D users for different algorithms with different numbers of D2D groups, 20 = M ,

Figure 4 .
Figure 4. Average throughput of cellular and D2D users for different algorithms with different numbers of D2D groups, M = 20, S k m,n ∼ (0, 1).

Figure 6 .
Figure 6.Average throughput of cellular and D2D users for different algorithms with different social relationships, M = 20, N = 5.

Figure 7 .
Figure 7. Network complexity for different algorithms with different numbers of D2D groups, 20 = M

Figure 7 .
Figure 7. Network complexity for different algorithms with different numbers of D2D groups, M = 20.Electronics 2019, 8, x FOR PEER REVIEW 15 of 16

Funding:
This report is supported by the National Natural Science F ， oundation of China (61971239 61631020).zhuqi@njupt.edu.cn and Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX19_0945).
, respectively.g n,1 and g m,n,1 are the channel gain between DT n and DR 1 n , C m and DR 1 n , respectively.ζ 1 n represents the AWGN at DR 1 n .If D2D receiver DR 1