Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications

Gu, Wenying; Zhu, Qi

doi:10.3390/electronics8111360

Open AccessArticle

Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications

by

Wenying Gu

^1,2 and

Qi Zhu

^1,2,*

¹

Jiangsu Key Laboratory of Wireless Communications, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

²

Engineering Research Center of Health Service System Based on Ubiquitous Wireless Networks, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

^*

Author to whom correspondence should be addressed.

Electronics 2019, 8(11), 1360; https://doi.org/10.3390/electronics8111360

Submission received: 28 September 2019 / Revised: 9 November 2019 / Accepted: 14 November 2019 / Published: 16 November 2019

(This article belongs to the Special Issue Cooperative Communications for Future Wireless Systems)

Download

Browse Figures

Versions Notes

Abstract

:

Device-to-device (D2D) communication and non-orthogonal multiple access (NOMA) have been considered promising techniques to improve system throughput. In the NOMA-enhanced D2D scenario, a joint channel and power allocation algorithm based on the Stackelberg game is proposed in this paper. The social relationship between the cellular and D2D users is utilized to define their utility functions. In the two-stage Stackelberg game, the cellular user is the leader and the D2D group is the follower. Cellular users and D2D groups are matched via the Kuhn–Munkres (KM) algorithm to allocate channels for D2D groups in the first stage. The power allocation of D2D users is optimized through a penalty-function-based particle swarm optimization algorithm (PSO) in the second stage. The simulation results show that the proposed algorithm can effectively strengthen the cooperation between cellular and D2D users and improve their utility.

Keywords:

D2D; NOMA; Stackelberg game; social relationship

1. Introduction

With the rapid growth of mobile terminals and multimedia services, the demand for high-rate data transmission has increased and the traffic pressure on the core network has become extremely high. Device-to-device (D2D) communication is considered a key technology to relieve the pressure of the core network effectively [1]. It enables users to communicate with each other by reusing the resources of other users without passing through the base station, thus significantly improving spectrum utilization and system throughput. Non-orthogonal multiple access (NOMA) is also a recent research hot spot. Compared with orthogonal multiple access (OMA), it has higher spectral efficiency and can provide faster transmission rate and lower outage probability [2].

In wireless communication, resource allocation has attracted widespread attention as it considerably affects system performance [3,4]. Game theory is often used to solve such problems [5,6]. In D2D communication, many studies focused on the resource allocation problem to improve the system performance. In Reference [7], a joint optimization algorithm for channel and power allocation based on the Nash bargaining game was proposed. It decomposed the optimization problem into two sub-problems, which simplified the calculation and improved the system throughput. In Reference [8], D2D power allocation was studied under cooperative and non-cooperative games, and the D2D transmit power was optimized via sub-gradient methods. In Reference [9], a D2D power auction mechanism based on a stochastic game was proposed to reduce interference and optimize power allocation. In Reference [10], a student-project matching model of cellular, D2D, and relay users, which improved the system throughput, was proposed. However, none of the above algorithms considers NOMA and no further improvement in system throughput was achieved.

A D2D transmitter can send messages to multiple D2D receivers simultaneously through NOMA. The distribution of D2D communication resources assisted by NOMA was studied in Reference [11], and the D2D throughput was optimized while ensuring the quality of service (QoS) of cellular users. In Reference [12], NOMA-based D2D resource allocation was studied as a Nash bargaining game, and the power optimization problem was solved by using the Karush–Kuhn–Tucker (KKT) conditions. In Reference [13], a D2D-NOMA optimization algorithm combining sub-channel allocation, user matching, and power control was proposed to optimize the total transmit power by coordinating interference. However, mobile devices are carried by humans and none of the above solutions considers the influence of social factors. In a practical environment, social relationships will affect user’s decision-making and can be used to strengthen the cooperation between users, thereby effectively improving system throughput.

In this study, a cellular uplink network scenario is considered. Cellular users occupy independent sub-channels, and D2D groups, which consist of a D2D transmitter and two D2D receivers, reuse the uplink channels of the cellular users to communicate with each other. In D2D groups, NOMA is considered to make the D2D receivers demodulate the signal correctly from the mixed signal. The resource allocation is modeled as a two-stage Stackelberg game by defining the utility functions of cellular users and D2D groups. The main contributions of this paper can be summarized as follows:

The social relationship between cellular and D2D users is considered. When D2D users reuse the channel resources of cellular users for communication, their social relationship will affect the channel selection and transmit power of D2D users. Considering the social relationship can strengthen the cooperation between users, thus increasing the system throughput.
In the two-stage Stackelberg game model, the cellular user is the leader. In the first stage, maximum weight matching between cellular users and D2D groups is achieved using the Kuhn–Munkres (KM) algorithm while ensuring the QoS of all users on the sub-channel to allocate channels for D2D groups.
The D2D group is the follower. In the second stage, a penalty-function-based particle swarm optimization (PSO) algorithm is utilized to optimize the D2D transmit power. The final power allocation strategy is determined via the convergence of PSO.

The rest of this paper is organized as follows. Section 2 presents the system model. In Section 3, we define the utility functions of cellular users and D2D groups, establish the Stackelberg game model, and prove the convergence of the algorithm. In Section 4, the simulation results are presented and analyzed. In Section 5, we summarize the paper.

2. System Model

In the cellular communication system, a single-cell uplink transmission scenario is considered. As shown in Figure 1a, we consider that

M

cellular users

{C_{1}, C_{2}, \dots, C_{M}}

and

N

D2D groups

{D_{1}, D_{2}, \dots, D_{N}}

are randomly distributed in the cell. The BS allocates a dedicated subchannel for each cellular user, and subchannels

{S C_{1}, S C_{2}, \dots, S C_{M}}

are orthogonal with each other. We assume that the cellular user

C_{m}

occupies the subchannel

S C_{m}

without loss of generality. The cellular users communicate with the BS in the traditional cellular mode. The D2D group is different from the traditional D2D pair. There are one D2D transmitter and several D2D receivers in one D2D group. We consider NOMA transmission protocol and serial interference cancellation (SIC) technology within D2D groups, so that the D2D transmitter can send messages to multiple D2D receivers simultaneously and each D2D receiver can demodulate the message which belongs to itself correctly. Considering that as the number of D2D receivers increases, series of problems such as complex interference, huge computational complexity, etc. will occur. This paper assumes that one D2D group only consists of one D2D transmitter

D T_{n}

and two D2D receivers

D R_{n}^{1}

,

D R_{n}^{2}

, and each D2D receiver is randomly distributed within a disc centred on

D T_{n}

.

Considering that mobile devices are carried by the human, social relationship between the cellular user and D2D user is taken into account. Social relationship will affect the channel selection and power allocation for D2D users and thereby strengthen the cooperation between D2D users and cellular users. Since the system model is closely associated with the social relationship between users, the following is the analysis of the physical domain and social domain, respectively.

Physical domain can be used to describe the impact of channel condition and system interference in practical network. In this paper, D2D groups communicate by reusing cellular users’ uplink channel resources. Therefore, cellular users may cause interference to D2D receivers and BS will suffer interference from D2D transmitters. The physical domain can be represented as a graph

G (V_{p}, E_{p})

, where

V_{p}

denotes the devices,

E_{p}

indicates the channel quality for data transmission. The physical domain shows whether the channel can meet the communication requirements of users.

Social domain can be used to describe users’ social attributes, which is shown in Figure 1b. Similarly, social domain can be represented as a graph

G (V_{s}, E_{s})

, where

V_{s}

denotes the users,

E_{s}

indicates the social relationship between cellular users and D2D receivers. Social relationship is defined as

S_{m, n}^{k}

,

S_{m, n}^{k} \in [0, 1]

,

k \in {1, 2}

. When two users have a very close social relationship,

S_{m, n}^{k}

should be close to one and they are more willing to cooperate with each other, which means the cellular user is more willing to let the D2D user who occupy its channel to increase the transmit power.

This paper assumes that each cellular user occupies an independent subchannel and each subchannel can be reused by only one D2D group. Meanwhile, each D2D group can only reuse one cellular user’s channel. Therefore, the signal received at the BS on the subchannel

S C_{m}

can be expressed as

y_{m} = \sqrt{P_{c}} g_{m, B} x_{m} + {\sum_{n} η_{m, n} \sqrt{P_{d}} g}_{n, B} x_{n} + ζ_{m}

(1)

where

P_{c}

and

P_{d}

represent the transmit power of the cellular user and D2D transmitter, respectively.

g_{m, B}

and

g_{n, B}

are the channel gain between

C_{m}

and BS,

D T_{n}

, and BS, respectively.

η_{m, n}

indicates whether

D_{n}

reuse

S C_{m}

, i.e.,

S C_{m}

is reused by

D_{n}

,

η_{m, n} = 1

; otherwise

η_{m, n} = 0

.

x_{m}

and

x_{n}

are the signals sent by

C_{m}

and

D T_{n}

, respectively.

ζ_{m}

represents the additive white Gaussian noise (AWGN) on the channel. As a consequence, the signal-to-interference-plus -noise-ratio (SINR) and transmission rate of

C_{m}

at BS can be defined as

γ_{m} = \frac{P_{c} g_{m, B}}{\sum_{n} η_{m, n} P_{d} g_{n, B} + N_{0}}

(2)

R_{m} = \log_{2} (1 + γ_{m})

(3)

where

N_{0}

represents the noise power.

Considering NOMA in D2D groups, we set the power allocation coefficients of the D2D transmitter

D T_{n}

as

α_{n}

and

β_{n}

, and

α_{n} + β_{n} \leq 1

. Therefore, the signal received by the D2D receiver

D R_{n}^{1}

can be expressed as

y_{n}^{1} = (\sqrt{α_{n} P_{d}} x_{n}^{1} + \sqrt{β_{n} P_{d}} x_{n}^{2}) g_{n, 1} + \sqrt{P_{c}} g_{m, n, 1} x_{m} + ζ_{n}^{1}

(4)

where

x_{n}^{1}

and

x_{n}^{2}

are the signals sent to

D R_{n}^{1}

and

D R_{n}^{2}

, respectively.

g_{n, 1}

and

g_{m, n, 1}

are the channel gain between

D T_{n}

and

D R_{n}^{1}

,

C_{m}

and

D R_{n}^{1}

, respectively.

ζ_{n}^{1}

represents the AWGN at

D R_{n}^{1}

.

If D2D receiver

D R_{n}^{1}

need to remove

x_{n}^{2}

and demodulate

x_{n}^{1}

properly through SIC, the following condition must be met [14], which can be represented as

\frac{β_{n} P_{d} g_{n, 1}}{α_{n} P_{d} g_{n, 1} + P_{c} g_{m, n, 1} + N_{0}} \geq \frac{β_{n} P_{d} g_{n, 2}}{α_{n} P_{d} g_{n, 2} + P_{c} g_{m, n, 2} + N_{0}}

(5)

where

g_{n, 2}

and

g_{m, n, 2}

are the channel gain between

D T_{n}

and

D R_{n}^{2}

,

C_{m}

and

D R_{n}^{2}

, respectively.

Equation (5) can be simplified as

A (η) = (P_{c} g_{m, n, 2} + N_{0}) g_{n, 1} - (P_{c} g_{m, n, 1} + N_{0}) g_{n, 2} \geq 0

(6)

As shown in Equation (6), the inequality is unrelated to the power allocation coefficient

α_{n}

,

β_{n}

and is only related to the channel allocation

η_{n, m}

. As a consequence, it can be expressed as a function of

η

.

Thus, the SINR and transmission rate at D2D receiver

D R_{n}^{1}

and

D R_{n}^{2}

can be defined as

γ_{n, 1} = \frac{α_{n} P_{d} g_{n, 1}}{P_{c} g_{m, n, 1} + N_{0}}

(7)

γ_{n, 2} = \frac{β_{n} P_{d} g_{n, 2}}{α_{n} P_{d} g_{n, 2} + P_{c} g_{m, n, 2} + N_{0}}

(8)

R_{n, 1} = \log_{2} (1 + γ_{n, 1})

(9)

R_{n, 2} = \log_{2} (1 + γ_{n, 2})

(10)

The above conclusions are all based on the assumption that

D R_{n}^{1}

can remove

x_{n}^{2}

and correctly demodulate

x_{n}^{1}

, and the corresponding other case, that is,

D R_{n}^{2}

remove

x_{n}^{1}

and correctly demodulate

x_{n}^{2}

, is similar to the above and will not be derived again.

3. Stackelberg Game Based Resource Allocation

According to the system model, this paper mainly studies the channel and power allocation of the D2D group under NOMA. Since the channel D2D group reuse may affect the D2D transmitter’s transmit power and the different transmit power is relative to the channel selection, the model is in accordance with the Stackelberg game. Therefore, we designed a two-stage Stackelberg game model where the leader is cellular users and the follower is D2D groups. In the first stage, we use KM algorithm to match the cellular users with D2D groups in order to allocate subchannels for D2D users. In the second stage, PSO algorithm based on penalty function will be used to optimize D2D users’ transmit power.

3.1. Utility Model

The utility functions of cellular users and D2D groups are defined on the basis of their benefit and loss. The first stage mainly solves the channel allocation problem, that is, the matching problem between cellular users and D2D groups. Considering that when D2D users reuse the channel of cellular users, they cause interference to cellular users and reduce cellular users’ throughput. Therefore, when the cellular channel is reused, the D2D user needs to pay a certain price for using the cellular channel. As a consequence, for cellular users, incentive is mainly derived from the rewards of assigning power to D2D groups based on social relationship. Meanwhile, they also sacrifice some of their throughput. Hence, the utility function of cellular users can be defined as

U_{m}^{c} = (1 - S_{m, n}^{1}) V * α_{n} P_{d} + (1 - S_{m, n}^{2}) V * β_{n} P_{d} - (R_{m}^{0} - R_{m})

(11)

where

S_{m, n}^{1}

and

S_{m, n}^{2}

are the social relationships between the cellular user

C_{m}

and D2D receivers

D R_{n}^{1}

,

D R_{n}^{2}

, respectively.

V

represents the price of per unit power.

(1 - S_{m, n}^{k}) V

is the actual price of per unit power and it is related to the social relationship between two users. The closer the social relationship is, the lower the actual price is.

R_{m}^{0}

denotes the data rate of

C_{m}

when no D2D user reuses

S C_{m}

.

R_{m}^{0}

can be expressed as

R_{m}^{0} = \log_{2} (1 + \frac{P_{c} g_{m, B}}{N_{0}})

(12)

The second stage mainly solves the power allocation for D2D users. We do not consider optimizing the cellular users’ transmit power here and set it to a certain value. Therefore, power allocation means optimizing the D2D transmitter’s transmit power when sending messages to two D2D receivers. For D2D users, the incentive is mainly derived from the increase of data rate after reusing the cellular channels. If the data rate is not improved after reusing the cellular channel, then the utility will be less than zero, and the cellular mode will be selected for communication; if the data rate is increased, D2D users should pay for the transmit power. As a consequence, we can obtain the utility functions of

D R_{n}^{1}

and

D R_{n}^{2}

:

U_{n, 1}^{d} = (R_{n, 1} - R_{n, 1}^{c}) - (1 - S_{m, n}^{1}) V * α_{n} P_{d}

(13)

U_{n, 2}^{d} = (R_{n, 2} - R_{n, 2}^{c}) - (1 - S_{m, n}^{2}) V * β_{n} P_{d}

(14)

where

R_{n, 1}^{c}

and

R_{n, 2}^{c}

are the data rates when D2D users do not reuse the cellular channel and send messages to the BS in traditional cellular mode. They can be defined as Equation (13).

Hence, the utility function of the D2D groups is given by

U_{n}^{d} = U_{n, 1}^{d} + U_{n, 2}^{d}

(15)

3.2. Analysis of Leaders

Cellular users are the leaders in the Stackelberg game. In the first stage, we mainly solve the matching problem among cellular users and D2D groups. Based on cellular user’s utility function defined in the previous section, the channel allocation problem can be formulated as the following:

\max_{η} \sum_{m} U_{m}^{c} (η, P)

(16a)

\begin{matrix} s . t . \end{matrix} γ_{m} \geq γ_{m}^{t h} \forall m

(16b)

γ_{n, 1} \geq γ_{n, 1}^{t h}, γ_{n, 2} \geq γ_{n, 2}^{t h} \forall n

(16c)

A (η) \geq 0 \forall m, n

(16d)

η_{m, n} \in {0, 1} \forall m, n

(16e)

\sum_{m} η_{m, n} \leq 1 \forall n

(16f)

\sum_{n} η_{m, n} \leq 1 \forall m

(16g)

where Equation (16a) is the optimization problem we formulate to maximize the cellular users’ utility through the channel allocation. Constraint (16b) limits the interference which the D2D user brings to the cellular user and ensures the QoS of the cellular user. Constraint (16c) guarantees the QoS of D2D users. Constraint (16d) represents the requirement which must be met if using SIC. Constraint (16e) indicates that the value of

η_{n, m}

should be either 1 or 0, representing reusing

S C_{m}

or not. Constraint (16f) indicates that the D2D group can only reuse one cellular user’s subchannel. Constraint (16g) indicates that only one D2D group can be assigned to each subchannel.

The objective function is non-convex because it is a 0–1 integer problem. It can be transformed into the optimal matching problem of the weighted bipartite graph. As we can see from Figure 2, the cellular users and D2D groups form two sets of vertices in the bipartite graph and cellular users’ utility can represent the weight of edge

w_{m, n}

. The principle of the matching process is that each vertex can only match one vertex from the other side, and each vertex should select the vertex with the largest weight edge if possible. Therefore, the optimization problem can be converted to

\max \sum_{m} \sum_{n} w_{m, n}

.

KM algorithm can be used to solve Equation (16a) because it can solve the maximum weighted-matching problem under complete matching via the Hungarian method. Specifically, it transforms the weight of edges to the vertex and finds a perfect matching via the Hungarian method. During the matching process, it continuously adjusts the vertex value, increases the feasible edges, then uses Hungarian method to find the final matching. However, KM algorithm requires that the bipartite graph is completely symmetrical. We assume that the number of D2D groups is no more than the number of cellular users in this paper. In order to apply KM algorithm in our scenario, it is necessary to add several virtual vertices to D2D groups. In addition, in order to avoid a non-conforming match, we reset the weight of edge to zero if constraints Equation (16a–c) are not met. Furthermore, KM algorithm is inherently in compliance with the constraints Equation (16d–f). As a consequence, we can solve the channel allocation problem through KM algorithm in the first stage.

Proposition 1.

KM algorithm converges to the optimal channel allocation strategy.

Proof.

KM algorithm claims that, during the matching process, the total utility of all the cellular users should not reduce and at least one cellular user’s utility should increase if the match changes, which indicates that the matching is optimized to the perfect match. Since the cellular users and D2D groups participating in the match are finite, the corresponding match is also limited. As a consequence, KM is bound to converge to the optimal match after a finite number of iterations. □

Proposition 2.

The computational complexity of KM is

O (M^{3})

.

Proof.

The computational complexity of KM is related to the number of vertices. As mentioned above, the number of vertices on both sides of our scenario is M. Hence, the computational complexity of KM is

O (M^{3})

. □

3.3. Analysis of Followers

D2D groups are the followers in the Stackelberg game. In the second stage, we mainly solve the power allocation for D2D users. Based on the utility function of D2D groups defined in Section 3.1, the power allocation problem can be formulated as the following:

\max_{P_{n}} U_{n}^{d} (η, P)

(17a)

\begin{matrix} s . t . \end{matrix} γ_{m} \geq γ_{m}^{t h} \forall m

(17b)

γ_{n, 1} \geq γ_{n, 1}^{t h}, γ_{n, 2} \geq γ_{n, 2}^{t h} \forall n

(17c)

α_{n} \geq 0, β_{n} \geq 0 \forall n

(17d)

α_{n} + β_{n} \leq 1 \forall n

(17e)

where Equation (17a) is the optimization problem we formulate to maximize the D2D group’s utility through the power allocation. Constraint Equation (17b,c) ensures the QoS of all the users on

S C_{m}

. Constraint Equation (17d,e) indicates that the D2D transmitter

D T_{n}

’s transmit power should not exceed the power threshold, and the transmit power should not be less than zero when

D T_{n}

sends signals to

D R_{n}^{1}

and

D R_{n}^{2}

, respectively.

Considering that Equation (17a) is a constrained optimization problem, we can transform it into an unconstrained optimization problem by the external penalty function method. The corresponding augmented objective function can be defined as

B (η, P, M) = U_{n}^{d} (η, P) - M [\begin{array}{l} \min^{2} (γ_{m} - γ_{m}^{t h}, 0) + \min^{2} (γ_{n, 1} - γ_{n, 1}^{t h}, 0) + \min^{2} (γ_{n, 2} - γ_{n, 2}^{t h}, 0) \\ + \min^{2} (α_{n}, 0) + \min^{2} (β_{n}, 0) + \min^{2} (1 - α_{n} - β_{n}, 0) \end{array}]

(18)

Based on the channel allocation in the previous section, Equation (18) mainly optimizes

α_{n}

and

β_{n}

. This problem is a non-convex problem and it can be solved via PSO. PSO is a parallel algorithm. The main idea of PSO is to initialize a group of random particles within the definition domain. Each particle adjusts its position according to the fitness determined by the objective function in each iteration. Two factors may affect particle’s speed and position. One is the optimal solution found by itself, and the other is the optimal solution currently found by the population. Through continuous iteration, all particles approximate the global optimal solution.

On the basis of the main idea of PSO, the position of the particle can be expressed as

X_{i d}

, where

i

represents the particle number and

d

represents the dimension. In this section, (18) mainly optimizes

α_{n}

and

β_{n}

, which means each particle represents a set of power allocation coefficients including two parameters

α_{n}

and

β_{n}

. Hence, it is a 2D optimization problem.

X_{i d}

can be expressed as

{(α_{1}, β_{1}), (α_{2}, β_{2}), \dots, (α_{N_{p o p}}, β_{N_{p o p}})}

, where

N_{p o p}

represents the size of the population. Each particle constantly adjusts its speed and position to approximate the optimal value based on (18) on the joint definition domain of

α_{n}

and

β_{n}

.

The updated speed can be defined as

V_{i d}^{'} = ω V_{i d} + C_{1} r a n d o m (0, 1) (P_{i d} - X_{i d}) + C_{2} r a n d o m (0, 1) (P_{g d} - X_{i d})

(19)

where

ω

represents inertia weight which determines the speed of finding the optimal solution.

ω

is non-negative.

C_{1}

and

C_{2}

are the acceleration constant used to characterize cognitive behaviour and social behaviour, respectively.

random (0, 1)

means a random number between [0, 1].

P_{i d}

represents the individual optimal position of

i

in dimension

d

.

P_{g d}

represents the optimal position of the population in dimension

d

.

The updated position can be defined as

X_{i d}^{'} = X_{i d} + V_{i d}^{'}

(20)

The algorithm stops when the fitness change of the optimal position is less than the convergence threshold

Δ

or reaches the maximum number of iterations. Through the continuous updating of particles’ speed and positions, the optimal value of the power allocation coefficients can be obtained. The proposed power allocation algorithm is shown in Algorithm 1.

Algorithm 1. PSO based on penalty function

1: Initialization: Population size

N_{pop}

, maximum number of iterations

N_{ITER}

, number of iterations

N_{iter}

, maximum speed of the particle

V_{\max}

, search region [0, 1]. Initialize each particle’s velocity and position.

2: For i = 1:

N_{ITER}

3:

N_{iter} = N_{iter} + 1

4: For j = 1:

N_{pop}

5: Calculate the fitness according to (18)

6: Compare and update

P_{i d}

and

P_{g d}

7: Update particle velocity according to (19)

8: Update particle position according to (20)

9: End for

10: If

| P_{gd} (i) - P_{gd} (i - 1) | < Δ

11: Break

12: End If

13: End for

14: Output:

(α_{n}^{*}, β_{n}^{*})

Proposition 3.

PSO based on penalty function converges to the optimal power allocation strategy.

Proof.

Reference [15] proves the convergence of PSO. The parameters of the converged PSO should conform to:

\sqrt{2 [1 + ω - (C_{1} + C_{2})] - 4 ω} < 2

. In this paper, we set

ω = 1

and

C_{1} = C_{2} = 1.8

to satisfy the convergence requirement. In addition, although the power allocation for D2D groups involve

N

α_{n}

and

β_{n}

, the power allocation of each D2D group is independent with each other. D2D group only causes interference to the corresponding cellular user on the reused channel. Therefore, the power allocation problem can be decomposed into

N

sub-problems. Each sub-problem will converge to a stable optimal solution through PSO. As a consequence, the optimization problem in the second stage will converge to a stable optimal solution. □

Proposition 4.

The computational complexity of PSO based on penalty function is

O (N \times N_{pop} \times N_{i t e r})

.

Proof.

The computational complexity of PSO is related to the number of particles

N_{pop}

and the number of iterations

N_{iter}

. It needs to perform PSO every time when optimizing transmit power for a D2D group. Therefore, the computational complexity of each execution of PSO is

O (N_{pop} \times N_{i t e r})

and the total computational complexity in the second stage is

O (N \times N_{pop} \times N_{i t e r})

. □

3.4. Joint Channel and Power Allocation Based on Stackelberg Game

We propose a two-stage Stackelberg game, where the leader is cellular users and the follower is D2D groups. In the first stage, we find the optimal match between cellular users and D2D groups according to Section 3.2. In the second stage, we optimize the D2D transmitter’s transmit power in each D2D group according to Section 3.3. The two-stage Stackelberg game will finally converge to a stable solution which will be proved later. The specific two-stage Stackelberg game based joint channel and power allocation algorithm (S-JCPA) is shown in Algorithm 2.

Algorithm 2. Stackelberg game based joint channel and power allocation (S-JCPA)

1: Initialization: Set of cellular users

{C_{1}, C_{2}, \dots, C_{M}}

, set of D2D groups

{D_{1}, D_{2}, \dots, D_{N}}

, power allocation coefficients

{α_{1}, α_{2}, \dots, α_{N}}

and

{β_{1}, β_{2}, \dots, β_{N}}

, set of historical channel allocation

{H_{i} (t)}_{i \in N} = \emptyset

, maximum number of iterations

K

.

2: For t=1: K

3: Allocate channels for D2D groups via KM according to (12)

4: If channel allocation results already exist in

H_{i} (t)

5: For i = 1: N

6: Optimize transmit power for D2D users via PSO according to (18)

7: Update

α_{n}

and

β_{n}

8: End for

9: break

10: Else

11: Save the channel allocation result to

H_{i} (t)

12: For i = 1: N

13: Optimize transmit power for D2D users via PSO according to (18)

14: Update

α_{n}

and

β_{n}

15: End for

16: End if

17: End for

18: Output:

(η^{*}, P^{*})

According to Section 3.2 and Section 3.3, it can be proved that both of the two stages can converge to the optimal solution. According to the characteristic of Stackelberg game, when the leader and follower both have an equilibrium solution, the Stackelberg equilibrium can be achieved. Through the previous analysis, we can easily achieve the network complexity in the system. Considering the computational complexity of KM and PSO, the network complexity is

O (K \times (M^{3} + N \times N_{pop} \times N_{i t e r}))

.

4. Simulation and Performance Analysis

This section simulates and analyzes the proposed joint channel and power allocation algorithm based on Stackelberg game. The system model is shown in Figure 1a. The simulation is built in a disc area with a radius of 500 m. The channel gain is subject to large-scale fading based on distance loss and small-scale fading based on Rayleigh fading [16]. The large-scale fading can be modeled as

κ d^{- α}

, where

d

represents the transmit distance,

κ

and

α

represent the possible fading and path loss exponent, respectively. The Rayleigh fading follows the exponential distribution with a mean of 1. The simulation parameters are shown in Table 1.

Figure 3 plots the utilities of the cellular and D2D users for different numbers of D2D groups. When the number of D2D groups increases, the utilities of both cellular and D2D users decline. This is because, as the number of D2D groups increases, the gap between the number of cellular users and the number of D2D groups is reduced. When performing channel matching, it is difficult to obtain an optimal match for each user because of the lack of channel resources. Consequently, the utilities of both cellular users and D2D users decline. In Reference [17], a joint optimization algorithm for channel allocation and power control was proposed to optimize the throughput of D2D users. However, this study did not consider the effect of the social relationship between cellular and D2D users and did not optimize the utility function based on the social relationship. Hence, the utility obtained with the algorithm in Reference [17] was not as high as that obtained with our algorithm.

Figure 4 plots the average throughput(rate) of the cellular and D2D users for different numbers of D2D groups. As the number of D2D groups increases, the average throughput of both cellular and D2D users shows a downward trend. This is because cellular users represent subchannels available for allocation in the system. Similar to the reason in Figure 3, the number of cellular users is unchanged whereas the number of D2D groups increases. Hence, it is difficult to obtain an optimal match for each individual because of the lack of channel resources. Consequently, the average throughput is reduced for both cellular and D2D users. Furthermore, Reference [17] aimed at optimizing the throughput of all the D2D users without considering whether the cellular users were willing to cooperate with them. Hence, the average throughput of D2D users in Reference [17] was higher than that obtained with our algorithm, whereas the average throughput of cellular users in Reference [17] was lower than that obtained with our algorithm.

Figure 5 shows the impact of the social relationships on the utilities of the cellular and D2D users. With a closer social relationship, the utility of D2D users continues to increase and the utility of cellular users continues to decrease, which is determined by their respective utility functions. When the social relationship between D2D and cellular users is not close, cellular users are not willing to allow the D2D users who reuse their channels to increase their transmit power to improve their throughput. Hence, the cellular users have high utility whereas the D2D users have low utility. When the social relationship is close, the D2D users can increase the transmit power on the cellular channel with a small expense. Consequently, the utility of the D2D users increases, whereas the utility of the cellular users gradually decreases. As the social relationship becomes closer, the D2D users can increase their transmit power without paying an expense to the cellular users. Hence, the utility of the cellular users drops sharply, even approaching zero. However, as Reference [17] did not consider the social relationship between cellular and D2D users, the utility function based on the social relationship was not optimized. Consequently, the utilities of both cellular and D2D users were lower than those obtained with our algorithm.

Figure 6 shows the impact of social relationships on the average throughput of the cellular and D2D users. As Reference [17] did not consider the influence of social relationship, the average throughput of the cellular and D2D users was unchanged. However, in our algorithm, the closer the social relationship between the cellular users and D2D users, the more cellular users are willing to allow the D2D users who reuse their channels to increase their transmit power to improve their throughput. As the social relationship becomes closer, the D2D users only need to pay a small expense to achieve a high transmit power. Therefore, the average throughput of the D2D users continuously increases, whereas the average throughput of the cellular users gradually decreases. Moreover, as the social relationship becomes closer, the average throughput of the D2D group in this study approaches that in Reference [17] and the average throughput of the cellular users becomes higher than that in Reference [17].

Figure 7 plots the network complexity for different numbers of D2D groups under different convergence thresholds and compares the proposed algorithm S-JCPA with the algorithm proposed in [17]. Figure 8 shows the impact of convergence threshold on the utilities of the cellular and D2D users. The study in Reference [17] first solved the channel allocation problem with KM and then optimized the D2D transmit power with KKT. In S-JCPA, KM and PSO are used to solve the resource allocation problem. Consequently, the network complexity of the algorithm in Reference [17] is less than that of our algorithm. We also compare the network complexity of our algorithm under different convergence thresholds. The results show that, when the convergence threshold is small, the network complexity is higher, and meanwhile, the utilities of the cellular and D2D users are higher as well because PSO can search for more accurate results. Considering that as the convergence threshold decreases, the utilities don’t change much, so we choose 0.001 as the convergence threshold instead of continuously reducing the convergence threshold.

5. Conclusion

In this paper, we propose a joint channel and power allocation algorithm based on the Stackelberg game. We first establish the system model including several cellular users and D2D groups. Cellular users communicate through traditional cellular mode while D2D groups communicate by reusing the channel resources of cellular users. In each D2D group, NOMA is adopted to improve throughput. We also set the SINR threshold of each user to ensure the Qos of the system. Secondly, we model the two-stage Stackelberg game in which cellular users are the leader and D2D groups are the follower. The utility functions of cellular users and D2D groups are defined with social relationships, respectively. By using KM and PSO based on penalty function, we finally obtain the optimal channel and power allocation. The convergence and computational complexity are discussed, respectively. The simulation results show that our algorithm can successfully strengthen the cooperation between users and improve the utility of cellular and D2D users.

Author Contributions

W.G. organized and developed the proposal of the study, carried out the mathematical analysis, and performed simulations using MATLAB. Q.Z. provided guidance, key suggestions, and finalized the paper.

Funding

This report is supported by the National Natural Science Foundation of China (61971239, 61631020). zhuqi@njupt.edu.cn and Postgraduate Research & Practice Innovation Program of Jiangsu Province (KYCX19_0945).

Conflicts of Interest

The authors declare no conflict of interest.

References

Yan, J.; Wu, D.; Wang, R. Socially Aware Trust Framework for Multimedia Delivery in D2D Cooperative Communication. IEEE Trans. Multimed. 2019, 21, 625–635. [Google Scholar] [CrossRef]
Ahmed, M.; Li, Y.; Waqas, M.; Sheraz, M.; Jin, D.; Han, Z. A Survey on Socially Aware Device-to-Device Communications. IEEE Commun. Surv. Tutor. 2018, 20, 2169–2197. [Google Scholar] [CrossRef]
Tsinos, C.G.; Foukalas, F.; Tsiftsis, T.A. Resource Allocation for Licensed/Unlicensed Carrier Aggregation MIMO Systems. IEEE Trans. Commun. 2017, 65, 3765–3779. [Google Scholar] [CrossRef]
Zhu, F.; Liu, A.; Lau, V.K. Joint Interference Mitigation and Data Recovery for Massive Carrier Aggregation via Non-Linear Compressive Sensing. IEEE Trans. Wirel. Commun. 2018, 17, 1389–1404. [Google Scholar] [CrossRef]
Tsinos, C.; Galanopoulos, A.; Foukalas, F. Low-Complexity and Low-Feedback-Rate Channel Allocation in CA MIMO Systems with Heterogeneous Channel Feedback. IEEE Trans. Veh. Technol. 2017, 66, 4396–4409. [Google Scholar] [CrossRef]
Mochaourab, R.; Holfeld, B.; Wirth, T. Distributed channel assignment in cognitive radio networks: Stable matching and walrasian equilibrium. IEEE Trans. Wirel. Commun. 2013, 14, 3924–3936. [Google Scholar] [CrossRef]
Liu, T.; Wang, G. Resource allocation for device-to-device communications as an underlay using nash bargaining game theory. In Proceedings of the International Conference on Information and Communication Technology Convergence (ICTC), Jeju, Korea, 28–30 October 2015; pp. 366–371. [Google Scholar]
Baniasadi, M.; Maham, B.; Kebriaei, H. Power control for D2D underlay cellular communication: Game theory approach. In Proceedings of the International Symposium on Telecommunications (IST), Tehran, Iran, 7–28 September 2016; pp. 314–319. [Google Scholar]
Chang, M.; Chien, F.; Chen, T.; Li, K. Stochastic game-theoretical power allocation in D2D communications. In Proceedings of the IEEE International Symposium on Broadband Multimedia Systems and Broadcasting (BMSB), Nara, Japan, 1–3 June 2016; pp. 1–4. [Google Scholar]
Cao, L.; Yao, F.; Zhao, H.; Zhang, J. Distributed resource allocation for D2D-enabled two-tier cellular networks with channel uncertainties. In Proceedings of the IEEE International Conference on Communication Systems (ICCS), Shenzhen, China, 14–16 December 2016; pp. 1–5. [Google Scholar]
Pan, Y.; Pan, C.; Yang, Z.; Chen, M. Resource Allocation for D2D Communications Underlaying a NOMA-Based Cellular Network. IEEE Wirel. Commun. Lett. 2018, 7, 130–133. [Google Scholar] [CrossRef]
Zheng, H.; Hou, S.; Li, H.; Song, Z.; Hao, Y. Power Allocation and User Clustering for Uplink MC-NOMA in D2D Underlaid Cellular Networks. IEEE Wirel. Commun. Lett. 2018, 7, 1030–1033. [Google Scholar] [CrossRef]
Yoon, T.; Nguyen, T.H.; Nguyen, X.T.; Yoo, D.; Jang, B.; Nguyen, V.D. Resource Allocation for NOMA-Based D2D Systems Coexisting with Cellular Networks. IEEE Access 2018, 6, 66293–66304. [Google Scholar] [CrossRef]
Sawyer, N.; Smith, D.B. Flexible Resource Allocation in Device-to-Device Communications Using Stackelberg Game Theory. IEEE Trans. Commun. 2019, 67, 653–667. [Google Scholar] [CrossRef]
Guo, F.; Zhang, H.; Ji, H.; Li, X.; Leung, V.C.M. An Efficient Computation Offloading Management Scheme in the Densely Deployed Small Cell Networks with Mobile Edge Computing. IEEE ACM Trans. Netw. 2018, 26, 2651–2664. [Google Scholar] [CrossRef]
Xu, C.; Song, L.; Han, Z.; Zhao, Q.; Wang, X.; Cheng, X.; Jiao, B. Efficiency Resource Allocation for Device-to-Device Underlay Communication Systems: A Reverse Iterative Combinatorial Auction Based Approach. IEEE J. Sel. Areas Commun. 2013, 31, 348–358. [Google Scholar] [CrossRef]
Alemaishat, S.; Saraereh, O.A.; Khan, I.; Choi, B.J. An Efficient Resource Allocation Algorithm for D2D Communications Based on NOMA. IEEE Access 2019, 7, 120238–120247. [Google Scholar] [CrossRef]

Figure 1. Two-layer system model: (a) Physical domain; (b) social domain.

Figure 2. Bipartite graph for matching problem.

Figure 3. D2D (Device-to-device) and cellular’s utility for different algorithms with different numbers of D2D groups,

M = 20

,

S_{m, n}^{k} ~ (0, 1)

.

Figure 3. D2D (Device-to-device) and cellular’s utility for different algorithms with different numbers of D2D groups,

M = 20

,

S_{m, n}^{k} ~ (0, 1)

.

Figure 4. Average throughput of cellular and D2D users for different algorithms with different numbers of D2D groups,

M = 20

,

S_{m, n}^{k} ~ (0, 1)

.

Figure 4. Average throughput of cellular and D2D users for different algorithms with different numbers of D2D groups,

M = 20

,

S_{m, n}^{k} ~ (0, 1)

.

Figure 5. D2D and cellular’s utility for different algorithms with different social relationships,

M = 20

,

N = 5

.

Figure 5. D2D and cellular’s utility for different algorithms with different social relationships,

M = 20

,

N = 5

.

Figure 6. Average throughput of cellular and D2D users for different algorithms with different social relationships,

M = 20

,

N = 5

.

Figure 6. Average throughput of cellular and D2D users for different algorithms with different social relationships,

M = 20

,

N = 5

.

Figure 7. Network complexity for different algorithms with different numbers of D2D groups,

M = 20

.

Figure 7. Network complexity for different algorithms with different numbers of D2D groups,

M = 20

.

Figure 8. D2D and cellular’s utility with different convergence thresholds,

M = 20

.

Figure 8. D2D and cellular’s utility with different convergence thresholds,

M = 20

.

Table 1. Simulation parameters.

Parameter	Value
Cellular radius	500 m
Maximum D2D communication range	30 m
Cellular transmit power	23 dBm
Maximum D2D transmit power	20 dBm
Noise power	−174 dBm
Cellular SINR threshold	1.8 dB
D2D SINR threshold	1.8 dB
Social relationship	[0, 1]
Penalty factor	10^6
Unit power price	25
Possible fading	0.01
Path loss exponent	4
Convergence threshold	0.001

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Gu, W.; Zhu, Q. Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications. Electronics 2019, 8, 1360. https://doi.org/10.3390/electronics8111360

AMA Style

Gu W, Zhu Q. Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications. Electronics. 2019; 8(11):1360. https://doi.org/10.3390/electronics8111360

Chicago/Turabian Style

Gu, Wenying, and Qi Zhu. 2019. "Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications" Electronics 8, no. 11: 1360. https://doi.org/10.3390/electronics8111360

APA Style

Gu, W., & Zhu, Q. (2019). Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications. Electronics, 8(11), 1360. https://doi.org/10.3390/electronics8111360

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Stackelberg Game Based Social-Aware Resource Allocation for NOMA Enhanced D2D Communications

Abstract

1. Introduction

2. System Model

3. Stackelberg Game Based Resource Allocation

3.1. Utility Model

3.2. Analysis of Leaders

3.3. Analysis of Followers

3.4. Joint Channel and Power Allocation Based on Stackelberg Game

4. Simulation and Performance Analysis

5. Conclusion

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI