Incentive and Penalty Mechanism For Power Allocation in Cooperative D2D-Cellular Transmissions

: In cellular communication systems, the introduction of device-to-device (D2D) communications provides a reasonable solution to facilitate high data rate services in short-range communication. However, it faces a challenging issue of interference management, where the cross-tier interference from D2D users to licensed cellular users (CUs) degrades their quality-of-service (QoS) requirements. D2D communications can also assist in ofﬂoading some nearby CUs to enhance the cellular operator’s beneﬁt. To encourage the D2D transmitters (D2DT) to provide service to CUs in the dead zone, the cellular base station (CBS) needs to incentivize it with some monetary beneﬁts. In this paper, a Stackelberg game-based joint pricing framework for interference management and data ofﬂoading is presented to illustrate the effects of cooperation between the D2D user and CBS. Speciﬁcally, a singular price is used to incentivize the D2DT to share its resources with the far-off CUs along with penalizing them for interference created at CBS. Simulation results illustrate the performance of the proposed technique in terms of the utilities of CUs and D2D users for varying distances of D2DT.


Introduction
The phenomenal amount of volume growth in Internet traffic is attributed to the proliferation of data-intensive applications that provide multimedia-rich services such as augmented reality, video over IP, online gaming, etc. According to CISCO, average Internet traffic is expected to grow 3.7-fold from 2017 to 2022 [1] . It is also estimated that mobile data traffic will increase by 77 exabytes per month by 2022, nearly a seven-fold rise above 2017 [2]. Furthermore, a huge number of intelligent devices are predicted to be connected to the Internet for the wide implementation of the Internet of Things (IoT). CISCO predicts that there will be around 1.5 billion IoT devices (with cellular connections) by 2020 [3]. Therefore, currently, as the most convenient data access method for mobile devices, the cellular networks are under immense pressure to meet the explosive data demand. The telecommunication industry and the research community need to investigate novel cellular architectures and paradigms to upgrade network capacity. In this scenario, device-to-device (D2D) communication presents one favorable solution to enhance spectral efficiency and augment capacity with minimal operational and capital expenditures. In D2D communication, the communication between two or more nearby mobile users takes place directly, i.e., without the intermediary of any infrastructure or base station (BS), which reduces communication delay, increases network throughput, and creates a cyber-physical space in a cellular environment [4][5][6][7]. In D2D, a D2D device can use one of the following mechanisms for resource exchange. In monetary-based schemes, virtual currency is used to benefit and hence promote the nodes to cooperate. Hence, the monetary-based mechanism becomes an appropriate mechanism, which has been widely used to solve cooperation-stimulation problems [22]. The most well-known monetary schemes for the scenarios where the operators will control and charge for D2D communications and incentivize the cooperating nodes are based on game models [23]. Hence, in this paper, a Stackelberg game is used to develop a pricing model for operator and D2D terminals. Specifically, in our scheme, a joint penalty-based power control and incentive-based cooperation scenario is considered. Previously, in [24], a discount proportional to the time-sharing services on interference price was introduced. However, in this work, a rate-based pricing scheme to incentivize the D2D users for providing offloading services to CUs is investigated. The main contributions of this paper are summarized as follows: (i) A joint penalty-based power control scheme and incentive-based data offloading are proposed for the D2D communication in a D2D-underlaid cellular network, where the CBS offers a rate-based incentive to the D2D user for offloading the CU. (ii) A singular pricing mechanism is used to penalize the D2D user for interference and to incentivize the D2D user to provide content to nearby CUs. A singular price enables CBS to decide a fair price by taking into consideration the offloading of its contents using D2D users.
The remainder of the paper is arranged as follows. Section 2 presents the system model and problem formulation of the D2D-underlain cellular system. Section 3 explains the proposed optimal power allocation and pricing. Section 4 provides the analysis of the proposed game followed by simulation results in Section 5 and the conclusion in Section 6.

System Model
The operation of a cellular system underlain with D2D communications is considered as shown in Figure 1. It is supposed that K := {1, 2, . . . , K} D2D users reuse the uplink spectrum of the cellular system. The reason to share the uplink band of CBS is that it is easier to manage the interference at CBS. In the D2D layer, D2D pairs coexist with CUs and are able to transmit multiple signals at the same time. At a specific time interval, a single CU uses a subchannel in the uplink spectrum; whereas a D2D pair uses the same subchannel for communication as in underlay mode. However, when D2DT sends data to its receiver, it brings about harmful interference to the CBS (cross-tier interference). If h d,c is assumed as the path loss between the D2D pair and CBS, then the interference created at the CBS is p d h d,c , where p d is the transmit power of D2DT. The CBS guarantees that the total interference p d h d,c experienced by it is less than the tolerable interference threshold Q th To ensure this, the CBS imposes a price γ on D2D users to create interference on its licensed channels. The interference threshold is generally implemented at the base station in heterogeneous networks [25] and at the primary user in cognitive radio networks [26]. Furthermore, there exists a set of CUs S := {1, 2, . . . , S} within the communication range of the D2DTs (S ≤ K) that attaches to the close-by D2DT, when their requested content matches the D2DT content.
However, a D2DT may be reluctant to provide service to a nearby CU; therefore, it needs to be incentivized with some economic benefits from CBS. In this paper, a linear pricing function is used because of its implementation complexity.
The achievable signal-to-interference plus noise (SINR) Γ n d of the nth D2D pair can be expressed as: where p n d denotes the transmit power of the nth D2DT. h n d represents the link gain among the nth D2D pair. The parameter σ 2 signifies the white Gaussian noise power with zero mean and a common variance σ 2 , while I c represents the interference from CUs to the D2D receiver (D2DR).

Problem Formulation
Without cooperation between D2DT and CBS, the data rate at the jth CU can be given as: where h j v and p j v represent the channel power gain and transmitted power between CBS and the jth CU. However, when successful cooperation occurs between CBS and D2DT, the D2DT sets aside a power fraction α n for its signal to D2DR, and the data rate at the jth CU will become: (3) h j c signifies the channel power gain between the jth D2DT and offloaded CU. p j d implies the transmit power of the jth D2DT. The CBS aims to motivate D2DT to provide resources to the CU and ensure their QoS obligation. Therefore, D2DT should maintain the minimum transmission rate of CU. Hence, cooperation can only take place if: or equivalently, α n ≤ α n 0 , where α n 0 can be expressed as: where α n 0 represents the highest fraction of power that D2DT can allocate to its signal. Therefore, the far-off CUs require at least (1 − α j 0 )p j d to meet the data rate constraint provided in Equation (4). The amount of resources allocated by D2DT to CU should also satisfy the following condition: where R c,th represent the required data rate of the CU. Correspondingly, The data rate achieved by the D2DR can then be expressed as: Moreover, it is assumed that D2DR has a rate requirement of R dth . To ensure the QoS of D2DR, the achievable power allocation should be more than the power demand of D2DR. Hence, where α n low and α n high denote the lower and upper limits of α n .

Stackelberg Game for Power Allocation and Pricing
A Stackelberg game [27] is a type of non-cooperative game, where the intelligent entities choose their actions/strategies selfishly without consulting each other. Every entity is interested in its own pay-off and makes all decisions individually. Specifically, a Stackelberg game is comprised of a leader and followers (entities). The leader decides first, and the followers decide afterward. The optimal action is established by the leader, presuming that the followers react by optimizing their pay-off functions based on the leader's strategy. The solution to the Stackelberg game model can be achieved by obtaining Stackelberg equilibrium (SE). SE corresponds to an equilibrium condition, where each entity is assumed to know the steady-state actions of the other entities, and no entity can receive more advantage by changing its action unilaterally.
In the proposed model, from the perspective of CBS, the primary concern is to maximize its profits by adaptively determining the price. From the perspective of D2D users, how to profitably allocate their resources is the main problem. Based on the repeated interplay of CBS and D2D users, the Stackelberg game can achieve a globally desirable system performance. In this game, the CBS operates as a leader that first anticipates the reaction (power allocation factor α) of D2D pairs and prices them γ corresponding to interference experienced by them. The proposed Stackelberg game is represented in Figure 2.
The CBS specifies the price Each D2DT specifies the power allocation ratio to satisfy the QoS requirement of D2DR and correspondingly (1-) for providing offloading services to nearby CU.

Utility Functions In the Proposed Stackelberg Game Model
To mitigate the interference to a CU on a common subchannel, the CBS charges the nth D2DT an interference cost. It is supposed that the interference price γ n that the D2D pair needs to pay varies linearly in an amount to the power. Since the power allocated to the D2DT-D2DR signal is α n P n d h n d , therefore the total payment charged on D2D user communication is γ n α n P n d h n dc . For the D2D user, the price can be viewed as the penalty for exploiting the licensed channel. For CBS, the price can be considered as the payment for its probable service quality deterioration caused by the D2D user. The price also linearly relates to the power fraction resource (1 − α j ) for the offloaded CU. The value of α n ranges between [0,1]. If the D2DT cooperates with the CBS, then it will also earn a proportion of incentive related to the price announced by the CBS and the remaining power fraction. The utility of D2DT can be modeled as: At the CU side, CBS intends to motivate D2DT to open resources to the CU and ensure their QoS requirements. Similarly, the utility of CBS for the jth CU can be expressed as: If the cooperation is established between the CBS and D2DT, the utility of CBS will consist of the benefit achieved by the rate of CU and the interference penalty obtained from the D2DT. The cost represents the incentive price paid to the D2DT for providing service to CU. Therefore, the optimization problem at the D2DT side can be expressed as: α n ≤ α n 0 .
Q th denotes the interference threshold sustained at CBS.
It can be noticed from P1 that D2DT can earn revenue by allocating power to CU, at the expense of lowering the transmission rate of its own D2DR.
The optimization problem at the CBS side can be expressed as: s.t.
γ j > 0. (11) and (16) jointly form a Stackelberg game [10], which is solved to have both D2DT and CBS act sequentially to achieve the Stackelberg equilibrium (SE), i.e., the optimal price and fraction of power resources allocated to D2DT and offloaded CU by optimizing their respective utilities.

Power Allocation Strategy of D2DT
Our goal is to search for α n that maximizes U n d while holding the constraints in Equations (12)-(15) valid. If γ n is given, D2DT determines the transmit power allotment factor α n , between its own user and the offloaded CU, as follows: The optimal α n * is derived by taking the first derivative of U d with respect to α n and setting it to zero. The value of α n * must also fulfill the conditions in Equations (13) and (15), as well as: The second derivative of Equation (19) yields: which shows that the utility function in Equation (11) is indeed concave in α n . Therefore, for given price γ, the optimal value of α is given by Equation (19), provided that it satisfies the condition in Equation (4); otherwise, the α will be set to one, and no cooperation will take place.

Incentive and Interference Strategy of CBS
Algorithm 1 : Optimal Incentive Mechanism for D2D-Cellular Communication 1: D2DT measures the channel power gain of D2DR h d and offloaded CU h c and sends it to CBS. 2: CBS computes the optimal price γ. 3: Then, based on the price γ and the condition for interference, D2DT decides the optimal power allocation factor α according to Equation (19) provided that the conditions in Equations (12)- (15) are met.
CBS decides its interference and incentive pricing strategy by taking into account the predicted behavior of the D2D pair strategies. Equation (19) is substituted in Equation (16) to express the pay-off function of CBS as: The first derivative of Equation (21) with respect to the price γ can be given as: Rearranging (22), we get Equation (24).
To get the maximum values, we let ∂U j c ∂γ j = 0. Since Equation (24) consists of a quadratic form, therefore it is solved using the quadratic formula for the price γ j . where: and:

Simulation Results
In this section, the performance of the the proposed Stackelberg game model is demonstrated with MATLAB. The parameter settings for the simulations are summarized in Table 1. For explanation purposes, we considered a D2D-underlain cell with one CBS located in the center and K = 35 D2D pairs located within the coverage of the cell. Here, it was considered that one CU was offloaded to one D2DT. The fading channels were modeled as a.L −β , where a signifies the small-scale fading factor using the Rayleigh fading process and L implies the distance between the transmitter and receiver. β represents the path loss exponent specified as three for the large-scale fading.  Figure 3a illustrates the effect of the optimal power allocation factor for varying distances between CBS and D2DT. In this figure, it is observed that as the distance between D2DT and CBS decreases (correspondingly, the SINR decreases and the interference created at CBS increases), the α decreases, which means that the D2DT can now use less power to transmit to D2DR. The value of α for varying interference thresholds is plotted, and it is observed that as the interference threshold becomes stricter, less interference is tolerated at CBS; hence, the cooperation does not occur at shorter distances (low SINR). At longer distances between CBS and D2DT (high SINR), the α should increase, as now, D2DT can afford to transmit at higher power, but since the utility function also consists of offloading power factor (1 − α) for a CBS user, the α increases almost negligibly.   Figure 3b shows the relationship between the power allocation factor α and the price γ. Since CBS charges D2DT for the price of creating interference at it, the relationship observed is linear. When α increases, the price increases because now, D2DT allocates more power to its D2DR, which will naturally increase the interference at the CBS side; hence, the price increases. Figure 4a,b depicts the impact of interference created at CBS on CBS and D2DT utility. The utilities of CBS and D2DT are plotted against the increasing distances between CBS and D2DT. We notice that when the interference threshold Q th is not met, the utility of D2DT is zero, i.e., the cooperation does not occur. This is because in order to meet the requirements of Q th (set by CBS), the D2DT will try to reduce the α, and the required constraints are not met. A similar scenario happens in the case of CBS utility when the cooperation does not exist: the utility becomes zero. The correspondence between the utility at different thresholds is attributed to the value of α at varying CBS-D2DT distances.   Figure 5 shows the impact of cooperation on CBS utility with varying α. With the cooperation between CBS and D2DT, CBS utility increases. The plots for CBS utility in the case of cooperation are plotted for the case when there is no cooperation, i.e., when the α is zero, in which case the utility is zero as no power is allocated to CBS either, and then, for the case when the cooperation happens at α = 0.6 in which case the throughput of CBS increases, which is consistent with our analysis in Section 3.  As the distance starts to increase (SINR increases), D2DT tries to dedicate more power to its D2DR (by increasing α), but in this process, the CBS charges it more price, to control it from allocating too much power to its own user. Figure 7 depicts the influence of price γ on the total received interference. One can observe that the aggregate interference starts high, but it decreases as more price is charged.

Conclusions
In this paper, an incentive mechanism was proposed to motivate a D2D user to offload CU along with a penalty-based power control mechanism to manage the interference created at CBS. A single price scheme was used for both incentivizing and penalizing the D2D user. The cooperation between the CBS and D2D user was modeled using a Stackelberg game. In this game, the CBS acted as a leader purchasing the power resources needed by its CU from the D2D user, while the D2D user was considered as a follower, deciding its optimal fraction of power resources for both its own user and offloaded CU based on the released price (made known by CBS) subject to the QoS constraints of the D2D user. Stackelberg equilibrium for this game was derived using the backward induction method. The results presented demonstrated that the cooperation between CBS and the D2D user would provide acceptable performance in terms of achieved utilities. A D2D user could share power-based resources with far-off CU, ensuring their QoS-based requirements were met with cross-layer interference managed under a pre-defined threshold. The proposed power-sharing-based D2D-cellular cooperation scheme enabled CUs to obtain QoS-based throughput at the cost of a tolerable decrease in the achievable data rate of the D2D user (receiver).
Future work: Future works will consider the rate-based sharing in mobile data offloading with the energy harvesting model. An energy interaction between CBS and the D2D user will be formulated using game theory. The D2D users will act as market leaders determining the amount of energy they need to buy and the amount of resources they will share with CBS. CBS will then decide the discounted energy price.