Proactive Content Delivery with Service-Tier Awareness and User Demand Prediction

: Cost-effective delivery of massive data content is a pressing challenge facing modern mobile communication networks. In the literature, two primary approaches to tackle this challenge are service-tier differentiation and personalized proactive content caching. However, these two approaches have not been integrated and studied in a uniﬁed framework. This paper proposes an integrated proactive content delivery scheme that jointly exploits the availability of multiple service tiers and multi-user behavior prediction. Three optimal algorithms and one heuristic algorithm are introduced to solve the cost-minimization problems of multi-user proactive content delivery under different modelling assumptions. The performance of the proposed scheme is systematically investigated to reveal the impacts of proactive window size, service-tier price ratio, and trafﬁc cost model on the system performance.


Introduction
The rapid proliferation of smart phones and mobile Internet has driven an explosive growth of mobile data traffic demand.According to Cisco's report [1], global mobile data traffic will reach 49 exabytes per month by 2021.Among various types of mobile applications, content delivery (e.g., web browsing, video streaming) consumes the majority of the mobile data traffic.A Cisco report [1] estimated that video content will account for 78% of the world's total mobile traffic in 2021.However, the high price of mobile data plan (e.g., cost per Mbyte) is still one of the main factors prohibiting the ubiquitous adoption of mobile video applications.Therefore, significant research interests have been attracted in designing a mobile content delivery network that is cost-friendly to massive content delivery services.
Contradicting the high price of mobile data plan, the overall utilization of the mobile communication network's capacity is relatively low.This is because the mobile traffic demand varies significantly across space and time [2][3][4][5], while the network is typically built to accommodate the peak traffic demand.Consequently, a large amount of "redundant capacity" (i.e., the difference between the actual traffic load and the network capacity) is not used during off-peak hours [6], resulting in a low overall utilization of the network.It is widely anticipated that improving the network utilization can help to reduce the cost per bit for mobile operators and ultimately the price per bit for mobile users.
Previous studies on mobile content delivery have either taken an ISP-centric perspective or a CP-centric perspective.To our best knowledge, studies that unify both perspectives are still rare.In this paper, we propose a content delivery scheme that integrates both perspectives.Our scheme can simultaneously exploit the availability of differentiated services tiers and the predictability of user behavior.The main contributions of our paper are as follows.First, we propose a proactive content delivery scheme with service-tier awareness and user behavior prediction for the purpose of cost reduction.Second, considering a baseline scheme of proactive content delivery with one time-slot, we derive the optimal content delivery policy that can minimize the long-term cost.Third, considering a generalized scheme of multi-time-slot proactive content delivery, we propose a near-optimal heuristic algorithm for cost reduction.The performances of the proposed schemes are systematically evaluated to reveal key insights into the impacts of various system parameters on the cost.
The remainder of this paper is organized as follows.Section 2 describes the system model.Sections 3 and 4 formulate and analyze the problems of proactive content delivery in single-time-slot and multi-time-slot cases, respectively.Numerical results are presented in Section 5. Finally, conclusions are drawn in Section 6.

Model of Communication Service Tiers
We consider a system consisting of a CP, an ISP, and N users.The content data is delivered from the CP to users via the ISP, as shown in Figure 1.For simplicity, we assume that the ISP offers two service tiers: a primary traffic (PT) service and a secondary traffic (ST) service.For concreteness, we further assume that the ST only utilizes the redundant capacity of the network [6].This assumption has two implications.First, ST has a strictly lower priority than PT, therefore the unit cost of ST (e.g., dollar per kilo bytes) is also cheaper than PT.The ratio of ST cost over PT cost is denoted as β, where 0 ≤ β ≤ 1.Second, the capacity of ST is upper bounded by the redundant capacity of the network.The total system capacity is dependent on the infrastructure deployment and network planning of the ISP.Once a network is rolled out, the system capacity is relatively stable.Redundant capacity is given by the difference between the system capacity and the primary traffic volume.Because the primary traffic volume fluctuates over time, the redundant capacity also changes dynamically.In practice, redundant capacity can be estimated by subtracting the pre-defined system capacity by the primary traffic load, which can be measured in real-time.We note that our paper focuses on the problem of proactive content delivery, which has a time-scale of seconds or minutes.Within such a time scale, the volume of redundant capacity can be treated as fixed.Therefore, our model captures the daily traffic fluctuation by a single parameter Cr t , which indicates the currently available redundant capacity, i.e., the upper limit for ST at time t.we derive the optimal content delivery policy that can minimize the long-term cost.Third, considering a generalized scheme of multi-time-slot proactive content delivery, we propose a nearoptimal heuristic algorithm for cost reduction.The performances of the proposed schemes are systematically evaluated to reveal key insights into the impacts of various system parameters on the cost.The remainder of this paper is organized as follows.Section 2 describes the system model.Sections 3 and 4 formulate and analyze the problems of proactive content delivery in single-time-slot and multi-time-slot cases, respectively.Numerical results are presented in Section 5. Finally, conclusions are drawn in Section 6.

Model of Communication Service Tiers
We consider a system consisting of a CP, an ISP, and N users.The content data is delivered from the CP to users via the ISP, as shown in Figure 1.For simplicity, we assume that the ISP offers two service tiers: a primary traffic (PT) service and a secondary traffic (ST) service.For concreteness, we further assume that the ST only utilizes the redundant capacity of the network [6].This assumption has two implications.First, ST has a strictly lower priority than PT, therefore the unit cost of ST (e.g., dollar per kilo bytes) is also cheaper than PT.The ratio of ST cost over PT cost is denoted as  , where    0 1.Second, the capacity of ST is upper bounded by the redundant capacity of the network.The total system capacity is dependent on the infrastructure deployment and network planning of the ISP.Once a network is rolled out, the system capacity is relatively stable.Redundant capacity is given by the difference between the system capacity and the primary traffic volume.Because the primary traffic volume fluctuates over time, the redundant capacity also changes dynamically.In practice, redundant capacity can be estimated by subtracting the pre-defined system capacity by the primary traffic load, which can be measured in real-time.We note that our paper focuses on the problem of proactive content delivery, which has a time-scale of seconds or minutes.Within such a time scale, the volume of redundant capacity can be treated as fixed.Therefore, our model captures the daily traffic fluctuation by a single parameter t Cr , which indicates the currently available redundant capacity, i.e., the upper limit for ST at time t.Within each service tier, the total traffic cost ( ) C L is a function of the traffic load L. The cost is interpreted as the cost to the ISP for secondary service provision (i.e., transmit more data using redundant capacity).It is assumed that such a cost of the ISP is proportional to the cost of CP to access communication services provided by the ISP.Two cost models are considered in our paper.One is the simple case of volume-based or linear cost, which means the cost per unit traffic remains unchanged regardless of the traffic load L. In this case, we have , where the cost is linearly proportional to the traffic load.Another case is quadratic cost, where  2 ( ) C L k L .This is a commonly used model in the literature [18] to reflect the fact that the cost to the ISP to support higher data rates Within each service tier, the total traffic cost C(L) is a function of the traffic load L. The cost is interpreted as the cost to the ISP for secondary service provision (i.e., transmit more data using redundant capacity).It is assumed that such a cost of the ISP is proportional to the cost of CP to access communication services provided by the ISP.Two cost models are considered in our paper.One is the simple case of volume-based or linear cost, which means the cost per unit traffic remains unchanged regardless of the traffic load L. In this case, we have C l (L) = k l L, where the cost is linearly proportional to the traffic load.Another case is quadratic cost, where C q (L) = k q L 2 .This is a commonly used model in the literature [18] to reflect the fact that the cost to the ISP to support higher data rates scales non-linearly with the data rate.Such a nonlinear scaling is rooted in Shannon's capacity formula: once the physical bandwidth is fixed, the data rate can be improved by increasing the transmit power, but with diminishing returns.In the literature, the cost-traffic volume function is commonly approximated by a quadratic function for analytical convenience [18].

Model of User Behavior
We assume that time is slotted into unit intervals and indexed by t.It is assumed that the CP is able to make probabilistic predictions on the users' content request behavior based on historical trace.The prediction tells that user n (n ∈ {1, 2, . . . ,N}) will consume a total of ξ n,t amount of data at time slot t with probability p n,t , where 0 ≤ ξ n,t < ∞ and 0 ≤ p n,t ≤ 1.A random binary variable is used to indicate whether the nth user's request actually occurs at time t, i.e., It is assumed that multiple users' arrival and content consumption behaviors are independent from each other.Furthermore, user demands are assumed to be cyclic-stationary.This assumption is supported by various measurements showing that the user demand fluctuates in a periodic pattern [40,43] (e.g., on a daily basis).As a result, we can group multiple time slots into a cyclic period.The number of time slots in a period is denoted as T. It follows that

Protocols of Proactive Content Delivery
We propose a protocol that is simultaneously aware of the service tiers and user behavior predictions.This requires certain degrees of collaboration and information sharing between the ISPs and CPs.At time slot t, the protocol uses the PT service tier to satisfy users' instantaneously content demand in the current slot.This is called reactive content delivery (RCD).Meanwhile, if redundant capacity is available, the protocol will proactively push a portion of the forecasted content demand in the upcoming several time slots using the ST service tier.This is called proactive content delivery (PCD).As the process iterates, the content demand at time t will be partly delivered by RCD via the PT service tier and partly by PCD via the ST service tier.Unlike traditional proactive caching schemes, the main difference here is that RCD and PCD are associated with the PT service tier and ST service tier, respectively.
Suppose that PCD is conducted over a length of W time-slots, where 1 ≤ W ≤ T and τ ∈ {1, 2, . . . ,W}.When W = 0, the content delivery mechanism is purely reactive, which serves as our baseline case.The case of W = 1 is called single-slot PCD (SPCD), while the more general case of 1 < W ≤ T is called multi-slot PCD (MPCD).We use x n,t (τ) to denote the portion of data expected for user n at time t + τ but is proactively pushed to the user at time-slot t.Here τ denotes how many time slots are ahead for proactive caching.The main parameters in this paper are summarized in Table 1.Portion of proactively delivered data to be consumed at time-slot t + 1 (unit: MB) x n,t (τ) Portion of proactively delivered data to be consumed at time-slot t + τ (unit: MB) Ratio of the cost of the ST service tier over the PT tier

Problem Formulation
This section considers the case of proactive content delivery with single time-slot, where forecasted user demands can be sent one time-slot ahead using the ST service tier.At a given time-slot t, the cost is composed of two parts.One is the cost generated by RCD through the PT service tier, and the other part is the cost generated by PCD through the ST service tier.The time-average expected cost can be written as: where we define a N × T matrix x, the elements of which are x n,t , ∀n, t.In Equation ( 3), x n,t+1 represents the portion of proactively pushed data for the next time slot t + 1, and the expectation is taken over the random variable I n,t .The received data for each user should not exceed the user's demand at time t, i.e., 0 and the total amount of proactively pushed data cannot exceed the upper limit of the redundancy capacity at the current time-slot t, i.e., The main objective is to minimize the total cost over the feasible space of x.The optimization problem can be formulated as x n,t ≤ Cr t ∀n, t For comparison purposes, also consider the baseline case of pure RCD.The time-average expected cost in this case is given by where ∑ N n=1 ξ n,t I n,t is the actual traffic load requested at time t.In this case, the system is purely reactive to the users' request and there is no decision variable to be optimized.

Linear Cost Model
Assuming the linear cost model, we can substitute C l (L) into Equation (3) to yield = 1 We note that the property of cyclic-stationary user demand (i.e., x n,t = x n,t+T ) is used in Equation ( 8) to give ∑ T t=1 x n,t+1 = ∑ T t=1 x n,t .From Equation ( 8), we can see that the optimization problem in Equation ( 6) becomes a linear programming problem, such that the problem can be easy solved by classic methods such as the dual interior point method.
A closer look at Equation ( 8) reveals a key insight that both the cost and the PCD decision variable x are determined by the relative difference between the cost ratio β and users' arrival probabilities p n,t .When p n,t > β, PCD for the nth user is beneficial for cost reduction; when p n,t < β, PCD for the nth user becomes harmful because there is a higher likelihood that the pushed data will not be actually consumed by the user, so that the resource used for PCD is wasted.When p n,t = β, PCD for the nth user makes no difference.

Quadratic Cost Model
When the cost is a quadratic function of the traffic load, the costs increase rapidly as the load increases.In this case, PCD becomes more useful because it helps to smooth the traffic load and reduce fluctuations over time.Substituting C q (L) into (3) yields: x n,t+1 x m,t+1 We can see that in this case, we no longer have a simple intuitive solution for x.However, it can be proved that the problem in Equation ( 9) is a convex optimization problem (see Appendix A).Hence, the optimal solution can be readily solved using standard convex optimization techniques.

Problem Formulation
As a generalization from the single-time slot case, portions of user's predicted demand can be pushed to users by multiple time-slots ahead through the ST service tier.The time-average expected cost in this case is given by: x n,t (τ) (10) where the decision variable x is a N × T × W matrix, whose elements are given by x n,t (τ), ∀n, t, τ.
Here, what differs from the single-time-slot case is that user n's cached data at time t is the accumulated data pushed from the previous W time-slots. PCD for each user is constrained by the individual user demand, i.e., x n,t−τ (τ) ≥ 0, in addition, the total amount of PCD data of all users at any time-slot t cannot exceed the current redundant capacity, i.e., x n,t (τ) ≤ Cr t (12) the optimization problem can then be formulated as: x n,t (τ) ≤ Cr t ∀n, t, τ

Linear Cost Model
Substituting the linear cost function C l (L) into Equation (10) we get In ( 14), the equality (b) follows by x n,t (τ).We can see that the optimization problem reduces to a linear programing problem.Similar to the case of single time-slot, the effectiveness of PCD still depends on the relative difference between the traffic cost ratio β and user n's arrival probability p n,t .However, the proactive data user n received from different time-slot, i.e., x n,t−τ (τ), depends on the redundant capacity of the previous W time-slots.This requires proper monitoring of real-time redundant capacity over multiple time slots.

Quadratic Cost Model
Substituting the quadratic cost function C q (L) into Equation (10), we have This yields a complicated non-linear optimization problem and there is no straightforward proof for its convexity.However, because the utility function can be easily evaluated in closed-form, general purpose heuristic search algorithms such as the pattern search [44] can be used to solve the problem effectively.

Simulation Results
This section presents numerical results to our previous analysis.For illustration purposes, we set T = 10 and N = 3. User n's demand at time t is drawn from a uniform distribution on [0, 500]; the arrival probability of user n at time t follows a uniform distribution on [0, 1].The scaling constants in the linear and quadratic cost models are given by k l = 2 and k q = 0.005, respectively.The case of pure RCD, where there is no proactive caching, is also presented as a performance benchmark.

Case of Single Time-Slot
Using the linear cost model, Figure 2 shows how the time-average expected cost and the redundant capacity utilization changes as a function of the ST/PT cost ratio β.The results are obtained by solving the linear optimization problem defined in Section 3.2 and averaging over 100 realizations.It is observed that a smaller value of β leads to a lower cost and a higher utilization of the redundant capacity.This is expectable because a smaller value of β would better encourage the use of PCD using the ST service tier.When β = 1, which means the two service tiers have the same cost, there is no performance gain to use PCD at all.Moreover, we can see that larger amount of redundant capacity helps to reduce the cost because more user demand can be accommodated via the ST service tier.
Using the linear cost model, Figure 2 shows how the time-average expected cost and the redundant capacity utilization changes as a function of the ST/PT cost ratio  .The results are obtained by solving the linear optimization problem defined in Section 3.2 and averaging over 100 realizations.It is observed that a smaller value of  leads to a lower cost and a higher utilization of the redundant capacity.This is expectable because a smaller value of  would better encourage the use of PCD using the ST service tier.When Using the quadratic cost model, Figure 3 shows how the time-average expected cost and the redundant capacity utilization changes as a function of the ST/PT cost ratio β.The results are obtained by solving the convex optimization problem defined in Section 3.3.The general trend observed in Figure 3 is similar to that in Figure 2, i.e., a smaller value of β leads to a lower cost and a higher utilization of the redundant capacity.However, a key difference to Figure 2 occurs when β approaches 1, where PCD is shown to be useful for cost reduction even when the cost of ST and PT are the same.For example, at Cr t = 400 and β = 1, the time-average cost can be reduced by nearly 32% (as opposed to 0% in Figure 2) and the redundant traffic utilization is about 43% (as opposed to 0% in Figure 2).This is because the PCD can help to smooth the user demand in time, while a more balanced user demand yields a lower cost under the quadratic cost model.Using the quadratic cost model, Figure 3 shows how the time-average expected cost and the redundant capacity utilization changes as a function of the ST/PT cost ratio  .The results are obtained by solving the convex optimization problem defined in Section 3.3.The general trend observed in Figure 3 is similar to that in Figure 2, i.e., a smaller value of  leads to a lower cost and a higher utilization of the redundant capacity.However, a key difference to Figure 2 occurs when  approaches 1, where PCD is shown to be useful for cost reduction even when the cost of ST and PT are the same.For example, at t Cr =400 and  =1, the time-average cost can be reduced by nearly 32% (as opposed to 0% in Figure 2) and the redundant traffic utilization is about 43% (as opposed to 0% in Figure 2).This is because the PCD can help to smooth the user demand in time, while a more balanced user demand yields a lower cost under the quadratic cost model.

Case of Multiple Time-Slot
Figure 4 shows three figures related to the performance of multi-time-slot PCD under the linear cost model.The results are obtained by solving the linear optimization problem defined in Section 4.2.The general conclusions drawn from Figure 4 are the same as that in Figure 2, i.e., PCD is not useful when there is no cost difference between ST and PT service tiers.Apart from this, Figure 4a, 4b, and 4c further reveal the impact of proactive window size on the performance.It is observed that increasing the window size does help to further reduce the cost, but the improvement is limited and  The general conclusions drawn from Figure 4 are the same as that in Figure 2, i.e., PCD is not useful when there is no cost difference between ST and PT service tiers.Apart from this, Figure 4a-c further reveal the impact of proactive window size on the performance.It is observed that increasing the window size does help to further reduce the cost, but the improvement is limited and becomes insignificant when W is greater than five.In Figure 4c, we can see that when the value of β increases, the effectiveness of cost reduction by increasing W decreases.This suggests that when the costs of the two service tiers are comparable, increasing the proactive window size W will become less effective for cost saving.Finally, Figure 5 shows three figures related to the performance of multi-time-slot PCD under the quadratic cost model.The results are obtained by solving the non-linear optimization problem defined in Section 4.3 using pattern search.Compared with Figure 4, Figure 5 shows that using PCD is always useful for cost reduction regardless of the values of  .Even when 1   , the cost can still be reduced by 53% thanks to the load smoothing effect.Moreover, increasing the window size also helps for load smoothing, and is hence considered beneficial for all values of  .Table 2 further demonstrates the smoothing effect of multi-time-slot PCD on network traffic load.Given , the variances of the actual traffic across different time slots is shown as a function of the window size.We can see that increasing W helps to reduce the variance of the traffic load, but has diminishing returns especially when W becomes greater than five.Finally, Figure 5 shows three figures related to the performance of multi-time-slot PCD under the quadratic cost model.The results are obtained by solving the non-linear optimization problem defined in Section 4.3 using pattern search.Compared with Figure 4, Figure 5 shows that using PCD is always useful for cost reduction regardless of the values of β.Even when β = 1, the cost can still be reduced by 53% thanks to the load smoothing effect.Moreover, increasing the window size also helps for load smoothing, and is hence considered beneficial for all values of β.Table 2 further demonstrates the smoothing effect of multi-time-slot PCD on network traffic load.Given Cr t = 400 and β = 0.5, the variances of the actual traffic across different time slots is shown as a function of the window size.We can see that increasing W helps to reduce the variance of the traffic load, but has diminishing returns especially when W becomes greater than five.The above simulation results show that both single time-slot and multiple time-slot PCD can bring good performance gain for CP.The performance gain increases with lower cost rate  and larger window size W.However, the performance gain is fundamentally constrained by the volume of redundant capacity.In practice, this means close cooperation must be established between CP and ISP so that the volume of redundant capacity in the current network can be measured and shared in real time.For the ISP, our model helps to improve the overall utilization of network infrastructure and generate additional revenue.For CP, our model helps to attract users and promote content consumption by reducing the cost of content delivery per bit.In summary, our model can offer a winwin situation for ISP and CP.

Conclusions
This paper proposes a personalized PCD scheme that aims to minimize the total cost of content delivery by means of multiple service-tier transmission and multi-user behavior prediction.The  The above simulation results show that both single time-slot and multiple time-slot PCD can bring good performance gain for CP.The performance gain increases with lower cost rate β and larger window size W.However, the performance gain is fundamentally constrained by the volume of redundant capacity.In practice, this means close cooperation must be established between CP and ISP so that the volume of redundant capacity in the current network can be measured and shared in real time.For the ISP, our model helps to improve the overall utilization of network infrastructure and generate additional revenue.For CP, our model helps to attract users and promote content consumption by reducing the cost of content delivery per bit.In summary, our model can offer a win-win situation for ISP and CP.

Figure 1 .
Figure 1.Illustration of the system model.

Figure 1 .
Figure 1.Illustration of the system model.

1 Figure 2 .
Figure 2. (a) The time-average expected cost as a function of the ST/PT cost ratio β; (b) the redundant capacity utilization as a function of the ST/PT cost ratio β (linear cost model, varying redundant capacity Cr t ).

Electronics 2018, 7 , 15 Figure 2 .
Figure 2. (a) The time-average expected cost as a function of the ST/PT cost ratio  ; (b) the redundant capacity utilization as a function of the ST/PT cost ratio  (linear cost model, varying redundant capacity

Figure 3 .
Figure 3. (a) The time-average expected cost as a function of the ST/PT cost ratio  ; (b) the redundant capacity utilization as a function of the ST/PT cost ratio  (quadratic cost model, varying redundant capacity t Cr ).

Figure 3 .
Figure 3. (a) The time-average expected cost as a function of the ST/PT cost ratio β; (b) the redundant capacity utilization as a function of the ST/PT cost ratio β (quadratic cost model, varying redundant capacity Cr t ).

Figure 4
Figure 4 shows three figures related to the performance of multi-time-slot PCD under the linear cost model.The results are obtained by solving the linear optimization problem defined in Section 4.2.

Figure 4 .
Figure 4. (a) The time-average expected cost as a function of the ST/PT cost ratio  ; (b) the redundant capacity utilization as a function of the ST/PT cost ratio  ; (c) the time-average expected cost as a function of the proactive window size W (linear cost model,

Figure 4 .
Figure 4. (a) The time-average expected cost as a function of the ST/PT cost ratio β; (b) the redundant capacity utilization as a function of the ST/PT cost ratio β; (c) the time-average expected cost as a function of the proactive window size W (linear cost model, Cr t = 400, varying window size W).

Figure 5 .
Figure 5. (a) The time-average expected cost as a function of the ST/PT cost ratio  ; (b) the redundant capacity utilization as a function of the ST/PT cost ratio  ; (c) the time-average expected cost as a function of the proactive window size W (quadratic cost model,

Figure 5 .
Figure 5. (a) The time-average expected cost as a function of the ST/PT cost ratio β; (b) the redundant capacity utilization as a function of the ST/PT cost ratio β; (c) the time-average expected cost as a function of the proactive window size W (quadratic cost model, Cr t = 400, varying window size W).

Table 1 .
Main parameters used in our model.

Table 2 .
The variance of traffic demand.

Table 2 .
The variance of traffic demand.