Selective Offloading by Exploiting ARIMA-BP for Energy Optimization in Mobile Edge Computing Networks

Mobile Edge Computing (MEC) is an innovative technique, which can provide cloud-computing near mobile devices on the edge of networks. Based on the MEC architecture, this paper proposes an ARIMA-BP-based Selective Offloading (ABSO) strategy, which minimizes the energy consumption of mobile devices while meeting the delay requirements. In ABSO, we exploit an ARIMA-BP model for estimating computation capacity of the edge cloud, and then design a Selective Offloading Algorithm for obtaining offloading strategy. Simulation results reveal that the ABSO can apparently decrease the energy consumption of mobile devices in comparison with other offloading methods.


Introduction
With the popularity of mobile devices, a growing number of mobile applications are striving for computation capacity to provide various services.Nevertheless, mobile devices generally have limited computation resources and short battery lifetime, so some applications which is computationally massive cannot be successfully implemented on mobile devices [1,2].This conflict between the applications which is scarce of resources and mobile devices whose resources are limited hence presents a formidable challenge.
A new technique called Mobile Cloud Computing (MCC) is likely to solve the aforementioned challenge.Cloud computing [3] offers enormous storage space and computation resources.Through transferring tasks from mobile devices to the resource-rich server, it could overcome the shortage of computation resources in mobile devices.Because mobile devices are far away from the remote cloud, large delays for mobile users has become a critical challenge for cloud computing.
Mobile Edge Computing (MEC) is envisioned as an emerging technique to handle this challenge.It provides cloud-computing service at the mobile edge network close to mobile devices [4].The main advantages of MEC are as follows: (i) compared with local computing [5], it can avoid the disadvantage of insufficient computation capacity of mobile devices; (ii) compared to Mobile Cloud Computing, it can overcome the large latency that occurs while tasks are transferred to the remote cloud.Thus, MEC presents a better compromise between tasks which is sensitive to delay and tasks is intensive to computation.
Mobile devices can offload their tasks to the edge cloud for computing, or they can choose to finish it locally.The computation capability of the edge cloud is crucial for this issue.In existing works on task offloading, the computation capability of the edge cloud is assumed to be known perfectly and is supposed to be a fixed value.However, the computation capability of the edge cloud varies over time due to the dynamic of the number of computation tasks.For instance, the computation capability of the edge cloud would decline when much more tasks are processed at edge cloud.
In this paper,we focus on the problem of computation capacity of the edge cloud, and propose an ARIMA-BP-based Selective Offloading strategy.The strategy minimizes the total energy consumption by mobile devices and simultaneously meets the tasks' delay constraints.The major contributions of this paper are summarized as follows:

•
We propose a multi-device framework for task offloading in MEC networks, and we formulate an optimization problem which minimizes the energy consumption and concurrently meets the delay constraints.

•
To solve this problem, we devise an efficient strategy, called ABSO (ARIMA-BP-based Selective Offloading).In ABSO, we propose an ARIMA-BP model to estimate computation capacity of the edge cloud, and then design a Selective Offloading Algorithm for obtaining offloading strategy.
The rest of this paper is organized as follows.In Section 2, we introduce the relevant research works review.The system model is in Section 3. In Section 4, we elaborate an ARIMA-BP model for estimating computation capacity of the edge cloud and further present a Selective Offloading Algorithm to solve offloading problem.Section 5 provide the simulation results.Finally, we come to the conclusion in Section 6.
The goal of latency-based offloading strategies is to minimize execution time of tasks.Liu et al. [10] formulated a latency minimization problem, then proposed a search algorithm to obtain the optimal task scheduling strategy.Chen et al. [11] designed a distributed task offloading method based on game theory, which can achieve a Nash equilibrium.Yang et al. [12] raised a novel technique based on compiler code analysis, which optimizes the execution time through offloading some of the code on the phone to the cloud dynamically.Mao et al. [13] presented a dynamic offloading strategy based on Lyapunov optimization, in order that execution time can be reduced.Yang et al. [14] designed a heuristic partitioning method to minimize the average execution time.However, these works do not consider energy consumption issues.
The purpose of energy-based offloading policies is to decrease energy consumption of mobile devices.Kamoun et al. [15] proposed a computation offloading strategy, in which the optimization problem is expressed as a Markov decision process.Tao et al. [16] apply KKT conditions to solve the energy minimizing optimization problem, and further propose a request offloading method.Lyu et al. [17] developed a lightweight framework and then designed a selective offloading strategy to minimize the energy consumption of devices.In [18], Huang et al. presented a dynamic offloading strategy through exploiting Lyapunov to reduce energy consumption.Munoz et al. [19] simultaneously optimized transmission time and data volume so as to minimize the mobile devices' energy consumption.However, these works only consider energy consumption and do not pay attention to the delay problem.
Furthermore, few works have addressed the problem of computation offloading jointly considering energy consumption and task execution time.Zhang et al. in [20] introduced an online offloading strategy so as to minimize energy consumption while meeting low delay.Guo et al. in [21] proposed a distributed computation offloading algorithm for MEC networks, in order that it can optimize energy consumption and time delay simultaneously.Neverless, they ignored that the computation capability of the edge cloud (details are provided in Sections 3.1 and 4.2.4) vary with the number of computation tasks for mobile edge computing.Baccarelli et al. [22] analyzed the energy efficiency of big data stream, and then the energy of batteries is measured through a power monitor.Taherizadeh et al. in [23] presented a distributed architecture, which can manage dynamic IOT environments where edge nodes possible overload because of increased workloads.Although they take into account the situation when workloads changes over time, they cannot describe the load of the cloud relatively accurately.
Various models have been applied to prediction of resource in the cloud environment.PRESS [24] adopts the Markov model and the signature-driven methods for resource prediction, which regards it as a linear time prediction problem.Martin et al. [25] implemented a Recurrent Neural Network to estimate CPU utilization.In this paper, we consider the two aspects together.We develop an ARIMA-BP model to estimate the usage of the edge cloud and then calculate the computation capacity.Since the computation resources of the edge cloud can be obtained by ARIMA-BP in real time, the offloading strategy based on this is relatively accurate and effective.

Scenario Description
The system model is shown in Figure 1.There is a mobile-edge computing system, including multiple mobile devices which have computation tasks and an edge cloud.Edge cloud is a relatively large data center with computation resources, which provides computing capacities in proximity to mobile devices.Its computing capacity is smaller than that of cloud computing and larger than that of mobile devices.There is a wireless base station, through which mobile devices could offload their computing tasks to the edge cloud.Suppose there are N mobile devices and K computation tasks in the MEC system.Then, we define the set of devices and tasks as N = {1, 2, . . ., N} and K = {1, 2, . . ., K}. Suppose time is divided into equal-sized slots.We assume that a mobile device can only request one task during a time slot, and different mobile devices can request the same task.Suppose that the mobile devices are static in the system.We define u n,k as the task k that mobile device n requests, and adopt three parameters representing the computation task.For task u n,k , its requirement is denoted as a tuple {w k , s k , D n,k } , where w k is the amount of computation resource required for u n,k , (i.e., the number of CPU cycles required to finish the task), and s k is the data size of u n,k , i.e., the amount of data content (e.g., the program codes and input parameters) which is transferred to the edge cloud, and D n,k represents completion time requirement for the task u n,k .Furthermore, due to the limitation of computation resources in the edge cloud, we denote the computation capacity of the edge cloud is c s .

Communication Model
When mobile device n transfers task k to the edge cloud, the uplink data transfer rate for mobile device n is shown below: where B is channel bandwidth, σ is the noise power, P n is the transmitted power of mobile devices, and H n denotes the channel gain between edge cloud and mobile device n.I n indicates the interference between edge cloud and mobile device n.Similar to many studies [16][17][18]26], we neglect the downlink transmission delay, because the output data of many applications is usually much smaller than the input data.

Computation Model
Considering the computation tasks can be processed locally or at the edge cloud, we will introduce this in detail.

Local Computing
Let f l n denote the CPU computation capacity of mobile device n.The local completion latency of u n,k can be given below: The computational energy consumption is expressed as: where l is the coefficient.According to work [27], l can be obtained by: where k is the energy coefficient.We set k = 10 −11 according to the work [28].

Mobile Edge Cloud Computing
If mobile device n choose to offload task k to the edge cloud, the process contains task transmission and task execution.The total completion time has two parts: (i) task transmission time T tra n,k (ii) task process time T pro n,k .The total latency of u n,k for mobile-edge computing is obtained by: where f c n,k denotes the computation resource of the edge cloud assigned to u n,k .Meanwhile, the energy consumption of mobile device n can be defined as E c n,k , which only consists of the transmission energy consumption while transferring tasks to the edge cloud.The energy consumption of mobile device as shown below:

Problem Formulation
For the problem of computational task offloading, the variable x n,k ∈ {0, 1} is defined, in order to express whether to offload the task u n,k to the edge cloud (x n,k = 1) or not (x n,k = 0).
Based on the aforementioned descriptions, the goal of the presented problem is to minimize the total energy consumption consumed by mobile devices, can be shown below: The constraint condition (C1) guarantees the total computation resources required to offload tasks must be less than total capacity of the edge cloud during each time slot.The constraint (C2) shows that the task u n,k must be completed within the time limit.The constraint (C3) indicates that the decision variable for task offloading is a binary data.
The objective function of our presented problem is defined as follows:

Research Motivation
In MEC, the edge cloud has relatively large storage and computation resources.Compared with the computation capacity of mobile devices themselves, the computation capacity of the edge cloud is huge.Mobile edge computing brings computation resources to the mobile devices and reduces energy consumption in comparison with local execution, but leading to increased duration of transmission.Hence we need to consider both energy consumption and time delay comprehensively.For this issue, the computation capacity of the edge cloud is a key factor.
In the experimental environment, we usually set the computation capacity of the edge cloud as a fixed value.However, the computation capacity of the edge cloud cannot be static in the process of task offloading.As the number of tasks processed at the edge cloud increases, its remaining computation resources will decrease, reducing its computation capacity.When the number of tasks processed at the edge cloud decrease, its computation capacity will increase.Therefore, It is very critical and effective to obtain the computation capacity of the edge cloud accurately for task offloading.
In practice, the computation capacity of the edge cloud is hard to represent and is not available ahead of time due to the dynamic of tasks.We hence can only get its estimated value by prediction.The increasingly advanced big data analysis technology makes it possible to accurately predict computation resources.By predicting the resource usage of the edge cloud, we can get the estimated value of its the computation capacity.

Estimation for Computation Capacity of Edge Cloud by ARIMA-BP
Resource prediction in cloud can be regarded as a time-series prediction problem.So far, the weighted average method has been widely used in the existing models to predict the trend of resource change.However, the compound model method for predicting CPU resource by time-series analysis techniques is rarely used.Moreover, it is found that the time-series process of CPU resource prediction is composed of linear structure and non-linear structure by researching.
The time-series prediction method mainly adopts a linear prediction model, which makes the method unable to process nonlinear data more accurately.Compared with the conventional time-series model, BP neural network is an efficacious nonlinear modeling measure.It has obvious advantages in handling the data with inconspicuous characteristics that with much randomness and nonlinearity, but it is a little poorer in processing the linear data compared to the conventional time-series model.Therefore, the combination method of time-series prediction and BP neural network can improve and perfect the prediction results.

ARIMA Model
The Auto-Regressive Integrated Moving Average (ARIMA) model is composed of the auto-regressive model (AR) and the moving average model (MA).It is the most commonly used non-stationary time-series prediction model.The basic idea of the modeling is to use the difference method to smooth the non-stationary time-series.The time-series is predicted and analyzed by observing the three parameters of the correlation function truncation and trailing characteristic auto-regressive order (p), difference frequency (d) and moving average order (q).The structure of the ARIMA model is as follows: where x i is the real value of time slot i. p is the number of autoregressive terms, and q represents the order of moving average.a, b are the relevant weights of this model.The term are the error terminologies relevance to the model.

Prediction of Resource Usage Using Fractional Differencing
PRESS [24] uses the Markov model and the signature-driven methods for resource prediction.They divided the use of resource measures into different intervals, then further calculate the conversion probability matrix.Finally they estimated the probability of the next interval by exploiting the Chapman-Kolmogorov equations: where π t and π t−1 indicate the probability at time instant t and t − 1.Moreover, AGILE [29] improves PRESS through exploiting wavelets to predict usage of resources in cloud.Also, ARIMA is used to estimate the resource usage in the cloud.The methods introduced above suppose that time-series are not dynamically changing, and there is no memory.Neverless, cloud environments is constantly changing, and its workloads are highly dynamic.The real workload conditions cannot be catched efficiently by traditional time-series models.It creates a need for a complicate time-series model that can be memorized over time.Thus, the method of rescaled range analysis [30] can be employed for fractional difference [31] in this paper.Fractional difference can be expressed as the following formula.
where x i denotes the value during interval i, x t represents the value of x t which is obtained after fractional difference.d denotes the parameter related to the difference.Then the method is applied to the analysis for multi-step ahead prediction of the above-mentioned methods.
The different models will be validated on Google cluster trace.Google cluster trace is a cluster usage dataset and is divided into six tables.Among them, the table of task resource usage offers the usage of various resources (such as memory, disk, and CPU) during different time intervals.Here, we analyze CPU usage at different time intervals.
Table 1 shows the RMSE (Root Mean Squared Error) for the multi-step ahead predictions of the above methods (PRESS, AGILE and ARIMA) without and with fractional difference.We can observe that fractional difference contributes to a significant improvement in the prediction results for all models.As the increase of prediction horizon, the RMSE of the prediction for CPU usage increases.Because errors accumulate gradually in the process of prediction.We can also observe that ARIMA has smaller RMSE than other models.Figure 2. shows the results of predictions of PRESS, AGILE and ARIMA on CPU usage.It can be seen from the figure that ARIMA reflects the trend of changes in CPU usage more accurately than PRESS and AGILE.Thus we adopt ARIMA model using fractional difference for time-series prediction.

Prediction of CPU Usage by ARIMA-BP
According to the ARIMA-BP method, the final prediction data of CPU usage can be got through adding the original data on CPU usage which is estimated by exploiting ARIMA using fractional difference and BP neural network is used to predict the residual sequence.
Table 2 indicates the RMSE of the two methods (ARIMA, ARIMA-BP), which is obtained by exploiting fractional difference.We can see that the RMSE of ARIMA-BP is lower than ARIMA leading to more accurate predictions.Figure 3 shows the results of predictions on CPU usage by ARIMA and ARIMA-BP.It can be observed that the ARIMA-BP predictions are much nearer to the actual values of CPU usage than ARIMA predictions.Therefore, we adopt the ARIMA-BP model using fractional difference to predict the computation capacity of the edge cloud in this paper.

ARIMA-BP for Prediction of Computation Capacity in Edge Cloud
In mobile edge computing, computation capacity of the edge cloud is usually represented by computation resources f c that can be allocated to mobile devices.We use f c (t) indicate computation resources that edge cloud have been allocated to mobile devices during the time slot [t − ∆t, t], then the remaining computation resources of the edge cloud is f c (t) = c s − f c (t) .Suppose that m tasks u n,k are processed at the edge cloud during the time slot [t − ∆t, t].We have We use the ARIMA-BP model for predicting the usage of computation resources of the edge cloud during a time slot.First, we build the ARIMA model.The usage of computation resources to be estimated during [t, t + ∆t], denoted by f c (t + ∆t), is expressed below: Then, we apply fractional difference method for catching long-term dependencies in the data.Finally, we use BP neural network to correct the residual error.

A Selective Offloading Algorithm
Based on the estimation of computation resources utilization of the edge cloud, we can obtain the remaining computation resources of the edge cloud by f c (t) = c s − f c (t).Then we develop a selective offloading algorithm.
Initially, all mobile devices need to set their initial decisions during a time slot.Then, we lead into the condition below for prioritizing urgent tasks that need to be offloaded.This is known as selective offloading.

Condition 1. if T l
n,k > D l n,k , the task u n,k offloads to the edge cloud for processing.
When Condition 1 is satisfied, it is usually the devices which are constrained to resource, and their tasks are sensitive to latency.We give priority to offload these tasks.Because computing locally cannot meet tasks' delay constraints.(i.e., T l n,k > D l n,k ).Tasks must be accomplished within the specified time (i.e., the delay requirements) D n,k .Therefore, we can calculate the minimum computation resources needed for each task under the latency requirements when computed at edge cloud by: The edge cloud server gets the remaining computation resources f c (t) and broadcasts it to the mobile devices.Mobile devices which do not meet Condition 1 will be judged by Condition 2 after they get f c (t).
For all tasks that satisfy Condition 2, they will merge into the set Θ(t), since the remaining computation resources of the edge cloud can meet the minimum resources they need (i.e., f c (t) ≥ f min n,k ), and the tasks not satisfying the Condition 2 still to judge in the next time slot.
While Θ(t) = ∅, tasks in Θ(t) compete for decision-making opportunity by sending requests to the edge cloud.During each time slot, only one task can get the opportunity to make the decision.Suppose the mobile device i obtain the chance, it shows to other devices that it gains the chance for decision-making by broadcasting.Then, the energy consumption of local computing and edge cloud processing of task u n,k will be compared.If the energy consumption of the edge cloud computing is smaller (i.e., E c n,k < E l n,k ), the task would be offloaded to the edge cloud for processing; If not, the task would be processed locally.For mobile devices don't get the opportunity, they will continue to judge the Condition 2 in the next time slot.Algorithm 1 shows the process of getting offloading decision.

Experiment Setup
We do simulations to prove the performance optimization that our presented computation offloading strategy can bring.An edge computing system is considered in this paper.The system is composed of the edge cloud and mobile devices, in which mobile devices have relatively intensive computation tasks.The simulation settings are listing as follows.We set the transmission bandwidth B = 5 MHz and transmitting power P n = 0.5 W, and the relevant noise power σ = 10 −10 .The channel gain H n is denoted by H n = 127 + 30 × log d, where d is the distance between edge cloud and mobile device n.
By default, the mobile devices have 50 computation tasks in total.For task u n,k , the number of CPU cycles w k subjects to normal distribution, which its average is 1000 M Cycles.The data size s k obeys normal distribution, and its average value is 3 MB, and the completion time D n,k is generated by a normal distribution with an average of 1 s.Furthermore, the computational capability of mobile devices is assumed 1 GHz.Relative to the research field, the above parameters are typical values, and will not affect the results obtained.

Task Offloading Evaluation
To evaluate the offloading, we compare our proposed scheme ABSO with following task offloading strategies.

•
Local execution: All computational tasks are processed locally.

•
Full offloading: All computational tasks are executed on edge cloud by offloading.

•
Branch and Bound Algorithm (BBA) : The objective function is transformed into a question that is binary and linear, and it can be resolved availably by using a branch-and-bound algorithm.
We observe the energy consumption of mobile devices and the completion time of tasks under various quantities of computation tasks in Figure 4. Obviously, we can see that the completion time and energy consumption will gradually increase as computational tasks increase.We can see that local execution consumes the most energy, because the CPU computation capacity of a mobile device is relatively small.Also, full offloading has higher completion time compared to others; this can mainly be attributed to the reason that full offloading needs to transmit data to the edge cloud before tasks can be processed.Moreover, it can be observed that the presented offloading strategy has better results compared to others on decreasing the expenditure of energy and time, and as the number of tasks increases, its advantages become more obvious.For the metric of the completion time, ABSO can achieve up-to 14% and 26%, and 7% performance improvement over the local execution, full offloading, and BBA, respectively.For the metric of the energy consumption, ABSO can reach up to 42% and 22%, and 9% decrement of energy compared to the local execution, full offloading, and BBA, respectively.
Figure 5a depicts the influence of data size of tasks on the energy consumption.We can observe that our presented ABSO outperforms other methods.The energy consumption of local execution is relatively steady, and it is smaller than the value of full offloading as s k increases.We can also see that as the data size of tasks increase, the energy consumption starts to increase.That's because with the data size of tasks rise, the communication time become larger, and the energy consumption of the edge cloud computing become larger than that of before.In the meantime, the difference of energy consumption between different strategies is subtle while the data size of tasks is small.Therefore, we infer that tasks with small amount of data are processed in the edge cloud when the other conditions of tasks are consistent.On the contrary, tasks would be supposed to processed locally.
The energy consumption with the different number of CPU cycles w k are shown in Figure 5b.According to the comparison, it can be seen that the energy consumption of our presented method ABSO is most optimal.It can be observed that the energy consumption of full offloading is relatively steady with little fluctuation.Also, as the increasing of required CPU cycles, the energy consumption are getting larger.This is because that when the required CPU cycles of tasks rise, the energy consumption of local execution turns larger.From Figure 5b, we can conclude that in the case of the consistent amount data size of tasks, the needed CPU cycles is small and tasks ought to be processed locally.Conversely, the CPU cycles needed is large, tasks are transferred to the edge cloud for processing.In Figure 5c, we consider the energy consumption as the variance of latency requirements.The energy consumption of local computing and full offloading is little changed.This is because that latency requirements are not taken into account when processing locally and offloading entirely, and energy consumption is independent of latency requirements.Obviously, offloading tasks to the edge cloud is helpful to cut down energy consumption.While tasks' delay deadlines are small, the performance of our proposed ABSO resembles full offloading and BBA.Since most tasks are offloaded to the edge cloud for processing.It seems the energy consumption of ABSO is plunged and tends to be gentle after 1.1 s.This is because that as latency turns larger, the presented ABSO uses an optimized strategy, in order that the energy consumption can be reduced slightly.

Conclusions
In this paper, we committed to designing an energy-efficient offloading strategy, which also meets the constraints of users' delay requirements.We propose an offloading strategy named the ARIMA-BP-based Selective Offloading (ABSO) strategy and use the two-step framework for designing it.We first design an ARIMA-BP model to estimate computation capacity of the edge cloud.Then, we propose a Selective Offloading Algorithm for obtaining offloading strategy.The simulation results show that the presented ABSO can obtain superior performance on reducing energy consumption of mobile devices.In future research work, we plan that more common situations will be taken into account, such as mobile user movement during use.

Figure 1 .
Figure 1.A multi-user system for MEC.

Figure 2 .
Figure 2. Prediction on CPU usage for different models.4.2.2.Modification of the Residual Error Correction by BP Neural Network Back Propagation Back Propagation (BP) is a neural network trained by error back propagation algorithm, which is multi-layer feedforward.It is the most widely used neural network training method in combination with optimization method such as gradient descent.BP neural network consists of two processes, including forward propagation of information and back propagation of errors.Using the error of the output layer to estimate the error of the direct predecessor layer of the output layer, and then using this error to estimate the error of the previous layer.It can continuously learn and store a large number of mappings about input-output modes without describing the equation of this relationship directly.Modifying the Residual Error Correction CPU usage has been fitted by ARIMA prediction techniques using fractional difference, then we use Back Propagation to obtain the nonlinear residual.It involves following steps: 1. Compute the remaining sequence used by the CPU. 2. Normalize the residual sequence.3. Define the structure of the BP neural network.4. Define the trained parameters of the BP neural network. 5. Obtain the residual sequence by the network simulation.

Figure 3 .
Figure 3. Prediction on CPU usage by ARIMA and ARIMA-BP.

Figure 4 .Figure 5 .
Figure 4. Energy consumption of mobile devices and completion time of tasks under various quantities of computation tasks.

Table 1 .
RMSE of Prediction on CPU Usage.

Table 2 .
RMSE of ARIMA and ARIMA-BP on CPU Usage.