Energy-Efficient Online Resource Management and Allocation Optimization in Multi-User Multi-Task Mobile-Edge Computing Systems with Hybrid Energy Harvesting

Mobile Edge Computing (MEC) has evolved into a promising technology that can relieve computing pressure on wireless devices (WDs) in the Internet of Things (IoT) by offloading computation tasks to the MEC server. Resource management and allocation are challenging because of the unpredictability of task arrival, wireless channel status and energy consumption. To address such a challenge, in this paper, we provide an energy-efficient joint resource management and allocation (ECM-RMA) policy to reduce time-averaged energy consumption in a multi-user multi-task MEC system with hybrid energy harvested WDs. We first formulate the time-averaged energy consumption minimization problem while the MEC system satisfied both the data queue stability constraint and energy queue stability constraint. To solve the stochastic optimization problem, we turn the problem into two deterministic sub-problems, which can be easily solved by convex optimization technique and linear programming technique. Correspondingly, we propose the ECM-RMA algorithm that does not require priori knowledge of stochastic processes such as channel states, data arrivals and green energy harvesting. Most importantly, the proposed algorithm achieves the energy consumption-delay trade-off as [O(1/V),O(V)]. V, as a non-negative weight, which can effectively control the energy consumption-delay performance. Finally, simulation results verify the correctness of the theoretical analysis and the effectiveness of the proposed algorithm.


Introduction
As an ubiquitous computer paradigm, the Internet of Things (IoT) has seen explosive growth in computationally intensive mobile applications such as autonomous driving, virtual reality, and interactive online games [1]. Because of size and production cost considerations, wireless devices (WDs) (e.g., sensors) in IoT systems generally carry capacity-constrained batteries and energy-saving, low-performance processors. Therefore, the development of computationally intensive applications is severely limited by resource-constrained devices [2,3]. Consequently, how to eliminate this bottleneck is a key issue in the research and development of modern Internet of Things technology.

•
We consider a multi-user multi-task MEC system with hybrid energy harvesting WDs to investigate the joint resource management and allocation problem. To solve the stochastic optimization problem, we turn the problem into two deterministic sub-problems, namely transmission optimization sub-problem and battery management sub-problems. The transmission optimization sub-problem can be designed to solve the transmission power and the transmission time of data, while the battery management sub-problem can be constructed to solve the needed wireless energy amount. • Based on Lyapunov optimization theory and convex optimization theory, we use the proposed energy-efficient joint resource management and allocation algorithm (ECM-RMA) to solve the problem of minimizing energy consumption under delay guarantees. Without the prior knowledge, the algorithm allocates energy offloading data according to the energy harvest amount, remaining available time, amount of data to be processed and other factors at the beginning of each time slot.

•
The trade-off between energy consumption and delay is [O(1/V), O(V)]. V, as a non-negative weight, can achieve the balance between energy consumption and task data queues.

•
In the field of MEC, we first proposed the devices with a hybrid energy harvesting method that integrates green energy and wireless energy.

Related Works
Recently, the academic community has done a lot of research on resource management and allocation strategies in an MEC system. These policies mainly manage and distribute resources of local execution, comunication process, and edge servers. The scenario of resource management and allocation in MEC systems is divided into single-user MEC systems [14][15][16][17][18][19], multi-user MEC systems [20][21][22][23][24][25][26] and heterogeneous server MEC systems [27,28]. In [14], the offloading ratio, transmission power, and the CPU clock frequency are jointly optimized to minimize the latency subject to the energy consumption or minimize the energy consumption subject to the latency. In [19], Mahmoodi jointly optimizes the computational latency and energy consumption, which only based on the Markov decision processes (MDP) theory to optimize the size of the offloading data. In order to minimize the total energy consumption, an optimization strategy is proposed in [20], which has a simple threshold structure for the size of offloading data and time allocation, and is controlled by the optimization strategy. In [21], instead of controlling the size of the offloading data and the time allocation, Barbarossa minimizes the overall energy consumption by controlling the transmission power and the server CPU clock. However, the above works did not consider a real application scenario, which is resource management and allocation under multi-user and multi-task.
In order to prolong the battery life of wireless IoT devices and to reduce its manual maintenance costs, some recent works have begun to explore renewable energy supplies and wireless transfer energy supplies in MEC systems. In [29,30], researchers incorporate renewable energy into MEC. In [29], an effective resource management algorithm based on reinforcement learning is proposed by Xu and Ren. Compared with standard reinforcement learning algorithms such as Q-learing, learning speed and time performance of the proposed effective resource management algorithms are significantly improved. Maos. proposed a low-complexity online algorithm to jointly determine the offloading strategy, CPU cycle frequency and transmission power to achieve an asymptotically optimal of execution cost in [30]. In wireless powered MEC scenes, an optimized resource allocation strategy in [6] is proposed by Wang, and the total energy consumption of the AP is minimized under the user's own computational delay constraint. In addition, in a multi-user wireless powered MEC network, in order to maximize the weighted total calculation rate, an alternating direction method of multipliers (ADMM) -based technique by jointly optimizing the computational model and transmission time allocation was proposed in [2]. However, the strong randomness of renewable energy cannot guarantee the reliability of local execution or offloading. In addition, in the wireless transfer energy powered MEC system, it will cause some energy consumption to the energy transmission end. To the best of our knowledge, it has not been proposed that the devices in an MEC system use a hybrid energy harvesting method for local computing and offloading.
In addition, the number of available sub-channels in the IoT network also have an larger impact on tasks offloading. In [1], Mao use a joint computation allocation and resource management algorithm to research the trade-off of fundamental between Energy Efficiency and delay. In order to minimize energy consumption of device while meeting delay constarint of user, Hao et al. propose a joint task offloading and caching problem in [31]. An energy-efficient dynamic offloading resource scheduling (eDors) policy is proposed by Guo to shorten task completion time and reduce energy consumption [25]. However, the above works did not consider the number of available sub-channels in an MEC system. The remainder of this paper is as follows: Section 3 shows the system model and the problem of minimizing the average total energy consumption subject to latency is formulated. Section 4 proposes an efficient online resource management and allocation algorithm to solve the formulated problem and the performance analysis of the algorithm is given. Section 5 provides simulation results to verify the correctness of the theoretical analysis and the effectiveness of the proposed algorithm. Finally, Section 6 concludes this paper. The key mathematical notations used in this paper are listed in Table 1, and the abbreviations used in the Table 1 are defined in Table 2. The set of tasks T The set of time slots p m (t) The uplink transmission power between WD m and the AP at time slot t h m,ul (t) The uplink channel gains between WD m and the AP at time slot t w 0 The white noise power level of uplink channel B m,ul The uplink channel bandwidth that the AP allocates for the WD m at time slot t ν m (t) The uplink transmission rate between WD m and the AP τ n m (t) The duration of task n partially offloaded from WD m to the AP within time slot t τ m (t) The time allocated schedule of all tasks in WD m during slot t I n m (t) The amount of computation task arrived at user m at the end of the time slot t f n m (t) The CPU-cycle frequency assigned to task n on WD m at time slot t L m (t) The CPU-cycles required to calculate one bit of data in WD m at time slot t D n m,l (t) The local computed data amount of task n of WD m at time slot t e n m,l (t) The energy consumption for local computation for task n of WD m during a time slot t D n m,c (t) The offloaded data amount of task n of WD m at time slot t e n m,c (t) The energy consumption of WD m offloading the data of task n at time slot t D n m (t) The total computed data of task n of WD m at time slot t Q n m (t) The backlog of the data queue for task n of WD m at time slot t e m,h (t) The energy obtained from the environment of WD m at time slot t e m,w (t) The energy obtained from the AP of WD m at time slot t e m,hw (t) The total harvested energy of WD m at time slot t γ(t) The energy supply from the environment at time slot t e n m,total (t) The total energy consumed by WD m for task n within time slot t E m (t) The dynamic change of WD's energy queue M The number of WDs N The number of tasks in a WD T The slot length S The number of slots K The number of available sub-channel p m,max The maximum uplink transmission power p m,min The minimum uplink transmission power I n,max m The maximum arrive data γ max The maximum green energy supply Table 2. Definition of abbreviations used in the Table 1.

WD Wireless device WDs
Wireless devices AP Access point

System Model
As shown in Figure 1, we consider a multi-user multi-task MEC system with hybrid energy harvesting, which consists of an AP (integrated with an RF energy transmitter, an MEC server and a communication circuit) and M WDs (each WD contains an energy harvesting circuit, a rechargeable battery, a computing unit and a communication circuit). We denote the set of WDs by M = {1, 2, ..., M} and use N = {1, 2, ..., N} to denote the set of computation tasks of WD m. Each WD has a completion deadline, and J m denotes the completion deadline of WD m. Moreover, the number of available sub-channels for the WDs is denoted by K. We divide the total time into slots denoted by set of T = {0, 1, ...,} and t ∈ T with slot length T. Within each time frame, some computationally intensive tasks can be offloaded to the AP through uplink wireless links and computed at the MEC server, while others are directly computed locally.
We assume that the AP knows the channel status information (CSI), task offloading information, and the remaining battery power of WDs, which can be obtained through feedback [20]. Using this information, at the beginning of slot t, the data transmission power, the transmission time and the energy need to be harvested by each device will be determined by the AP. In addition, then the AP sends these decisions to the WDs in the MEC system. Specially, we only need to focused on one time slot.
From Figure 2, the WDs in a hybrid energy harvesting MEC system powered by green energy from the environment and wireless transfer energy from the AP as well as the AP powered by green energy and grid energy can be seen. In detail, when a device cannot get enough energy from the environment to fill the rechargeable battery, AP in the MEC system will transmit wireless energy to fill the device battery. In addition, for the AP, green energy is used as the primary source of energy, while stable grid energy as a backup energy supply [2]. Then, the harvested energy of a WD is used by the computing unit and the communication circuit to compute tasks locally and offload tasks, respectively.

Communication Model
We first introduce the communication model between the WDs and the AP in an MEC system. Small-scale Rayleigh fading is followed by all wireless channels. Therefore, the wireless channel states are independent and remain unchanged during each time slot. During a time slot t, the uplink transmission rate between WD m and the AP is quantified as where B m,ul is the uplink channel bandwidth, which is allocated to WD m by the AP. p m (t) represents the transmission power between WD m and the AP at time slot t. Moreover, p m (t) with the minimum value p m,min and the maximum value p m,max , i.e., h m,ul (t) is the uplink channel gain between WD m and the AP. Specially, the channel gain is determined by small-scale Rayleigh fading and the distance-dependent path loss. In addition, w 0 is the white noise power level. At the same time, a WD can only occupy one available sub-channel. Let τ n m (t) denote the duration of task n partially offloaded from WD m to the AP within time slot t, i.e., Moreover, we define τ m (t) = {τ 1 m (t), τ 2 m (t), ..., τ N m (t)} as the time allocated schedule of all tasks in WD m during slot t, and the total uplink transmission time of WD m must not exceed the available slot length, given as In addition, the total uplink transmission time of the all tasks on all devices must not exceed the sum of the time lengths of all available sub-channels, which is as follows: where K denotes the number of available sub-channels in the MEC system, and each WD can only access an available sub-channel at one time slot due to the WDs in a IoT network being narrow-band and simple.
In this paper, we don't consider the delay, energy consumption and packet loss of downlink transmission. This is because the size of the result returned is normally much smaller than the corresponding pre-processing task. In addition, the downlink transmission rate between the WDs and the AP is higher than the corresponding uplink transmission rate [6,31].

Task Computation Model
In this section, we introduce the task computation model. In practice, many computed tasks consist of multiple small procedures/components. Thus, it is necessary to perform partial computation offloading of a task. Specifically, a computed task can be divided into multiple parts where some of the parts were executed at the WD and the others were offloaded for the AP execution. Let I n m (t) denote the data amount of computed task n that arrived at WD m at the end of time slot t and the WD can continuously receive data within a time slot. I n m (t) with the rate of data arrival C n m (t), which obeys independently and identically distributed (i.i.d) over time slots, independent between tasks, have a maximum data arrival rate C n,max m (t) and have average arrival data rate λ n m . Therefore, I n m (t) has the maximum data arrival I n,max m , i.e., Moreover, each WD allocates the same amount of data storage space R D for each task, and the task data is stored in its own corresponding data storage space before it is processed.

Local Computing
Regarding the local computing, let D n m,l denote the data amount of task n computed locally by the WD m within a time slot t. The CPU-cycle frequency assigned to task n in WD m is denoted by f n m , and L m is the CPU cycles required to compute a one bit size of data. Therefore, the local computed data amount of task n in WD m during time slot t can be expressed as follows: The corresponding energy consumption for local computation for task n in WD m during a time slot t is given as follows: where k is the effective switched capacitance related to the chip architecture [1].

Mobile Edge Cloud Computing
On the other hand, because of WDs not having enough computing power, some data need to be offloaded to the AP for computation. Our paper is aimed at the energy consumption of the WDs in an MEC system, so we do not consider the energy consumption generated by the MEC server.
We assume that the offloaded data amount of task n in WD m at time slot t is shown by In addition the corresponding energy consumption of WD m offloading the data of task n at time slot t can written as e n m,c (t) = p m (t)τ n m (t).
After a task is offloaded to the MEC server, the MEC server will process the task. We assume that the computation capability of the MEC server is f ap , and the computing resource required for task n of WD m is g n m . Hence, the computation delay generated by processing task n of WD m by the MEC server can be expressed as As mentioned before, each WD has a completion deadline. Thus, we have Clearly, we have the total computed data of task n in WD m at time slot t, D m,n (t), as given by In this paper, each task has a data backlog queue. Hence, we use Q n m (t) to denote the backlog of the data queue at task n of WD m and it is updated over time. Based on the above analysis, the dynamics of data backlog queue at task n of WD m can easily be expressed as

Energy Consumption Model
In this section, we will introduce the energy consumption model. As we know, green energy is unpredictable and unstable. In order to be environmentally friendly and improve the reliability of energy harvesting as much as possible, we use two energy sources to support the WDs in an MEC system. The one energy source is the green energy that WDs harvested from the environment, and the other one is the wireless transfer energy from the AP. In our paper, the AP will provide wireless transfer energy to support the WD when green energy can't support the power required by a WD.
In a given time slot, the energy that WD m gets from the environment is denoted as e m,h , which is a stochastic value determined by the state of environment, and e m,w is defined as the energy that WD m gets from the AP. Moreover, the energy supply from the environment is denoted by γ m (t), that is, the green energy that WD m can be get at time slot t. γ m (t) is i.i.d in different time slots, and the upper limitation of γ m (t) is γ max . Since the harvested energy from the environment by any WD cannot exceed the energy supply in a time slot, the following inequality can be obtained: In addition, e m,w is defined as where µ ∈ (0, 1) represents the energy harvesting efficiency of energy harvesting from an AP. p ap (t) and h m,dl (t) denote the energy transmission power of the AP and the downlink channel gain, respectively. As for m (t), it is the transmission time of wireless energy, and it satisfies m (t) ≤ T. In order to avoid mutual interference caused by the common use of channels, wireless energy transfer and task offloading cannot be performed simultaneously. Thus, we use the following formula to limit: Similarly, convert Equation (5) to Equation (17 Therefore, the total energy harvested by WD m at time slot t is e m,hw (t), i.e., e m,hw (t)=e m,h (t) + e m,w (t), ∀m ∈ M.
As we mentioned before, energy harvested by each WD is stored in a rechargeable battery. The energy queue of WD m is denoted by E m (t), which is energy available in the battery of WD m. Within time slot t, the total energy consumed by WD m for task n is expressed as e n m,total (t), which consists of two parts, the local CPU energy consumption and the data offloaded energy consumption, i.e., e n m,total (t) = e n m,l (t) + e n m,c (t), ∀m ∈ M.
Based on the above analysis, the dynamic energy queue of WD m can be expressed as where ∑ n∈N e n m,total (t) satisfies the following inequality, i.e., ∑ n∈N e n m,total (t) ≤ E m (t), (22) since the energy consumption of the WD m cannot exceed the length of its energy queue. Moreover, due to the limited battery capacity of the WD, the sum of the available energy and the harvested energy of the WD cannot exceed the capacity of the battery, which is shown as As for the total energy consumption of the WDs in the MEC system at time slot t, it is defined as

Optimization Problem Formulation
Based on the above model, we turn the energy minimization problem into a stochastic optimization programming problem. Especially, with the help of Lyapunov optimization theory, we turn the stochastic optimization problem P1 into a series of deterministic optimization problems, where each problem is processed in each time slot and is hoped to be solved through the standard convex optimization technology. Thus, the joint optimization of data transmission power, transmission time and need to harvest energy is made to minimize the total energy consumption of the WDs in the MEC system.
First of all, it can be learned from the previous section that the total energy consumption of the WDs in an MEC system consists of the local CPU energy consumption and the data offloaded energy consumption. Now, we denote the average time energy consumption of the WDs in an MEC system as where S is the total time length of the MEC system running. Secondly, we model the problem of minimizing energy consumption. In order to simplify the description, we use p(t), τ(t) and e(t) to represent the set of transmission power p m (t), transmission time τ m (t) and amount of wireless harvested energy e m,w (t) at time slot t. Moreover, we use ξ(t) = (p(t), τ(t), e(t)) to denote the set of all variables that need to be optimized at time slot t. Therefore, the mathematical problem of minimizing energy consumption can be modeled as follows: (12), (15), (17), (18), (22), (23). (26) In the next section, we will give a detailed method to solve problem P1.

Online Resource Management and Allocation Optimization in MEC
In this section, in order to solve the problem of queue stability constraints P1, we use Lyapunov theory to decompose the problem into two sub-problems in a single time slot, and give the energy minimization framework. Based on this framework, an algorithm that guarantees the stability of data queues and energy queues obtains a near-optimal solution.

Lyapunov-Based Problem Decomposition
Within time slot t, the state of the MEC consists of the data queues Q(t) and energy queues E(t) of the WDs, expressed as U(t) = (Q(t), E(t)). Then, we define the Lyapunov equation L(t), which consists of the square sum of the data queue length and the remaining battery capacity: whereÊ m (t) = Ω − E m (t) represents the remaining battery capacity of WD m. L(t) is a scalar measure of the length of the data queue and the size of the remaining battery capacity. It is known from Equation (27), the smaller the value of L(t), the smaller the length of the data queues, which also indicates that the remaining batteries capacity of the WDs is low, and vice versa. In addition, we define the Lyapunov drift ∆L(t), which represents the expectation of the deviation of the Lyapunov equations between time slot t + 1 and time slot t when given the network state U(t), i.e., In order to achieve the goal of minimizing the energy consumption of the WDs, we integrate the energy consumption function into the Lyapunov drift ∆L(t) to obtain the drift plus energy consumption function ∆ V L(t): where V is a non-negative weight, and it represents the proportion of energy consumption e(t) in ∆ V L(t). The higher the value of V, the higher the proportion of e(t) in ∆ V L(t), and vice versa. By minimizing ∆ V L(t), the purpose of stabilizing queue length and minimizing energy consumption can be achieved jointly. However, as the value of V increases, the length of data queues and energy queues also increases. In other words, WDs require larger data storage devices and batteries to ensure the running of the MEC. By adjusting the value of V, we can achieve a trade-off between queue length and energy consumption in the MEC system. where Proof of Lemma 1. The proof is provided in Appendix A.
Then, we can know the upper bound of ∆ V L(t) that is following as: In order to minimize the total energy consumption of WDs in the MEC system and ensure the stability of the queue at the same time, we minimize the upper bound of ∆ V L(t). In addition, since we consider the queue state U(t) within a single time slot t, we can remove expectations. In other words, we convert P1 to P2 based on Lemma 1 as follows: Problem P2 is consisting of two parts linearly. Therefore, we minimize the problem into two sub-problems which are transmission optimization and battery management. In the sub-problem of transmission optimization, we optimize the transmission power p(t) and transmission time τ(t) during tasks offloaded, while we optimize the amount of harvested energy e(t) in the sub-problem of energy management. After solving these two sub-problems, WDs update their own data queues and energy queues to prepare for the optimization of the next time slot.
We will study the solution of these two sub-problems in the following sections.

Transmission Optimization Problem
Considering the first term and the second term in problem P2, the following transmission optimization problem can be designed to solve the transmission power p(t) and However, the maximization problem (34) is not a convex optimization problem because of the constraint (20). In order to make problem (34) easier to handle, we introduce a set of auxiliary variables e var (t), i.e., e n,var m (t) = p m (t)τ n m (t). Therefore, we convert problem (34) to problem (35), which is displayed as follows:

Lemma 2. Problem (35) is a convex optimization problem.
Proof of Lemma 2. The proof is provided in Appendix B.
Due to problem (35) being a convex optimization problem, it can be solved by a standard convex optimization technique, such as the interior point method. In our paper, we use the interior point method to solve the objection function in problem (35), and the computational complexity is roughly proportional to O(max((MN + M + 1), F)), where (MN + M + 1) is the total size of set e var (t) and τ(t), while F denotes the cost of evaluating the objection function, first and second derivatives.

Battery Management Problem
Considering the third term in problem P2, a battery management problem can be constructed to solve e m,w (t), which represents the wireless energy amount, which is harvested by WD m at time slot t, respectively: The battery management problem is a linear programming problem. In order to achieve the equation e m,hw (t) = Ω − E m (t) as much as possible, let e * m,w (t) be the optimal solution to the battery management problem and the corresponding energy transfer time is defined as l m * (t). If the battery of WD m can accommodate more energy E m (t) < Ω at the beginning of time slot t, that is to saŷ E m (t) > 0, then we need to get the wireless energy e m,w (t) = max{Ω − E m (t) − γ m (t), 0} from the AP; otherwise, the AP will not transfer energy to WD m, meaning e m,hw (t) = 0. When E m (t) < Ω, we can know m (t) ≤ T − ∑ n∈N τ n m (t) from Equation (17) and make m (t) = µp ap (t)h m,dl (t) } and e * m,w (t) = µp ap (t)h m,dl (t) m * (t). In short, by solving the battery management problem, the WD will get as much energy as possible to fill the battery.

Energy Consumption Minimization Resource Management and Allocation Algorithm (ECM-RMA)
Based on the analysis of the above two parts, we propose the ECM-RMA algorithm. In addition, the process description is given in Algorithm 1. The MEC system executes the ECM-RMA algorithm to obtain the optimal wireless transmission power p * (t), transmission time τ * (t), and amount of wireless harvested energy e * (t) at each time slot, respectively. Then, the length of data queue Q(t) and energy queue E(t) are updated according to the value of their respective variable.
Because the battery management problem has a closed-form solution, the complexity is negligible. In that way, transmission optimization problem determines the complexity of the ECM-RMA algorithm. Hence, the time complexity of the ECM-RMA algorithm is O(max((MN + M + 1), F)).

Performance Analysis
We analyze the performance of the ECM-RMA algorithm in this section. As mentioned in the previous parts, the upper limitation of ∆ V L(t) is indirectly minimizing ∆ V L(t). The proposed ECM-RMA algorithm is used to solve the problem P2 in (31), which is not equal to original problem P1. Thus, how is the performance of the proposed ECM-RMA algorithm? We first show the performance of the proposed ECM-RMA algorithm in Theorem 1. Then, we reveal the energy consumption and delay trade-off achieved by the proposed ECM-RMA algorithm in Theorem 2.

The Performance of the Proposed ECM-RMA Algorithm
Lemma 3. We assume that problem P1 is reasonable, i.e., given average arrival data rate λ n m , there is at least one transmission power and time allocation scheme that satisfies all the limitations in P1. For a given average data arrival rate λ n m + ε, where ε is a positive number and P1 is also feasible. Then, for any ς > 0, there is at least one power and time allocation scheme Θ alg.1 = (p m (t), τ n m (t)) that satisfies the following restrictions: whereē alg.1 andē opt represent the average time energy consumption under ECM-RMA algorithm (Algorithm 1) and the optimal value for solving the original problem P1, respectively. e alg.1 (t) and D alg.1 (t) represent the energy consumption and the amount of data processed in time slot t under the scheme Θ alg.1 , respectively.
Assumption 1. Next, we assume that there is an upper and lower bound on the average time energy consumption by ECM-RMA algorithm, i.e.,ē min ≤ē alg.1 ≤ē max . Similar assumptions can be found in [32][33][34].
Using Lemma 3, we can get the gap betweenē alg.1 andē opt , as shown in the following Theorem 1.
Proof of Theorem 1. The proof is provided in Appendix C.

The Trade-Off between Energy Consumption and Delay
In order to reveal the energy consumption and delay trade-off achieved by the proposed ECM-RMA algorithm, we will describe the performance based on the time average data queue length, which is given in the following Theorem 2.

Theorem 2.
We assume that problem P1 is reasonable, and then, in the considered MEC system, the ECM-RMA algorithm ensures that the all data queues are stable. The upper bound of the average time data queue length is shown as follows: whereē min is the lower bound ofē opt , the assumption made in Section 4.3.1.
Proof of Theorem 2. The proof is provided in Appendix D.
The average time power consumptionē alg.1 drops at a rate of O(1/V) seen in Theorem 1, and, as seen from Theorem 2, the average data queue length increases at a rate of O(V). In other words, there is a trade-off between the average time energy consumption and the average time data queue length. In detail, with a large V value, there is less time average energy consumption but a larger average time data queue length. In addition, the trade-off can be described quantitatively as ]. In addition,T is denoting the average time queue delay, which is proportional to the average time data queue length at a given data arrival rate, i.e.,T = lim Hence, the proposed ECM-RMA algorithm achieves an energy consumption-delay trade-off as too. Obviously, the system resources can be effectively used, and the service quality of the user can be guaranteed by controlling the system parameter V.

Simulation Results
In this section, we use Matlab software (R2016a, MathWorks, Natick, Massachusetts, MA, USA) for simulation and the results of simulation are provided to evaluate the performance of the proposed ECM-RMA algorithm. We verify the reasonability and high efficiency of the ECM-RMA algorithm in minimizing energy consumption.

Simulation Settings in All Simulations, the Energy Transmitter at the AP with p ap = 3 W
We consider a Small-scale Rayleigh fading channel model, and the channel gain h i =h i α. Here, h i indicates that the average channel gain is determined by the geographical position of WD m and α is a random variable of a unit mean independent exponential. Specifically,h i follows the model below: This is a free-space path loss model, where A d = 4.11 indicates the antenna gain, f c = 915 MHz indicates the carrier frequency. The unit of d e is meters, which means the distance between WD m and the AP. In addition, d e ≥ 2 indicates the path loss exponent. Here, we set d e = 2.8. For ease of explanation, we consider h m,ul = h m,dl , d m = 2.5 + 0.3(m − 1) meters and static channel model with α = 1. In this way, h m,ul = h m,dl =h m,dl and channel gain decreases with the increasing m.
As for some parameters of the energy harvesting from the environment, we set the maximum green energy supply γ max = 2, and the green energy supply γ m is uniformly distributed in [0, γ max ]. Each WD has the same parameters, Ω = 10 J, k = 10 −18 , f n m = 2.4 GHz and L m = 788 cycles/bit. In addition, each WD has the same energy queue size R E = Ω = 10 J, and each WD allocates the same amount of data storage space R D = 1000 bits/Hz for each task. R D = 1000 bits/Hz is the available data queue length of each task. Moreover, the completion deadline required by each WD is the same, J m = 0.78 s. In addition, each task requires the same computing resources, g n m = 0.5 GHz. As for the computation on the MEC server, the computation capability of the MEC server is f ap = 4 GHz/s [35]. During the communication process, B m,ul = 1 MHz, w 0 = 10 −10 W, p m,min = 0.1 Wh and p m,max = 0.2 Wh.
The length of each time slot T = 1 s, and the total time length of all simulations S =10,000 s. At the beginning of all simulations, we set Q n m (0) = 0 and E m (0) = R E , ∀m ∈ M, ∀n ∈ N . Without loss of generality, the data arrival for all tasks are set to a random process with the equal average data arrival rates, i.e., λ n m = λ. Unless otherwise stated, λ = 120 bits/time-slot, K = 2, µ = 0.5. To evaluate the efficiency of the proposed ECM-RMA algorithm, we compare the ECM-RMA algorithm with the other two strategies, which are Baseline 1 and Baseline 2. Baseline 1 assigns equal time to tasks, and only optimizes transmission power allocation of tasks. As for Baseline 2, it only considers that WDs harvest energy from the environment. Unless otherwise stated, the other optimizations of Baseline 1 and Baseline 2 are consistent with the ECM-RMA algorithm. Figure 3a shows the time-averaged energy consumption versus the system control parameter V ranging from 1 to 50. It can be seen from Figure 3 that the time-averaged energy consumption decreases with the speed of O(1/V) as V increases, which confirms the viewpoint in Theorem 1. In particular, when V is large enough, the time-averaged energy consumption eventually converges to the optimal value of problem P1, which indicates that the proposed ECM-RMA algorithm is asymptotically optimal for solving problem P1. In addition, we can observe that, as the number of WDs M and the number of tasks N increase, the time-averaged energy consumption will also increase. This is due to the fact that, during a certain length of time, a larger number of WDs M and a larger number of tasks N require a larger transmission rate to ensure the stability of the task data queue, which results in much more time-averaged energy consumption. In addition, part of the time-averaged energy consumption (10) can explain that the above analysis is reasonable.   Figure 3b shows the time-averaged task data queue length versus the system control parameter V ranging from 1 to 50. Expectedly, the time-averaged data queue length increases with the speed of O(V) as V increases, which confirms the viewpoint in Theorem 2. Furthermore, we can observe that the time-averaged task data queue increases as the number of WDs M, the number of tasks N and the average data arrival rate λ increase. This is because more WDs, tasks and larger average data arrival rate λ mean more data that need to be processed, which results in an increase in the length of the data queue. Figure 4a shows the energy consumption-delay trade-off for the WDs in the MEC system. Obviously, the larger the time-averaged delay, the smaller the time-averaged energy consumption. This is because a larger time-averaged delay means a lower quality of service (QoS), which requires only a smaller transmission rate ν m (t). According to (1) and (10), only a small transmission power is required, which results in lesser energy consumption. Based on the analysis in Figures 3-5, there is a trade-off between energy consumption and delay, and it is quantized as [O(1/V), O(V)]. By adjusting the value of V, we can effectively control the performance of energy consumption-delay in the MEC system. More specifically, if the devices in the system have a high requirement for low latency, then a smaller V value needs to be set. Conversely, if the devices have a high requirement for low energy consumption, a large V value needs to be set. Figure 4b shows the energy consumption-delay trade-off achieved by ECM-RMA, Baseline 1 and Baseline 2. As expected, the proposed ECM-RMA algorithm is better than Baseline 1 and Baseline 2. In more detail, in the case of the same delay, the proposed ECM-RMA algorithm consumes less energy than Baseline 1 and Baseline 2, and this also means that the ECM-RMA algorithm has the lowest latency when the same energy is consumed. This is because, compared to Baseline 1, the ECM-RMA algorithm can allocate transmission time according to the dynamics of channel and the length of date queue. Under Baseline 2, the energy harvested by WDs is unstable, and WDs cannot compute tasks, instantly resulting in long delays when the same energy is consumed.

Impact of λ, K, µ
Now, we evaluate the impact of system variables λ, K, µ on the time-averaged energy consumption. Unless otherwise stated, M = 3, N = 4. Figure 5 shows the time-averaged energy consumption versus the average data arrival rate λ ranging from 100 to 200 under different system control parameter V. The time-averaged energy consumption increases with increasing λ. Meanwhile, the rate of increase in the time-averaged energy consumption slows down when there is a higher average data arrival rate λ due to the limited data queue size and the limited available battery capacity. Figure 6 shows the time-averaged energy consumption versus the number of available sub-channels K under different system control parameter V. Since more available sub-channels can support more WDs for data transmission, the time-averaged energy consumption increases with K when K ≤ M, i.e., the number of available sub-channels does not exceed the number of WDs in the MEC system. Figure 7 shows the time-averaged energy consumption versus the energy harvesting efficiency µ under different system control parameter V. The time-averaged energy consumption monotonically increases with the increase of energy harvesting efficiency µ. This is because the wireless harvested energy of WDs is a linear function positively correlated with the energy harvesting efficiency µ according to (16), and, as µ increases, more energy can be used for wireless data transmission. However, the growth rate of the time-average energy consumption slows down with higher µ. This indicates that the time-averaged energy consumption is also affected by the battery capacity and the number of available sub-channels in case of sufficient energy supply.

Conclusions
In this paper, we investigated the time-averaged energy consumption minimization problem in a multi-user multi-task MEC system with hybrid energy harvest IoT devices. A time-average energy consumption minimization problem while satisfying data queue stability and harvesting energy availability constraints was formulated as an online stochastic optimization problem. We propose the ECM-RMA algorithm based on Lyapunov optimization technology and turn the stochastic optimization problem into two deterministic sub-problems that can be easily solved by a convex optimization technique and linear programming technique. Performance analysis shows the asymptotic optimization of the proposed ECM-RMA algorithm and energy consumption and delay trade-off. The simulation results verify the theoretical analysis and the effectiveness of the proposed ECM-RMA algorithm in minimizing the time-averaged energy consumption.
, we square both sides of (15) and (21) to get (A1) and (A2), respectively. In addition, (A1) and (A2) are shown below: (Ê m (t + 1)) 2 ≤(Ê m (t)) 2 + ( ∑ n∈N e n m,total (t)) 2 + (e m,hw (t)) 2 − 2Ê m (t)( ∑ n∈N e n m,total (t) − e m,hw (t)). (A2) Summing all WDs m ∈ {1, 2, ...M} for (A1) and (A2), we can get (A3) and (A4), respectively. In addition, (A3) and (A4) are shown below: During a time slot, a fixed amount of data can be processed in a WD according to Equation (7). In addition, the amount of data to be transmitted also has upper and lower bounds. This is because the transmission power has upper and lower limits under the determined transmission time. Therefore, we know D n,min m ≤ D n m (t) ≤ D n,max m . In addition, ∑ n∈N e n m,total (t) cannot exceed the length of its energy queue at time slot t, so we have the upper bound of energy consumption, e.g., e max . As for e m,hw (t), it cannot exceed the size of battery capacity, so we have the upper bound of amount of harvested energy, e.g., e m,hw (t) ≤ Ω. We use the upper bound D n m (t) ≤ D n,max m , I n m (t) ≤ I n,max m , e n m,total (t) ≤ e max , e m,hw (t) ≤ γ max , ∀m ∈ M, ∀n ∈ N and define c max = 1 2 ∑ m∈M ∑ n∈N ((e max ) 2 + (Ω) 2 + (D n,max m (t)) 2 + (I n,max m (t)) 2 ). In addition, I n m (t) is the amount of data arrive to WD m at the end of time slot t, so both the data queue length Q n m (t) and the corresponding data queue state U(t) are independent of I n m (t) at the current time slot t. As a result, I n m (t) is independent of the current power allocation p m (t) and time allocation τ n m (t). In other words, E{I n m (t)|U(t)} is a constant. We can define C max as follows:

Appendix B. Proof of Lemma 2
First of all, we will show that the binary function f (x 1 , x 2 ) = −x 1 ln(1 + x 2 x 1 ) is convex on x 1 and x 2 . The reason is as follows: The Hessian matrix of f (x 1 , x 2 ) = −x 1 ln(1 + x 2 x 1 ) is The eigenvalues of H are k 1 = 0 and k 2 = x 2 1 +x 2 2 x 3 1 +2x 2 1 x 2 +x 1 x 2 2 , respectively, and H is positive semidefininite. Therefore, f (x 1 , x 2 ) is convex on x 1 and x 2 .
For (35), let x 1 = τ n m (t), x 2 = e n,var m (t)h m,ul (t) w 0 . According to (A8), since f (x 1 , x 2 ) is convex, η n m (τ n m (t), e n,var m (t)) = − Q n m (t)B m,ul f (x 1 ,x 2 ) ln 2 is concave. In the issue, ∑ m∈M ∑ n∈N η n m (τ n m (t), e n,var m (t)) is concave too. In addition, the first term in (35) is independent of the variable to be optimized, whereas the third term and fourth term are linear functions of e n,var m (t). Therefore, the objection problem in (35) is concave. Clearly, the constraints in Equation (35) are linear of τ n m (t) and e n,var m (t). Thus, the feasible region of the problem in Equation (35) is convex.
In summary, the objection problem in (35) is a convex optimization problem. Lemma 2 is eventually proved.

Appendix C. Proof of Theorem 1
Now, we will show the performance gap caused by Lyapunov optimization theory compared to the original problem P1. First, we use Algorithm 1 to minimize the right hand of (32), and we have