A Fast and Efficient Task Offloading Approach in Edge-Cloud Collaboration Environment

: Edge-cloud collaboration fully utilizes the advantages of sufficient computing resources in cloud computing and the low latency of edge computing and better meets the needs of various Internet of Things (IoT) application scenarios. An important research challenge for edge-cloud collaboration is how to offload tasks to edge and cloud quickly and efficiently, taking into account different task characteristics, resource capabilities, and optimization objectives. To address the above challenge, we propose a fast and efficient task offloading approach in edge-cloud collaboration systems that can achieve a near-optimal solution with a low time overhead. First, it proposes an edge-cloud collaborative task offloading model that aims to minimize time delay and resource cost while ensuring the reliability requirements of the tasks. Then, it designs a novel Preprocessing-Based Task Offloading (PBTO) algorithm to quickly obtain a near-optimal solution to the Task Offloading problem in Edge-cloud Collaboration (TOEC) systems. Finally, we conducted extended simulation experiments to compare the proposed PBTO algorithm with the optimal method and two heuristic methods. The experimental results show that the total execution time of the proposed PBTO algorithm is reduced by 87.23%, while the total cost is increased by only 0.0004% compared to the optimal method. The two heuristics, although better than PBTO in terms of execution time, have much lower solution quality, e.g., their total costs are increased by 69.27% and 85.54%, respectively, compared to the optimal method.


Introduction
With the rapid development of wireless communication technology and the Internet of Things (IoT) technology, novel applications with high computing power and low latency requirements have emerged, such as real-time monitoring, smart manufacturing, autonomous driving, augmented reality, and online gaming [1][2][3].One way to perform these applications is to offload them to the cloud.However, Cloud Servers (CSs) are usually far away from IoT devices, making it difficult to meet the latency requirements of these applications [4,5].As an extension of cloud computing at the network edge, edge computing deploys computational resources on the Edge Servers (ESs) close to IoT devices [6].Thus, offloading IoT tasks to ESs is beneficial to reduce service latency [7].Unfortunately, due to resource constraints, it is difficult for ESs to fulfill large-scale task offloading requests.
Edge-cloud computing, as a new computing paradigm, combines the sufficient computing power of CSs and the low latency advantages of ESs, which can provide higher computing and transmission performance than single-edge computing or cloud computing [8,9].Nevertheless, task offloading in Edge-cloud Collaborative Computing (ECC) is closely related to many factors, such as application goals, network environments, and task offloading algorithm.Section 6 carries out simulation experiments to evaluate the performance of the proposed algorithm.Finally, the paper is concluded in Section 7.

Related Work
Recently, several efforts have been made in task offloading in cloud-edge collaborative computing.They typically consider delay, energy consumption, and the trade-off between delay and energy consumption as optimization objectives for task offloading.
Dai et al. [18] proposed a deep reinforcement learning-based task offloading scheme for vehicular edge computing in a cloud-edge environment, which minimizes the average processing latency of the tasks.Considering user delay satisfaction with services provided by Mobile Edge Computing (MEC), Li et al. [19] formulated the offloading problem with maximum cumulative satisfaction and presented an efficient heuristic algorithm to solve the problem.Tang et al. [20] proposed a dynamic resource allocation algorithm in a cloud-edge collaborative environment and obtained an optimal policy for task resource matching, which can effectively reduce network latency.Mazouzi et al. [21] formulated the task offloading problem as a nonlinear binary integer programming and minimized energy consumption with a distributed linear relaxation heuristic algorithm based on the Lagrangian decomposition method.Su et al. [22] proposed a cloud-edge collaborative computing offloading model for transmission energy and computation energy consumption and designed a near real-time computing offloading algorithm that can effectively reduce total energy consumption and achieve good performance close to the optimal solution.
In addition to considering a single delay or energy consumption objective, some studies have focused on time delay and energy consumption as tradeoffs.Liu et al. [23] proposed an online task offloading and resource allocation method for edge-cloud collaborative computing.The proposed method reduces the average latency and energy consumption of tasks and achieves near-optimal performance.Xu et al. [24] used the NSGA-III algorithm to solve the multi-objective optimization problem of shortening execution time and reducing the energy consumption for each mobile device.Laili et al. [25] proposed a large-scale task scheduling model in the edge-cloud collaboration environment, which is solved using a parallel swarm merge evolutionary algorithm and can effectively reduce system delay and energy consumption.Long et al. [26] proposed a task-offloading method based on the improved AR-MOEA algorithm for the task-offloading problem.Their work can reduce the total delay of task completion, energy consumption, and load difference among edge nodes.
To model heterogeneous application scenarios more realistically, several works have jointly considered the computation offloading problem with delay, energy, and cost objectives.Haber et al. [27] studied the optimal offloading problem with task energy and computation cost in a multi-tier edge architecture.Li et al. [11] investigated the task offloading and resource purchasing problem in an edge-cloud collaborative system and proposed a dual time scale Lyapunov optimization method, which purchases computing resources from the public cloud with different time scales and makes offloading decisions online.Hoseiny et al. [17] studied the task scheduling problem in a fog-cloud environment.They first formulated the problem as mixed-integer linear programming to minimize the computation, communication, and violation costs, and then they proposed two efficient heuristic algorithms that are able to obtain computationally inexpensive offloading schemes.
Fog or edge network-based IoT systems are prone to various failures as their distributed and open nature makes them vulnerable.These failures may occur during the operation of a server node, rendering it inoperable.In recent years, many research efforts have focused on task offloading problems that minimize time delay and energy consumption while enhancing reliability.
Wang et al. [28] studied the delay-sensitive and reliability-ensuring task offloading problem in fog computing and designed two algorithms, branch-and-bound and greedy heuristic, to solve the problem, which can realize shorter delay and lower energy consumption while ensuring reliability.Ghanavati et al. [29] proposed a task scheduling algorithm based on dynamic fault-tolerant learning automata, which achieves efficient allocation of IoT tasks to fog nodes and can ensure reliable execution of tasks while optimizing response time and energy consumption.Hou et al. [30] investigated the fog computing-assisted task allocation problem for Unmanned Aerial Vehicle (UAV) swarms, aiming to minimize the energy consumption of UAV swarms under the constraints of latency and reliability, and developed a fast distributed algorithm based on the proximal Jacobi multiplier alternating direction method to solve the problem.Dong et al. [31] proposed an optimal computation offloading and resource allocation method for reliability-aware mobile edge computing, which can effectively reduce system latency and energy consumption.Liang et al. [32] studied the reliability-constrained task offloading problem in edge computing systems, aiming at minimizing latency with reliability as a constraint.The problem is addressed by a distributed reliability-aware task processing and offloading algorithm.Siyadatzadeh et al. [33] proposed a machine-learning-based primary-backup task assignment method in fog computing systems, which achieves enhanced system reliability while meeting real-time constraints by assigning the primary-backup task instances to appropriate fog nodes.
In addition to considering delay, energy consumption, and reliability objectives, some research efforts have also focused on task offloading for joint cost and reliability objectives.Yao et al. [34] investigated the trade-off between maximizing reliability and minimizing the system cost of fog resource allocation and designed a modified best-fit decreasing algorithm that was able to achieve suboptimal solutions with better time efficiency.Mao et al. [35] studied the computational offloading problem in mobile-edge computing, where the optimization objective is to minimize execution cost and task failures under energy threshold constraints, and developed a dynamic computational offloading algorithm based on Lyapunov optimization to solve the problem.
The above studies focusing on reliability generally require that the expected reliability of the system does not fall below a specified threshold.However, since the failure probability of a task instance varies over time, it is difficult to accurately predict its reliability.To this end, Li et al. [36] converted the user's reliability requirements into user satisfaction with the services provided by MEC network and improved the overall reliability of the system by deploying as many primary-backup instances as possible but did not require that the primary-backup instances must be deployed.As a result, the reliability of a specific task cannot be guaranteed.
Table 1 summarizes the related work and highlights the differences between this study and existing work.While researchers have done a great deal of useful work on task offloading for various optimization objectives, there is still much room for improvement in terms of system modeling and algorithmic efficiency.
As shown in Table 1, there are many previous works that have investigated task offloading models in fog or edge networks, which generally focus on one or a few optimization objectives, but there is still a lack of work on task offloading in ECC with joint consideration of time delay, resource cost, and reliability objectives.In our work, we propose an edge-cloud collaborative task offloading model that aims to minimize time delays and resource costs while ensuring reliability.Moreover, our work differs from related studies in terms of reliability-ensuring techniques.First, we improve reliability by adding the number of backup instances and do not need to predict the failure rate of task instances.Second, we require that each primary-backup instance must be assigned to a different server for execution, thus ensuring the reliability of each task.
In terms of task-offloading algorithms, several research efforts have used heuristic algorithms, meta-heuristic algorithms, and machine learning algorithms to solve the taskoffloading problem.Heuristic algorithms can quickly obtain solutions, but the quality of the solutions is not high.Meta-heuristic algorithms (e.g., genetic algorithm, evolutionary algorithm, etc.) and machine learning algorithms (e.g., Q-learning algorithm, deep learning algorithm, etc.) can produce high-quality solutions, but they generally require long runtimes, posing challenges in meeting the real-time IoT task offloading requirements.Additionally, in edge-cloud collaboration scenarios, the number of tasks, the number of servers, and the network conditions often change dynamically.As a result, some machine learning algorithms need to collect new training samples to retrain the neural network and adapt it to the new offloading environment, leading to a large limitation in their performance.Compared to previous studies, our proposed approach periodically collects and monitors relevant information about tasks and servers and invokes the offloading algorithm to solve an instance of the task offloading problem at a certain period, and thus can adapt to the dynamic changes of edge-cloud collaboration scenarios to a certain extent.The proposed preprocessing-based offloading algorithm demonstrates good performance in obtaining a near-optimal solution with low time overhead and is able to satisfy delay-sensitive large-scale IoT task offloading requirements.

References
System Model Delay Cost Reliability Task Dependency Algorithm [18] Edge-cloud

Real-World Scenario
To facilitate the upgrading of smart manufacturing levels, factory Y has deployed various IoT devices and machines in the workshops to collect real-time data and has developed a number of IoT applications to analyze the collected data for better condition monitoring and control.Alice and Bob are two important contributors to the IoT application scenario, where Alice is the Chief Executive Officer (CEO) of the factory, responsible for the planning and management of the IoT project, and Bob is the project director responsible for the specific implementation and execution of the IoT project.Alice plans to utilize the ECC architecture to provide task processing for the IoT applications.She appointed Bob to find an efficient task offloading method in ECC.
As shown in Figure 1, Bob designed a three-layer edge-cloud collaborative computing architecture containing a device layer, an edge layer, and a cloud layer.The device layer involves various IoT devices such as sensors, terminal devices, smart cameras, industrial robots, actuators, etc.The edge layer consists of a number of resource capacity-constrained computing devices, which are near the IoT devices and provide low-delay task processing capability.In the cloud layer, there are powerful servers that are far away from IoT devices and have a high communication overhead.
The core component of the proposed three-layer computing architecture is the Edge-Cloud Broker (ECB), which is placed at the edge layer.The ECB consists of four subcomponents: task receiver, resource monitor, task scheduler, and management controller.The task receiver receives all task requests from IoT devices through the gateway.The resource monitor is responsible for collecting and monitoring the available resources of ESs and CSs periodically.The task scheduler runs offloading algorithms to solve the TOEC problem based on the characteristics and attributes of the tasks and servers.The management controller is responsible for managing and coordinating other components for task offloading.
In the computing architecture shown in Figure 1, it is assumed that the ECB and each ES have an associated Base Station (BS) deployed at the same location [22].The ECB is wirelessly connected to the ESs through the associated BS and to the CSs through a high-speed fiber connection [17].
ting architecture containing a device layer, an edge layer, and a cloud layer.The devi layer involves various IoT devices such as sensors, terminal devices, smart cameras, i dustrial robots, actuators, etc.The edge layer consists of a number of resource capacit constrained computing devices, which are near the IoT devices and provide low-del task processing capability.In the cloud layer, there are powerful servers that are far aw from IoT devices and have a high communication overhead.
The core component of the proposed three-layer computing architecture is the Edg Cloud Broker (ECB), which is placed at the edge layer.The ECB consists of four subcom ponents: task receiver, resource monitor, task scheduler, and management controller.T task receiver receives all task requests from IoT devices through the gateway.The resour monitor is responsible for collecting and monitoring the available resources of ESs an CSs periodically.The task scheduler runs offloading algorithms to solve the TOEC pro lem based on the characteristics and attributes of the tasks and servers.The manageme controller is responsible for managing and coordinating other components for task o floading.
In the computing architecture shown in Figure 1, it is assumed that the ECB and ea ES have an associated Base Station (BS) deployed at the same location [22].The ECB wirelessly connected to the ESs through the associated BS and to the CSs through a hig speed fiber connection [17].In Figure 1, the number and characteristics of the tasks and the number and resour capacity of the servers change dynamically and frequently.Therefore, ECB periodica collects and monitors these data and runs a task offloading algorithm based on them solve an instance of the task offloading problem for each period.Finally, it offloads ea In Figure 1, the number and characteristics of the tasks and the number and resource capacity of the servers change dynamically and frequently.Therefore, ECB periodically collects and monitors these data and runs a task offloading algorithm based on them to solve an instance of the task offloading problem for each period.Finally, it offloads each task to the appropriate ES or CS for execution based on the offload solution.The detailed task offloading process is described below: (1) IoT devices first submit their requests through the gateway to the task receiver, which analyzes the attributes of each submitted task, such as the size and type of the task, the deadline of the task, and the resource requirements of the task, and then sends them to the management controller, which then places them in a list of tasks to be assigned.
(2) When a specific period arrives, the management controller first obtains the available resource information of ESs and CSs from the resource monitor, then sets the reliability requirements based on experience as well as the characteristics and type of the tasks, and finally, it sends these data to the task scheduler.(3) The task scheduler runs the offloading algorithm and forwards the offloading solution back to the management controller, which then assigns the tasks to the available ECs and CSs for execution.(4) After the ES and CS have performed their tasks, they return the processing results to the controller, which checks whether the time delay of these results exceeds their deadlines.If not, it returns the results to the corresponding IoT devices.
To address the task offloading problem, Bob planned several reasonable requirements for achieving task offloading.First, to meet reliability requirements, each primary-backup instance of a task needs to be assigned to different servers.Second, when multiple tasks are assigned to the same server, the number of resources required by those tasks should not exceed the resource capacity of that server.
Bob realizes that he is experiencing a many-to-many assignment problem.Fortunately, he found E-CARGO and GMRA [40,43,46] to be effective ways to solve the assignment problem and decided to use them to solve the task offloading problem.However, the original GMRA problem only specifies the maximum number of tasks a server can take on and cannot directly deal with server resource capacity constraints.This is a new challenge for Bob.For this reason, we define the task offloading problem by extending GMRA and provide a solution for Bob in the following.
To better illustrate the task offload scenario, we provide an example including 10 tasks and 4 servers, where t 0 -t 4 are small tasks with less resource demand, t 5 -t 9 are large tasks with more resource demand, s 0 , s 1 are ESs, and s 2 , s 3 are CSs.The number of primary backup instances required by the tasks, the resource requirements and related attributes of the tasks, the resource capacity and unit price of the servers, and the cost incurred by the servers to perform the tasks are shown in Tables 2-7, respectively.
According to the reliability model in the literature [31], the reliability of a task is closely related to its length.This is because the longer the task length, the longer the execution time required on the server, which leads to an increased probability of failure.For large tasks with long task lengths, their reliability needs to be ensured more strongly, i.e., more task backups are required.Based on the above considerations, in Table 2, we set the number of primary-backup instances for small tasks (t 0 -t 4 ) to 1-2, and the number of primary-backup instances for large tasks (t 5 -t 9 ) to 2-3.
Based on Tables 3 and 4 and according to the delay cost model and resource cost model in Section 4, we can obtain the delay cost and resource cost required to offload tasks t 0 -t 9 to servers s 0 -s 3 for processing, as shown in Tables 5 and 6, respectively.We set the weight coefficient for balancing the importance of delay cost and resource cost to 0.5 and normalize the delay cost and resource cost in Tables 5 and 6 to obtain the cost incurred by servers s 0 -s 4 to execute tasks t 0 -t 9 , as shown in Table 7.Based on the reliability requirements in Table 2 and the costs in Table 7, we call the task offloading algorithm in Section 5 to obtain the task offloading result, e.g., the total sum of the assigned evaluation values is 8.22, which is the minimum, and the optimal solution is shown in Table 7 (bold underlined).Note that the elements in Table 7 represent the costs incurred by the servers to perform the tasks, and the detailed evaluation process is described in Section 4.

Problem Formulation
A typical task offloading scenario in ECC involves n tasks and m servers.All servers consist of a set of ESs and a set of CSs, denoted as S = {s 0 , s 1 , . . ., s g−1 , s g , s g+1 , . . .., s m−1 }, where S E = {s 0 , s 1 , . . ., s g−1 } is a set of ESs, and S C = {s g , s g+1 , . . .., s m−1 } is a set of CSs, S = S E ∪ S C .It is assumed that, in each period, each IoT device's offload request contains only one indivisible task [9,38].All tasks to be offloaded are denoted as T = {t 0 , t 1 , . . ., t n−1 }.
For each task t j ∈ T (0 ≤ j < n), it is defined as an eight-tuple t j = <in j , out j , lr j , cr j , mr j , br j , rr j , dl j >, where in j and out j denote t j 's data sizes of the input and output, respectively; lr j denotes the length of t j (in terms of Million Instructions-MI); cr j , mr j , and br j denote the number of computational, memory, and bandwidth resources required by t j , respectively; rr j denotes the reliability requirement of t j , which is the number of backup instances required by t j ; dl j denotes the deadline for t j , which specifies the maximum tolerable delay.The resource capacity and price of each server s i ∈ S (0 ≤ i < m) is characterized by the sixtuple s i = <cc i , mc i , bc i , cp i , mp i , bp i >, where cc i , mc i , and bc i denote the CPU computational power, memory capacity, and bandwidth capacity provided by s i , respectively; and cp i , mp i , and bp i denote unit price of the computational, memory, and bandwidth resources of the server s i , respectively.The main notations used in this paper are given in Table 8.
The input (output) data size of task t j lr j The length of task t j rr j The reliability requirement of task t j cr j , mr j , br j The computing, memory, and bandwidth resources requirement by task t j cc i , mc i , bc i The capacity of server s i in terms of computing, memory, and bandwidth resources The unit price of the computing, memory, and bandwidth resources of server s i L The role range vector The resource requirement vector, and R h [j] indicates the hth-type resource required by t j

W h
The resource capacity vector, and W h [i] indicates the hth-type resource capacity provided by s i P The assignment matrix, and P[i, j] indicates whether server i is assigned to task j σ The total cost of the TOEC problem The transmission delay, execution delay, and total delay cost for task j to be offloaded to server i for execution The computing cost, memory cost, bandwidth cost, and total resource cost of executing task j on server i α The weighting coefficient for delay cost and resource cost objectives The column vector of the elements at column j in Q The ordered vector of the k smallest elements picked from The row index vector that records the corresponding row index number in The row vector of the elements at row i in Q V R i The column index vector that keeps an ordered record of the corresponding column index number in The TOEC problem is really about how to assign multiple tasks to multiple servers to minimize delay costs and resource costs while ensuring reliability.Inspired by the theory of E-CARGO and GMRA, we extend GMRA to give a formal definition of the TOEC problem, where tasks are roles and servers are agents.
Definition 1 ([41,43]).A role range vector L is a vector representing the lower bound of the role ranges in the environment e of group g.
In our scenario, to enhance reliability, multiple instances of each task are created: a primary task instance and multiple backup task instances, where the number of backup instances indicates the reliability requirements of the task.L reflects the number of primarybackup instances of the tasks, i.e., L[j] = 1 + rr j (0 ≤ j < n); if no backup is required, L[j] = 1, and if a backup is required, L[j] = 2.Both the primary and backup task instances need to be offloaded to different servers for execution.For example, each task in Table 2 specifies the number of primary-backup instances; hence, the L vector based on the tasks in Table 2 Note that at each task offloading period, we have difficulty in accurately predicting the failure rates.Therefore, it is not easy to set L[j] correctly, and we may need to conduct an in-depth study on this topic in the future.Here, we propose a preliminary scheme for setting L[j]: initially, ECB estimates the average failure rate of ECs and CSs based on experience; then, ECB integrates the average failure rate of the servers, the length of task t j , and its criticality to determine L[j] accordingly.Definition 2. A cost matrix Q is an m × n matrix, where Q[i, j] ∈ [0, 1] expresses the cost of agent i (0 ≤ i < m) to take on role j (0 ≤ j < n), with Q[i, j] = 0 meaning the lowest and 1 the highest.
In the scenario described in Section 3, the Q matrix is evaluated based on the delay cost and resource cost incurred by the server in providing processing to the task.Building a Q matrix for the TOEC problem is not trivial and requires calculating multiple types of time delays and multiple dimension resource costs.Accordingly, we present a more comprehensive evaluation method to establish Q in Section 4. Table 5 illustrates the normalized Q matrix for the example in Section 3.
To describe the resource requirements of the task and the capacity limits of the server, we formalize the following two data structures.Definition 3. A resource requirement vector R h is an n-vector, where R h [j] (0 ≤ j < n) indicates the number of the hth type of resource required by task t j .
Note that in this paper, we mainly consider three types of resources: computational, memory, and bandwidth, i.e., h = 0, 1, or 2 denotes computational, memory, and bandwidth resources, respectively, i.e., for a task t j (0 ≤ j < n), there are R 0 [j] = cr j , R 1 [j] = mr j , and R 2 [j] = br j .Definition 4. A resource capacity vector W h is an m-vector, where W h [i] (0 ≤ i < m) indicates the capacity of the hth type of resource provided by server s i .
Similar to the resource requirement vector, the resource capacity vector also involves three main types of resources, namely computational, memory, and bandwidth, i.e., for a server s i (0 Definition 5 ([41,43]).An assignment matrix P is defined as an m × n matrix, where P[i, j] ∈ {0, 1} (0 ≤ i < m, 0 ≤ j < n) indicates whether or not agent i is assigned to role j, with P[i, j] = 1 meaning yes and 0 meaning no.
The assignment matrix represents an offloading solution in our scenario, where P[i, j] ∈ {0, 1} expresses an offloading decision, with P[i, j] = 1 meaning to offload task j to server i and 0 meaning no.For example, the task assignment matrix for the example in Section 3 is shown in Table 9.
Definition 6 ([41,43]).Role j is workable in the group if it has been assigned enough agents, that is, Definition 7 ([41,43]).P is workable if each role in group g is workable, i.e., Only if P is workable, the group g is workable.Definition 8.The total cost is defined as the sum of the assigned agents' costs, that is, Definition 9. Given L, Q, R h , and W h , the TOEC problem is to find a workable P to subjected to where (1) Expression ( 4) is the binary value indicating whether server i is assigned to task j or not; (2) Expression ( 5) indicates that each task must be assigned to enough servers to ensure the reliability requirements of the task; (3) Expression ( 6) depicts that the hth resource required by all tasks assigned to s i cannot exceed the capacity of its hth resource.
The TOEC problem defined in Definition 9 can be formulated as an ILP problem, which we can transform into the standard form of ILP using the following steps: (1) The ILP problem [47] is a class of problems to find a vector X = [x 0 , x 1 , . . ., x n*−1 ] to obtain Min (Max) subjected to (2) From Definition 9, we have L, Q, R h , W h , and P.

Cost Evaluation
The optimization objective of the TOEC problem is to minimize the delay cost and resource cost.The delay cost model and the resource cost model are given below.

Delay Cost Model
In the ECC architecture shown in Figure 1, we mainly consider two types of time delay: (1) transmission delay and (2) execution delay.Their calculation processes are described in detail below.
(1) Transmission delay When task t j is offloaded to the ES i for processing, the ESB utilizes the wireless network to transmit the data of t j to the BS near the ES i , which in turn passes it to the ES i for processing.The transmission delay includes the time incurred by transmitting data from the ECB to the ES i , the data delivering time between the BS and the ES i , and the time incurred by returning the processing result to the ECB after t j is executed on the ES i .Considering that each ES and the associated BS are deployed at the same location, the data transfer between them is ignored.Moreover, we ignore the result return time since the size of data in the processing result is usually small [19].
Similarly, when task t j is offloaded to the CS i for processing, the ESB transmits the data of t j to the CS i using the wired network.The transmission delay includes the time incurred by transferring the data from the ECB to the CS i , as well as the time incurred by the CS i to return the processing result to the ECB after executing t j .We still assume that the return time of the processing result is ignored.
According to Shannon's theorem [48], the data transmission rate of the channel between the ECB and the ES s i can be expressed as follows: where b indicates the BS co-located with the ESB, ω is the bandwidth of the wireless channel, and SNR b,i is the wireless channel's signal-to-noise ratio.
Based on Equation (10), the data transmission delay between the ECB and the server s i for task t j can be defined as follow: where ϖ b,i is the bandwidth of the wired link between the BS b and the CS s i .
(2) Execution delay: The delay for executing the task t j on a specific server (ES or CS) s i can be calculated using In summary, the delay cost for task t j to be offloaded to server s i for execution can be expressed as

Resource Cost Model
In an ECC system, offloading tasks to servers for processing consumes multiple resources and incurs various costs, including computational costs, memory costs, and bandwidth usage costs.
The computational cost is the product of the price of CPU processing per unit time of server i and the execution time spent by task j on server i, which is calculated using The memory cost is the product between the price per storage unit of server i and the number of memory units required by task j, which is expressed as The bandwidth cost is the product of the price per data unit of server i and the sum of the input and output data sizes for task j, which is defined as follow: According to Equations ( 14)-( 16), the resource cost of executing task j on server i can be calculated using Based on the above delay cost and resource cost models, we evaluate the cost of the server to perform the task in the following.Specifically, each Q[i, j] is obtained by evaluating the weighted sum of the normalized delay cost and resource cost incurred by server i to execute task j, that is, where α ∈ [0, 1] is a weighting coefficient to balance the importance of delay cost and resource cost, e.g., when α > 0.5, it indicates that delay cost is more important than resource cost.τ * i,j and ς * i,j denote the normalized delay cost and resource cost, respectively, which can be calculated using where τ max and ς max denote the maximum possible delay cost and resource cost of task t j , respectively.
In the scenario outlined in Section 3, the lengths, input file sizes, and the number of computational, memory, and bandwidth resources required by tasks t 0 -t 9 are shown in Table 3, and the servers' unit price for computational, memory, and bandwidth resources is shown in Table 4.We assume that the transmission delays for the tasks offloaded to the ESs and CSs randomly vary within the ranges of 5-15 ms and 50-150 ms, respectively.Based on the delay cost and resource cost models, Tables 5 and 6 show the delay cost and resource cost of offloading tasks to the servers for processing, respectively.
Based on Tables 5 and 6 and setting the weight coefficient α to 0.5, the costs incurred by the servers s 0 -s 4 to perform the tasks t 0 -t 9 can be evaluated according to Equations ( 18)- (20), as shown in Table 7.

Solution to the TOEC Problem
In Section 3, the TOEC problem is formulated as an ILP.To solve this problem, it is actually necessary to assign n tasks to m servers.Since the requirements of the tasks and the properties of the servers are different, m n solutions exist for assigning tasks to the servers.
In a large-scale scenario, n and m are usually on the scale of hundreds or thousands [17].Meanwhile, to meet the low delay requirement of IoT applications, offloading decisions need to be made immediately once the requests arrive [49].Therefore, designing an efficient offloading algorithm is a challenging task.
Fortunately, in the experimental evaluation, we observe that the optimal assignment results for each task are generally distributed among the few lowest-cost candidate servers despite the constraints on server resource capacity.Inspired by this phenomenon, we propose the PBTO algorithm, which is based on the following ideas: (1) The candidate servers for each task are sorted according to their cost based on the Q matrix.(2) The Top k candidate servers are picked out for each task to reduce the size of the TOEC problem, and finally, CPLEX is invoked to solve the Reduced TOEC (RTOEC) problem.
To facilitate the algorithm description, the following auxiliary data structures are defined.
Definition 10.A column vector Q C j (0 ≤ j < n) is the vector of the elements at column j in Q, where Q , and for any Q Definition 13.The column index vector V R i is used to keep an ordered record of the corresponding column index number in Q of each element remaining after Q R i has been reduced, where Definition 14.Based on V k C j and V R i , the total cost of the RTOEC problem is defined as , and V R i , the RTOEC problem is to find a workable P to subjected to: ∑ The PBTO algorithm consists of two phases: the preprocessing phase and the task offloading phase.The specific process of the preprocessing phase is shown in Algorithm 1, and its main steps are as follow: Step 1: First, copy the elements of the jth column of Q into Q C j , sort the elements of Q C j in ascending order, and pick out the Top k elements to be added to min k Q C j ; then obtain the corresponding row index number of each element in min k Q C j , and save them one by one into the row index vector V k C j ; and finally, add V k C j into the set of row index vectors RS.
Step 2: First, copy the ith column element of Q into Q R i ; then, traverse each element Q R i [j], and determine whether its row index number exists in row index vector V k C j ; if the result is true, obtain the corresponding column index number of Q R i [j] in Q, and save it into column index vector V R i ; and finally, add V R i into the set of column index vectors CS.

Input:
T: the tasks set; S: the servers set; Q: the cost matrix.

Output:
RS, CS: the row index vector set, and the column index vector set.01: for each t j ∈ T do 02: Copy all elements in column j of Q into Q C j ; 03: Sort the elements of Q C j in ascending order; 04: Save the corresponding row index number in Q for each element in min k Q C j intoV k C j ; 06: end for 07: Add V k C j into RS; 08: for each s i ∈ S do 09: Copy all elements in row i of Q into Q R i ; 10: Save the corresponding column index number in Q of the element end if 14: end for 15: end for 16: Add V R i into CS; 17: return RS, CS From the inputs of Algorithm 1, it can be observed that the number of tasks, the number of servers, and the cost matrix Q are key parameters that affect the efficiency of the PBTO algorithm.Due to the dynamic and open nature of edge-cloud collaboration systems, these parameters, including the number of tasks, the number of servers, the characteristics and resource requirements of tasks, and the available resources of servers, change dynamically.Therefore, the task offloading scheme in ECC should be adjusted dynamically according to their changes.To address this challenge, we present our task offloading scheme in Section 3. In this scheme, the ECB periodically collects and monitors these parameters, and when each offloading period arrives, the ECB first updates the above parameters based on the latest data collected and then invokes the PBTO algorithm to solve an instance of the offloading problem based on the updated parameters.
In addition, in Section 4, the Q matrix is created by evaluating the delay cost and resource cost incurred by tasks executed on the server, which in turn depend on network conditions.In this paper, we do not currently consider the impact of dynamically changing network conditions on delay cost and resource cost, and we plan to investigate this issue in-depth in future work.
The task offloading phase solves the RTOEC problem by invoking CPLEX based on the preprocessing results of Algorithm 1.The detailed process is shown in Algorithm 2, and its main steps are as follows: Step 1: The matrices Q and P were converted into the corresponding one-dimensional matrices U and X.
Step 2: Based on U and X, first, an objective expression is declared; then, each task t j in the task set T is traversed, and the row index vector V k C j corresponding to t j is obtained; is traversed to find its corresponding element in U and X and added into the objective expression; finally, the minimization optimization objective is established.
Step 3: Based on X and L, for each task t j in the task set T, a reliability constraint expression is declared, and its row index vector V k C j is obtained; then, each of its elements in X is found and added into the reliability constraint expression; finally, the reliability constraint equation is established.
Step 4: Based on X, R 0 , R 1 , R 2 , W 0 , W 1 , and W 2 , for each server s i in the server set S, firstly, the three constraint expressions for computing, memory, and bandwidth capacity are declared, and the column index vector V R i of s i is obtained; then, each element V R i [j ′ ] in V R i is traversed to find its corresponding element in X, the total amount of demand for each type of resources of the element is obtained and added to the corresponding resource constraint expression, and finally, the three constraint inequalities for computing, memory, and bandwidth are established.
Step 5: Based on the above objective and constraints, cplex.solve()method is invoked to obtain the optimal solution.
The complexity of the above algorithms depends on the number of tasks (n) and the number of servers (m).The time complexity of Algorithm 1 is decided using the following: (1) the time complexity of step 1 is O(n × mlogm), and (2) the time complexity of step 2 is O(m × n).
Notably, Algorithm 1 preprocesses the TOEC problem, by which only the Top k candidate servers are selected for each task, thus effectively reducing the number of decision variables.For example, the task offloading problem for n tasks and m servers has m × n decision variables, while in PBTO, the number of decision variables is reduced to n × k.
The time complexity of the first four steps of Algorithm 2 is determined using the following: (1) the time complexity of step 1 is O(n × m), (2) the time complexity of step 2 is O(n × k), (3) the time complexity of step 3 is O(n × k), and (4) the time complexity of step 4 is O(n × m).Step 5 of Algorithm 2 employs CPLEX (version 22.1.0)to solve the RTOEC problem.CPLEX is a commercial optimization software package supported by IBM.We do not know the actual complexity of CPLEX.The simplex, ellipsoid, and interior point algorithms are commonly used for solving linear programming, and their worst-case time complexities are 2 Φ , Φ 4 × Ω, and Φ 3.5 × Ω, respectively, where Φ is the number of decision variables, and Ω is the length of a binary coding of the input data, that is [50] where |a i,j | denotes the number of binary bits of element a i,j in the input data matrix A.

Algorithm 2: Task Offloading Algorithm
Input: Q: the cost matrix; L: the role range vector; R 0 , R 1 , R 2 : the computing, memory, bandwidth resources requirement vector; W 0 , W 1 , W 2 : the computing, memory, bandwidth resources capacity vector; RS, CS: the row index vector set, and the column index vector set.

Output:
P: the task assignment matrix.01: Convert the Q matrix into a one-dimensional matrix U; 02: Convert the P matrix into a one-dimensional matrix X; 03: IloLinearNumExpr obj = cplex.linearNumExpr();04: for each t j ∈ T do 05: V k C j ← Get the row index vector of column j from RS; 06: end for 09: end for 10: cplex.addMinimize(obj);11: for each t j ∈ T do 12: IloLinearNumExpr cons_r = cplex.linearNumExpr();V k C j ← Get the row index vector of column j from RS; 13 14: for end for 17: cplex.addEq(cons_r,L[j]); 18: end for 19: for each s i ∈ S do 20: IloLinearNumExpr cons_c = cplex.linearNumExpr();21: IloLinearNumExpr cons_m = cplex.linearNumExpr();22: IloLinearNumExpr cons_b = cplex.linearNumExpr();V R i ← Get the column index vector of row i from CS; 23 24: for each element j ′ ∈ V R i do 25: We assume that CPLEX integrates simplex, ellipsoid, and interior point algorithms to solve linear programming problems.The input data for Algorithm 2 consist of Q, R 0 , R 1 , R 2 , W 0 , W 1 , W 2 , and L, where Q is an n × m double-type matrix; R 0 , R 1 , and R 2 are ndimensional double-type vectors; W 0 , W 1 , and W 2 are m-dimensional double type vectors; L is an n-dimensional integer-type vector.Generally, an integer variable needs to occupy 32 bits, and a double variable needs to occupy 64 bits.Moreover, since PDTO preprocesses the TOEC problem, the elements in Q are correspondingly reduced to n × k double-type variables.In summary, the number of bits required for the input data of Algorithm 2 is In the case of CPLEX using simplex, elliptic, and interior point algorithms, the worstcase time complexity of PBTO is as follows: (1) Simplex: We assume that the optimal method for solving the TOEC problem is also using CPLEX.The optimal method does not reduce the size of the TOEC problem, which has the following input data: L, Q, R 0 , R 1 , R 2 , W 0 , W 1 , and W 2 .Therefore, the number of decision variables of the optimal method is n × m, and the number of bits required for input data is 8×n × m + 31 × n + 24 × m.In the case of CPLEX using simplex, elliptic, and interior point algorithms, the worst-case time complexity of the optimal method is as follows: (1) Simplex: In the discussed scenario, k is a constant (typically less than 10).As a result, the time complexity of the PBTO and optimal methods, as well as their time complexity ratios, can be simplified, as shown in Table 10.
Table 10.The time complexity of the PBTO, corresponding optimal methods, and their time complexity ratios.

Experiments
In this section, we conducted extended experiments to simulate the task offloading process in ECC and evaluated the performance of the proposed approach and the impact of important parameters on performance.All experiments were coded in Java and performed on a Windows platform equipped with Intel Core i5-1240P @ 1.7 GHz and 16GB RAM.

Experimental Settings
To evaluate the performance of PBTO, we compare it with three benchmark methods, including an optimal method and two heuristic methods: (1) OPT: It uses IBM CPLEX Optimizer to find the optimal solution to the optimization model introduced in Section 4 and offloads tasks based on the solution.( 2) ETO [23]: In this approach, tasks are prioritized to be offloaded at the edge as long as an ES exists with enough resources to complete the current task; otherwise, it is offloaded to the cloud.(3) ATO [22]: In this approach, tasks are equally distributed to the edge and the cloud according to the order of submission of requests to make the edge-cloud load balanced.
ETO and ATO are two heuristic methods that employ prioritized edge and average edge-cloud policies for task offloading, respectively.As a result, ETO and ATO have low time overhead, but the quality of the obtained solutions is not high.We choose ETO and ATO as benchmark comparison methods to evaluate the effectiveness of the proposed PBTO algorithm, i.e., to compare how close they are to the optimal solution.
The efficiency and effectiveness of the PBTO algorithm are mainly affected by six key parameters: (1) the number of tasks (n); (2) the number of servers (m); (3) the proportion of ESs (θ); (4) the proportion of large tasks (δ); ( 5) the number of primary-backup instances of the tasks (L[j]); and (6) the Top k candidate servers (k).Among them, the number of tasks and servers are two key parameters that affect the execution time of the PBTO algorithm.Moreover, the proportion of ESs affects the capacity of the available resources, and the proportion of large tasks affects the resource requirements.These two parameters, along with the number of tasks, the number of servers, the number of primary-backup instances of the tasks, and the Top k candidate servers, collectively affect the effectiveness of the PBTO algorithm, i.e., delay cost, resource cost, and total cost.Based on the six key parameters above, we conducted six sets of experiments to analyze the impact of these parameters on the efficiency and effectiveness of the PBTO algorithm.Table 11 summarizes the settings of the experiments, which will be discussed in the next section.Each experiment is repeated 100 times, and the results are then averaged.From Section 4, we can find that some attributes of tasks and servers are the key parameters for evaluating the Q matrix, such as the length of the task, the size of the task's input and output data, the task's resource requirements, and the server's resource capacity and price.Therefore, in experiment Sets #1-6, we employ these parameters to create the Q matrix.In experiment Sets #1-6, tasks are categorized into two types: small tasks and large tasks.The specific parameter settings for each task type are shown in Table 12.The resource capacity and price attributes of the servers are set as shown in Table 13.
In recent years, researchers have carried out a lot of work on the task offloading problem in ECC [17,37,38], and we refer to their parameter settings in the simulation experiments and combine them with the requirements of our application scenarios to set the parameters, as shown in Tables 11-13.All the attributes outlined in Tables 12 and 13 are generated randomly within the specified range.Moreover, the objective weighting coefficient a is set as 0.5, while the signal-to-noise ratio SNR for a BS is set as 100 dB [22].The transmission delay from the ECB to the ESs is chosen randomly from 5 to 15 ms, while the transmission delay from the ECB to the CSs varies randomly from 50 to 250 ms [17].

Experimental Results
Figures 2-7 demonstrate the effectiveness and efficiency of all the approaches in experiment Sets #1-6, respectively, in terms of (1) the total cost (in sub-figures (a)), ( 2) the delay cost (in sub-figures (b)), ( 3) the resource cost (in sub-figures (c)), and (4) the execution time (in sub-figures (d)).In addition, to measure the relative difference in total cost and execution time between PBTO, ETO, ATO, and the optimal approach OPT, we define two metrics: total cost percent deviation and execution time percent deviation, as shown in the following equations: where * denotes any of the approaches in PBTO, ETO, and ATO; λ * σ and λ * τ denote the total cost percent deviation and execution time percent deviation, respectively; σ * and τ * express the total cost and execution time of PBTO, ETO, and ATO, respectively; σ OPT and τ OPT indicate the total cost and execution time of OPT, respectively.
In general, OPT, being the optimal approach, clearly achieves the lowest total cost compared to all other approaches across all experiments.This comes at the cost of high time overhead and is therefore not suitable for large-scale scenarios where low latency is critical.PBTO is close to OPT in terms of total cost, delay cost, and resource cost and is much lower than OPT in terms of execution time.The other two baseline approaches have shorter execution times, but their total costs are significantly higher than those of OPT and PBTO.In conclusion, PBTO outperforms all the benchmark approaches.
(1) Impact of the number of tasks (Set #1) In Set #1, we study the effect of different numbers of tasks on the total cost, total time delay, total monetary cost, and algorithm execution time.The ratio of the number of tasks to the number of servers, the proportions of ESs, the proportions of large tasks, the reliability requirements of tasks, and k are set to 2:1, 50%, 50%, 1-3, and 10, respectively.
As shown in Figure 2a, the total cost for all four approaches gradually increases with the number of tasks.In all cases, the total cost of OPT is minimized.PBTO removes some of the higher-cost candidate servers and then uses CPLEX to solve the RTOEC problem so that its total cost is close to that of OPT and, in most cases, almost the same as OPT.Both ETO and ATO do not focus on the optimization objective during task offloading; thus, their total cost is much higher than that of OPT and PBTO.
Figure 2b,c illustrates the effect of the number of tasks on delay cost and resource cost.In all cases, the delay cost and resource cost of PBTO and OPT are almost the same and significantly lower than those of ETO and ATO.It is worth noting that ETO has a lower delay cost than ATO due to its prioritization of offloading at the edge, but at the same time, it leads to the highest resource cost.On the contrary, ATO employs an offloading strategy that balances the edge and the cloud, and hence, it has the highest delay cost among the several approaches, but the resource cost is lower than that of ETO. Figure 2d indicates the execution time of the four approaches.The execution time of OPT rapidly increases as the number of tasks increases, exceeding 5 s when the number of tasks is 700 and eventually reaching 12.5 s when the number of tasks is 1000.PBTO greatly reduces the problem size by removing a large number of higher-cost candidate servers prior to offloading the tasks, and thus, its overhead time is much lower than that of OPT.Even when the number of tasks reaches 1000, PBTO takes no more than 740 ms.Both ETO and ATO only need to locally determine whether resource capacity constraints are satisfied during task offloading, so their execution time is smaller than that of OPT and PBTO.
By observing Table 14, we find that PBTO deviates minimally from OPT in total cost and substantially from OPT in execution time.In all cases, PBTO sacrifices 0.0003% of the total cost but gains an 87.84% reduction in execution time compared to OPT.ETO and ATO have much lower execution time compared to OPT.However, their total cost deviates from OPT by more than double.This indicates that our proposed PBTO is able to find a near-optimal solution to the TOEC problem at a very fast speed and is fully capable of coping with large-scale delay-sensitive task offloading scenarios.
Note that in Table 14, when n = 300, the total cost deviation value of PBTO is 0.00018, which is higher than that of n = 400 and 500.This is mainly due to the fact that when n is relatively small, the number of candidate servers for each task is less than that of cases with larger n.For instance, when n = 300 and m = 150, the number of candidate servers for each task is 150, whereas when n = 500 and m = 250, the number of candidate servers for each task is 250.Further, the PBTO algorithm removes some high-cost candidate servers during preprocessing, and this operation may erroneously remove some candidate servers that should have been present in the optimal solution.In particular, the probability of such an error is relatively high when n is small, leading to fluctuations in the value of the total cost deviation when n is small.Fortunately, although PBTO's total cost deviation is less stable for smaller n, it is still very close to the optimal solution overall.(2) The impact of the number of servers (Set #2) In Set #2, we investigate the impact of different numbers of servers on offloading performance with a fixed number of tasks.We fixed the number of tasks at 500 and changed the number of servers from 100 to 400, with a step of 50.The proportion of large tasks, the proportions of edge servers, the reliability requirements of tasks, and k are set (2) The impact of the number of servers (Set #2) In Set #2, we investigate the impact of different numbers of servers on offloading performance with a fixed number of tasks.We fixed the number of tasks at 500 and changed the number of servers from 100 to 400, with a step of 50.The proportion of large tasks, the proportions of edge servers, the reliability requirements of tasks, and k are set to 50%, 50%, 1-3, and 10, respectively.
It is evident from Figure 3a that as the number of servers increases, the total cost of both OPT and PBTO shows a slight improvement, mainly because the increase in the number of servers enhances their chances of obtaining a better task-offloading solution, and it can be further confirmed from Figure 3b,c.In Figure 3b,c, as the number of servers increases, the time delay gradually decreases while the resource cost remains constant, resulting in some improvement in the total cost of both OPT and PBTO.Similar to the observations in Figure 2a-c, the total cost, delay cost, and resource cost of ETO and ATO remain stable and much higher than those of OPT and PBTO.
In Figure 3d, the execution time of OPT and PBTO increases with the number of servers, and OPT increases more rapidly than PBTO.The reason behind this phenomenon is that the increase in the number of servers leads to a rapid expansion of the search space for the TOEC problem.Since OPT and PBTO need to obtain the global optimal solution, their execution time grows accordingly.It is worth noting that PBTO's execution time does not increase as rapidly as OPT's because it reduces the search space of the problem.Similar to Figure 2d, the execution time of both ETO and ATO does not change significantly.
By observing Table 15, we find that PBTO deviates minimally from OPT in total cost and significantly from OPT in execution time.In all cases, PBTO sacrifices 0.0005% of the total cost but gains an 87.60% reduction in overall execution time compared to OPT.ETO and ATO exhibit results similar to those in Table 14, i.e., low time overhead but large total cost deviation.(3) The impact of the proportions of ESs (Set #3) In Set #3, we fix the number of tasks at 500 and increase the percentage of ESs in the total servers from 0% to 100% to compare the performance of the four approaches.The ratio of the number of tasks to the number of servers, the proportion of large tasks, the reliability requirements of tasks, and k are set to 2:1, 50%, 1-3, and 10, respectively.
From Figure 4a-c, as the percentage of ESs increases, the total cost of all approaches first improves significantly and then stabilizes.This is because when the percentage of ESs increases at the beginning, many tasks migrate from CSs to ESs, resulting in a rapid decrease in time delay and a gradual increase in resource cost, which reduces the total cost.However, as the proportion of ESs increases further, no more tasks will migrate to the ESs, and therefore the total cost stabilizes.Notably, ETO prioritizes offloading tasks at ESs, and once the ESs take some percentage, it migrates as many tasks as possible to ESs, and hence, its total cost, delay cost, and resource cost change process is more pronounced than the other three approaches.Furthermore, ACO does not have a feasible solution when all servers are ESs because of its edge-cloud balanced offloading strategy.(3) The impact of the proportions of ESs (Set #3) In Set #3, we fix the number of tasks at 500 and increase the percentage of ESs in the total servers from 0% to 100% to compare the performance of the four approaches.The ratio of the number of tasks to the number of servers, the proportion of large tasks, the reliability requirements of tasks, and k are set to 2:1, 50%, 1-3, and 10, respectively.
From Figure 4a-c, as the percentage of ESs increases, the total cost of all approaches first improves significantly and then stabilizes.This is because when the percentage of ESs increases at the beginning, many tasks migrate from CSs to ESs, resulting in a rapid decrease in time delay and a gradual increase in resource cost, which reduces the total cost.However, as the proportion of ESs increases further, no more tasks will migrate to the ESs, and therefore the total cost stabilizes.Notably, ETO prioritizes offloading tasks at ESs, and once the ESs take some percentage, it migrates as many tasks as possible to ESs, and hence, its total cost, delay cost, and resource cost change process is more pronounced than the other three approaches.Furthermore, ACO does not have a feasible solution when all servers are ESs because of its edge-cloud balanced offloading strategy.
Figure 4d demonstrates the variation in execution time for all approaches.OPT still takes a long time to obtain the optimal solution, and the execution time of PBTO is slightly higher than that of ETO and ACO but still much lower than that of OPT.Moreover, when the percentage of ESs is close to 100%, many ESs have difficulty offloading more tasks due to resource capacity constraints, which results in OPT and PBTO taking a longer time to find a solution.
Observing Table 16, we find that PBTO deviates minimally from OPT in total cost and considerably from OPT in execution time.In all cases, PBTO sacrifices 0.0004% of the total cost but gains an 87.75% reduction in overall execution time compared to OPT.ETO and ATO yield results similar to those in Table 14, i.e., low time overhead but large total cost deviation.(4) The impact of the proportions of large tasks (Set #4) In set #4, we research the effect of the proportions of different task types, from all small tasks to all large tasks, on offloading performance.The total number of tasks and servers is fixed at 500 and 250, respectively.The proportion of ESs, the reliability requirements of the tasks, and k are set to be 50%, 1-3, and 10, respectively.(4) The impact of the proportions of large tasks (Set #4) In set #4, we research the effect of the proportions of different task types, from all small tasks to all large tasks, on offloading performance.The total number of tasks and servers is fixed at 500 and 250, respectively.The proportion of ESs, the reliability requirements of the tasks, and k are set to be 50%, 1-3, and 10, respectively.
Figure 5a demonstrates that the total cost of all approaches increases with the proportion of large tasks.This is because large tasks result in much greater time delay and resource cost than small tasks, regardless of whether they are offloaded to ESs or CSs.From Figure 5b,c, it can be noticed that the delay cost and resource cost of all approaches increase with the proportion of large tasks.Moreover, in all cases, the total cost, delay cost, and resource cost of PBTO still closely follow OPT and outperform the other two approaches.In Figure 5d, there is no significant difference in the change in execution time of all approaches with the proportion of large tasks increasing.
Observing Table 17, we find that PBTO deviates minimally from OPT in total cost and significantly from OPT in execution time.In all cases, PBTO sacrifices 0.0008% of total cost but gains an 89.90% reduction in overall execution time compared to OPT.ETO and ATO show similar results to those in Table 14, i.e., low time overhead but large total cost deviation.(5) The impact of the reliability requirements of the tasks (Set #5) (5) The impact of the reliability requirements of the tasks (Set #5) In set #5, we compare the performance of four approaches while increasing the number of primary-backup task instances from 1 to 5. The total number of tasks and servers is fixed at 500 and 250, respectively.The proportion of ESs, the proportion of task types, and k are set to 50%, 50%, and 10, respectively.
Figure 6a shows that the total cost of all approaches increases with the number of task primary backups.For a task, each additional backup, whether offloading it to ES or CS, incurs a certain time delay and resource cost.As shown in Figure 6b,c, the delay cost and resource cost of all four approaches increase with the number of task primary backups.
Figure 6d illustrates that as L[j] increases from 1 to 5, the execution time of OPT first decreases rapidly and then gradually increases.The possible reason is that when L[j] = 1, there are more assignable servers for each task, and the OPT takes more time to obtain the optimal solution.When L[j] increases to 2, it is equivalent to reducing the number of assignable servers for each task by half.Thus, the execution time of the OPT decreases rapidly.However, when L[j] continues to increase, although the number of assignable servers decreases, the number of primary-backup instances that need to be offloaded increases, so the execution time of OPT starts to increase gradually again.The execution time of PBTO changes similarly to that of OPT, but the execution time is much smaller than that of OPT, similar to the reason explained in Figure 2d.The time changes of ETO and ACO are very slow, and there is no significant difference.
Observing Table 18, we find that PBTO deviates minimally from OPT in total cost and substantially from OPT in execution time.In all cases, PBTO sacrifices 0.0006% of the total cost but gains an 82.91% reduction in overall execution time compared to OPT.ETO and ATO yield results similar to those in Table 14, i.e., low time overhead but large total cost deviation.In set #6, we evaluated the effects of k on the total cost and execution time of PBTO.Parameters n and m are fixed at 500 and 250, respectively.The reliability requirements of the tasks, the proportion of large tasks, and the proportion of ESs were set to 1-3, 50%, and 50%, respectively.
Figure 7a shows that the total cost of PBTO does not change much as k increases.However, in Figure 7b, the execution time of PBTO decreases slightly and then increases gradually with the increase in k.The reason may be that when the number of candidate servers is relatively small, PBTO takes more time to obtain a feasible solution, and the execution time decreases as the number of candidate servers gradually increases.Nevertheless, when the number of candidate servers increases further, PBTO takes more time to obtain the optimal solution from the many feasible solutions.It is worth noting that although the execution time of PBTO increases with the increase of k, the overall growth trend is relatively flat, and even at k = 25, the execution time is only 385 ms, which is acceptable for general IoT tasks.Additionally, increasing the value of k does not improve the total cost much, as shown in Figure 7a; the total cost remains around 183 G$ for all cases.Therefore, it is not necessary to set the value of k too large.In our scenario, setting k to 10 is appropriate.Since OPT does not reduce the number of candidate servers, its execution time is not affected by k.
Observing Table 19, we find that PBTO deviates minimally from OPT in total cost and significantly from OPT in execution time.In all cases, PBTO sacrifices 0.0004% of the total cost but gains an 87.36% reduction in overall execution time compared to OPT.  (6) The impact of the k (Set #6) In set #6, we evaluated the effects of k on the total cost and execution time of PBTO.Parameters n and m are fixed at 500 and 250, respectively.The reliability requirements of the tasks, the proportion of large tasks, and the proportion of ESs were set to 1-3, 50%, and 50%, respectively.
Figure 7a shows that the total cost of PBTO does not change much as k increases.However, in Figure 7b, the execution time of PBTO decreases slightly and then increases gradually with the increase in k.The reason may be that when the number of candidate servers is relatively small, PBTO takes more time to obtain a feasible solution, and the execution time decreases as the number of candidate servers gradually increases.Nevertheless, when the number of candidate servers increases further, PBTO takes more time to obtain the optimal solution from the many feasible solutions.It is worth noting that although the execution time of PBTO increases with the increase of k, the overall growth trend is relatively flat, and even at k = 25, the execution time is only 385 ms, which is acceptable for general IoT tasks.Additionally, increasing the value of k does not improve the total cost much, as shown in Figure 7a; the total cost remains around 183 G$ for all cases.Therefore, it is not necessary to set the value of k too large.In our scenario, setting k to 10 is appropriate.Since OPT does not reduce the number of candidate servers, its execution time is not affected by k.
Observing Table 19, we find that PBTO deviates minimally from OPT in total cost and significantly from OPT in execution time.In all cases, PBTO sacrifices 0.0004% of the total cost but gains an 87.36% reduction in overall execution time compared to OPT.

Conclusions
In an ECC system, offloading tasks to ESs or ESs quickly and efficiently is an important yet challenging problem.In this paper, we propose a task offloading method for heterogeneous scenarios involving multi-tasks, multi-ESs, and multi-CSs, which takes minimizing delay cost and resource cost as the optimization objective while considering the reliability requirements of tasks and resource capacity constraints of servers.First, we formally define the TOEC problem by extending GMRA and formulating it as an ILP.Then, we design the PBTO algorithm to obtain a near-optimal solution quickly.Further, we carry out a series of simulation experiments comparing the PBTO algorithm with the optimal method and two heuristic methods.
The experimental results show that the execution time of the proposed PBTO algorithm is reduced by 87.23%, while the total cost increases by only 0.0004% compared to the optimal method.The two heuristic methods, although better than PBTO in terms of time performance, have much lower-quality solutions.In all cases, the total cost of both heuristic methods is more than double that of the optimal method, making them difficult to apply in real-world IoT task offloading scenarios.
In conclusion, the proposed method is suitable for large-scale IoT task offloading scenarios with strict low-delay requirements and can also be applied to other related scenarios, such as crowdsourcing task allocation, micro-services deployment, drone swarm scheduling, and task offloading in vehicle edge computing.
However, our approach does not take into account the dynamically changing requirements of resource prices and network conditions and ignores the offloading scenario where an IoT task is composed of multiple interdependent subtasks.Our future research will focus on the following aspects: (1) Extending the task offloading model with the ability to characterize energy consumption and explore the impact of dynamically changing network conditions on delay, resource cost, and energy consumption.(2) Considering a scenario where service providers dynamically change resource prices over time and demand in pursuit of greater benefits, we will study the time-varying multidimensional resource dynamic pricing model and design corresponding task offloading algorithms.(3) In real-world application scenarios, an IoT task generally consists of multiple subtasks, and we will study how to divide the task into multiple subtasks reasonably and propose an offloading algorithm in the edge-cloud collaboration environment based on the dependencies between subtasks.

Conclusions
In an ECC system, offloading tasks to ESs or ESs quickly and efficiently is an important yet challenging problem.In this paper, we propose a task offloading method for heterogeneous scenarios involving multi-tasks, multi-ESs, and multi-CSs, which takes minimizing delay cost and resource cost as the optimization objective while considering the reliability requirements of tasks and resource capacity constraints of servers.First, we formally define the TOEC problem by extending GMRA and formulating it as an ILP.Then, we design the PBTO algorithm to obtain a near-optimal solution quickly.Further, we carry out a series of simulation experiments comparing the PBTO algorithm with the optimal method and two heuristic methods.
The experimental results show that the execution time of the proposed PBTO algorithm is reduced by 87.23%, while the total cost increases by only 0.0004% compared to the optimal method.The two heuristic methods, although better than PBTO in terms of time performance, have much lower-quality solutions.In all cases, the total cost of both heuristic methods is more than double that of the optimal method, making them difficult to apply in real-world IoT task offloading scenarios.
In conclusion, the proposed method is suitable for large-scale IoT task offloading scenarios with strict low-delay requirements and can also be applied to other related scenarios, such as crowdsourcing task allocation, micro-services deployment, drone swarm scheduling, and task offloading in vehicle edge computing.
However, our approach does not take into account the dynamically changing requirements of resource prices and network conditions and ignores the offloading scenario where an IoT task is composed of multiple interdependent subtasks.Our future research will focus on the following aspects: (1) Extending the task offloading model with the ability to characterize energy consumption and explore the impact of dynamically changing network conditions on delay, resource cost, and energy consumption.(2) Considering a scenario where service providers dynamically change resource prices over time and demand in pursuit of greater benefits, we will study the time-varying multidimensional resource dynamic pricing model and design corresponding task offloading algorithms.

Definition 11 .
which is located in the ith row and jth column of Q, |Q C j | = m.Let min k Q C j be an ordered vector of the k smallest elements picked from Q C j , where |min k Q C j | = k, and k ≤ m.The row index vector V k C j is used to record the corresponding row index number in Q for each element in min

Figure 3 .
Figure 3. Experimental results of experiment Set #2 (varying number of servers).(a) Total cost; (b) delay cost; (c) resource cost; (d) execution time.Note that the dotted lines in the graph indicate the increasing or decreasing trend of the costs.

Figure 6 .
Figure 6.Experimental results of experiment Set #5 (varying reliability requirements of the tasks).(a) Total cost; (b) delay cost; (c) resource cost; (d) execution time.

Table 1 .
Comparison of Related Works.

Table 2 .
Primary-backup instances of the tasks.

Table 3 .
Resource requirements and related attributes of the tasks.

Table 4 .
Resource capacity and unit price of the servers.

Table 5 .
The delay costs of offloading tasks to the servers for processing.

Table 6 .
The resource costs of offloading tasks to the servers for processing.

Table 7 .
The costs incurred by the servers to perform the tasks.

Table 12 .
Attributes of the Task Settings.

Table 13 .
Attributes of the Server Settings.

Table 14 .
The percentage deviation in total cost and execution time of experiment Set #1.

Table 15 .
The percentage deviation in total cost and execution time of experiment Set #2.

Table 15 .
The percentage deviation in total cost and execution time of experiment Set #2.

Table 16 .
The percentage deviation in total cost and execution time of experiment Set #3.

Table 17 .
The percentage deviation in total cost and execution time of experiment Set #4.

Table 18 .
The percentage deviation in total cost and execution time of experiment Set #5.

Table 19 .
The percentage deviation in total cost and execution time of experiment Set #6.

Table 19 .
The percentage deviation in total cost and execution time of experiment Set #6.