Collaborative Task Offloading and Service Caching Strategy for Mobile Edge Computing

Mobile edge computing (MEC), which sinks the functions of cloud servers, has become an emerging paradigm to solve the contradiction between delay-sensitive tasks and resource-constrained terminals. Task offloading assisted by service caching in a collaborative manner can reduce delay and balance the edge load in MEC. Due to the limited storage resources of edge servers, it is a significant issue to develop a dynamical service caching strategy according to the actual variable user demands in task offloading. Therefore, this paper investigates the collaborative task offloading problem assisted by a dynamical caching strategy in MEC. Furthermore, a two-level computing strategy called joint task offloading and service caching (JTOSC) is proposed to solve the optimized problem. The outer layer in JTOSC iteratively updates the service caching decisions based on the Gibbs sampling. The inner layer in JTOSC adopts the fairness-aware allocation algorithm and the offloading revenue preference-based bilateral matching algorithm to get a great computing resource allocation and task offloading scheme. The simulation results indicate that the proposed strategy outperforms the other four comparison strategies in terms of maximum offloading delay, service cache hit rate, and edge load balance.


Introduction
With the rapid development of wireless network technology, a large number of computing-intensive and delay-sensitive applications emerge, such as autonomous driving, face recognition, and virtual/augmented reality (VR/AR) [1,2]. The restricted computing performance and storage resources of mobile terminals limit the further development of emerging applications [3,4]. The traditional solution is to offload these application tasks to a cloud server for centralized processing, leading to long transmission time because of its far location [5]. Mobile edge computing (MEC) is an emerging paradigm, which sinks the functions of cloud servers and provides users with required services and computing demands at the edge of network. As an important technology in mobile edge computing, task offloading solves the limitation caused by the insufficient capability of the terminal and relieves core network pressure [6].
As the infrastructure for the extension of cloud services to the edge side, edge servers are required to be modular and miniaturized. To meet the needs of different application scenarios, edge servers should be able to be fully decoupled into computation, storage, communication, management, and other components. Besides, edge servers are designed to be more compact in size. These all limit the resource of edge servers. Compared with powerful cloud servers, the capability gap between them can reach several orders of magnitude [7,8]. When the number of users increases, on the one hand, a single server is not able to support all user tasks, resulting in poor user experience. On the other hand, there is uneven load distribution among multiple edge servers, which causes some edge servers to overload while some to idle. Therefore, it has become a trend for multiple edge servers to perform task offloading collaboratively while considering the computation load balance among edge servers [9,10]. However, these reported works do not consider the limitation of the service caching on task offloading, which will cause the failure of task execution in practical scenarios.
Service caching refers to the cache of program databases and libraries required. Only edge servers with relevant services can execute corresponding user tasks [11]. These services can be downloaded from the remote cloud when user tasks arrive, or they can be cached in MEC beforehand. It will spend tens of seconds temporarily downloading from the cloud server [12]. Therefore, it can effectively reduce the initial delay if various services are cached in advance. Most reported task offloading works in MEC ideally assume that edge servers cache all required services, but the actual edge servers have constrained storage resources and the type of caching services must be chosen wisely [13,14]. Furthermore, the fixed-type service caches are also not suitable for the user with dynamical requirements. Thus, it is necessary to make an efficient and dynamical caching strategy according to the actual task requirements.
In addition, many current works focus on better overall benefits, such as less total delay [15], smaller energy consumption [16], or lower system cost [17]. A solution that only guarantees the overall system benefits may result in unfair treatment of the individual users, which will lead to poor user experience. Hence, fairness among users is also an important issue in MEC [18][19][20].
To solve the problems mentioned above, this paper investigates collaborative task offloading assisted by a dynamical caching strategy, considering user fairness and edge load balance in MEC.
The main contributions of this paper are summarized as follows.

1.
We constructed a two-layer collaborative MEC system model. To meet the feasibility constraints of task execution, the services of various emerging applications are dynamically cached in advance at edge servers; 2.
To ensure fairness among users to a certain extent, the optimization goal is to effectively reduce the maximum delay of all users. A JTOSC algorithm that comprehensively considers adaptive dynamic service caching, efficient collaborative task offloading, and fair computation resource allocation is proposed; 3.
To simplify the solution of the proposed algorithm, JTOSC is decoupled into outer and inner subproblems. The outer layer in JTOSC iteratively updates the service caching decisions based on Gibbs sampling. The inner layer in JTOSC is based on the fairness perception and the offloading revenue preference to get a sensible computing resource allocation and task offloading scheme, respectively. Simulation results have verified the effectiveness of the proposed strategy.
The remainder of this paper is organized as follows. In Section 2, we review the related works. In Section 3, we describe the system model, and the optimizing problem is formulated. In Section 4, we detail the scheme design of joint task offloading and service caching based on edge collaboration. Section 5 evaluates and analyzes the performance of the proposed strategy. Finally, some conclusions for the work are drawn in Section 6.

Related Works
Currently, task offloading has become a critical issue in mobile edge computing. In [21], an efficient task offloading management scheme in a densely deployed small cell network was studied, using a genetic algorithm and particle swarm algorithm to jointly optimize offloading decision, spectrum resource, transmit power, and computing resource allocation to minimize the energy consumption of users. With the same optimization objective described in [21], multi-users partial computation offloading based on Lyapunov with integrating energy harvesting (EH) technology was presented to achieve long-term operation of the terminal in [22]. The task dependency model for multiple users was considered in [23], which focused on addressing the combination of offloading decisions among tasks and the strong coupling with resource allocation to minimize the weighted sum of energy consumption and delay for users. It was pointed out in [24] that cooperation among MECs could yield huge performance gains while balancing the computational load. From the perspective of game theory, efficient vehicle task offloading was achieved through thermal-aware MEC collaboration based on the analysis of vehicle users running trajectories to reduce the task completion delay significantly in [25]. The horizontal cooperation of multiple MEC-BSs was proposed to further offload additional tasks to the remaining MEC-BSs to enhance their computation offloading performance in [26]. In [27], horizontal cooperation among edge servers and three-layer vertical cooperation were considered during task offloading. To reduce the average task duration, the offloading decisions and computing resource allocation were optimized by using the alternating direction multiplier method and difference of convex functions programming. Deep reinforcement learning was applied to achieve privacy-preserving task offloading in mobile blockchains in [28].
The above research works assumed that each edge server caches all services and could handle any type of computing task. However, it is difficult for the actual edge server to cache all services as its storage resources are limited. Therefore, it is necessary to develop a suitable service caching strategy according to the actual task requirements. Relevant research had been devoted to the edge service caching problem. In [19], service caching was used as a constraint to limit the computation offloading location of user tasks, but the service types on each edge server were fixed, which was not fitting for dynamic task requirements. An adaptive edge caching scheme based on location awareness was designed to optimize the hit rate of the caching service strategy by predicting the popularity of content in [29]. In [30], multi-dimensional features such as historical and future data information, social relationships, and geographical location were further considered to design the prevalence model and reduce prediction errors. However, it would cause all edge nodes to prefer to select popular service caching and relatively unpopular services were only solved in the cloud server, which would result in high transmission delay. The service caching strategy and task offloading policy based on the ε-greedy strategy and the Gibbs sampling principle were proposed to reduce the computing delay in [31], respectively. As the horizontal collaboration among edge servers was not taken into account, it resulted in low resource utilization among edge devices. In [32], a decentralized cooperative service placement algorithm (CSP) was proposed to improve Gibbs sampling as a service caching strategy to maximize the system utility under cellular full and non-full cooperation. However, the computing resource limitation of edge servers was not considered.
In contrast to the above works, the collaborative task offloading problem, assisted by dynamical cache strategy in MEC, is studied by considering several aspects such as collaboration, wise service caching, balanced task offloading, and fair resource allocation, which guarantees strict execution delay under the constrained computation and storage resources of edge servers.

Network Model
As shown in Figure 1, we consider a two-layer collaborative MEC network model. It consists of N mobile terminal users (TUs) and M wireless base stations (BSs). Each TU is connected to its associated BS via a wireless link, and each BS communicates with each other through a wired link. Each BS is equipped with an MEC server, serving as an edge node to provide certain computing and storage resources. The execution of each user task depends on the required service, and the type of service corresponds to the type of task. At present, emerging applications will all be used as user tasks, so the whole system includes application service types such as cognitive assistance, autonomous driving, online games, security monitoring, VR/AR, video conferencing, 3D modeling, and so on. The concept of the BS is equivalent to the MEC in the subsequent sections.
Divide the continuous time into T separate slots, where slot t represents the t-th slot. In each slot, the location of TUs and the transmission channel condition are considered fixed [33]. In order to simplify the model analysis, it is assumed that each user has only one mobile terminal, and one computing task is generated in a time slot. This task can either be processed locally or offloaded to an edge server for computing. It will be uploaded first to its associated BS if the TU performs the offloading decision, and it can be handled by its associated BS provided that there are sufficient computing resources and relevant services cached. Otherwise, the task will be further forwarded to a nearby collaborative BS with the required services and computing demands. Besides, the associated BS refers to the base station that is closest to a TU and with the best channel condition in the current time slot.

REVIEW 4 of 19
other through a wired link. Each BS is equipped with an MEC server, serving as an edge node to provide certain computing and storage resources. The execution of each user task depends on the required service, and the type of service corresponds to the type of task. At present, emerging applications will all be used as user tasks, so the whole system includes application service types such as cognitive assistance, autonomous driving, online games, security monitoring, VR/AR, video conferencing, 3D modeling, and so on. The concept of the BS is equivalent to the MEC in the subsequent sections. Divide the continuous time into T separate slots, where slot t represents the t -th slot. In each slot, the location of TUs and the transmission channel condition are considered fixed [33]. In order to simplify the model analysis, it is assumed that each user has only one mobile terminal, and one computing task is generated in a time slot. This task can either be processed locally or offloaded to an edge server for computing. It will be uploaded first to its associated BS if the TU performs the offloading decision, and it can be handled by its associated BS provided that there are sufficient computing resources and relevant services cached. Otherwise, the task will be further forwarded to a nearby collaborative BS with the required services and computing demands. Besides, the associated BS refers to the base station that is closest to a TU and with the best channel condition in the current time slot. , respectively. In a slot, the TU n generates a computation task, which is given by The set of BSs and TUs are denoted by M = {1,2, . . . , M} and N = {1,2, . . . , N}, respectively. In a slot, the TU n generates a computation task, which is given by I n = {D n , λ n , S n , t max n }. D n indicates the size of input data of the task, and λ n represents the number of CPU cycles required of the task. S n denotes the type of service required of the task, and t max n is the maximum delay limit of the task. The set of computing tasks generated by all TUs is I = {I 1 , I 2 , . . . , I n }, and the set of service types available in the whole scenario is S = {S 1 , S 2 , . . . , S l }. The set of TUs associated with the base station m is N m . If user n is associated with the base station m, then n ∈ N m . The main symbols and their definitions are summarized in Table 1.

Communication Model
Each TU is connected to its associated BS via a wireless link. At the same time, the Orthogonal Frequency Division Multiple Access (OFDMA) communication mode is used in the cell, each TU transmits its task through an orthogonal channel, so that the interference in the cell can be ignored. Besides, to simplify the problem, inter-cell interference is not considered for the time being, since interference management is not the focus of this paper. We define R nm as the uplink transmission rate, which is from the user n to its associated BS m. Its value depends on the number of TUs associated with the BS. Assuming that TUs connected to the same base station share communication resources equally, then R nm can be expressed as where W is the available spectrum bandwidth, P n and h nm represent the uplink transmission power and the channel gain between the user n and its associated base station, respectively. σ 2 is the additive Gaussian white noise power, and |N m | represents the number of TUs associated with the BS m.

Computation Offloading Model
Assume the tasks generated by each TU are inseparable, and they are supposed to be executed locally, offloaded to its associated BS, or further offloaded to a collaborative BS for computation. Define X = x mk,I n |m ∈ M, k ∈ M ∪ {0}, n ∈ N m as the task offloading strategy for the system. x mk,I n ∈ {0, 1} is the offloading decision variable for the user n, where x mk,I n = 1 indicates the user task I n associated with m is executed by k, otherwise, x mk,I n = 0. In addition, k = 0 indicates I n is performed locally, k = m indicates I n is executed by its associated BS m, and k ∈ M\{m} indicates I n is calculated by a nonassociated collaborative BS k. The task offloading decision should satisfy ∑ k∈M∪{0} x mk,I n = 1, ∀m ∈ M, n ∈ N m (2)

Local Computing
Assume the computing capability (i.e., the CPU cycles per second) of user n is denoted by f L n . Accordingly, the local computing delay of the task I n can be expressed as

Associated Base Station Computing
If a TU executes one task on its associated BS, then the whole offloading delay includes three parts: the uploading time T tr nm = D n /R nm , the computing time T exe nm in associated BS m, and the downloading delay of computation results. Since the computation results are usually much smaller than the input data and the downlink transmission rate is very high, we ignore the last part of the downloading delay [18]. Besides, we define the computing resource allocation strategy of the edge server as F = { f mn |m ∈ M, n ∈ N exe m }, where f mn represents the computing resources allocated by edge server m to user n, N exe m represents a set of tasks performed by m. The tasks in N exe m include hit by its local cache and offloaded by other collaborative BSs. Due to the limited computing capabilities of edge servers, the resources allocated to users cannot exceed their total resources, which must be satisfied ∑ n∈N exe m f mn ≤ f m . In this case, the computing time of the associated BS is T exe nm = λ n / f mn . Consequently, the total execution delay in the associated BS m can be expressed as

Non-Associated Collaborative Base Station Computing
The calculation time in a non-associated collaborative BS includes four parts: the uploading time T tr nm , the transmission time T tr mk from the associated BS m to the collaborative BS k, the computing time T exe nk in k, and the ignorable downloading delay. Define the transmission rate between m and k as a fixed value R mk , then T tr mk = D n /R mk . According to the computing resource allocation strategy, the computing resources allocated by the collaborative BS k to the user n are f kn , then T exe nk = λ n / f kn . Therefore, the total execution delay in the non-associated collaborative BS k can be expressed as

Service Caching Model
Only when the relevant application services are cached in advance can the corresponding computing tasks be executed by the edge server. We define the service caching strategy of the edge server as C = {c m,s |m ∈ M, s ∈ S}. c m,s is the service caching decision variable for server m, where c m,s = 1 indicates the server m caches the service s, otherwise, c m,s = 0. Due to the limited storage resources of the MEC server, the total amount of services cached by each MEC cannot exceed its capacity. Therefore, we have the following caching decision constraint ∑ s∈S c m,s D s ≤ K m , ∀m ∈ M (6) where D s is the data size of service s, K m is the storage capacity of edge server m.

Service Caching Model
A TU generates one computing task in a time slot, which can optionally be executed locally or offloaded to its associated or collaborative BS with the required services and computing demands in advance. Assume that TUs can perform all tasks generated by themselves locally, the actual computation delay of the task I n is T n = x m0,I n T L n + c m,s n x mm,I n T nm + ∑ k∈M\{m} c k,s n x mk,I n T nk (7) We develop the joint optimization problem of collaborative offloading strategy X , computation resource allocation strategy F , and service caching strategy C with the consideration of user fairness, where the fairness is reflected by minimizing the maximum actual delay T n of all users. Accordingly, the objective problem can be formulated as P1 : min where the constraint C1 indicates that the total amount of services cached by each MEC cannot exceed its capacity. C2 ensures that a TU can only perform at one of its local, associated BS, or collaborative BS. C3 denotes that the total computation resources allocated by an MEC cannot exceed its computing capability. C4 means the computation resources allocated are non-negative. C5 represents that the service caching decision is a binary variable and it can only be service cached or not cached. C6 represents that the task offloading decision is a binary variable and it can only be task offloaded or not offloaded.

Joint Optimization Strategy of Task Offloading and Service Caching
In this section, an efficient computation offloading strategy called JTOSC is proposed to achieve the goal of P1. Since the service caching and task offloading variables are 0 or 1, the computation resources allocation result can be any value between 0 and 1. Therefore, problem P1 is a mixed integer nonlinear programming problem. In addition, c m,s and x mk,I n , x mk,I n and f mn are coupled with each other, leading to the objective function being non-convex and difficult to tackle. Thus, we decompose P1 into two sub problems to solve, namely service caching and task scheduling problem, where the task scheduling problem can be further divided into task offloading decision and fair resource allocation.

Service Caching Model
In the outer layer of JTOSC, the service caching decision of MEC is determined iteratively based on Gibbs sampling, where the main idea of Gibbs sampling is to simulate conditional samples by scanning each variable while keeping the remaining variables constant in each iteration. Specifically, the update process of service caching decision is regarded as a L dimensional Markov chain. In each round of iteration, an edge server m ∈ M and a feasible caching strategy C * m ∈ C satisfying the relevant constraints are randomly selected, while the caching strategies on the remaining edge servers maintain unchanged. Based on the caching decisions of all edge servers in the previous round and the current round, the task offloading strategy X and X * , the computing resource allocation strategy F and F * , the objective function value τ and τ * can be calculated for the previous round and the current round, respectively. Associate the conditional probability distribution of cache update strategies with the optimization goal of P1, accepting the current caching strategy with probability ρ, and maintaining the previous round of caching strategy with probability 1 − ρ. Eventually, the Markov chain will converge to the optimal caching policy with high probability. The service caching strategy is shown in Algorithm 1. , compute the task offloading strategy X and resource allocation strategy F and objective function value τ and τ ave ; 5: Based on the current round caching policy C l 1 , C l 2 , . . . C * m , . . . C l M , compute the task offloading strategy X * and resource allocation strategy F * and objective function value τ * and τ * ave ; 6: Let C l m = C * m with the probability ρ = 1 1+e (τ * −τ)/w ; 7: Let C l m = C l−1 m with the probability 1 − ρ; 8: end for When the outer layer service caching decision is determined, the original optimization problem P1 is reduced to the inner layer task scheduling problem P2.

P2 : min
In optimization problem P2, the task offloading strategy X is coupled with the computation resource allocation strategy F , where F depends on the result of X , and X needs to be further adjusted and optimized according to the result of F . We consider solving these two coupled problems alternatively by fixing one of the result terms.

Computing Resource Allocation Based on Fairness Perception
We define the fairness of TUs from the perspective of user experience, which can be reflected by minimizing the maximum actual delay T n of all users. Specifically, we propose a fairness perception computing resource allocation strategy, fairly allocating all computing resources to TUs. By initializing the task offloading decision X , P2 is simplified to the computing resource allocation problem P3 as follows: P3 : min given the service caching decision and the task offloading decision, the second term Q n in P3 is a fixed value, and its value can be clearly expressed as Q n = x m0,I n λ n / f L n + c m,s n x mm,I n D n /R nm + ∑ k∈M\{m} c k,s n x mk,I n (D n /R nm + D n /R mk ), where N o f f is the set of all TUs offloaded to MECs, and N exe k is the set of TUs offloaded to MEC k. Meanwhile, since both caching decision and offloading decision are binary variables, and only one of the offloading decision variables (x m0,I n , x mm,I n and x mk,I n ) is equal to 1, let ∑ k∈M c k,s n x mk,I n λ n / f kn + Q n = λ n / f kn + Q n ≤ τ. At this time where λ n / f kn is the computation delay of MEC k, and its value is non-negative. Then, 0 ≤ λ n / f kn ≤ τ − Q n . This constraint of f kn can be transformed into 0 ≤ λ n /(τ − Q n ) ≤ f kn .
Only when we put all computing resources to work can we ensure that each TU is allocated relatively more computing resources from MEC and obtain higher quality performances. Therefore, At this point, the problem of computing resource allocation is transformed into where the constraint C7 is a monotonically decreasing function of τ, τ min = Q n and τ max = ∑ n∈N exe k (λ n / f k + Q n ). Use the bisection method to calculate the optimal objective function value τ within the upper and lower bounds. The computing resource allocation process is shown in Algorithm 2.

Bilateral Matching Task Offloading Based on Revenue Preference
In the previous section, a fixed task offloading strategy was used to allocate computing resources. However, it is necessary to continuously adjust the offloading scheme according to a reliable offloading strategy. At this point, the optimization problem is transformed into: where the value of T n is given in Equation (7). The set of BSs that cache the services required by the task I n is defined as M candidate n . The locations where the task can be executed include the local TU and MEC m, satisfying ∀m ∈ M candidate n . Each TU sends the offloading request to its own associated BS at the beginning of a time slot, and the set of offloading requests received by the associated BS is defined as N req m , which includes the tasks offloaded by the associated TUs and the collaborative BSs. If the associated BS m belongs to M candidate n , that is, its local cache hits the service required by the task I n . Then, these tasks hit will be added to N candidate m , and the missed will be added to the set N no m . The initial task offloading scheme assumes that all tasks in N candidate m are executed by MEC m, each task in N no m sends its offloading request to collaborative BSs with the highest preference value in M candidate n , and the collaborative BS executes all tasks received. Meanwhile, the computing resources allocation strategy of TUs is computed by Formula (14). So far, the initial service caching, task offloading, and computing resource allocation scheme are obtained.
With the updated service caching decision, the task offloading strategy adopts a preference-based bilateral matching algorithm to select the appropriate offloading location. Calculate the objective function value T n of each TU under the current offloading decision. If all TUs meet their maximum delay requirements and do not exceed the computing resources constraint of each BS, then the offloading scheme at this time is suitable. Otherwise, define the difference between the task maximum latency limit and its actual latency as the task offloading revenue, that is γ nm = t max For each TU in N o f f m , a preference-based approach is adopted to select an appropriate offloading location. Each task to be further offloaded has a preference for different offloading locations, and the preference value is related to the estimated delay of the offloading location. The larger the estimated offloading delay, the smaller the preference value. In this case, the task I n is rejected by the MEC m and needs to be further offloaded has a preference value for the collaborative BS k, which can be expressed as The task I n that is rejected by the edge device m and needs to be further offloaded has a preference value for its local TU, which can be expressed as The task I n sends its offloading request to the location with a high preference value preferentially. If the location requested is the local TU, then the offloading request will be accepted directly, and let x n0,I n = 1. If the location requested is the collaborative BS, then the BS reply is needed. If the offloading request is rejected, then it will be sent to the next best offloading location in the next iteration until it is accepted and let x nk,I n = 1 at once. Repeat the above process until all offloading decisions are confirmed, then the algorithm terminates. The preference-based bilateral matching task offloading process is shown in Algorithm 3. , N no m ← C 0 ; 8: Initial offloading strategy X 0 : N candidate m → x nm,I n = 1 , N no m → according to user preferences, with full acceptance of offloading requests; 9: Initial resource allocation strategy: F 0 ← X 0 , according to Algorithm 2; 10: end for 11: for n ∈ N do 12: Computing T n ← Equation (7) N exe m = N exe m ; 16: x nm,I n = 1; 17: else 18: Computing γ nm (∀n ∈ N exe m ); 19: Sort γ nm in descending order, select a task with the smallest value to offload in turn, let

Simulation Setting
Considering the edge computing scenario where four BSs and many users are randomly distributed, each BS is deployed with an MEC server. The system bandwidth is set to 20 MHz, and the background noise power is −100 dBm. The path loss factor used in this paper refers to the setting of [17], i.e., L[dB] = 140.7 + 36.7 log 10 d[km]. For computing tasks, we consider face detection and recognition applications for airport security and surveillance, and they can benefit from collaboration between TUs and the MEC platform [34]. In most simulations, unless otherwise specified, we consider the number of user tasks as 20, the input size of the task to be set to D n = 420 KB, the number of CPU cycles required of task to be set to λ n = 1000 Megacycles, and the computing capability of MEC as 20 GHz. Assume they contain six types of services, which satisfies all task requirements in system. Simulation is performed on MATLAB to evaluate the performance of the proposed joint optimization strategy of task offloading and service caching. The main simulation parameters are listed in Table 2.

Strategies Comparison
In order to better evaluate the performance of the proposed strategy, we compared it with the following four task offloading strategies.
(1) Computation Offloading and Resource Allocation algorithm (CORA) [18]. Tasks generated by TUs are calculated locally or by the cloud, and the edge servers do not cache any services; (2) Joint Task Offloading and Resource Allocation algorithm (JTORA) [17]. Task offloading and resource allocation in a multi-users and multi-severs scenario is optimized without considering MEC collaboration, using the caching strategy in this paper for service caching; (3) Optimizing Service Placement and Resource Allocation algorithm (OSPRA) [13]. Service placement and resource allocation are optimized without considering MEC collaboration, using service popularity to greedy cache relatively more popular services; (4) Collaborative Data Caching and Computation Offloading (CDCCO) [14]. MEC collaborates with each other for task offloading, and we adopt the dynamic programming algorithm that caches data in the original algorithm for service caching.
The performance of each strategy is evaluated by four indicators: the maximum execution delay of all users, the average execution delay, the number of load tasks, and the local service caching hit ratio of each edge server. The local service caching hit ratio refers to the ratio of hit services number to required services number about the associated BS and its users.

Analysis of Simulation Results
In Figure 2, the maximum delay of all users, which reflects user fairness sideways, is compared. It can be seen from Figure 2 that TUs generate the largest delay when choosing the CORA strategy because of the weak computing capability of TUs themselves and the far distance between TUs and the cloud, leading to high execution delay and transmission delay, respectively. Compared with the CORA strategy, the tasks can be offloaded to MEC servers, which brings more resources and closer distance. Hence, the maximum delay of all users of the other four strategies was cut down as a result.
compared. It can be seen from Figure 2 that TUs generate the largest delay when c the CORA strategy because of the weak computing capability of TUs themselves far distance between TUs and the cloud, leading to high execution delay and trans delay, respectively. Compared with the CORA strategy, the tasks can be offloaded servers, which brings more resources and closer distance. Hence, the maximum all users of the other four strategies was cut down as a result.
Simultaneously, the JTOSC and CDCCO strategies show better performance JTORA and OSPRA strategies-the reason is whether to consider the collabora tween MECs. The tasks not hit locally can be offloaded to the collaborative MECs ing demands preferentially rather than the remote cloud directly, which reduces th mission delay and balances the edge load. Besides, the JTOSC and JTORA use a tively updated strategy based on probability in this paper to perform service cach ter than the dynamic programming cache in CDCCO and the greedy cache in Therefore, the JTORA strategy shows slightly better performance than the OSPR the JTOSC strategy displays the most excellent performance. In Figure 3, the impact of the different numbers of users on the average delay is illustrated, where the average delay is the overall tasks delay divided by the nu tasks executed. With the increasing number of users, the average delay of all ta sents an upward trend. Increasing users lead to intensified communication com among them, then in turn raises the delay slightly in the CORA strategy. Meanwh Maximum delay of users / s Simultaneously, the JTOSC and CDCCO strategies show better performance than the JTORA and OSPRA strategies-the reason is whether to consider the collaboration between MECs. The tasks not hit locally can be offloaded to the collaborative MECs satisfying demands preferentially rather than the remote cloud directly, which reduces the transmission delay and balances the edge load. Besides, the JTOSC and JTORA use an iteratively updated strategy based on probability in this paper to perform service caching, better than the dynamic programming cache in CDCCO and the greedy cache in OSPRA. Therefore, the JTORA strategy shows slightly better performance than the OSPRA, and the JTOSC strategy displays the most excellent performance.
In Figure 3, the impact of the different numbers of users on the average delay of tasks is illustrated, where the average delay is the overall tasks delay divided by the number of tasks executed. With the increasing number of users, the average delay of all tasks presents an upward trend. Increasing users lead to intensified communication competition among them, then in turn raises the delay slightly in the CORA strategy. Meanwhile, due to the constrained resources of MECs, queuing and further offloading cause redundant waiting and transmission delay, respectively, in the other four strategies, leading to more overall delay and average delay. From Figure 3, it can be concluded that the CDCCO and JTOSC strategies show better performance. As there are more computing resources for task offloading because of the MECs' collaboration, the delay is relatively reduced.
to the constrained resources of MECs, queuing and further offloading cause redundan waiting and transmission delay, respectively, in the other four strategies, leading to mor overall delay and average delay. From Figure 3, it can be concluded that the CDCCO an JTOSC strategies show better performance. As there are more computing resources fo task offloading because of the MECs' collaboration, the delay is relatively reduced. In Figure 4, the impact of the computing capabilities of MEC servers on the maximum delay of all users is illustrated. The improvement of the computing capabilities does no have any influence on the CORA strategy, since its edge servers do not cache computin services and cannot participate in computing any user tasks. In the remaining four strate gies, with the computing capabilities of edge servers increasing, the computing resource allocated to user increase, then the computing delay decrease. However, due to the limi tation of storage resources of edge servers, they are unable to cache more services to per form more tasks, so the downward trend gradually stabilizes. In addition, it can be visu alized from Figure 4 that the performance difference between MECs' non-collaboratio (OSPRA and JTORA) and collaboration (CDCCO and JTOSC) strategies gradually de creases. This is because that the number of user tasks, which can be processed by the as sociated MEC itself, increases with the greater computing capabilities. In Figure 4, the impact of the computing capabilities of MEC servers on the maximum delay of all users is illustrated. The improvement of the computing capabilities does not have any influence on the CORA strategy, since its edge servers do not cache computing services and cannot participate in computing any user tasks. In the remaining four strategies, with the computing capabilities of edge servers increasing, the computing resources allocated to user increase, then the computing delay decrease. However, due to the limitation of storage resources of edge servers, they are unable to cache more services to perform more tasks, so the downward trend gradually stabilizes. In addition, it can be visualized from Figure 4 that the performance difference between MECs' non-collaboration (OSPRA and JTORA) and collaboration (CDCCO and JTOSC) strategies gradually decreases. This is because that the number of user tasks, which can be processed by the associated MEC itself, increases with the greater computing capabilities. task offloading because of the MECs' collaboration, the delay is relatively reduce In Figure 4, the impact of the computing capabilities of MEC servers on the m delay of all users is illustrated. The improvement of the computing capabilities d have any influence on the CORA strategy, since its edge servers do not cache com services and cannot participate in computing any user tasks. In the remaining fou gies, with the computing capabilities of edge servers increasing, the computing re allocated to user increase, then the computing delay decrease. However, due to t tation of storage resources of edge servers, they are unable to cache more service form more tasks, so the downward trend gradually stabilizes. In addition, it can alized from Figure 4 that the performance difference between MECs' non-collab (OSPRA and JTORA) and collaboration (CDCCO and JTOSC) strategies gradu creases. This is because that the number of user tasks, which can be processed by sociated MEC itself, increases with the greater computing capabilities.  In Figure 5, the impact of the caching capacities of MEC servers on the maximum delay of all user tasks is illustrated. Similarly, the increase of the storage capacities of edge servers does not affect the maximum delay of all users, since the edge servers cannot participate in computing any user tasks in the CORA strategy. In the remaining four strategies, with the storage capacities of the edge servers increasing, the services required will be cached with a greater probability, reducing further offloading to collaborative MECs and remote cloud, and the maximum delay decreases with it. Moreover, it can be seen from Figure 5 that the downward trend gradually becomes stable while the caching capacity reaches about 125 GB. This means that the edge servers are limited mainly by their own computing resources at this time.
Sensors 2022, 22, x FOR PEER REVIEW In Figure 5, the impact of the caching capacities of MEC servers on the m delay of all user tasks is illustrated. Similarly, the increase of the storage capacitie servers does not affect the maximum delay of all users, since the edge servers can ticipate in computing any user tasks in the CORA strategy. In the remaining fou gies, with the storage capacities of the edge servers increasing, the services requ be cached with a greater probability, reducing further offloading to collaborativ and remote cloud, and the maximum delay decreases with it. Moreover, it can from Figure 5 that the downward trend gradually becomes stable while the cac pacity reaches about 125 GB. This means that the edge servers are limited mainly own computing resources at this time.  Figure 6, the comparison of the number of load tasks executed by each edg and cloud under four strategies is illustrated. The CORA strategy is not compar all tasks will be offloaded to the remote cloud for execution under CORA. Both the and JTORA strategies do not consider the horizontal collaboration between edge resulting in an unbalanced load among MECs. On the contrary, the CDCCO and strategies consider the horizontal collaboration among MECs, and their loads tively balanced. Besides, the number of tasks performed by each edge server is r its own service cache hit rate. Most tasks were performed by MECs in JTOSC be its better iteratively update service caching strategy.  In Figure 6, the comparison of the number of load tasks executed by each edge server and cloud under four strategies is illustrated. The CORA strategy is not compared, since all tasks will be offloaded to the remote cloud for execution under CORA. Both the OSPRA and JTORA strategies do not consider the horizontal collaboration between edge servers, resulting in an unbalanced load among MECs. On the contrary, the CDCCO and JTOSC strategies consider the horizontal collaboration among MECs, and their loads are relatively balanced. Besides, the number of tasks performed by each edge server is related to its own service cache hit rate. Most tasks were performed by MECs in JTOSC because of its better iteratively update service caching strategy.
In Figure 7, the comparison of the local service cache hit ratio of edge servers under four strategies is illustrated. Similarly, the CORA strategy does not participate in the comparison. As we can see, the JTOSC strategy proposed in this paper possesses the highest hit ratio, and the second one is the JTORA, indicating that the performance of the proposed caching strategy is excellent. The dynamic programming method for caching in CDCCO is better than the greedy cache in OSPRA. Because the greedy cache preferentially chooses popular services, relatively unpopular services can only be stored in the cloud, resulting in high transmission delay.
resulting in an unbalanced load among MECs. On the contrary, the CDCC strategies consider the horizontal collaboration among MECs, and their tively balanced. Besides, the number of tasks performed by each edge serv its own service cache hit rate. Most tasks were performed by MECs in JTO its better iteratively update service caching strategy.  Figure 7, the comparison of the local service cache hit ratio of edge serve four strategies is illustrated. Similarly, the CORA strategy does not participate in t parison. As we can see, the JTOSC strategy proposed in this paper possesses the hit ratio, and the second one is the JTORA, indicating that the performance of posed caching strategy is excellent. The dynamic programming method for ca CDCCO is better than the greedy cache in OSPRA. Because the greedy cache prefe chooses popular services, relatively unpopular services can only be stored in th resulting in high transmission delay.

Conclusions
In this paper, a collaborative task offloading problem assisted by dynamica caching in MEC is investigated to reduce the maximum delay of all users by join sidering the service caching decisions, task offloading decisions, and computing allocation. A service caching strategy based on Gibbs sampling is proposed to s propriate services for computing. Furthermore, a computing resources allocation based on fairness is presented to improve the equity among users certainly. More offloading revenue preference-based bilateral matching strategy is introduced for ing location options. The simulation results have demonstrated that the proposed can effectively reduce the maximum delay of all users, improve the user experie balance the edge load. In this work, it is assumed that all users share communic sources equally, and the inter-cell interference is ignored. Communication inte management will be studied in the next research work. This study can be review reference for task offloading in MEC.

Conclusions
In this paper, a collaborative task offloading problem assisted by dynamical service caching in MEC is investigated to reduce the maximum delay of all users by jointly considering the service caching decisions, task offloading decisions, and computing resource allocation. A service caching strategy based on Gibbs sampling is proposed to select appropriate services for computing. Furthermore, a computing resources allocation strategy based on fairness is presented to improve the equity among users certainly. Moreover, an offloading revenue preference-based bilateral matching strategy is introduced for offloading location options. The simulation results have demonstrated that the proposed JTOSC can effectively reduce the maximum delay of all users, improve the user experience, and balance the edge load. In this work, it is assumed that all users share communication resources equally, and the inter-cell interference is ignored. Communication interference management will be studied in the next research work. This study can be reviewed as a reference for task offloading in MEC.