1. Introduction
In recent years, with the rapid proliferation of wireless IoT applications such as connected vehicles, VR/AR, smart cities, and personalized streaming videos, there has been a surge in computation-sensitive and real-time services in mobile networks [
1]. These services typically require completion within a short time frame (20–125 ms) [
2].
Figure 1 illustrates the workflow of real-time services. Generally, three steps are involved: generation, processing, and transmission [
3]. Firstly, mobile users (MUs) such as connected vehicles, AR/VR devices, mobile phones, or drones would continuously generate sensor or user data (infrared, radar, video streams, health, and so on) when requesting specific real-time services. Subsequently, MUs can process these tasks locally or transmit them to server units located in the cloud or edge to obtain computing services. The computing operations will be carried out with the assistance of application data (executable programs). The computing results return to the MUs at the end of the time frame. For example, vehicles requiring autonomous driving services would detect environment data from fusion monitoring devices (in-vehicle cameras, millimeter-wave or ultrasonic radar, etc.) while driving. After preprocessing, these raw data are sent to computing units deployed with corresponding target detection algorithms (or processed directly by locally embedded computing units) to perceive road conditions and provide optimal driving decisions.
This necessitates an efficient, cost-effective service model that minimizes user energy consumption while ensuring service quality. Traditional cloud computing architectures, as a centralized solution, encounter challenges such as severe wide-area network latency and fluctuating service quality when handling these new types of services.
Multi-access edge computing (MEC) [
4] emerges as a computing paradigm capable of addressing these issues. It leverages the distributed computing power and communication resources at the network edge, particularly at the edge of the mobile wireless access network (RAN), by deploying communication entities such as road side units (RSUs), smart gateways, and small cell base stations to construct a network topology. It deploys corresponding computing and storage units at wireless access points (WAPs) to provide personalized services to MUs. MUs can directly access computing services from edge servers located at WAPs, avoiding traffic bottlenecks in the core and backhaul network during routing, thus partially alleviating the cache and computational burdens on data centers.
Additionally, MEC offers performance-optimizing technologies such as computation offloading (CO) [
5] and edge caching (EC) [
6]. Computation offloading, as a service optimization paradigm, allows MUs to delegate complex interactive computing tasks to edge nodes and receive computation results via the wireless downlinks between WAPs and MUs, thereby greatly alleviating the problem of high computational energy consumption in terminal devices with limited computational resources during service usage. Edge caching allows users to cache application data at communication nodes, enabling different MUs requesting the same service to directly upload locally generated real-time data for computation, avoiding the redundant transmission of the same application data and thus alleviating high latency and heavy load in the fronthaul network.
As an evolution of traditional mobile base stations, the MEC architecture allows for the scheduling and management of parallel communication resources at the network edge layer. This provides a new solution for designing cache data-sharing strategies among MEC nodes using backhaul networks, further optimizing system performance. It also offers cloud service providers more flexible computation offloading solutions by changing the execution and storage locations of data to optimize system performance.
Although the MEC network paradigm can overcome some of the drawbacks of cloud computing and meet the service constraints of real-time applications, there are still many challenges. Specifically, from the perspective of data sharing, in the current edge computing domain, the most well-known approach is based on blockchain [
7] for distributed data storage and synchronized data sharing. However, this approach faces serious performance issues for high-throughput applications due to the low performance of blockchain networks, high costs associated with storing block data, and scalability challenges posed by large-scale data synchronization. These issues present difficulties in scheduling real-time applications.
To address these issues, our work focuses on leveraging performance optimization methods in MEC networks to provide real-time services for MUs. The main contributions of this study can be summarized as follows:
The thesis proposes a Distributed Edge Service Caching and Offloading (DESCO) architecture based on edge servers’ collaboration to provide real-time services for mobile users. Within DESCO, the optimization problem of minimizing the long-term average energy consumption of users while satisfying latency constraints is established, for which cache sharing, replacement, and computation offloading methods are leveraged to further optimize user costs.
A decentralized consistent hashing-based cache-sharing mechanism named “Cache Chord” is designed. This mechanism leverages the communication backhaul links at the edge layer and expands the DESCO framework by allowing edge servers to self-organize into a circular logical topology, facilitating application data sharing. Additionally, a cache IP list mechanism is devised to link application resource key-values in the overlay network with actual data in the real network.
The real-time computation offloading (RCO) problem is transformed into a multi-player static game among MUs within the wireless network coverage of each server in order to reduce the current time slot energy consumption. A multi-dimensional discrete particle swarm optimization algorithm is applied to solve this problem after proving the existence of the Nash equilibrium (NE) solution for the game. Furthermore, the exploration coefficients and iteration rules of the algorithm are designed to meet environment constraints.
Finally, simulation experiments are conducted to evaluate the performance of the proposed framework and algorithms. Results demonstrate that the proposed offloading method effectively reduces overall energy consumption at the user layer and obtains better converges compared to baseline algorithms.
The remainder of the paper is organized as follows:
Section 2 introduces the related work of computation offloading and cache-sharing technology.
Section 3 describes the system model of DESCO, and further introduces the cache-sharing mechanism. In
Section 4, we transfer the RCO problem into a static game and proof the existence of NE solutions, then design the MDPSO algorithm.
Section 5 presents simulation experiments to explore the superiority of the MDPSO algorithm and Cache Chord. Finally, the conclusion is given in
Section 6.
2. Related Works
With the development of real-time applications, resource requests generated by terminal devices tend to be independent and the service requests from MUs exhibit more dynamic characteristics. Specifically, as noted in [
8,
9], within MU groups, the popularity of services follows a Zipf distribution, implying that a single server-based service requires maintaining a large cache database to improve hit rates, leading to significantly increase system redundancy. Simultaneously, computation offloading services impose considerable pressure on the transmission bandwidth of the wireless uplink from end to edge, while the computational capabilities of terminal MU devices are limited and consume a considerable amount of energy [
10]. Therefore, coordinating energy consumption and service efficiency to maximize the system’s offloading benefits is a worthy issue to explore. Given the current state of limited infrastructure capacity and the limited computational capabilities of user devices in edge networks, the effective use of network resources to improve resource utilization and system performance has become an urgent challenge for network service providers.
To address this situation, researchers primarily employ two methods: firstly, by designing fine-grained and efficient computation offloading algorithms, and secondly, by considering the heterogeneity of edge devices and their geographical distribution characteristics, leveraging collaborative services between edges to enable on-demand resource mobility and fully utilize existing computing devices to ensure service quality. The following two subsections will provide detailed explanations of these approaches.
2.1. Computation Offloading Strategy
The real-time computation offloading (RCO) can enhance the QoE of interactive gaming [
11]. In this scenario, MEC servers would process real-time action data offloaded by players (MUs) and render them into corresponding in-game scenes. The rendered data are then compressed via a video encoder and transmitted back to users through video-streaming. The RCO strategy allows servers to wisely offload service requests from multiple users, maximizing overall user satisfaction. RCO can also be applied in energy-constrained drones [
12] and wearable devices [
13] to reduce the computational overhead of local devices. Complex neural networks or machine learning computations can drain battery life for these devices. Additionally, requests from similar types of devices often exhibit popularity (e.g., health monitoring, object detection, and path planning algorithms). The RCO algorithm can select a strategy that maximizes offloading benefits based on the application categories stored on the server and the user cluster’s requests, thereby reducing the computational energy consumption of MUs and extending device battery life.
Multiple mobile users can achieve resource-sharing and collaborative computing, while nearby MUs may request similar tasks [
14]. Based on this scenario, a fine-grained collaborative computing offloading and caching strategy is proposed to minimize the overall execution latency of MUs within the network [
15]. Additionally, the concept of a call graph is utilized to model the offloading and caching relationships among MUs. It is noteworthy that [
16] considers software-fetching and multicasting in network modeling, mathematically characterizing the processes of data uploading, task execution, and computation result downloading to minimize cache and weighted deadline as optimization objectives. They employ a joint algorithm combining ADMM and a penalty convex–concave procedure to obtain the optimal offloading strategy.
The ADMM algorithm is also applied in [
17] to obtain distributed offloading decisions. In this work, the authors propose a computation offloading scheme where computational tasks generated by ground users can be computed locally, on Low Earth Orbit (LEO) satellites, or on cloud servers. The authors also consider the limited computational capabilities and coverage time of each LEO satellite. They subsequently investigate the optimization problem of minimizing the total energy consumption of ground users, which is discrete and non-convex, and convert it into a linear programming problem.
In Time-Division Multiple Access (TDMA)-based MEC systems, a partial offloading strategy based on an iterative heuristic algorithm is proposed [
18] to minimize the total energy consumption of MUs while ensuring the task delay constraints. This strategy jointly optimizes task offloading rates, channel allocation, and MEC computing resource allocation. The authors decompose the problem into a series of offloading subproblems and design a two-stage algorithm to iteratively solve the offloading task set until achieving the minimum energy consumption.
When addressing performance optimization problems with computation offloading algorithms, it is observed that the discussion on the economies of scale brought by multi-server clusters is insufficient. Most articles only conduct research based on single-server scenarios, while in reality, MEC servers are often densely deployed in scenarios such as streets, malls, and schools to provide services to users. This cluster effect can be applied to the design of distributed network models and data-sharing algorithms. Wired backhaul links between edges can transmit large amounts of real-time data, which are also beneficial for the on-demand allocation of cached content related to real-time tasks.
2.2. Data-Sharing Mechanism
The authors of [
19,
20] focus on algorithm design to explore the cache-sharing strategy in distributed edge environments. To address the problem of MUs’ difficulty in discovering required IoT resources due to device heterogeneity, a Fog Computing-based resource discovery solution named FDS-RD is proposed [
19]. FDS-RD employs a Distributed Hash Table (DHT) to map resources to a structured peer-to-peer (P2P) network, facilitating resource discovery in large-scale IoT environments.
VCNS [
20] is a content-sharing network, tailored for the vehicular ad hoc network (VANET) scenario. It presents an edge caching scheme based on cross-entropy and dynamically adjusts caching based on content popularity within the request scope. Additionally, it designs a collaborative content delivery mechanism for RSUs to further reduce system latency overhead. Meanwhile, the authors of [
21,
22] focus on network logical topology design, exploring both structured and unstructured network topologies. The emphasis is on managing resource nodes to organize decentralized sharing networks, providing cached content to users to optimize system overheads.
Considering cooperative caching among edge servers, ref. [
21] proposes a distributed edge data indexing system called EDIndex. In this system, any server maintains a hierarchical counting bloom filter (HCBF) tree index structure, which indexes the data stored in nearby edge servers, enabling fast querying at the edge. SA-Chord [
22] is a pure edge computing-based adaptive distributed overlay network architecture. Based on the chord protocol, it designs a two-layer circular routing overlay composed of peer nodes and super nodes. Peer nodes only participate in content reception and transmission without routing, while super node clusters are responsible for maintaining routing tables for message-forwarding, achieving decentralized content-sharing based on the dual-layer structure.
From the above works, it can be observed that decentralized application caching and retrieval based on distributed hash can efficiently achieve data sharing, effectively reducing various overheads of the system, considering the limited storage capacity of geographically distributed edge service nodes. Therefore, this paper will explore the optimal task scheduling strategy based on a heuristic algorithm. Additionally, it will leverage the edge backhaul links’ path in existing multi-server–multi-user MEC networks to construct a structured data-sharing mechanism based on DHT.
5. Performance Evaluation
Simulation experiments were developed on the Python platform for the DESCO network environment. The MDPSO algorithm was deployed based on this distributed environment. The service coverage area of distributed MEC nodes is a regular hexagonal region with a diagonal length of 200 m. The values of system time slots, system operating cycle, server’s wireless transmission bandwidth, server’s and MUs’ computing power, user’s channel gain
, and the local computing energy consumption coefficient
are all constants and remain unchanged in the subsequent comparative experiments. The specific values are shown in
Table 2.
During the initialization phase, DESCO randomly generates the data for all services in the set
. The values are sampled from the intervals
,
, and
, where the maximum values
are all set to 5. The service cache capacity S of MEC server
d is fixed at 2GB, and the cache replacement policy is set to random caching by default. In the subsequent experiments, we will investigate the average energy consumption of users under different numbers of MEC servers. The environment configuration is the same as [
3,
31,
33]. The state transition probabilities of user requests follow a Zipf distribution parameterized by
, where
whereby
represents the Zipf distribution parameter.
L denotes the number of adjacent services that may be requested in the next stage.
R signifies the probability that user
n will not request any service in the subsequent phase. We will adjust parameters
L and
R to evaluate the performance of the algorithm under varying transition probabilities.
(1) The Greedy–Random algorithm: combines Greedy RCO with random cache replacement strategy. In the initialization phase, MEC servers randomly generate the RCO strategies for the user. Then, based on this initial strategy, the algorithm traverses each user’s offloading decision in the set
. At each step, it searches for the channel occupancy strategy that minimizes the energy consumption for that user. Meanwhile, the MEC server adopts a random cache replacement strategy for local cache space. This strategy randomly replaces the stored application data with the application data offloaded by users in the current time slot after the cache space becomes saturated, satisfying Equation (
18) during this process.
(2) Random cache replacement with multi-dimensional discrete particle swarm offloading (MDPSO-Random): The MDPSO strategy is utilized to solve RCO strategy. By assigning initial momentum to random multi-dimensional particles, the RCO strategy explores the optimal strategy in the solution space based on the fitness function. It is worth noting that this algorithm is deployed only in DESCO networks with a single MEC server.
(3) The MDPSO-Random algorithm with Cache Chord (MDPSO-Random with CC) builds upon the MDPSO-Random algorithm: This mechanism allows MUs to access the application cache resources across the entire edge layer.
The effectiveness of the MDPSO algorithm and Cache Chord is verified from three aspects: convergence of the MDPSO algorithm, energy consumption performance, and cache hit rate.
Firstly, the convergence analysis of the proposed algorithm is conducted.
Figure 4 illustrates the instantaneous energy consumption obtained after iterations for both the MDPSO algorithm and the Greedy algorithm in the single-server scenario. The experiment is conducted with different MU counts (
) and a local MEC cache capacity of 2 GB. The horizontal axis represents the number of iterations for particles or algorithms, while the vertical axis represents the energy consumption of the MEC server. In the experiment, the dimension
L of particles is set to 500, and the number of iterations
i is 100.
The blue line represents the convergence curve of the MDPSO, while the orange line represents the Greedy algorithm. The green horizontal line, which remains constant throughout iterations, represents the energy consumption of local computation. Comparing the two algorithms, it is observed that both MDPSO and Greedy strategies converge within the first 40 rounds and can reduce user-level energy consumption by 18.75% to 41.17% compared to local computation. Moreover, MDPSO exhibits better convergence compared to the Greedy strategy, with an average reduction of 19.7% in user-level energy consumption per round. This indicates that the MDPSO algorithm can explore the solution space more comprehensively, avoiding local optima, and performs well when dealing with discrete vectors.
Figure 5 illustrates the average task completion time of user layer per time slot with a task latency constraint of
ms. The red line represents the MDPSO algorithm with collaboration among five nodes, while the green line represents the Greedy offloading strategy with the same number of collaborating nodes. Since MUs can adopt DVFS to adjust the computational power of local devices and ensure timely task completion, the task completion time in each round remains stable at a maximum latency constraint of 20 ms. After running for 1000 time slots, the proposed algorithm outperforms others, with an average latency of 7.4 ms, while the Greedy strategy exhibits an average latency of 11.4 ms, both showing a decrease compared to the local computing. This implies that the proposed algorithm can converge to the optimal offloading strategy, thereby applying the cache-based offloading mode as much as possible, to reduce data transmission overhead for MUs.
Figure 6 compares the energy consumption of local computation, Greedy-Random, MDPSO-Random, and MDPSO-Random with Cache Chord (CC)-assisted algorithms under different numbers of MUs and task richness conditions, using the default Zipf service request distribution function. In
Figure 6a, the number of users covered by each MEC server is fixed at 5. Through the experiments, it is observed that, relative to local computation, the other three algorithms effectively reduce energy consumption overhead. In
Figure 6a, the curve for local energy consumption remains relatively constant. This is because the energy consumption of local computation is only dependent on the number of tasks and FLOPs of tasks. The increase in the number of user types only poses challenges for cache-based algorithms. Moreover, through horizontal comparison, it is noted that the algorithm based on the cooperation of 5 distributed MEC servers with cache exhibits the best performance. This is attributed to the CC data-sharing mechanism, which allows users at any location to access resources across the entire MEC edge layer. This implies that the cache space linearly increases with the addition of servers in the CC network, thereby eliminating the need for users to transmit
and resulting in a higher transmission of energy consumption.
Figure 7 compares the average energy consumption of a cluster of users under the collaboration of different numbers of MEC servers. As the number of MEC servers increases, the energy consumption in all cases shows a decrease and converges to the same value in all cases, because as the total number of server
D increases, the cache pool storage is richer, which allows more requests to be converted from pure offloading to cache-based offloading until all MUs do not need to upload application data
. It can also be noticed that the slopes of the three energy consumption curves decrease gradually as the total number of tasks increases. The energy consumption curve decreases most rapidly when the total number of tasks is small (
) and converges at around 20 servers; the energy consumption curve of
converges at
because when the total number of tasks is larger, the servers also need to collaborate on a larger scale to make sure that the corresponding application data are stored.
Figure 8 illustrates the average energy consumption of MUs per time slot under different Zipf parameters and various task attributes for MDPSO. As depicted in
Figure 8a, it can be observed that with the increase in the maximum value of application data
, the average energy consumption of the Greedy algorithm and the MDPSO assisted by CC continues to rise. This is due to the increase in
, which increases the transmission overhead of tasks, thereby reducing the offloading benefits. The proposed algorithm can effectively reduce the average energy consumption by 10.9% to 25.17% compared to the Greedy offloading strategy. Additionally, under the same parameter conditions (
), MDPSO based on multi-server collaboration can reduce the average energy consumption by 7.89% to 4.7% compared to the single-node mode.
Observing
Figure 8b, we can conclude that with the growth of
, the average energy consumption of all algorithms increases. Moreover, comparing the Greedy algorithm with the MDPSO algorithm based on multi-node collaboration, under the same Zipf
R condition, MDPSO can reduce the average energy consumption by 2.1% to 16.8% compared to Greedy. Furthermore, with the increase in parameter
R, the energy consumption also decreases. This is because parameter
R is positively correlated with the probability that the MUs’ next stage service request is empty. A larger R means more MUs with empty requests at any time. Sparse service requests undoubtedly lead to reduced transmission energy consumption for MUs.
Figure 9a,b, respectively, explore the impact of the number of servers in the CC mechanism on the cache hit ratio from the perspectives of user count and service types. The cache hit ratio represents the ratio of the total number of accesses to the edge layer by users
to the number of services successfully requested
, expressed as the following formula:
By vertically comparing
Figure 9, it is evident that under the same conditions, as the number of MEC servers increases, the cache space grows linearly, leading to a significant improvement in the cache hit ratio
. Compared to a single server, when the number of servers is increased to 10, the cache hit ratio
increases by
to
. Moreover, when the number of users
is 20 and the number of services
, the cache hit ratio achieved by the cooperation of 10 servers is consistently above 91%. This improvement is attributed to the Cache Chord mechanism overcoming the bottleneck imposed by the storage space limitation.
However, as the number of task types increases, the cache hit ratio decreases. This decline is due to the sparser distribution of user service requests as the task types increase. For individual MEC servers, the proportion of service data stored in the limited space of the local cache pool becomes relatively smaller compared to the overall request volume, leading to a decrease in . Comparing the four models, it can be concluded that the Cache Chord mechanism effectively assists in computation offloading. Moreover, with the increase in node scale, higher benefits can be achieved.