Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters

Zeng, Ruijiang; Li, Zhiyong; Li, Sifeng; Zhang, Jiahao; Chen, Xiaomei

doi:10.3390/en19010269

Open AccessArticle

Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters

by

Ruijiang Zeng

^1,2,

Zhiyong Li

^1,2,

Sifeng Li

³,

Jiahao Zhang

³ and

Xiaomei Chen

^3,*

¹

Electric Power Research Institute of Guangdong Power Grid Co., Ltd., Guangzhou 510663, China

²

China Southern Power Grid Key Laboratory of Power Grid Automation Laboratory, Guangzhou 510663, China

³

School of Electrical and Electronic Engineering, North China Electric Power University, Beijing 102206, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(1), 269; https://doi.org/10.3390/en19010269

Submission received: 9 November 2025 / Revised: 5 December 2025 / Accepted: 24 December 2025 / Published: 4 January 2026

(This article belongs to the Special Issue Operation and Planning of Distribution Systems Under High Renewable Penetration)

Download

Browse Figures

Versions Notes

Abstract

Distribution terminal units (DTUs) play critical roles in smart grid for supporting data acquisition, remote monitoring, and fault management. A single DTU generates continuous data streams, imposing new challenges on data processing. To tackle these issues, a cloud-edge collaboration-based data processing method is introduced for DTU edge clusters. First, considering the load imbalance degree of DTU data queues, a cloud-edge integrated data processing architecture is designed. It optimizes edge server selection, the offloading splitting ratio, and edge-cloud computing resource allocation in a collaboration mechanism. Second, an optimization problem is formulated to maximize the weighted difference between the total data processing volume and the load imbalance degree. Next, a cloud-edge collaboration-based data processing method is proposed. In the first stage, cloud-edge collaborative data offloading based on the load imbalance degree, and a data volume-aware deep Q-network (DQN) is developed. A penalty function based on load fluctuations and the data volume deficit is incorporated. It drives the DQN to evolve toward suppressing the fluctuation of load imbalance degree while ensuring differentiated long-term data volume constraints. In the second stage, cloud-edge computing resource allocation based on adaptive differential evolution is designed. An adaptive mutation scaling factor is introduced to overcome the gene overlapping issues of traditional heuristic approaches, enabling deeper exploration of the solution space and accelerating global optimum identification. Finally, the simulation results demonstrate that the proposed method effectively improves the data processing efficiency of DTUs while reducing the load imbalance degree.

Keywords:

cloud-edge collaboration; computing resource allocation; DQN; adaptive differential evolution

1. Introduction

With the development of the smart grid, distribution terminal units (DTUs) have been largely deployed at substations, ring main units, and switchgear cabinets to realize data acquisition, remote monitoring, and fault management [1]. A single DTU generates continuous streams of line parameters, harmonics, equipment status signals, and protection measurements, causing new challenges in data processing. Local data processing can reduce transmission latency, but the amount of processed data is constrained by the limited computing and storage capabilities of terminals [2,3]. On the other hand, centralized cloud processing, despite its powerful computing resources, suffers from transmission latency, bandwidth limitations, and security issues [4]. To overcome these limitations, cloud-edge collaboration has emerged as a feasible solution by integrating the abundant resources of cloud with edge clusters. It enables edge-side execution of time-critical tasks while enabling data aggregation from multiple DTUs for advanced data analysis [5,6]. This novel architecture of data processing ensures low latency and high reliability while further improving data processing efficiency and resource utilization levels [7].

In the cloud-edge collaboration architecture, the load imbalance degree and the volume of processed data are two key indicators for measuring system performance. The load imbalance degree reflects the non-uniformity of the distribution of data queue backlogs among edge nodes and is typically defined as the standard deviation or coefficient of load variation on each node [8,9]. A lower load imbalance degree can prevent some nodes from being overloaded while others remain idle [10,11], thereby improving the overall data processing efficiency [12,13]. On the other hand, the volume of processed data directly reflects the real-time data processing capability of DTU edge clusters. A larger volume of processed data means higher computing throughput, which can support more complex analysis tasks, such as fault detection and power quality analysis. Scholars have determined how to reduce the load imbalance degree [14]. In [15], Binh et al. proposed a load-balanced routing algorithm for wireless mesh networks based on software-defined networking (SDN). By dynamically monitoring node loads and introducing transmission quality constraints, the algorithm achieved low-latency and high-reliability data transmission through multi-objective optimization, significantly reducing network load imbalance and resource utilization. In [16], Chen et al. proposed a two-stage intelligent load imbalance method for multi-edge collaboration scenarios in wireless metropolitan area networks. This method combined centralized deep neural networks (DNNs) for global task prediction and distributed deep Q-networks (DQNs) for local dynamic adjustment, effectively reducing system response latency. In [17], Yang et al. proposed a task offloading algorithm based on traditional deep Q-networks. By using reinforcement learning to selectively offload computing tasks to the optimal edge nodes or servers, the algorithm optimized task offloading and resource allocation. However, most existing research predominantly focuses on individually optimizing either load imbalance reduction or processed data volume, lacking a coordinated optimization.

Cloud-edge collaboration provides an effective solution for the efficient processing of data from DTUs, with its core lying in jointly optimizing the selection of edge servers, the decision on data offloading splitting ratios, and the allocation of computing resources to enhance system performance. Firstly, rational selection of edge servers can minimize network transmission latency while balancing the load distribution within the edge cluster. Secondly, optimizing the data offloading splitting ratio involves dynamically adjusting the proportion of raw data processed at the edge layer versus the cloud. This approach not only fully utilizes the real-time advantages of edge computing but also leverages the powerful batch processing capabilities of the cloud. Lastly, precise allocation of computing resources, through the rational distribution of resources such as CPU and memory, can significantly improve task processing efficiency and reduce energy consumption. However, the aforementioned optimization problem involves high-dimensional discrete variables and complex constraints, making it difficult for traditional convex optimization methods to solve effectively. There is an urgent need to introduce intelligent optimization algorithms to achieve efficient decision-making.

Multi-agent deep reinforcement learning (DRL) has gained widespread application in edge data processing due to its strong environment perception and decision-making capabilities. In terms of server selection, in [18], Bai et al. proposed a federated learning-based edge server selection method for dynamic edge networks, achieving efficient server-terminal matching by jointly optimizing communication link quality and computing resources utilization. In [19], Liu et al. introduced a mobile edge computing server selection algorithm based on DRL, which realized optimal server selection in dynamic mobile environments through the design of a multi-dimensional state space and reward functions. Regarding data offloading, in [20], Yan et al. proposed a joint optimization framework for task offloading and resource allocation in edge computing environments. By solving the optimal offloading splitting ratio and computing resource allocation simultaneously through mixed-integer nonlinear programming, this method reduced the system’s total processing delay while ensuring task dependency constraints. In [21], Alam et al. introduced a dynamic data offloading mechanism based on DRL. By designing a state prediction module containing a long short-term memory (LSTM) network, this method accurately predicted network congestion levels and edge node load states and used a double-delay deep deterministic policy gradient algorithm to optimize offloading decisions. However, traditional DRL methods, such as DQN, suffer from overestimation of Q-values and struggle to handle multi-dimensional continuous action spaces involving server selection, offloading splitting ratio decisions, and resource allocation simultaneously.

Genetic algorithms have demonstrated strong potential in the field of computing resource allocation optimization. Liu et al. proposed an improved genetic algorithm for multi-user computing offloading and resource allocation optimization in heterogeneous 5G mobile edge computing (MEC) networks [22]. In [23], Laili et al. proposed a cloud-edge-end collaborative computing resource allocation algorithm based on differential evolution. By combining genetic algorithms, deep learning, and differential evolution, this method achieved efficient task offloading and resource scheduling optimization. Zhu et al. introduced a genetic algorithm-based elastic resource scheduling method to optimize resource allocation of multi-stage stochastic tasks in hybrid cloud environments [24]. Farahnakian et al. proposed a genetic algorithm-based energy-efficient virtual machine consolidation scheme [25]. Utilizing time series analysis to predict future host loads, this scheme optimizes cloud computing resource allocation through multi-dimensional chromosome encoding and dynamic mutation mechanisms. However, traditional genetic algorithms suffer from slow convergence and a tendency to fall into local optima, making them inadequate for meeting the real-time optimization needs of cloud-edge collaboration.

Despite the progress made in the aforementioned studies, several issues remain. Firstly, traditional data processing architectures for DTU edge clusters do not consider the dynamic data processing queue evolution of the terminal, edge, and cloud sides. There is a lack of effective coordination between DTU local processing, edge cluster processing, and cloud processing. Second, in environments where data from multiple DTUs are processed concurrently, the load states fluctuate significantly across different time slots. Traditional DQN-based data offloading strategies are mostly optimized based on static reward functions, failing to account for dynamic load evolution characteristics. This results in a high load imbalance degree and a large data volume deficit accumulation over consecutive time slots. Last but not least, traditional genetic algorithms are prone to falling into local optima due to gene overlap issues. New genes generated during the mutation process overlap with existing ones, leading to slower convergence and inferior resource allocation performance.

To address the aforementioned challenges, this paper focuses on solving the following three problems:

Problem 1: How to establish an efficient data coordination processing mechanism in a cloud-edge integrated DTU edge cluster. The mechanism must be able to collaboratively optimize high-dimensional short-term decisions and satisfy long-term performance constraints to overcome the load imbalance and low efficiency caused by resource fragmentation and coupled decisions in traditional approaches.

Problem 2: How to maintain system stability under fluctuating distribution terminal loads. The mechanism must dynamically track load variations and virtual-queue evolution to link real-time offloading decisions with long-term stability constraints, thereby suppressing load oscillations and ensuring stable system performance, particularly the required total data-processing throughput.

Problem 3: How to achieve efficient resource allocation in a large-scale, dynamic cloud-edge environment. The key challenge is to overcome the tendency of traditional methods to become trapped in local optima. The objective is to enable globally efficient solutions for resource allocation strategies, thereby enhancing scheduling performance and resource utilization in dynamic scenarios.

To address the above three problems, we propose a cloud-edge collaboration-based data processing framework for DTU edge clusters. First, a cloud-edge integrated data processing architecture is established, promoting seamless collaboration among the data acquisition layer, DTU layer, edge cluster layer, and cloud server layer. Second, to enhance data processing capability and avoid load imbalance, a joint optimization problem is established. The objective is to maximize the weighted difference between the total data processing volume and the weighted load imbalance degree by jointly optimizing edge server selection, the offloading splitting ratio, and edge-cloud computing resource allocation. Then, by leveraging virtual queue theory, the long-term constraint of data processing volume is decoupled, and the original problem is decomposed into a data offloading selection subproblem and a cloud-edge resource allocation subproblem. Finally, a cloud-edge collaboration-based data processing algorithm for DTU edge clusters is proposed to address the decoupled subproblems. Unlike traditional DQN-based approaches, which exhibit significant fluctuations in load imbalance degree and long-term data volume constraint violations, the proposed algorithm incorporates load imbalance degree and data volume awareness to improve data offloading performance. It also incorporates adaptive differential evolution to enhance cloud-edge resource utilization efficiency. Extensive simulation experiments are carried out to confirm its effectiveness. The primary contributions are outlined below.

Cloud-edge integrated data processing architecture. The proposed architecture employs a collaboration mechanism between DTU local processing, edge cluster proximate processing, and cloud remote processing. The collected data are intelligently stored, split, offloaded, and parallel processed to combine the rapid response characteristics of the edge cluster and powerful computing capacities of cloud server.

Cloud-edge collaborative data offloading based on load imbalance degree and data volume-aware DQN. A penalty function based on load fluctuations and data volume deficit is incorporated into the reward function design. This drives DQN to evolve toward suppressing the fluctuation of load imbalance degree across time slots and ensures the satisfaction of differentiated long-term data volume constraints.

Cloud-edge computing resource allocation based on adaptive differential evolution. We incorporate an adaptive mutation scaling factor, which is dynamically adjusted during the iterative optimization process. This overcomes the gene overlapping issues of traditional heuristic approaches, enabling deeper exploration of the solution space and accelerating global optimum identification.

2. System Model

The cloud-edge integrated data processing architecture for DTU edge clusters is illustrated in Figure 1, comprising the acquisition layer, DTU layer, edge cluster layer, and cloud server layer. The acquisition layer includes various electrical equipment such as switchgear, current transformers, and voltage transformers. Each DTU is deployed on electrical equipment, receiving operational status data from the acquisition layer. The collected data are locally processed on each DTU or uploaded to the edge cluster layer and cloud server layer for further processing. The set of DTUs is denoted as

D = \{d_{1}, \dots, d_{m}, \dots, d_{M}\}

. The edge cluster layer consists of edge clusters and 5G base stations. The 5G base stations provide communication support for data interaction among the DTU layer, edge cluster layer, and cloud server layer. The edge clusters integrate multiple edge servers, which can assist DTUs in data processing. The set of edge servers is denoted as

S = \{s_{1}, \dots, s_{k}, \dots, s_{L}\}

. The cloud server layer includes a master station and cloud server

s_{0}

. The cloud servers possess abundant storage and computing resources, enabling rapid processing of uploaded data.

The optimization period is partitioned into

T

slots, each with a duration

τ

. Each slot includes four processes, namely, data acquisition, edge server selection, data offloading splitting ratio selection, and collaborative processing between the edge cluster and cloud server. During the data acquisition process, DTUs receive operational status data from the acquisition layer and await transmission. In the edge server selection process, DTUs make edge server selection decisions to determine which edge server to collaborate with. In the data offloading splitting ratio selection process, DTUs make decisions on the ratio of data offloading to the edge cluster and cloud server. During the edge cluster–cloud server collaborative processing, the edge cluster and cloud server utilize their computing resources to collaboratively process the data uploaded by the DTUs.

2.1. Edge Cluster–Cloud Server Collaboration Model

We introduce the edge cluster collaboration splitting ratio and cloud server collaboration splitting ratio to partition the data collected by DTUs. A portion of the data is retained at DTU for local processing, while another portion is offloaded to the edge layer for processing by edge clusters. The remaining data are then offloaded to the cloud layer for processing by cloud servers.

A DTU receives operational status data from the acquisition layer and awaits transmission. Subsequently, the DTU makes decisions regarding edge server selection and data offloading splitting ratio selection. The edge server selection decision variable is defined as

x_{m, k} (t) \in \{0, 1\}

, with

x_{m, k} (t) = 1

indicating that DTU

d_{m}

chooses the edge server

s_{k}

for collaborative data processing.

During the data offloading splitting ratio selection process, the data offloading splitting ratios are discretized into L levels, denoted as

\{φ_{1}, \dots φ_{l}, \dots, φ_{L}\}

. The data offloading decision of DTU

d_{m}

during slot

t

is denoted by

Y_{m} (t) = \{φ_{m}^{D} (t), φ_{m}^{E} (t), φ_{m}^{C} (t)\}

, satisfying

φ_{m}^{D} (t) + φ_{m}^{E} (t) + φ_{m}^{C} (t) = 1

. Here,

φ_{m}^{D} (t)

represents the local offloading splitting ratio of DTU

d_{m}

.

φ_{m}^{E} (t)

denotes the edge cluster offloading splitting ratio of

d_{m}

, representing the proportion of data offloaded to edge clusters, and

φ_{m}^{C} (t)

signifies the cloud server offloading splitting ratio of

d_{m}

, reflecting the proportion of data offloaded to cloud servers.

2.2. DTU Local Computing Model

This section aims to establish the dynamic update rules for the local data queues of DTUs and the calculation method for local data processing volume. This provides a foundation for the subsequent cloud-edge collaborative data processing model while also laying a quantitative basis for the optimal allocation of computing resources.

Let the data queue maintained locally by DTU

d_{m}

at slot

t

be denoted as

Q_{m}^{D} (t)

, which is updated as

Q_{m}^{D} (t + 1) = Q_{m}^{D} (t) - U_{m}^{a l l} (t) + {\hat{A}}_{m} (t) + \max \{U_{m}^{a l l} (t) φ_{m}^{D} (t) - U_{m}^{D} (t), 0\}

(1)

where

{\hat{A}}_{m} (t)

represents the amount of newly arrived data at

d_{m}

during slot

t + 1

.

U_{m}^{a l l} (t)

represents the total amount of data offloaded by DTU

d_{m}

.

U_{m}^{D} (t)

represents the amount of data processed locally by

d_{m}

.

The local processing data volume

U_{m}^{D} (t)

of

d_{m}

depends on its computing capability, denoted as

U_{m}^{D} (t) = \min \{\frac{τ f_{m}^{D} (t)}{λ_{m}}, U_{m}^{a l l} (t) φ_{m}^{D} (t)\}

(2)

where

λ_{m}

represents the computing complexity of data processing, indicating the CPU cycles required per bit of data processed.

f_{m}^{D} (t)

represents the computing resources allocated by

d_{m}

for local data processing.

2.3. Edge Cluster Data Processing Model

The data transmission rate between DTU

d_{m}

and edge server

s_{k}

is given by

R_{m, k}^{D, E} (t) = B_{m, k} (t) \log (1 + \frac{P_{m, k} (t) G_{m, k} (t)}{δ_{m, k} (t) + δ_{0}})

(3)

where

B_{m, k} (t)

,

P_{m, k} (t)

, and

G_{m, k} (t)

represent the communication bandwidth, transmission power, and channel gain, respectively.

δ_{m, k} (t)

represents the electromagnetic interference, and

δ_{0}

represents the noise power.

When making data offloading decisions,

d_{m}

must ensure that all data offloaded to the edge server

s_{k}

can be completely uploaded within the current slot; that is,

τ R_{m, k}^{D, E} (t) \geq x_{m, k} (t) U_{m}^{a l l} (t) φ_{m}^{E} (t)

(4)

Let the data queue maintained by the edge server

s_{k}

for DTU

d_{m}

at slot

t

be denoted as

Q_{m, k}^{E} (t)

, which is updated as

\begin{array}{l} Q_{m, k}^{E} (t + 1) = Q_{m, k}^{E} (t) \\ + x_{m, k} (t) U_{m}^{a l l} (t) φ_{m}^{E} (t) - U_{m, k}^{E} (t) \end{array}

(5)

where

U_{m, k}^{E} (t)

represents the amount of data processed by

s_{k}

for

d_{m}

.

The amount of data that an edge server can process during slot

t

is specifically expressed as

U_{m, k}^{E} (t) = \min \{Q_{m, k}^{E} (t) + x_{m, k} (t) U_{m}^{a l l} (t) φ_{m}^{E} (t), \frac{τ f_{m, k}^{E} (t)}{λ_{m}}\}

(6)

where

f_{m, k}^{E} (t)

represents the computing resources provided by

s_{k}

for processing data of

d_{m}

.

Define the total computing resources available at the edge server

s_{k}

during slot

t

as

f_{k}^{E, a l l} (t)

. Accordingly, the computing resources constraint of

s_{k}

can be expressed as

\sum_{m = 1}^{M} f_{m, k}^{E} (t) \leq f_{k}^{E, a l l} (t)

(7)

2.4. Cloud Server Data Processing Model

Fiber optics is used for data transmission between DTU

d_{m}

and cloud server

s_{0}

. Fiber optic communication is unaffected by electromagnetic interference, offering higher stability and reliability. For simplicity, the fiber optic data transmission rate is set as

R_{m}^{D, C} (t)

. Compared to edge servers, cloud servers possess more abundant storage and computing resources.

Similarly, when making data offloading decisions,

d_{m}

must ensure that all data offloaded to

s_{0}

can be completely uploaded within the current slot; that is,

τ R_{m}^{D, C} (t) \geq U_{m}^{a l l} (t) φ_{m}^{C} (t)

(8)

Let the data queue maintained by

s_{0}

for

d_{m}

at slot

t

be denoted as

Q_{m}^{C} (t)

, which is updated as

Q_{m}^{C} (t + 1) = Q_{m}^{C} (t) + U_{m}^{a l l} (t) φ_{m}^{C} (t) - U_{m}^{C} (t)

(9)

where

U_{m}^{C} (t)

represents the amount of data processed by

s_{0}

for

d_{m}

at slot

t

.

Considering the cloud server offloading splitting ratio

φ_{m}^{C} (t)

, the amount of data that the cloud server can process during slot

t

is specifically expressed as

U_{m}^{C} (t) = \min \{Q_{m}^{C} (t) + U_{m}^{a l l} (t) φ_{m}^{C} (t), \frac{τ f_{m}^{C} (t)}{λ_{m}}\}

(10)

where

f_{m}^{C} (t)

represents the computing resources provided by

s_{0}

for processing data of

d_{m}

.

2.5. Processed Data Volume Model

The total volume of processed data includes the amount of data processed locally, the amount of data processed by the edge cluster, and the amount of data processed by the cloud server, which is expressed as

U_{m}^{c a l} (t) = U_{m}^{D} (t) + U_{m}^{C} (t) + \sum_{k = 1}^{K} U_{m, k}^{E} (t)

(11)

2.6. Load Imbalance Degree Model

To ensure the timeliness of data processing, the data queues should be as balanced as possible across DTUs, edge clusters, and the cloud server. The load imbalance degree is employed to quantify the imbalance level of data queues. The load imbalance degrees for

d_{m}

are, respectively, expressed as

σ_{m}^{D} (t) = \frac{|w_{m} Q_{m}^{D} (t) - \frac{1}{M} \sum_{m = 1}^{M} w_{m} Q_{m}^{D} (t)|}{\frac{1}{M} \sum_{m = 1}^{M} w_{m} Q_{m}^{D} (t)}

(12)

σ_{m}^{E} (t) = \frac{|\sum_{k = 1}^{K} w_{m} Q_{m, k}^{E} (t) - \frac{1}{M} \sum_{m = 1}^{M} \sum_{k = 1}^{K} w_{m} Q_{m, k}^{E} (t)|}{\frac{1}{M} \sum_{m = 1}^{M} \sum_{k = 1}^{K} w_{m} Q_{m, k}^{E} (t)}

(13)

σ_{m}^{C} (t) = \frac{|w_{m} Q_{m}^{C} (t) - \frac{1}{M} \sum_{m = 1}^{M} w_{m} Q_{m}^{C} (t)|}{\frac{1}{M} \sum_{m = 1}^{M} w_{m} Q_{m}^{C} (t)}

(14)

where

w_{m}

represents the queue backlog weight used to balance the magnitude of queue backlogs among various DTUs.

Therefore, the weighted load imbalance degree of

d_{m}

is expressed as

σ_{m}^{a l l} (t) = μ_{m}^{D} σ_{m}^{D} (t) + μ_{m}^{E} σ_{m}^{E} (t) + μ_{m}^{C} σ_{m}^{C} (t)

(15)

where

μ_{m}^{D}

,

μ_{m}^{E}

, and

μ_{m}^{C}

represent the load imbalance weight coefficients for DTU, edge cluster, and cloud server, respectively.

2.7. Long-Term Data Processing Volume Model

Constraints are imposed on the long-term data processing volume of DTUs, ensuring that the data processing requirements of various power services are satisfied. Defining the minimum tolerable average data processing volume of

d_{m}

as

U_{m}^{\min}

, the long-term data processing volume constraint is expressed as

\lim_{T \to \infty} \frac{1}{T} \sum_{t = 1}^{T} U_{m}^{c a l} (t) \geq U_{m}^{\min}

(16)

3. Problem Formulation and Transformation

To meet differentiated processing requirements of various power services for DTUs and avoid load distribution imbalance among data queues of different DTUs, this paper balances data processing efficiency and load imbalance on the basis of satisfying the long-term data processing volume constraint of DTUs. Specifically, it maximizes the average value of the difference between total data processing volume and weighted load imbalance over

T

time slots, which essentially achieves the collaborative optimization of data processing efficiency improvement and load imbalance reduction. The optimization variables include edge server selection, data offloading splitting ratio selection, and cloud-edge collaborative computing resource allocation. The optimization problem is formulated as follows:

\begin{array}{l} P 1 : \max_{\{x_{m, k} (t)\}, \{Y_{m} (t)\}, \{f_{m, k}^{E} (t)\}, \{f_{m}^{C} (t)\}} \sum_{t = 1}^{T} \sum_{m = 1}^{M} U_{m}^{c a l} (t) - ϑ σ_{m}^{a l l} (t) \\ s.t. C_{1} : \sum_{k = 1}^{K} x_{m, k} (t) = 1, \forall d_{m} \in D, \forall s_{k} \in S \\ C_{2} : \{\begin{cases} x_{m, k} (t) \in \{0, 1\} \\ φ_{m}^{D} (t), φ_{m}^{E} (t), φ_{m}^{C} (t) \in {\{φ_{1}, \dots φ_{l}, \dots, φ_{L}\}}^{'} \end{cases} \\ \forall d_{m} \in D, \forall s_{k} \in S \\ C_{3} : φ_{m}^{D} (t) + φ_{m}^{E} (t) + φ_{m}^{C} (t) = 1, \forall d_{m} \in D, \forall s_{k} \in S \\ C_{4} : \sum_{m = 1}^{M} f_{m, k}^{E} (t) \leq f_{k}^{E, a l l} (t), \forall d_{m} \in D, \forall s_{k} \in S \\ C_{5} : \lim_{T \to \infty} \frac{1}{T} \sum_{t = 1}^{T} U_{m}^{c a l} (t) \geq U_{m}^{\min}, \forall d_{m} \in D, \forall s_{k} \in S \end{array}

(17)

where

ϑ

is a weighting parameter that balances the trade-off between reducing the load imbalance degree and improving the data processing efficiency.

C_{1}

restricts each DUT

d_{m}

to select at most one edge server.

C_{2}

defines the value range constraints for the local, edge, and cloud offloading splitting ratios.

C_{3}

ensures that the sum of the splitting ratios for data allocated to the local, edge, and cloud equals 1.

C_{4}

limits the amount of computing resources processed by an edge server to not exceed its maximum processing capacity.

C_{5}

ensures that the long-term average data processing volume of DTU

d_{m}

is not lower than the minimum tolerable average processing volume

U_{m}^{\min}

.

In the optimization problem

P 1

, the long-term data processing volume constraint

C_{5}

is mutually coupled with short-term decisions within a single time slot. This renders

P 1

unsolvable directly by traditional convex optimization methods, requiring transformation and decomposition into subproblems. Therefore, we employ Lyapunov optimization to construct dynamic virtual queues, converting

C_{5}

into a stability constraint on virtual queues. The virtual deficit queue associated with

C_{5}

is defined as

Z_{m} (t)

, which evolves dynamically as follows:

Z_{m} (t + 1) = \max \{Z_{m} (t) - U_{m}^{c a l} (t) + U_{m}^{\min}, 0\}

(18)

where

Z_{m} (t)

represents the deviation between the actual amount of processed data of

d_{m}

and the required data processing constraint. According to the virtual queue theory, when the virtual deficit queue for processed data remains stable, the long-term average processed data volume will exceed

U_{m}^{\min}

, thus satisfying

C_{5}

.

On this basis, the optimization problem

P 1

is transformed into optimization problem

P 2

.

P 2

converts the long-term constraint of

P 1

into a short-term observable virtual queue state, enabling the optimization problem to be solved sequentially on a per-time-slot basis and laying the foundation for subsequent decomposition into subproblems, which can be expressed as

\begin{array}{l} P 2 : \max_{\{x_{m, k} (t)\}, \{Y_{m} (t)\}, \{f_{m, k}^{E} (t)\}, \{f_{m}^{C} (t)\}} \begin{matrix} Θ (t) = \sum_{m = 1}^{M} U_{m}^{c a l} (t) - ϑ σ_{m}^{a l l} (t) \\ - V Z_{m} (t) [U_{m}^{c a l} (t) - U_{m}^{\min}] \end{matrix} \\ s.t. C_{1} \sim C_{4} \end{array}

(19)

where

V

is a weight coefficient used to balance the trade-off between the optimization objective and the stability of the virtual deficit queue.

P 2

can be categorized into mutually independent decision variables and resource variables. Therefore,

P 2

can be decomposed into a data offloading decision subproblem

P 3

and a cloud-edge collaborative computing resource allocation subproblem

P 4

.

P 3

aims to determine the edge server selection and data offloading splitting ratios for DTUs, thereby providing the demand basis for subsequent resource allocation.

P 3

is formulated as follows:

\begin{array}{l} P 3 : \max_{\{x_{m, k} (t)\}, \{Y_{m} (t)\}} Θ (t) \\ s.t. C_{1} \sim C_{3} \end{array}

(20)

Building upon the offloading splitting ratios determined by

P 3

,

P 4

is responsible for computing resource allocation, thus ensuring timely data processing and thereby preventing congestion in the edge queues.

P 4

is formulated as follows:

\begin{array}{l} P 4 : \max_{\{f_{m, k}^{E} (t)\}, \{f_{m}^{C} (t)\}} Θ ({\hat{x}}_{m, k} (t), {\hat{Y}}_{m} (t)) \\ s.t. C_{4} \end{array}

(21)

where

{\hat{x}}_{m, k} (t)

and

{\hat{Y}}_{m} (t)

represent the decisions on edge server selection decisions and data offloading splitting ratios selection decisions, respectively.

4. Cloud-Edge Collaboration-Based Data Processing Method for DTU Edge Clusters

We propose a cloud-edge collaboration-based data processing method for DTU edge clusters, with its overall framework corresponding to optimization objective

P 1

, as illustrated in Figure 2. The framework decomposes

P 1

into two key subproblems: the left part of Figure 2 is used to determine the edge server selection and offloading splitting ratio, corresponding to optimization subproblem

P 3

, and demonstrates the process of solving

P 3

using load imbalance degree and data volume-aware DQN. The right part of Figure 2 is employed to match computing resources with data volume demands, corresponding to optimization subproblem

P 4

, which is solved via an adaptive differential evolution algorithm. Specifically, the algorithm is divided into two stages. In the first stage, considering the load imbalance degree of DTUs across different time slots and the dynamic changes in virtual queues, a penalty function is constructed based on the fluctuation in load imbalance degree among time slots and the deficit of total processed data volume. This drives the DQN to evolve toward maximizing the reward function and minimizing the penalty function, thereby suppressing fluctuations in the load imbalance degree across different time slots and ensuring that the long-term constraint on the total processed data volume in cloud-edge collaboration is met. In the second stage, an adaptive mutation scaling factor is introduced during the mutation phase, which dynamically adjusts as the iterative optimization of cloud-edge computing resources’ collaborative allocation progresses. This enables more effective exploration of the solution space and the discovery of the global optimal solution in a shorter time, achieving efficient utilization and allocation of cloud-edge computing resources for DTUs. The algorithm execution flow is shown in Algorithm 1, with details provided below.

Algorithm 1: Cloud-Edge Collaboration-based Data Processing Method for Distribution Terminal Unit Edge Clusters

Input:

\{Q_{m}^{D} (t), Q_{m}^{E} (t), Q_{m}^{C} (t), σ_{m}^{a l l} (t - 1), Z_{m} (t - 1), {\hat{A}}_{m} (t) |\forall m\}

.
Output:

\{x_{m, k} (t), Y_{m} (t), f_{m, k}^{E} (t), f_{m}^{C} (t) |\forall m, \forall k}

.
1: For

t = 1, 2, \dots, T

do
Stage 1: Cloud-Edge Collaborative Data Offloading based on Load Imbalance Degree and Data Volume-Aware DQN
2:

d_{m}

selects action

{\hat{O}}_{m} (t)

based on (25).
3:

d_{m}

executes action and calculates the reward

r_{m} (t)

and penalty

α_{m} (t)

based on (24) and (26).
4:

d_{m}

updates

S_{m} (t)

to

S_{m} (t + 1)

, and store the experience data

[S_{m} (t), {\hat{O}}_{m} (t), r_{m} (t), α_{m} (t), S_{m} (t + 1)]

into the experience buffer

P_{m} (t)

.
5:

d_{m}

calculates the loss function of

ϖ_{m}^{e} (t)

based on (27) and (28).
6:

d_{m}

updates

ϖ_{m}^{e} (t)

and

ϖ_{m}^{o} (t)

based on (29) and (30).
Stage 2: Cloud-Edge Computing Resource Allocation based on Adaptive Differential Evolution
7: Map the cloud-edge collaborative computing resource allocation to chromosome

h_{k}

based on

q_{k} = \{h_{k} \Rightarrow {f_{m, k}^{E} (t), f_{m}^{C} (t)}\}

.
8: For

g = 1, 2, \dots, G

do
9: Randomly select three particles

q_{r 1} (t, g)

,

q_{r 2} (t, g)

, and

q_{r 3} (t, g)

to generate corresponding mutated gene

v_{k} (t, g)

based on (32) and (33).
10: Perform crossover between

q_{k} (t, g)

and

v_{k} (t, g)

based on (34).
11: Select the optimal particle between the current and historical best particles using the greedy strategy based on (35).
12: End For
13: End For

4.1. Cloud-Edge Collaborative Data Offloading Based on Load Imbalance Degree and Data Volume-Aware DQN

The DQN is a deep neural network model based on reinforcement learning. It learns the optimal mapping strategy between states and actions through a trial-and-error feedback mechanism and employs three key techniques-experience replay, target networks, and gradient descent optimization to achieve stable and efficient training. Traditional DQN approaches overlook the perception of load imbalance degree and data volume across different time slots for DTUs, leading to significant fluctuations in load imbalance degree and difficulty in meeting the long-term constraint on the total processed data volume. To address this issue, we construct a penalty function based on the fluctuation of load imbalance degree and the deficit of total processed data volume in cloud-edge collaboration, in addition to the traditional reward function. This drives the DQN to evolve toward maximizing the reward function and minimizing the penalty function, thereby suppressing fluctuations in load imbalance degree across different time slots and ensuring the long-term constraint.

The load imbalance degree and data volume-aware DQN algorithm in this paper adopts a dual-network architecture combined with a scenario-specific experience replay pool. This allows the primary network to learn from experiences that integrate scenario-specific information, while the target network ensures stable training through periodic parameter synchronization. The DQN topology follows a three-layer “input-hidden-output” structure. Each layer is responsible for quantizing local load states, queue backlogs, and data volume, extracting relevant features, and evaluating offloading action values, respectively. Through the close collaboration of these components, the system effectively suppresses load imbalance and ensures long-term data processing requirements are met.

Based on the trial-and-feedback decision-making mechanism, we formulate the subproblem

P 3

as a Markov decision process (MDP) [26], which includes three components: actions, states, and rewards. Specifically, interactions among the cloud server layer, edge cluster layer, and DTUs are used to obtain rewards, aiming to optimize the edge server selection and data offloading splitting ratio selection. The detailed algorithmic procedure is presented as follows.

MDP Problem Modeling:

(1): State $S_{m} (t)$ . To maximize the total data processing volume and avoid load imbalance, at the start of each slot, DTUs perceive real-time information, such as data queue backlog, load imbalance degree, virtual queue status, and collected data volume. This information is used to construct the state space $S_{m} (t)$ , expressed as

$S_{m} (t) = \{Q_{m}^{D} (t), Q_{m}^{E} (t), Q_{m}^{C} (t), σ_{m}^{a l l} (t - 1), Z_{m} (t - 1), {\hat{A}}_{m} (t)\}$

(22)
(2): Action $O_{m} (t)$ . At the start of slot $t$ , the DTU $d_{m}$ determines the data offloading decision based on the state space, specifically expressed as

$O_{m} (t) = \{X_{m} (t); Y_{m} (t)\}$

(23)

where $X_{m} (t) = \{x_{m, k} (t) ∣ \forall d_{m} \in D, \forall s_{k} \in S\}$ represents the edge server selection decision. $Y_{m} (t) = \{Y_{m} (t) ∣ \forall d_{m} \in D\}$ represents the data offloading splitting ratio decision, including the local offloading splitting ratio, edge server offloading splitting ratio, and cloud server offloading splitting ratio.
(3): Reward $r_{m} (t)$ . The reward value for $d_{m}$ is defined as the objective function of $P 3$ , expressed as

$\begin{array}{l} r_{m} (t) = U_{m}^{c a l} (t) - ϑ σ_{m}^{a l l} (t) \\ - V Z_{m} (t) [U_{m}^{c a l} (t) - U_{m}^{\min}] \end{array}$

(24)

Algorithm Solution:

d_{m}

simultaneously maintains two neural networks, namely, the primary network and the target network, with their respective parameters denoted as

ϖ_{m}^{e} (t)

and

ϖ_{m}^{o} (t)

. The proposed algorithm consists of three main steps: selection of DTU action, state transition and experience storage, and experiential learning and network updates. The detailed description is as follows.

(1): Selection of DTU Action

At the beginning of time slot

t

, the DTU

d_{m}

inputs the current time slot’s state space

S_{m} (t)

into the primary network

ϖ_{m}^{e} (t)

to obtain the Q-values corresponding to different actions within the time slot. Based on these Q-values, action

{\hat{O}}_{m} (t)

is selected, expressed as

{\hat{O}}_{m} (t) = \underset{O_{m} (t)}{argmax} Q [S_{m} (t), O_{m} (t) ∣ ϖ_{m}^{e} (t)]

(25)

(2): State Transition and Experience Storage

To avoid the issues of unsatisfied constraints and drastic fluctuations in weighted load imbalance degree, we construct a penalty function

α_{m} (t)

based on the fluctuations in load imbalance degree between time slots and the deficit in the total processed data volume, expressed as

α_{m} (t) = |σ_{m}^{c a l} (t) - σ_{m}^{c a l} (t - 1)| - ς [U_{m}^{c a l} (t) - U_{m}^{\min}]

(26)

where

ς

represents the weighting coefficient. During the optimization process, the DQN network evolves in the direction of minimizing the penalty function, thereby suppressing fluctuations in the load imbalance degree, reducing the cumulative data processing deficit, and ensuring the long-term constraint.

The DTU

d_{m}

executes action

{\hat{O}}_{m} (t)

, calculates the reward

r_{m} (t)

and penalty

α_{m} (t)

, updates the state space to

S_{m} (t + 1)

, and stores the experience data

[S_{m} (t), {\hat{O}}_{m} (t), r_{m} (t), α_{m} (t), S_{m} (t + 1)]

into the experience buffer

P_{m} (t)

.

(3): Experiential Learning and Network Updates

The DTU randomly samples

H

experience data from the experience pool

P_{m} (t)

and calculates the mean squared error loss function of the main network, expressed as

Υ (t) = \frac{1}{H} \sum_{h = 1}^{H} [Q_{m}^{h} (t) - Q (S_{m} (t), {\hat{O}}_{m} (t) ∣ ϖ_{m}^{e} (t))]

(27)

where

Q_{m}^{h} (t)

represents the target

Q

-value obtained by the target network

ϖ_{m}^{o} (t)

based on the

h

-th group of experience data, specifically expressed as

Q_{m}^{h} (t) = r_{m} (t) - α_{m} (t) + γ Q (S_{m} (t + 1), {\hat{O}}_{m} (t + 1) ∣ ϖ_{m}^{o} (t))

(28)

where

γ \in [0, 1]

is used to measure the influence of the next slot’s reward on the current action decision.

The main network parameters are updated based on the gradient descent method, expressed as

ϖ_{m}^{e} (t + 1) \leftarrow ϖ_{m}^{e} (t) - β \nabla_{ϖ_{m}^{e} (t)} Υ (t)

(29)

where

β

is the learning rate.

\nabla_{ϖ_{m}^{e} (t)}

denotes the Hamiltonian operator, a vector differential operator used to compute the gradient of the primary network’s loss function with respect to network parameters, enabling iterative updates of these parameters [27]. The meaning of the entire formula is as follows: update the primary network parameters

ϖ_{m}^{e} (t)

in the opposite direction of the loss function’s gradient to minimize the loss function

Υ (t)

, ultimately allowing the DQN to learn the optimal offloading decision strategy.

The target network is synchronized with the primary network every

T_{0}

slots. The update process can be expressed as

ϖ_{m}^{o} (t + 1) = \{\begin{matrix} ϖ_{m}^{e} (t + 1), & t \mod T_{0} = 0 \\ ϖ_{m}^{o} (t), & else \end{matrix}

(30)

where

t \mod T_{0}

represents the remainder of

t

divided by

T_{0}

.

Leveraging the DQN network’s robust perception and precise decision-making capabilities, the proposed algorithm constructs a penalty function based on the fluctuation of the DTU load imbalance degree between time slots and the deficit in the total processed data volume. During the optimization process, it continuously perceives the load imbalance degree of DTUs and data volume, evolving toward suppressing fluctuations in load imbalance degree across different time slots and reducing the deficit in the total processed data volume of cloud-edge collaboration.

4.2. Cloud-Edge Computing Resource Allocation Based on Adaptive Differential Evolution

Although traditional genetic algorithms can solve non-convex optimization problems like

P 4

, they may generate new genes during the mutation process that overlap with existing genes in the population, leading to a high likelihood of falling into local optima in the later stages of optimization. To address these issues, we propose a cloud-edge computing resource allocation algorithm based on adaptive differential evolution. Built upon the evolutionary framework of genetic algorithms, the algorithm introduces an adaptive mutation scaling factor to mitigate its tendency to converge to local optima. Specifically, an adaptive mutation scaling factor is incorporated during the mutation phase. This factor dynamically adjusts as iterations proceed, enabling a more efficient exploration of the solution space and facilitating the identification of the global optimum within a shorter timeframe. The detailed workflow of the proposed adaptive differential evolution algorithm is outlined in Stage 2 of Algorithm 1. We consider a total of

G

iterations, with the specific algorithm flow as follows.

Population Construction:

There are

K

particles in the population, and their set is denoted as

Q = \{q_{1}, \dots, q_{k}, \dots, q_{K}\}

.

q_{k}

represents the gene vector of particle

k

, denoted as

q_{k} = \{h_{k} \Rightarrow {f_{m, k}^{E} (t), f_{m}^{C} (t)}\}

, which maps the cloud-edge computing resource allocation of data processing tasks onto chromosome

h_{k}

.

h_{k}

consists of multiple genes, denoted as

h_{k} = \{h_{k}^{1}, \dots, h_{k}^{m}, \dots, h_{k}^{M}\}

, where

h_{k}^{m}

represents the computing resources allocated by edge server

s_{k}

and cloud server

s_{0}

to DTU

d_{m}

.

Fitness Function Design:

The fitness function can measure a particle’s adaptation to the environment. Since the essence of the adaptive differential evolution algorithm is to stochastically search for the minimum value, the fitness function of the differential evolution algorithm is defined as the optimization objective of

P 4

, expressed as

\begin{array}{l} F (h_{k} \Rightarrow {f_{m, k}^{E} (t), f_{m}^{C} (t)}) = \\ \sum_{m = 1}^{M} U_{m}^{c a l} (t) - ϑ σ_{m}^{a l l} (t) - V Z_{m} (t) [U_{m}^{c a l} (t) - U_{m}^{\min}] \end{array}

(31)

Population Initialization:

The main task of population initialization is to assign values to the chromosomes and genes of each particle. Taking particle

k

as an example, the gene

h_{k}

on its chromosome

h_{k}^{m}

should be greater than or equal to 0 and must satisfy the constraint

C_{4}

.

Adaptive Differential Evolution:

Differential evolution demonstrates robust performance in non-convex optimization through mutation, crossover, and selection operations. Building upon the traditional differential evolution algorithm, this study introduces an adaptive mutation scaling factor during the mutation phase. This factor dynamically adjusts as the iterative optimization of cloud-edge computing resource allocation progresses, enabling more effective exploration of the solution space and accelerating identification of the global optimum.

(1): Adaptive Genetic Mutation

Let

q_{k} (t, g)

,

v_{k} (t, g)

, and

c_{k} (t, g)

denote the original gene, mutated gene, and crossover gene of particle

k

in the

g

-th iteration, respectively. In the gene mutation phase, three randomly selected particles from the same generation are used to generate a corresponding mutated gene for each particle, denoted as

v_{k} (t, g) = q_{r 1} (t, g) + ξ (g) [q_{r 2} (t, g) - q_{r 3} (t, g)]

(32)

where

r 1, r 2, r 3 \in {1 \dots k \dots K}

represent the identifiers of different particles, and

ξ

is the adaptive mutation rate factor, denoted as

ξ (g) = ξ_{0} 2^{e^{(1 - \frac{G}{1 + G - g})}}

(33)

where

ξ_{0} \in (0, 1)

represents the preset constant mutation rate factor. During the early stages of iteration,

ξ

is relatively large, which helps maintain population diversity and enhances the global search capability. As the number of iterations increases,

ξ

gradually decreases, preserving the valuable information of the population, preventing the destruction of the optimal solution, and improving convergence speed. This leads to the effective utilization and allocation of cloud-edge computing resources for the DTU.

(2): Dynamic Gene Crossover Recombination

Crossover refers to the replacement of original genes with their mutated counterparts under a certain probability, with the specific crossover position randomly determined, expressed as

c_{k} (t, g) = \{\begin{cases} q_{k} (t, g), \\ random (0, 1) < CR or randint (0, K) = e \\ v_{m} (t, g), else \end{cases}

(34)

where

CR

represents the crossover probability, and

e

is a random integer.

(3): Greedy-based Optimal Particle Selection

In each iteration of the proposed algorithm, the best particle is selected and compared with the best particle from previous iterations stored in the population. The individual with the higher fitness is retained. Let variable

{\tilde{h}}_{k}

denote the current best particle, and a greedy strategy is used to select the optimal particle as follows:

{\tilde{h}}_{k} = \{\begin{cases} c_{k} (t, g), \\ F (\{c_{k} (t, g) \Rightarrow {f_{m, k}^{E} (t), f_{m}^{C} (t)}\}) \\ > F (\{{\tilde{h}}_{k} \Rightarrow {f_{m, k}^{E} (t), f_{m}^{C} (t)}\}) \\ {\tilde{h}}_{k}, else \end{cases}

(35)

(4): Adaptive Differential Iteration

Let

g = g + 1

, repeat steps (1) and (3), and continuously perform adaptive differential evolution iterations until the

G

-th iteration is completed.

5. Simulation

We construct a simulation area for the distribution network with dimensions of 300 m × 300 m, deploying eight DTUs randomly distributed on power equipment. These units collect voltage, current, and equipment status data at a frequency of 12.8 kHz. The edge server cluster consists of six edge servers with a time slot length of 1 s, and the unit data arrival rate follows a Poisson distribution. Additional simulation parameters are presented in Table 1 [28,29].

We validate the performance improvement of the proposed algorithm by comparing it with two traditional algorithms, which are configured as follows. Baseline 1 is a task offloading algorithm based on the traditional DQN. By using reinforcement learning to selectively offload computing tasks to optimal edge nodes or servers, it optimizes resource allocation and task offloading while minimizing system latency and resource consumption [17]. However, Baseline 1 does not consider the awareness of the load imbalance degree and data volume across different time slots or data offloading splitting ratio selection. Baseline 2 is a cloud-edge-terminal computing resource allocation algorithm based on differential evolution. By combining genetic algorithms with differential evolution, it achieves efficient task offloading and resource scheduling optimization, improving scheduling stability and the accuracy of resource allocation [23]. However, Baseline 2 does not consider adaptive factors. Meanwhile, the simulation parameter settings and optimization objectives of both baseline algorithms are the same as that of the proposed algorithm, but neither consider the load imbalance degree.

Figure 3 shows the data volume of cloud-edge-terminal collaborative processing over time slots. Compared to Baseline 2, the proposed algorithm improves the total data volume processed by cloud-edge collaboration by 45.7%. The reason is that the proposed algorithm considers the long-term data volume constraints of DTUs and dynamically adjusts strategies based on load imbalance degree across different time slots and virtual queue variations. A penalty function is introduced into the DQN network to maximize the network reward function. Under the premise of ensuring long-term constraints on the total data volume, it jointly optimizes edge server selection, data offloading splitting ratios, and cloud-edge computing resource allocation, thereby maximizing the total data volume.

Figure 4 compares the load imbalance degree over time slots. The proposed algorithm demonstrates superior performance in reducing the load imbalance degree, with its average load imbalance degree stabilizing between 0.02 and 0.08. Compared to Baseline 1 and Baseline 2, the average load imbalance degree is reduced by 28.6% and 85.3%, respectively. This is because a penalty function related to load imbalance degree fluctuations is considered during Q-value updates, thereby promptly correcting the learning direction of the DQN agent and improving the stability of the load imbalance degree. Additionally, the proposed algorithm incorporates an adaptive mutation scaling factor in differential evolution, enhancing the global search capability during the initial stages of computing resource allocation policy iteration and avoiding local optima in the later stages.

Figure 5 illustrates queue backlogs of different DTUs at time slot 400. The figure indicates that, compared with Baseline 1 and Baseline 2, the proposed algorithm achieves the most stable queue backlog. As indicated by the results in Figure 4, the proposed algorithm achieves optimal performance, ensuring balanced distribution of queue backlogs across DTUs, edge clusters, and the cloud server.

Figure 6 presents the variation in loss function values over time slots. The proposed algorithm demonstrates the fastest convergence speed and the least volatility in loss function values, significantly outperforming Baseline 1. This is attributed to the proposed algorithm’s use of penalty terms to promptly correct the learning direction of the DQN agent, enabling the algorithm to quickly converge to optimal edge server selection decisions and data offloading splitting ratio decisions.

In summary, the comparison of performance indicators of different algorithms is summarized in Table 2.

Table 3 shows the comparison of indicators of different algorithms in various scenarios. As the scale increases, the average data volume, load imbalance degree, and convergence speed increase approximately linearly without significant performance degradation. This indicates that the proposed algorithm’s load fluctuation penalty mechanism can still work effectively in large-scale nodes. On the other hand, the state dimension of the DQN in large-scale scenarios has expanded from the original “8 × 3 (DTU queue) + 6 × 8 (edge queue)” to “50 × 3 + 20 × 50”. However, through experience replay and target network soft updates, the accuracy of Q-value estimation can still be maintained.

6. Conclusions

To address the issues of inadequate cloud-edge integration, high load imbalance degree, and low efficiency in data processing for DTU edge clusters, we proposed a cloud-edge collaboration-based data processing method for DTU edge clusters. The formulated joint optimization problem was decomposed into a cloud-edge collaborative data offloading subproblem and a cloud-edge computing resource allocation subproblem. The first one is solved by the load imbalance degree and data volume-aware DQN, which can suppress the fluctuation of the load imbalance degree and guarantee differentiated long-term data volume constraints. The second one is solved by adaptive differential evolution, which facilitates deeper exploration of the solution space and accelerates global optimum identification by overcoming the gene overlapping issues. Compared to Baseline 1 and Baseline 2, the proposed method increases the total cloud-edge collaborative data processing volume by 10.2% and 15.4%, respectively, and reduces the average load imbalance degree by 35.1% and 23.7%, respectively.

The proposed algorithm addresses existing issues in DTU edge cluster data processing, such as insufficient cloud-edge integration, high load imbalance, and low processing efficiency. However, its optimization focuses solely on improving data processing efficiency and reducing load imbalance. It does not consider the grid energy management among electrical equipment, nor does it quantify the correlation between computing resource allocation and carbon emissions. In future research, we will develop a coupled grid energy management and carbon-emission model for collaborative edge servers and DTUs. This model will reveal how computing resource allocation affects grid energy management and carbon emissions. We will then improve the DQN algorithm by designing an intelligent reward function. This function will integrate data processing efficiency, energy scheduling strategies, and carbon emissions. A multi-objective electricity-carbon optimization framework will be established. It will enhance the algorithm’s adaptive decision-making under electric-carbon constraints. Ultimately, this will support low-carbon and intelligent distribution network operation.

Author Contributions

Conceptualization, R.Z., Z.L. and X.C.; methodology, R.Z., Z.L. and S.L.; software, S.L. and J.Z.; validation, R.Z. and Z.L.; formal analysis, R.Z., Z.L. and S.L.; investigation, R.Z. and Z.L.; data curation, S.L. and J.Z.; writing—original draft preparation, R.Z., Z.L., S.L. and J.Z.; writing—review and editing, R.Z., Z.L. and X.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by China Southern Power Grid Co., Ltd.’s science and technology program under Grant No. GDKJXM20231495 (030000KC23120094).

Data Availability Statement

The data that support the findings of this study are available from the corresponding author, Xiaomei Chen, upon reasonable request.

Conflicts of Interest

Authors Ruijiang Zeng and Zhiyong Li were employed by the Electric Power Research Institute of Guangdong Power Grid Co., Ltd. and China Southern Power Grid Key Laboratory of Power Grid Automation Laboratory. All authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest. The companies had no role in the design of the study, in the collection, analysis, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Nomenclature

Variable	Explanation	Variable	Explanation
$D$	Set of DTU	$Q_{m}^{h} (t)$	Target Q-value
$S$	Set of edge servers	$S_{m} (t)$	State
$s_{0}$	Cloud server	$O_{m} (t)$	Action
$φ_{m}^{D} (t)$	Local offloading splitting ratio of DTU $d_{m}$	$r_{m} (t)$	Reward
$φ_{m}^{E} (t)$	Edge cluster offloading splitting ratio of $d_{m}$	$X_{m} (t)$	Edge server selection decision vector
$φ_{m}^{C} (t)$	The cloud server offloading splitting ratio of $d_{m}$	$Y_{m} (t)$	Data offloading splitting ratio decision vector
$U_{m}^{D} (t)$	Local processing data volume of $d_{m}$	$α_{m} (t)$	Penalty function
$f_{m}^{D} (t)$	Computing resources allocated by $d_{m}$ for local data processing	$P_{m} (t)$	Experience pool
$U_{m, k}^{E} (t)$	Amount of data processed by $s_{k}$ for $d_{m}$	$ϖ_{m}^{o} (t)$	Target network
$f_{m, k}^{E} (t)$	Computing resources provided by $s_{k}$ for processing data of $d_{m}$	$β$	Learning rate
$U_{m}^{C} (t)$	Amount of data processed by $s_{0}$ for $d_{m}$ at slot $t$	$\nabla_{ϖ_{m}^{e} (t)}$	Hamiltonian operator
$f_{m}^{C} (t)$	Computing resources provided by $s_{0}$ for processing data of $d_{m}$	$Q$	Population set
$σ_{m}^{D} (t)$	Load imbalance degrees for $d_{m}$ of DTU	$ϑ$	Weighting parameter
$σ_{m}^{E} (t)$	Load imbalance degrees for $d_{m}$ of edge clusters	$ξ$	Adaptive mutation rate factor
$σ_{m}^{C} (t)$	Load imbalance degrees for $d_{m}$ of cloud server	$CR$	Crossover probability
$U_{m}^{\min}$	Minimum tolerable average data processing volume of $d_{m}$	${\tilde{h}}_{k}$	Current best particle
$h_{k}^{m}$	Computing resources allocated by edge server $s_{k}$ and cloud server $s_{0}$ to DTU $d_{m}$	$V$	Weight coefficient
${\hat{A}}_{m} (t)$	Amount of newly arrived data at $d_{m}$ during slot $t$	$δ_{m, k} (t)$	Electromagnetic interference
$Z_{m} (t)$	Deviation between the actual amount of processed data of $d_{m}$ and required data processing constraint	$δ_{0}$	Noise power
$λ_{m}$	Computing complexity of data processing	$w_{m}$	Queue backlog weight

References

Wu, B.; Wang, C.; Wei, J.; Pei, X. Intelligent inspection technology for substations based on multimodal sensing. Guangdong Electr. Power 2024, 37, 54–63. [Google Scholar]
Kurobe, A.; Nakajima, Y.; Kitani, K.; Saito, H. Audio-visual self-supervised terrain type recognition for ground mobile platforms. IEEE Access 2021, 9, 29970–29979. [Google Scholar] [CrossRef]
Wang, J.; Li, J.; Gao, Z.; Han, Z.; Qiu, C.; Wang, X. Resource management and pricing for cloud computing based mobile blockchain with pooling. IEEE Trans. Cloud Comput. 2023, 11, 128–138. [Google Scholar] [CrossRef]
Zhong, J.; Liu, B.; Yu, X.; Wong, P.; Wang, Z.; Xu, C.; Zhou, X. Enhancing voltage compliance in distribution network under cloud and edge computing framework. IEEE Trans. Cloud Comput. 2023, 11, 1217–1229. [Google Scholar] [CrossRef]
Jia, Q.; Jiao, W.; Chen, S.; Yan, Z.; Sun, H. A trustworthy cloud-edge collaboration framework for scheduling distributed energy resources in distribution networks. IEEE Trans. Smart Grid 2025, 16, 2691–2694. [Google Scholar] [CrossRef]
Gao, X.; Hou, L.; Chen, B.; Yao, X.; Suo, Z. Compressive-learning-based federated learning for intelligent IoT with cloud-edge collaboration. IEEE Internet Things J. 2025, 12, 2291–2294. [Google Scholar] [CrossRef]
Xu, Q.; Zhang, Z.; Zhuo, Y.; Li, S. Distributed resource intelligent control technology and testing validation for new power systems. Guangdong Electr. Power 2025, 38, 30–40. [Google Scholar]
Nguyen, H.; Usman, M.; Buyya, R. DRLQ: A deep reinforcement learning-based task placement for quantum cloud computing. In Proceedings of the 2024 IEEE 17th International Conference on Cloud Computing (CLOUD), Shenzhen, China, 7–13 July 2024; pp. 475–481. [Google Scholar]
Gao, C.; Zhang, J.; Xuan, D.; Deng, G.; Cai, W. Network operation and maintenance technology based on machine learning algorithms in cloud computing environment. In Proceedings of the 2024 International Conference Interactive Intelligent Systems and Techniques (IIST), Bhubaneswar, India, 4–5 March 2024; pp. 317–321. [Google Scholar]
Chen, Y.; Zhao, C. Improved approximation of dispatchable region in radial distribution networks via dual SOCP. IEEE Trans. Power Syst. 2023, 38, 5585–5597. [Google Scholar] [CrossRef]
Dang, S.; Zhang, J.; Lu, T.; Zhang, Y.; Song, P.; Zhang, J.; Liu, R. Cloud-edge collaborative high-frequency acquisition data processing for distribution network resilience improvement. Front. Energy Res. 2024, 12, 1440487. [Google Scholar] [CrossRef]
Bernstein, A.; Burton, M.; Ghenassia, F. How to bridge the abstraction gap in system level modeling and design. In Proceedings of the IEEE/ACM International Conference on Computer Aided Design, San Jose, CA, USA, 7–11 November 2004; pp. 910–914. [Google Scholar]
Li, J.; Chai, R.; Gui, K.; Liang, C. Joint task offloading and resource scheduling in low earth orbit satellite edge computing networks. Electronics 2025, 14, 1016. [Google Scholar] [CrossRef]
Li, Z.; Xu, Y.; Fang, S.; Wang, Y.; Zheng, X. Multi objective coordinated energy dispatch and voyage scheduling for a multienergy ship microgrid. IEEE Trans. Ind. Appl. 2020, 56, 989–999. [Google Scholar] [CrossRef]
Binh, L.; Duong, T. Load balancing routing under constraints of quality of transmission in mesh wireless network based on software defined networking. J. Commun. Netw. 2021, 23, 12–22. [Google Scholar] [CrossRef]
Chen, X.; Yao, Z.; Chen, Z.; Min, G.; Zheng, X.; Rong, C. Load balancing for multiedge collaboration in wireless metropolitan area networks: A two-stage decision-making approach. IEEE Internet Things J. 2023, 10, 17124–17136. [Google Scholar] [CrossRef]
Yang, G.; Sang, J.; Zhang, X.; He, X.; Liu, Y.; Sun, F. Sensing data aggregation in mobile crowd sensing: A cloud-enhanced-edge-end framework with DQN-based offloading. IEEE Internet Things J. 2024, 11, 31852–31861. [Google Scholar] [CrossRef]
Bai, J.; Chen, Y. The node selection strategy for federated learning in UAV-assisted edge computing environment. IEEE Internet Things J. 2023, 10, 13908–13919. [Google Scholar] [CrossRef]
Liu, H.; Cao, G. Deep reinforcement learning-based server selection for mobile edge computing. IEEE Trans. Veh. Technol. 2021, 70, 13351–13363. [Google Scholar] [CrossRef]
Yan, J.; Bi, S.; Zhang, Y.J.; Tao, M. Optimal task offloading and resource allocation in mobile-edge computing with inter-user task dependency. IEEE Trans. Wirel. Commun. 2020, 19, 235–250. [Google Scholar] [CrossRef]
Alam, M.; Jamalipour, A. Multi-agent DRL-based Hungarian algorithm (MADRLHA) for task offloading in multi-access edge computing internet of vehicles (IoVs). IEEE Trans. Wirel. Commun. 2022, 21, 7641–7652. [Google Scholar] [CrossRef]
Liu, L.; Yuan, X.; Chen, D.; Zhang, N.; Sun, H.; Taherkordi, A. Multi-user dynamic computation offloading and resource allocation in 5G MEC heterogeneous networks with static and dynamic subchannels. IEEE Trans. Veh. Technol. 2023, 72, 14924–14938. [Google Scholar] [CrossRef]
Laili, Y.; Wang, X.; Zhang, L.; Ren, L. DSAC-configured differential evolution for cloud-edge-device collaborative task scheduling. IEEE Trans. Ind. Inf. 2024, 20, 1753–1763. [Google Scholar] [CrossRef]
Zhu, J.; Li, X.; Ruiz, R.; Xu, X. Scheduling stochastic multi-stage jobs to elastic hybrid cloud resources. IEEE Trans. Parallel Distrib. Syst. 2018, 29, 1401–1415. [Google Scholar] [CrossRef]
Farahnakian, F.; Pahikkala, T.; Liljeberg, P.; Plosila, J.; Hieu, N.T.; Tenhunen, H. Energy-aware VM consolidation in cloud data centers using utilization prediction model. IEEE Trans. Cloud Comput. 2019, 7, 524–536. [Google Scholar] [CrossRef]
Kalagarla, K.C.; Kartik, D.; Shen, D.; Jain, R.; Nayyar, A.; Nuzzo, P. Optimal control of logically constrained partially observable and multiagent Markov decision processes. IEEE Trans. Autom. Control 2025, 70, 263–277. [Google Scholar] [CrossRef]
Li, C.; Jiang, K.; He, G.; Bing, F.; Luo, Y. A computation offloading method for multi-UAVs assisted MEC based on improved federated DDPG Algorithm. IEEE Trans. Ind. Inf. 2024, 20, 14062–14071. [Google Scholar] [CrossRef]
Lu, Y.; Liu, L.; Panneerselvam, J.; Gu, J.; Garraghan, P.; Min, G. CECF: A DNN-based energy-efficient cloud-edge collaboration framework for intelligent workload scheduling in 6G-enabled transportation systems. IEEE Trans. Intell. Transp. Syst. 2025, 26, 17889–17900. [Google Scholar] [CrossRef]
Yamansavascilar, B.; Baktir, A.C.; Sonmez, C.; Ozgovde, A.; Ersoy, C. DeepEdge: A deep reinforcement learning based task orchestrator for edge computing. IEEE Trans. Netw. Sci. Eng. 2023, 10, 538–552. [Google Scholar] [CrossRef]

Figure 1. Cloud-edge integrated data processing architecture for DTU edge clusters.

Figure 2. Schematic diagram of cloud-edge collaboration-based data processing method for DTU edge clusters.

Figure 3. The data volume of cloud-edge-terminal collaborative processing over time slots.

Figure 4. The load imbalance degree over time slots.

Figure 5. Queue backlogs of different DTUs at time slot 400.

Figure 6. Variation in loss function over time slots.

Table 1. Simulation parameters.

Parameter	Value	Parameter	Value
$I$	8	$δ_{0}$	−114 dBm
$T$	400	$f_{m, k}^{E} (t)$	[2.5, 4] GHz
$P_{m, k} (t)$	[0.2, 0.4] W	$f_{m}^{C} (t)$	[10, 20] GHz
$B_{m, k} (t)$	1 MHz	$f_{m}^{D} (t)$	[0.5, 1] GHz
$τ$	1 s	$δ_{m, k} (t)$	[−30, −14] dBm
$V$	5	$λ_{m}$	500
$β$	0.001	$ξ_{0}$	0.9
$K$	50	${\hat{A}}_{m} (t)$	[5, 10] Mbits

Table 2. Comparison of performance indicators of different algorithms.

Performance Indicator (t = 400)	Proposed	Baseline 1	Baseline 2	Performance Improvement vs. Baseline 1, 2
Total data volume of cloud-edge-terminal collaborative processing	61.9 Mb	61.4 Mb	47.8 Mb	0.8%/29.5%
Convergence load imbalance degree	0.05	0.19	0.34	28.6%/85.3%
Average terminal-side queue backlog	5.4 Mb	11.8 Mb	45.4 Mb	54.2%/88.1%
Average edge-side queue backlog	5.5 Mb	17.1 Mb	62.8 Mb	67.8%/91.2%
Cloud-side queue backlog	5.8 Mb	11.6 Mb	30.0 Mb	50.3%/80.7%
Convergence speed	211	310	Unconvergence	31.9%/-
Convergence loss function	0.18	0.23	/	21.7%/-

Table 3. Comparison of indicators of different algorithms in various scenarios.

Key Metrics (Average)	8 DTUs + 6 Edge Servers	30 DTUs + 15 Edge Servers	50 DTUs + 20 Edge Servers
Total data volume of cloud-edge-terminal collaborative processing	61.9 Mb	174.8 Mb	230.3 Mb
Convergence load imbalance degree	0.05	0.08	0.12
Convergence speed	211	233	251

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Zeng, R.; Li, Z.; Li, S.; Zhang, J.; Chen, X. Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters. Energies 2026, 19, 269. https://doi.org/10.3390/en19010269

AMA Style

Zeng R, Li Z, Li S, Zhang J, Chen X. Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters. Energies. 2026; 19(1):269. https://doi.org/10.3390/en19010269

Chicago/Turabian Style

Zeng, Ruijiang, Zhiyong Li, Sifeng Li, Jiahao Zhang, and Xiaomei Chen. 2026. "Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters" Energies 19, no. 1: 269. https://doi.org/10.3390/en19010269

APA Style

Zeng, R., Li, Z., Li, S., Zhang, J., & Chen, X. (2026). Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters. Energies, 19(1), 269. https://doi.org/10.3390/en19010269

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Cloud-Edge Collaboration-Based Data Processing Method for Distribution Terminal Unit Edge Clusters

Abstract

1. Introduction

2. System Model

2.1. Edge Cluster–Cloud Server Collaboration Model

2.2. DTU Local Computing Model

2.3. Edge Cluster Data Processing Model

2.4. Cloud Server Data Processing Model

2.5. Processed Data Volume Model

2.6. Load Imbalance Degree Model

2.7. Long-Term Data Processing Volume Model

3. Problem Formulation and Transformation

4. Cloud-Edge Collaboration-Based Data Processing Method for DTU Edge Clusters

4.1. Cloud-Edge Collaborative Data Offloading Based on Load Imbalance Degree and Data Volume-Aware DQN

4.2. Cloud-Edge Computing Resource Allocation Based on Adaptive Differential Evolution

5. Simulation

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Nomenclature

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI