A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems

Fang, Hualiang; Feng, Ziyi; Li, Weibo

doi:10.3390/en19061501

Open AccessArticle

A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems

by

Hualiang Fang

^1,*,

Ziyi Feng

¹ and

Weibo Li

²

¹

School of Electrical Engineering and Automation, Wuhan University, Wuhan 430072, China

²

School of Automation, Wuhan University of Technology, Wuhan 430072, China

^*

Author to whom correspondence should be addressed.

Energies 2026, 19(6), 1501; https://doi.org/10.3390/en19061501

Submission received: 3 January 2026 / Revised: 1 March 2026 / Accepted: 5 March 2026 / Published: 18 March 2026

(This article belongs to the Special Issue Advanced Techniques for Optimization and Energy Management in Smart Grids)

Download

Browse Figures

Versions Notes

Abstract

With the rapid evolution of a new power system characterized by a high proportion of renewable energy, system operations have become increasingly random, variable, and uncertain. The system model exhibits features such as high dimensionality, multiple time scales, stochastic behavior, and nonlinearity. This paper proposes a large-scale computational power system model architecture based on cloud-edge-terminal collaboration. By defining functional roles within the cloud-edge-terminal structure and implementing a global model coordination mechanism, the approach enables an organic integration of global awareness, local adaptation, dynamic training, and online optimization for power system problem models. At the cloud level, various object models and the power grid topology are constructed. The edge generates typical problem models for the power system, while the terminal devices produce lightweight models adapted to local grids. This architecture supports collaborative modeling for key business scenarios such as power flow analysis, stability assessment, and reactive power optimization. The study focuses on the training methods of distilled parameters within the terminal models to enhance their adaptability for real-world deployment in power systems. Simulation results demonstrate that the cloud-edge-terminal model offers excellent scalability, adaptability, and real-time performance for computations in new power systems, effectively supporting localized, intelligent operations and decision-making within the system.

Keywords:

new power system; large model; Cloud-Edge-Terminal Architecture; model distillation

1. Introduction

With the large-scale integration of renewable energy, the traditional power grid is gradually evolving into a new power system characterized by a more complex structure, larger scale, and more stochastic operation. The stochastic and fluctuating nature of renewable energy output poses heightened challenges to the stability, security, economic efficiency, and low-carbon operation of power systems. Meanwhile, the interactive randomness of generation and load, along with the integration of diverse energy storage technologies, has led to strong system coupling, multi-timescale dynamics, and high operational variability. Under these conditions, traditional power system models are increasingly inadequate to meet the analytical and computational needs of new power systems. To address this, this paper proposes a cloud-edge-terminal collaborative modeling framework based on AI large models.

After the integration of renewable energy, power systems exhibit much stronger uncertainty and nonlinearity, and the scale of grid nodes, computational burden, and overall complexity increase sharply. Modeling approaches that rely on priors, forecasts, or rule-based formulations provide only limited accuracy [1,2], and traditional methods based on physical models and linear assumptions are becoming increasingly inadequate. References [3,4] develop modeling and analysis methods for power system planning and simulation under the random characteristics of renewable generation and loads, using multi-source heterogeneous data after the integration of renewables. The authors of ref. [5] investigate a multi-agent collaborative optimization planning model for power systems with a high share of renewable energy, while ref. [6] considers modeling of multi-scale fault-diagnosis features under a unified time series framework. Refs. [3,7] analyze typical modeling issues in new power system operation and control, such as the stochastic behavior of distributed renewables and the modeling of power-electronic interfaces.

In modeling typical problems of new power systems, AI techniques have been widely applied to power flow calculation, fault diagnosis, and voltage/frequency stability analysis. Owing to their advantages in handling complex topologies, graph neural networks, combined with deep neural networks (DNNs), have been used for dynamic equivalent modeling of microgrids with high penetration of renewable energy for frequency stability studies, achieving significant speed-ups while maintaining high accuracy [8]. Reference [9] studies RMS modeling and control of a grid-forming E-STATCOM in isolated power systems, enabling prediction and assisted assessment of system stable operation. A symmetry-preserving dynamic equivalent modeling method for large power systems based on transfer learning is proposed in [10], allowing fast identification of dynamic operating conditions. Refs. [11,12] investigate XGBoost ensemble learning models and modular modeling schemes, improving both accuracy and convenience when analyzing complex systems in a multi-model framework.

From a system-architecture perspective, AI is driving a shift in power system modeling from traditional approaches toward a hybrid “data-driven + physics-guided” paradigm. Considering the randomness of renewable and load data in power system modeling [13,14], study data-driven modeling methods that use time-series segmentation. In line with the computational characteristics of power system models, the authors of [15] develop a carbon-emission AI modeling framework for new power systems with large-scale renewable integration, highlighting the joint features of high-performance computing and AI in power system modeling. Ref. [16] investigates knowledge distillation in neural networks, where compression techniques are used to distill the knowledge of an ensemble into a single model, and a new type of ensemble is introduced to distinguish fine-grained classes that are difficult for full models to separate. Ref. [17] proposes a sample-efficient OPF learning method based on annealing knowledge distillation, which integrates decoupled tasks and improves accuracy in small-data regimes. Ref. [18] studies the selection of eigenvalues for parallel analysis of large-scale power system models, and ref. [19] develops a GRU–attention-based ultra-short-term load forecasting model for large power systems, emphasizing feature extraction from load data. By extracting attention-based features that capture the variation in renewable generation, ref. [20] proposes a Seq2Seq Transformer–based optimization method for regulating resource capacity allocation in power grids with high penetration of renewable energy.

However, conventional approaches typically construct independent models for typical problems such as power flow calculation, transient stability, and short-circuit analysis, lacking global consistency and the ability for coordinated evolution. In essence, the power system is a high-dimensional, nonlinear, strongly coupled system, in which a disturbance at any node may trigger dynamic changes in the global state. This global cascading effect is reflected not only in the coupling relationships between major nodes such as generation and load, but also in the interactions among multidimensional variables such as voltage, frequency, and power flow. Therefore, given this global nature and strong coupling, it is necessary to establish a system model with a global perspective and dynamic response capability; however, such a model is extremely large-scale and cannot meet the fast-response requirements of power system operation.

To address how to construct a model structure that both captures the globally coupled characteristics of the power system and enables rapid response, this paper proposes a large-model system based on a cloud-edge-terminal collaborative architecture. The cloud focuses on global object modeling and learning of evolutionary trends, the edge focuses on modeling typical problems, and the terminals focus on lightweight inference models for fast local response. Altogether, this forms an intelligent modeling framework that provides global coverage while responding to local conditions, thereby meeting the multi-level requirements of new power system operation.

2. Organizational Structure of Large Models

In new power systems, factors such as a high penetration of renewable energy, demand-side interaction, and deep coupling among generation, load, and storage jointly drive system analysis toward higher complexity and autonomous intelligence. The power system is a highly coupled whole, and comprehensive, accurate computation requires taking all objects into account and constructing an integrated model with global perception capability. However, such a global model is extremely large in scale, stochastic and dynamic, and highly complex. Typical power system problems such as power flow and stability need to be modeled on the basis of global awareness and are also closely related to variations at local terminals. Accurate analysis of the local grid likewise needs to be built upon the global model, but this makes the computational workload too heavy to satisfy the fast-response requirements of local services. As shown in Figure 1, a cloud-edge-terminal collaborative architecture is an effective way to tackle this complexity challenge. Its essence lies in functionally layering the processes of modeling, training, inference, and optimization of large power system models and deploying them separately on the cloud, edge, and terminal, which then collaborate to complete the tasks of power system modeling, analysis, control, and optimization [20].

At the cloud level in Figure 1, unified object models are constructed for generation, network, load, and storage. Based on large-scale historical operation data of the grid, a family of large power system models is trained for global application and shared across different regions and tasks. The cloud not only provides model parameters, structures, and global feature information, but also performs model calibration and synchronized updates, thereby offering the edge and terminal layers the necessary modeling basis and data support.

The edge model in Figure 1 focuses on modeling and analysis of typical power system problems, such as optimal power flow, stability assessment, and reactive power optimization. By extracting relevant object models and topology information from the cloud, the edge layer couples system states, control objectives, and constraint conditions to form problem-oriented online multi-scale computational models, enabling real-time modeling of typical problems under globally aware online monitoring.

The terminal model in Figure 1 is deployed in local distribution networks, microgrids, and park-level grids to enable high-frequency, low-latency intelligent control responses. Due to resource constraints at the terminal level, models are distilled and pruned by the edge layer to generate lightweight inference models. These retain essential feature parameters and serve tasks such as local grid operation optimization and autonomous edge control.

Collaboration among the three layers is achieved through standardized model and data interfaces and conversion protocols, ultimately building a large power system model framework that integrates global perception, problem-oriented analysis, and local autonomy, thereby comprehensively enhancing the intelligent analysis and regulation capabilities of the power system.

3. Cloud Model

As the central hub of the large-model framework, the cloud model not only serves as the “central brain” for global cognition of the power system, but also acts as the “knowledge source” that links multi-region and cross-layer tasks of the grid. The behaviors of various entities—generation, network, load, and storage—are highly nonlinear, strongly spatiotemporally correlated, and subject to numerous dynamic constraints. Through cross-period learning and regionally integrated modeling, the cloud model extracts key operating features from a global perspective and abstracts a unified modeling framework for different types of devices (such as wind power, photovoltaics, conventional generating units, the grid, energy storage, and flexible loads).

At the same time, based on grid topology and dynamic coupling relationships, graph neural network (GNN) structures are constructed among different types of entities to achieve the fusion of physical correlations and information flows. The cloud layer thus aggregates a comprehensive system that includes standardized object modeling, graph-based representation of grid structures, unified model input–output interfaces, task-driven collaborative training, local model distillation, and model update mechanisms.

3.1. Object Models

Typical power system objects include various power sources, grid components, loads, and energy storage systems. Taking photovoltaic (PV) systems as an example, the cloud model integrates physical mechanisms with AI-based modeling. Parameters such as conversion efficiency, temperature derating coefficient, model weights, and output bias need to be periodically trained and dynamically updated based on operational data. Through techniques such as sliding-window training, residual compensation, federated learning, and multi-model management, the system builds an intelligent, sustainable, and adaptive PV modeling framework.

The conventional empirical formula for PV output can be expressed as:

P_{p v} = η \cdot A \cdot G_{t} \cdot [1 - γ (T_{c} - T_{r e f})]

(1)

P_{p v}

: Actual output power (W)

η

: Module efficiency (affected by aging, cleanliness)

A: Total module area (m²)

G_t: Solar irradiance (W/m²)

T_c: Real-time module temperature (°C)

T_ref: Reference module temperature (°C)

γ: Temperature derating coefficient (typically around 0.003–0.005/°C)

This model has a simple structure and strong physical interpretability, but it responds slowly to short-term weather changes and environmental impacts, and lacks predictive capability. The module efficiency

η

needs to be updated quarterly, as it is affected by factors such as PV panel aging and dust accumulation. The temperature coefficient γ requires annual recalibration.

To address these limitations, multivariate regression or time series forecasting models can be built using operational data:

{\hat{P}}_{t + 1 : t + h} = f_{A I} (X_{t}) = f_{m o d e l} (G_{t - n : t}, T_{t - n : t}, H_{t - n : t}, t i m e, l o c)

(2)

where the inputs include recent irradiance G, temperature T, humidity H, timestamps, location, etc. The output is the PV power output in the next h steps (rolling prediction). Model architectures such as LSTM, GRU, TFT, Conv1D+LSTM, and XGBoost can be used. These models capture temporal patterns, adapt to different regional characteristics, and offer high prediction accuracy—though their input features must be periodically updated.

(1) Sliding Window Incremental Training

Use data from the most recent few days or one week for mini-batch training to achieve short-term adaptive correction and update model weights:

θ_{t + 1} = θ_{t} - η \cdot \nabla L (x_{t}, y_{t})

(3)

where:

η

: Learning rate, controlling the step size of each update

\nabla L (x_{t}, y_{t})

: Gradient of the loss function

L

with respect to parameters

θ_{t}

, evaluated on the current sample

(2) Residual-Based Adaptive Correction Model

An auxiliary model is constructed to fit the prediction error ε(t), which is then used to compensate for the output of the base model:

P_{t}^{f i n a l} = P_{t}^{m o d e l} + ϵ (t), ϵ (t) = f_{r e s} (G_{t}, T_{t}, t)

(4)

P_{t}^{f i n a l}

: Final corrected PV output

P_{t}^{m o d e l}

: PV output predicted by the base physical or empirical model

ϵ (t)

: Prediction error between the forecast and actual output

f_{r e s} (\cdot)

: Residual model that fits the error term

Equation (4) is particularly suitable for handling sudden weather changes or model underperformance scenarios.

(3) Federated Learning and Edge-Side Fine-Tuning

Due to the sensitive and private nature of operational data at individual PV power plants, federated learning can be employed. In this approach, model parameters are trained locally at each edge node without uploading raw data. The cloud then aggregates model weights to update the global model:

θ^{g l o b a l} = \sum_{i = 1}^{N} \frac{n_{i}}{n} \cdot θ_{i}

(5)

where:

θ^{g l o b a l}

: Global model parameters (weight vector) aggregated in the cloud

n_{i}

: Number of training samples at the i-th edge node

n: Total number of training samples across all edge nodes

θ_{i}

: Model parameters trained locally at the i-th edge node

On the basis of conventional physical models, the PV model further incorporates factors such as component aging, conversion efficiency, and external environmental influences during operation, while also taking into account daily, monthly, and seasonal variations in solar irradiance. By training the corresponding parameters using global data from the power grid, the generalization capability of the PV model under various operating conditions can be enhanced, enabling it to satisfy computational requirements arising from dynamic changes across different regions and time periods of the grid, and allowing direct invocation during modeling at the edge and terminal layers.

Wind Power Models share similar characteristics with PV models, and can follow the same data-driven modeling and update mechanisms. Thermal Power Units involve parameters such as frequency regulation characteristics, upper and lower output limits, ramp rates, and heat rate curves, which vary over time and require periodic identification and adjustment using historical operational data. Hydropower Units are influenced by water head, penstock dynamics, and reservoir scheduling plans. Parameters related to their governor systems must be adaptively tuned online to maintain accuracy. Energy Storage Systems (ESS) experience dynamic changes in parameters such as charge/discharge efficiency, capacity degradation coefficients, internal resistance, and voltage–power response characteristics due to time, temperature, and cycling. These parameters must be periodically identified and updated online. Load Models must support high-accuracy short-term forecasting, long-term adaptability, and multi-scenario transferability. AI-based load models are constructed by integrating meteorological data, time-of-day features, and user behavior patterns. The power network model’s accuracy depends on real-time data such as network topology, line parameters (impedance, admittance), transformer status, and operating conditions. By integrating PMU and SCADA data and applying data-driven algorithms, the model parameters can be dynamically updated to reflect actual network behavior.

3.2. Model Organization

To support the multi-region, multi-task, and multi-scale computational requirements of power systems, the cloud model must be equipped with capabilities for task decoupling and unified interfacing. This enables the model to serve upward for system-level planning, assessment, and dispatch decision-making, while also supporting downward deployment of lightweight models to edge and terminal layers through distillation or pruning, ensuring adaptability to regional grids, microgrids, or local control systems.

The various physical entities in the power system (generators, photovoltaic units, loads, energy storage devices, grid components, etc.) are modeled in a standardized manner. Each object is abstracted as a “Model Meta-Object,” which consists of the following structure:

To enable cross-scenario model transfer and composite training, all physical entities in the power system—such as generators, PV units, loads, storage systems, and grid components—are abstracted using a standardized modeling approach. Each entity is defined as a Model Meta-Object, which includes the following components:

(1) Static Parameters: Rated capacity, response coefficients, controller parameters, device constraints, etc.

(2) Dynamic States: Frequency, voltage, power output, state of charge (SOC for storage), etc.

(3) Control Interfaces: Frequency regulation, voltage regulation, load response, storage control strategies, etc.

(4) Label Information: Region affiliation, equipment type, control hierarchy, etc.

All model meta-objects in the power system are encapsulated in a modular structure, facilitating the composition of edge-layer models and the invocation of models at the terminal layer.

3.3. Graph-Structured Modeling

The cloud model serves as the central hub of the large-model architecture for new power systems. Building on the standardized representations of the various objects, it performs graph-structured modeling with unified inputs and outputs, supports collaborative training of different tasks, continual learning, model distillation, and model evolution. The cloud model establishes the global structural representation of the power system and, for online applications in local grids, provides model meta-objects and topological connectivity information. This enables the construction of efficient and accurate lightweight local models, supporting intelligent operation requirements across multiple regions, tasks, and scenarios.

The spatiotemporal coupling and connectivity among different objects can be organized for the entire network using graph-based methods, so as to meet the analysis needs of different grid areas and different types of problems.

The power grid is modeled as a graph G = (V,E), where:

Node set V: Physical system components (e.g., generators, storage units, loads, substations)

Edge set E: Electrical connections (transmission lines), control paths, or data communication channels

Node features X_v: Encapsulate object parameters and real-time states

Edge weights A_ij: Represent physical relationships such as electrical admittance, line length, etc.

This graph-based modeling approach enables a unified representation of the global power grid structure and can be used as input to a graph neural network (GNN) for node state prediction and regional-level grid operation assessment. It also supports rapid extraction and training of local subgraphs (for example, for regional grid dispatch optimization problems).

3.4. Unified Model Training

In new power systems, generation, network, load, and storage exhibit strong coupling, dynamic variability, and pronounced regional heterogeneity. To achieve collaborative optimization and dynamic updating of each model meta-object, it is essential to rely on a unified cloud-based data training framework. This framework aggregates multi-source heterogeneous data—such as wind and PV output, power grid flows, load time series, electricity prices, and energy storage SOC—under a unified standard. Through standardized processing and feature extraction, it provides consistent, high-frequency, and traceable inputs for all types of model meta-objects.

During training, wind and solar forecasting models depend on meteorological and output data; load models integrate factors such as temperature, time series characteristics, and user behavior; grid models require the combination of power flow and stability status; and storage models focus on SOC and charge–discharge response. The cloud data platform can simultaneously meet these multidimensional input requirements for all such models.

In addition, by leveraging training feedback and operational error analysis, the system can dynamically adjust parameters such as efficiency, capacity limits, and response delays, thereby enabling continuous evolution of model meta-objects under a unified data-driven paradigm. This unified data training mechanism not only enhances modeling consistency and collaborative performance but also lays a solid data foundation for integrated scheduling of generation, grid, load, and storage.

4. Edge Model

In the cloud-edge-terminal architecture, the cloud layer centrally builds and maintains the various “generation–grid–load–storage” object models (such as renewable energy models, load models, grid models, and energy storage models), enforcing unified standards, continuous training, and dynamic updating. The edge-layer models, by contrast, target typical engineering problems in power systems (such as power flow calculation, reactive power optimization, and stability analysis). Centered on these problems, they combine, reconstruct, and parameterize the model meta-objects to form edge computing models that are deployable and capable of fast response.

As the “intermediate logical hub” in the cloud-edge-terminal collaborative framework, the edge-layer model is mainly oriented toward specific typical power system problems and is responsible for decomposing system-level tasks, conducting regional modeling, and performing rapid problem solving. Its core role is to construct problem-oriented analytical models tailored to different tasks, based on the unified object models and structural parameters provided by the cloud.

In the edge-layer modeling process, the required model parameters and state information for generators, loads, energy storage, and grid topology in the target region are first retrieved from the cloud. These are then combined with problem objectives and control constraints to construct the corresponding mathematical models. This paper mainly analyzes the edge-layer models for three typical problems: optimal power flow, reactive power optimization, and stability.

4.1. Edge Model Architecture Analysis

(1) Optimal Power Flow (OPF)

For the Optimal Power Flow (OPF) problem, an optimization model is constructed over the entire regional network, considering nodal and line constraints as well as economic objectives:

\begin{array}{c} \min \sum_{i} c_{i} (P_{G, i}) + λ \cdot P_{spilled} + γ \cdot C_{stor} (t) \\ s . t . \{\begin{cases} P_{G, i} + P_{i}^{d i s} - P_{i}^{c h} - P_{D, i}^{pred} = \sum_{j} V_{i} V_{j} (G_{i j} \cos θ_{i j} + B_{i j} \sin θ_{i j}) \\ Q_{G, i} - Q_{D, i}^{p r e d} = \sum_{j} V_{i} V_{j} (G_{i j} \sin θ_{i j} - B_{i j} \cos θ_{i j}) \\ E_{i, t + 1} = E_{i, t} + η_{c} P_{i}^{c h} Δ t - \frac{P_{i}^{d i s}}{η_{d}} Δ t \\ E^{\min} \leq E_{i} \leq E^{\max}, P^{c h / d i s} \leq P^{\max}, V_{i}^{\min} \leq V_{i} \leq V_{i}^{\max} \end{cases} \end{array}

(6)

where:

Q_{D, i}^{p r e d}

: Forecasted load at node i, obtained from the cloud-based load AI model, updated every 5–15 min

P_{G, i}

: Generator output, obtained from the edge-side OPF solution, real-time optimization variable

E_{i}, η_{c}, η_{d}

: State of charge and efficiency of energy storage systems, obtained from the cloud-based storage model, calibrated daily or in real time

P_{i}^{c h}

,

P_{i}^{d i s}

: Charge/discharge power

G_{i j}, B_{i j}

: Line parameters of the grid, obtained from the cloud-based grid model, considered quasi-static

P_spilled: Renewable energy curtailment, derived in real time from the cloud-based wind/PV models

c_i(P): Cost function of thermal generators, sourced from the cloud-based thermal unit model, updated periodically

λ

: Penalty coefficient for curtailment, reflecting economic loss or carbon cost due to unutilized renewable energy

C_{stor} (t)

: Operational cost of energy storage, including charge/discharge loss, degradation, and price arbitrage

γ

: Weighting factor for storage cost, reflecting its importance in the overall optimization objective

This model integrates cloud-provided parameters and predictions with real-time variables optimized at the edge, allowing for region-specific, adaptive OPF solutions.

(2) Volt-VAR Optimization (VVO)

For the Volt-VAR optimization problem, information from local inverters and reactive compensation devices is extracted to construct a local voltage-constrained optimization model. The objective is to minimize voltage deviations or reactive power losses, thereby ensuring voltage compliance and achieving optimal reactive power distribution:

\min \sum_{i \in N} w_{i} {(V_{i} - V_{i}^{ref})}^{2} + \sum_{(i, j) \in L} r_{i j} \cdot Q_{i j}^{2}

(7)

where:

V_{i}

: Voltage magnitude at node i

V_{i}^{ref}

: Target voltage at node i (set by cloud platform or dispatch center)

Q_{i j}

: Reactive power flow on branch ij

r_{i j}

: Reactive power transmission loss coefficient

w_{i}

: Weighting factor for voltage importance at node i

This localized optimization model ensures voltage stability and efficient VAR support within the regional distribution or microgrid environment.

(3) Transient Stability

For transient stability issues in power systems (such as rotor angle, frequency stability), a transient stability assessment model is constructed based on cloud-side dynamic models of generator control, excitation systems, renewable energy sources, loads, and energy storage. The structure of the edge-layer stability model is as follows:

\frac{d Δ X (t)}{d t} = \frac{1}{2 H} (\sum P_{m, i} (t) - \sum P_{e, i} (t) - D \cdot Δ X (t))

(8)

where:

P_{m, i}

: Generator input power: determined by the cloud-side governor model and adjusted according to the control strategy;

P_{e, i}

: Electrical power

D: Overall system damping coefficient, pushed from the cloud-side model;

H: System equivalent inertia (including virtual inertia), obtained by aggregating the cloud-side generator, renewable energy, and energy storage models and then pushed to the edge;

Δ X (t)

: Deviations of stability-related variables.

4.2. Edge Model Generation and Training

The edge-layer models are oriented toward the analysis needs of typical power system problems. By extracting relevant object models and global state information from the cloud-based global model, they can rapidly generate various problem-specific models (such as optimal power flow and stability analysis), improving response speed while ensuring computational accuracy, and thus offering good real-time performance and scalability.

In terms of modeling mechanisms, the edge-layer models are built on the graph structure of the cloud model and introduce a “task graph generation” mechanism. They select task-related device nodes (such as generation, load, and energy storage), extract the corresponding topology, and combine dispersed object models according to task requirements, thereby forming model structures tailored to typical power system problems.

The edge-layer models also need to be custom-trained and rapidly fine-tuned in conjunction with actual operating scenarios. When training data at the edge is insufficient, a few-shot learning strategy can be adopted: designing feature transfer modules based on the existing model structures for typical problems and coupling them with cloud pre-trained models to achieve rapid convergence of both structure and parameters, thereby enhancing the model’s generalization and adaptability in new scenarios.

To further accelerate analysis while maintaining result accuracy, the edge layer can deploy a “task template generation mechanism + fast optimization/decision engine,” such as heuristic search, approximate linear programming, and AI-assisted solvers. Edge-layer modeling should also support model pruning, compression, and distillation, providing training data and teacher-model outputs for the terminal side, and distributing “lightweight models” to enable low-cost deployment at terminals.

Within the cloud-edge-terminal architecture, the edge-layer models play a key bridging role in mapping “from global to local.” Through standardized composition of model components and graph-based modeling, the edge layer can rapidly construct models online that are consistent with actual operation, thereby supporting efficient, real-time analysis and practical deployment of typical power system applications.

5. Terminal Model

5.1. Terminal Model Distillation

The terminal model is deployed at the edge of the power system, such as in distribution automation systems, microgrid master controllers, or campus energy management systems (EMS). It is responsible for high-frequency, localized decision-making and control, and must meet strict requirements for low computational load, real-time responsiveness, and strong adaptability.

Due to limited computing resources, terminal models cannot execute full-scale models. Therefore, a knowledge distillation mechanism—from edge model (teacher) to terminal model (student)—is adopted to train lightweight models.

The terminal model distillation process includes the following steps:

(1) Training Dataset Generation at the Edge Layer

This includes input features (such as load, voltage, and disturbance information) and teacher model outputs (such as optimal dispatch outputs, reactive power responses, and stability classification results).

(2) Student Model Architecture Design

Lightweight neural network structures are selected based on task requirements, such as MLP (Multi-Layer Perceptron), CNN (Convolutional Neural Networks), Attention-based models

(3) Distillation Loss Function Definition

The loss function may include a combination of supervised error and distillation KL divergence, along with regularization terms.

(4) Local Training and Fine-Tuning

Model parameters are fine-tuned using on-site operational data at the terminal to enhance adaptability.

(5) Online Deployment and Update

The terminal model supports periodic or event-triggered synchronization and updates to maintain performance in dynamic environments.

The terminal model can achieve “edge-level approximation with rapid response,” providing the local power grid with fast and efficient autonomous operation optimization strategies.

5.2. Terminal Model Architecture Analysis

(1) Optimal Power Flow (OPF)

To enable fast analysis of local optimal power flow (OPF) scheduling problems at the terminal, a lightweight multilayer perceptron (MLP) network is trained using a distillation approach, with the edge-layer OPF model outputs serving as the teacher model. The student model takes as inputs the local node loads, voltage states, and topology encodings, and outputs the predicted optimal generation setpoints for each node. A power-balance regularization term is introduced into the loss function to reinforce physical constraints within the model, ensuring that the results are more consistent with actual operating conditions.

L_{O P F} = \sum_{i} {({\hat{P}}_{G, i} - P_{G, i}^{t r u e})}^{2} + λ \cdot |\sum_{i} {\hat{P}}_{G, i} - \sum_{i} P_{D, i}|

(9)

The first term is the mean squared error (MSE) of predicted generation outputs

The second term ensures power flow balance

This model is trained using supervised learning on typical regional grid datasets and supports millisecond-level online inference.

(2) Volt-VAR Optimization

For voltage/reactive power optimization, the teacher model adopts the full edge-layer V–Q optimization algorithm to solve for the optimal reactive power distribution. The student model is built using a shallow convolutional neural network combined with an attention mechanism, taking local voltage states and equipment configurations as inputs and outputting the reactive power control commands for the corresponding nodes. During training, apparent power limits and voltage bounds are introduced as regularization terms to ensure that the voltage control results are executable and compliant.

L_{V Q} = \sum_{i} {({\hat{Q}}_{i} - Q_{i}^{r e f})}^{2} + α \cdot P e n a l t y_{l i m i t} + β \cdot P e n a l t y_{V r a n g e}

(10)

where:

Q_{i}^{r e f}

: Reactive power outputs from the edge-side Volt-VAR optimization model (teacher);

P e n a l t y_{l i m i t}

: Apparent power limit violation penalty;

P e n a l t y_{V r a n g e}

: Voltage limit violation penalty.

This model can be deployed at substations or distribution automation nodes, enabling rapid reactive control in response to voltage disturbances. The CNN + Attention architecture enables fast and efficient decision-making for reactive power allocation (e.g., among inverters, SVCs, etc.).

(3) Transient Stability

To enhance the terminal system’s online assessment capability for disturbances in frequency, voltage, and rotor angle, this paper develops a lightweight stability classification neural network that retains the key node objects and state variables of the terminal grid. The student model takes disturbance feature sequences and the system’s initial state as inputs, extracts temporal features via 1D convolution, and outputs a stability score through a shallow fully connected network. During training, in addition to label-supervised learning (cross-entropy loss), a temperature-scaled distillation KL-divergence term is incorporated to fully capture the boundary information of the more complex edge-layer stability model.

L_{s t a b} = (1 - γ) \cdot L_{CE} (\hat{y}, y) + γ \cdot T^{2} \cdot K L (S o f t_{T} (p_{t e a c h e r}) ∥ S o f t_{T} (p_{s t u d e n t}))

(11)

where:

L_{C E} (\hat{y}, y)

: Standard classification loss between student prediction

\hat{y}

and true label

y

γ

: Distillation balance factor, adjusting the weight between the two losses (range: 0–1)

T: Parameter for softening logits in distillation

KL(⋅): Kullback–Leibler divergence measuring distributional distance

S o f t_{T} (p)

: Softmax probability distribution under temperature T

p_{t e a c h e r}

: Logits output from the teacher model, i.e., the unnormalized prediction scores

This final model provides fast, local assessment of grid disturbance stability, enabling rapid decision-making for emergency control at the terminal level. It allows the system to quickly judge whether the current disturbance will result in stable or unstable behavior in terms of frequency, voltage, and rotor angle dynamics.

5.3. Distillation Parameter T

In model distillation, the temperature parameter T plays a crucial role as a hyperparameter that controls the smoothness of the soft labels. It adjusts the softmax probability distribution output by the teacher model:

P_{i} = \frac{\exp (z_{i} / T)}{\sum_{j} \exp (z_{j} / T)}

(12)

When T = 1: The softmax output approaches hard labels (close to 0 or 1), with information highly concentrated.

When T > 1: The output becomes smoother, revealing more inter-class structure. This allows the stu dent model to learn the reasoning path of the teacher, not just its final decision.

When T → ∞: The output approaches a uniform distribution.

When T < 1: The output becomes sharply peaked, overly biased toward the maximum class with reduced training signal.

In smart modeling for new power systems, the T is not only a training parameter but a core mechanism for balancing accuracy and generalization. For highly volatile wind and solar resources, smooth soft labels help the model better perceive boundary states and fuzzy classifications. For systems with multi-source coupling (e.g., combinations of generators, storage, and loads), a higher T reveals relative relationships among suboptimal outputs, helping the student model understand complex couplings. In distributed power structures (e.g., microgrids or industrial parks), tuning T allows the model to adapt to heterogeneous local topologies, enhancing transferability and generalization.

(1) Softmax with Temperature

The teacher model produces soft labels using temperature T, and the student model follows:

p_{i}^{(s)} = \frac{\exp (z_{i}^{(s)} / T)}{\sum_{j} \exp (z_{j}^{(s)} / T)}

(13)

(2) Distillation Loss (KL Divergence)

KL Divergence measures the difference between the teacher’s and student’s softened output distributions:

L_{KD} = T^{2} \cdot \sum_{i} p_{i}^{(t)} \log (\frac{p_{i}^{(t)}}{p_{i}^{(s)}})

(14)

(3) Cross-Entropy Loss

Standard classification loss between student predictions and true labels:

L_{CE} = - \sum_{i} y_{i} \log ({\hat{y}}_{i})

(15)

where

{\hat{y}}_{i}

is the predicted probability for class i from the student model.

(4) Distillation Temperature Regularization

To prevent T from drifting too far from its optimal range during training:

L_{reg} = γ \cdot {(T - T_{0})}^{2}

(16)

(5) Total Loss Function

Combining all components into the overall loss:

L_{total} = λ_{KD} \cdot L_{KD} + λ_{CE} \cdot L_{CE} + L_{reg}

(17)

Training Steps per Batch:

(1) Forward

Teacher model:

z^{(t)}

Student model:

z^{(s)}

Compute softened outputs:

p^{(t)} = S o f t \max (\frac{z^{(t)}}{T}), p^{(t)} = S o f t \max (\frac{z^{(s)}}{T})

(18)

(2) Compute Loss

L_{total}

using Equation (17)

(3) Backpropagation and update model parameters:

θ_{s} \leftarrow θ_{s} - η_{1} \cdot \frac{\partial L_{total}}{\partial θ_{s}}

(19)

(4) Update T Parameter:

T \leftarrow T - η_{2} \cdot \frac{\partial L_{total}}{\partial T}

(20)

where

η_{1}

,

η_{2}

are learning rates for model weights and T, respectively.

A proper choice of the distillation temperature T can significantly enhance the terminal model’s adaptability to dynamic disturbances, boundary operating conditions, and cross-regional applications, and is therefore crucial for enabling the practical deployment of AI large models in new-type power systems. The parameter T depends on many factors. In practice, an appropriate value of T can be determined through a certain amount of pre-training, and then further adjusted via additional training using task-specific data, before being deployed in the lightweight terminal model.

In principle, to cover all possible dynamic variations in the terminal grid, T should be obtained by training on the complete set of dynamic data and computing it according to Equations (13)–(20). However, it is difficult to acquire all such data in real applications. In particular, Equation (20) determines T by computing the various loss terms between the teacher-model and student-model outputs over the entire training set, and then selecting T based on the magnitude of these losses. Therefore, in practice, T may be determined using limited routine operating data via pre-training to obtain a locally suitable value, and then adjusted as needed. For example, a commonly used value is T = 5. In this paper, we further evaluate T over the range 1–10 and select different T values according to the corresponding results so as to match different operating conditions. A lightweight model trained with a well-chosen T can balance speed and accuracy, making it suitable for deployment in edge/local grids that require fast response.

6. Case Study

6.1. Simulation Case Scenarios and Data Generation

In this paper, a regional power grid is used as the foundation, with the various object models and topological structure of the grid taken as references. The IEEE 33-bus distribution system, the MG-14 microgrid system, and the local grid of an industrial park are selected as three independent terminals, which are connected into a unified grid to form test cases, on the basis of which the cloud-edge-terminal model framework is constructed.

On the cloud side, unified construction and training of the various object models in the grid are carried out. Based on the cloud models, the edge side organizes the modeling of typical problems such as optimal power flow, reactive power/voltage optimization, and stability analysis. Finally, these typical problem models are distilled into lightweight models and deployed in the three local terminal grids. The three simulation scenarios are summarized in Table 1. Scenario S1 corresponds to the IEEE 33-bus distribution system, where a 3-layer MLP model is used to implement terminal applications for power flow prediction and stability analysis. Scenario S2 corresponds to the MG-14 microgrid system, where an MLP combined with a constraint module is used to implement terminal applications for reactive power optimization and stability. Scenario S3 corresponds to the industrial park system, where a CNN+LSTM model supports terminal applications for optimal power flow and stability analysis.

In the simulation studies, large-scale multi-scenario samples are first generated using Latin hypercube sampling (LHS) based on the probability distributions of wind/PV output and load demand, yielding sufficiently rich joint time-series data for renewables and loads [21]. This is then fused with historical grid operation data (such as power flows, transmission corridor constraints, and equipment status) to form input datasets for power flow, reactive power optimization, and stability analysis. For the energy storage component, SOC-based sliding scheduling logic and disturbance discharge commands are defined to simulate multi-strategy response processes, thereby characterizing the charging and discharging behavior of storage under different scenarios. Grid topology disturbances are constructed via corridor reconfiguration, islanding switching, generator tripping, load transfer, and other operations to emulate changes in operating modes and the propagation of contingency chains. Stability events are created by artificially injecting short-circuit faults, power steps, and load steps, producing frequency and voltage dynamic response data. In this way, a joint dataset combining “multi-scenario steady-state + dynamic events” is ultimately formed, which is used to train the cloud-edge-terminal models and to carry out case validation.

6.2. Development of the Cloud-Edge-Terminal Model Architecture in the Simulation System

As shown in Figure 2, the cloud layer acts as the global modeling hub of the power system, integrating real-time operational data, historical logs, and equipment status information from across the grid. Using deep modeling approaches such as GNNs and Transformers, it builds generalized model structures and parameter templates for a wide range of objects, including generators, energy storage units, PV systems, loads, and induction motors. Its core outputs include unified object models, trained parameter sets, and abstract graph representations of the grid structure, which serve as the foundation for the edge layer to construct typical problem models such as power flow and stability analysis.

The edge layer constructs models for typical problems not directly from raw data, but on the basis of the device models and parameters provided by the cloud. It builds problem-oriented model structures and dynamically generates models for specific tasks such as optimal power flow (OPF) and dynamic stability. In addition to performing global optimization, dispatch, and stability assessment for these typical problems, the edge layer also produces soft-label data and intermediate feature representations to guide the distillation training of lightweight terminal models.

Terminal models are deployed in local grids and focus on fast-response application scenarios such as real-time power flow prediction, reactive power optimization, and stability analysis. Based on the typical problem models at the edge and combined with local real-time operating data, the terminal performs parameter fine-tuning or rapid distillation to obtain lightweight neural networks (e.g., MLPs, CNNs). These terminal models preserve local regional characteristics while partially inheriting the structural knowledge of the cloud and edge models, thereby achieving efficient, high-accuracy, and locally aware control and decision-making capabilities. Terminal models are the final executors in the cloud-edge-terminal intelligent collaborative framework.

6.3. Computational Results and Analysis

Based on the cloud-edge-terminal framework, the final terminal models generated from the cloud and edge are used to carry out simulation calculations for three types of tasks—power flow prediction, reactive power optimization, and transient stability classification—in the three terminal grids, respectively. The results are then compared with those of traditional centralized models, as shown in Table 2.

In the power flow prediction task, the global generation–grid–load–storage model trained in the cloud achieves high accuracy, but its size is too large to be directly deployed at the terminal. Based on the edge-layer power flow model, a lightweight terminal model is obtained through data training and subsequent distillation, which yields a low error in terms of the MSE metric. As shown in Table 2, taking IEEE 33 as an example, the MSE of the traditional model for power flow prediction is 0.0221. As illustrated in Figure 3, for different values of T, the MSE varies across terminal models; for the distilled terminal model, when T = 5, the MSE is 0.0245, an increase of only about 10.9%. At the same time, the model size is reduced by 83% and the inference time is shortened by about 132 ms, significantly improving the real-time power flow response performance of the local grid.

In the reactive power optimization task, the traditional model requires a relatively long time per iteration and cannot satisfy the frequent control demands of various industrial processes in an industrial park. As shown in Figure 4, after distillation at the terminal, the edge-layer GNN-based reactive power optimization model can quickly output node-level reactive power adjustment strategies, with an average control error of less than 3% and a response delay kept within 30 ms. Therefore, when the load in the industrial park changes frequently, the lightweight terminal model provides rapid control and better meets the optimized power-use requirements of industrial production.

If only minor changes occur in the nodes or branches of a terminal grid, the existing lightweight terminal model can be adapted through parameter fine-tuning. For example, in the IEEE 33-bus system, when a PV generation unit with a capacity equal to 40% of the bus capacity is added at one node while the overall grid topology remains unchanged, the input power at that node becomes stochastically varying, which affects power-flow prediction to some extent. This change can be accommodated by fine-tuning the model parameters. As shown in Table 2, the response time increases by 17 ms, while the model size remains unchanged and the accuracy is almost unaffected.

For the transient stability classification task, various model objects that have been uniformly trained in the cloud are combined with local microgrid operating characteristics to construct the edge-layer model, and a compact and effective terminal stability classifier is obtained via distillation. In the MG-14-bus microgrid, this model achieves a rapid stability classification accuracy of 94.7% for three types of disturbances (voltage sag, load disturbance, and source tripping), which is about 7.2% higher than that of traditional methods.

The advantages of the cloud-edge-terminal architecture are reflected not only in the flexibility of model deployment and the improvement in inference efficiency, but more importantly in its ability to rapidly adapt to the operating conditions of terminal grids. Global modeling on the cloud ensures consistency of model structures and forward scalability; the edge layer can quickly construct models for typical power system problems; and on the terminal side, lightweight predictors or classifiers are formed through model distillation, meeting diverse analytical requirements for power grids with different structures and target tasks.

6.4. Analysis of the Distillation Parameter T

As the final executor in the cloud-edge-terminal architecture, the terminal model’s distillation parameter T directly determines the ultimate tradeoff between result accuracy and response speed. In practical settings, when the output varies smoothly, T can be chosen from [1, 2]; for small variations, from [2, 4]; and for larger variations, from [4, 10]. In our case studies, power changes in power-flow results can sometimes be large; in reactive power optimization, voltage variations are relatively small but reactive power can vary substantially; and stability outcomes are even more irregular. Therefore, we analyze a relatively wide range of T ∈ [1, 10]. In what follows, different values of T are selected for three terminal grid application scenarios to perform calculations and comparisons. Compared with Figure 3, more samples are used here, with a more uniform value distribution and wider coverage, resulting in higher MSE accuracy. We now analyze and compute the impact of different T values for the terminal models in the three local grids.

(1) IEEE 33-Bus System

After distilling lightweight terminal models from the cloud and edge, power flow prediction and stability analysis are carried out. The scenarios and task settings are shown in Table 3.

When the distillation temperature parameter T = 1, 2, 5, 10, each task is trained with 100,000 disturbance samples (covering both power flow and stability tasks). The results are as follows:

As shown in Figure 5, the voltage prediction MSE is minimized at T = 10, but the MSE values at the six points within the range T = 5–10 differ only slightly. The current prediction MSE reaches its minimum at T = 8, yet the MSE values across T = 5–10 are also very close, and the three points at T = 8–10 are almost identical. Therefore, in this case, for convenience, T = 5 can be uniformly adopted in practical computations.

As shown in Table 4 and Table 5, when the distillation temperature T = 5, the terminal model achieves the best performance on both the power flow prediction and stability classification tasks. Not only is the power flow prediction error significantly reduced, but the stability classification accuracy is also greatly improved. A moderate T value strikes a good balance between imitating the structural complexity and decision behavior of the cloud–edge teacher models and maintaining efficiency, thereby enhancing the model’s capability to identify disturbances and atypical operating states.

(2) Analysis of Terminal Models in the MG-14 Node Microgrid

The microgrid includes a high penetration of renewable energy (PV + wind), energy storage systems, and hybrid loads. The control objective is to predict the optimal SVG and SVC output strategies to maintain voltage compliance and minimize reactive power losses.

The teacher model is a combined optimization result from cloud-edge collaborative OPF, while the student model is a lightweight DNN regression model trained via knowledge distillation.

The teacher model for power flow is a jointly optimized cloud + edge OPF model, while the student model is a lightweight DNN regression model with distillation. For the stability classification task, the goal is to determine whether the system can maintain dynamic voltage stability after a disturbance. The teacher model consists of dynamic stability simulation combined with a rule-based boundary model, and the student model is a lightweight MLP-based stability classifier trained with distillation.

For distillation parameters T = 1, 2, 5, 10, each task is trained with 50,000 disturbance samples and tested with 10,000 samples.

According to Figure 6, the voltage-deviation MSE reaches its minimum at T = 6, but the MSE values at the six points within the range T = 5–10 differ only slightly. The reactive control error MSE is minimized at T = 7, yet the MSE values at the three points within the range T = 5–7 are also very close. Therefore, in this case, T = 5 can be uniformly adopted in practice for computation.

As shown in Table 6 and Table 7, when T = 5, the terminal model also achieves optimal performance in terms of reactive power control error and stability classification accuracy. It significantly reduces voltage deviation and reactive power control error, better realizing nonlinear control strategies and enhancing stability pre-assessment capability. An appropriately chosen distillation temperature helps to keep the terminal model lightweight while strengthening its ability to predict voltage and stability states under ambiguous conditions, thus meeting the fast response requirements for microgrid terminal deployment.

(3) Industrial Park System

In industrial parks, there are many impact loads with strong nonlinear fluctuations. The power supply typically adopts a hybrid configuration of PV + energy storage + utility grid, supporting both islanded operation and grid-connected modes. The control objectives are high-precision power flow prediction and fast stability identification. To avoid affecting actual industrial production, voltage, power, and fault control are required to respond in real time with very low latency.

For the terminal OPF prediction model, the inputs include current load, power source status, and power boundaries, while the output is the optimal power flow distribution. The teacher model is an OPF solver based on global constraints, and the student model is a lightweight DNN-based regression model.

For stability classification, the inputs are the system state before a disturbance and the disturbance type, while the output is a binary classification of system stability. The teacher model combines cloud-edge stability criteria, and the student model is a lightweight MLP classifier trained via distillation.

Distillation parameters were set as T = 1, 2, 5, 10. Each task used 100,000 disturbance samples for training and 20,000 samples for testing.

According to the results in Figure 7, the voltage-error MSE is minimized at T = 5, whereas the power-error MSE is minimized at T = 10. However, the power-error MSE values at the six points within the range T = 5–10 do not differ significantly. Therefore, in this case, T = 5 can be uniformly adopted in practice for computation.

As shown in Table 8 and Table 9, when the distillation T = 5, the terminal model achieves a significant reduction in voltage and reactive power errors in the power flow prediction task. In the stability classification task, the accuracy improves to 95.9%, with an F1-score of 0.94, clearly outperforming locally trained standalone models. These results show that a moderate distillation parameter helps simplify the terminal model’s decision strategy under complex boundary conditions, enhancing its ability to handle multiple types of disturbances and atypical states in the industrial park power grid. At the same time, the model inference time remains around 18 ms, meeting the requirements for online rapid deployment in industrial power control scenarios within the park.

The distillation parameter is not just a training hyperparameter, but a key tuning factor for enhancing the practicality and deploy ability of terminal models in new power systems. Based on the results obtained for different T values, T = 5 generally yields the best performance, or performance close to the best. For convenience, the case studies in this paper adopt a unified setting of T = 5. In practice, for a terminal grid, it is often only feasible to pre-train on limited operational data to obtain a value of T that is suitable for the current operating conditions. To accommodate other operating scenarios, T needs to be adjusted accordingly. For instance, a commonly used setting is T = 5. In this paper, we evaluate T over the range 1–10 and determine appropriate T values based on the corresponding results so as to match different operating conditions. With a well-chosen T, the lightweight model can achieve a good balance between speed and accuracy, making it suitable for deployment in edge/local grids that require fast response.

7. Conclusions

Traditional power system models are typically developed separately for each canonical problem (such as power flow calculation and stability analysis), and thus lack global awareness of the overall coupling characteristics of the system. As a result, they struggle to meet the demand for fast analysis under the conditions of new power systems, which are characterized by multi-source, stochastic, heterogeneous data and highly complex, variable operating modes. In view of these challenges, this paper carries out an in-depth study and presents the following contributions:

(1) A cloud-edge-terminal large-model architecture for power systems is proposed to achieve unified global perception. On the cloud side, unified modeling and training are performed for various objects in generation, grid, load, and storage. On this basis, the edge side can uniformly deploy typical problem models, while at the terminal side, lightweight models are generated online via distillation to meet the requirements of fast and accurate computation in local grids. Taking typical problems such as power flow analysis, stability assessment, and reactive power optimization as examples, the paper investigates cloud-edge-terminal collaborative modeling methods for power systems.

(2) To address the need for fast and accurate modeling of local grids under the globally large-scale, high-dimensional, and dynamically stochastic structure of real-world power systems, a training method for setting the distillation parameter T is proposed. This method can effectively improve the accuracy and generalization capability of terminal grid models and exhibits strong adaptability across different application scenarios, making it a key means for enabling the practical deployment of intelligent terminal modeling in new power systems.

The cloud-edge-terminal model architecture for new power systems not only provides global perception, but also organizes edge-layer models and distills local models online, thereby meeting the requirements for fast and accurate local computation. It represents an important direction for the development of intelligent modeling in power systems and serves as a key enabling technology for the analysis of future power grids with a high penetration of renewable energy.

Author Contributions

Conceptualization, H.F.; methodology, W.L. and H.F.; simulation, Z.F.; validation, H.F.; formal analysis, Z.F.; investigation, Z.F. and H.F.; resources, H.F.; data curation, W.L.; writing—original draft preparation, H.F.; editing, W.L.; visualization, Z.F.; supervision, Z.F. and H.F.; project administration, H.F.; funding acquisition, H.F. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The original contributions presented in this study are included in the article material. Further inquiries can be directed to the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Liu, S.; Zhang, W.; Yuan, S.; Bao, H.; Mao, W.; Xi, S. A Lightweight Model for Insulator Defect Detection Based on Vision–Language Modeling and Prior Knowledge in Power Systems. Processes 2025, 13, 3714. [Google Scholar] [CrossRef]
Wang, G.; Li, Y.; Cheng, X.C.; Li, R.; Ding, C.X.; Zhang, Y.; Liu, Y. Integrated optimization of equipment degradation modeling and spare parts inventory for predictive maintenance in power systems. Sustain. Energy Technol. Assess. 2025, 83, 104626. [Google Scholar] [CrossRef]
Gómez, R.I.J.; García, M.A.; Tabora, M.J. Power System Modeling and Simulation for Distributed Generation Integration: Honduras Power System as a Case Study. Energies 2025, 18, 4777. [Google Scholar] [CrossRef]
Yao, W.; Huo, Z.; Zou, J.; Wu, C.; Wang, J.; Wang, X.; Lu, S.; Xie, Y.; Zhuo, Y.; Liang, J.; et al. Medium- and Long-Term Power System Planning Method Based on Source-Load Uncertainty Modeling. Energies 2024, 17, 5088. [Google Scholar] [CrossRef]
Jiang, H.; Liu, L.; Hou, J.; Wu, J.; He, T.; Ai, X. Voltage Security-Constrained Energy Storage Planning Model Considering Multi-Agent Collaborative Optimization in High-Renewable Power Systems. Energies 2025, 18, 6597. [Google Scholar] [CrossRef]
Zhang, Z.; Wei, C.; Zhang, W.; Wen, L. A new large model with multi-scale feature fusion for fault diagnosis based on unified time series model. Appl. Soft Comput. 2025, 185, 113941. [Google Scholar] [CrossRef]
Li, C.; Yang, Y.; Mao, X.; Xiong, X.; Tomislav, D. Modeling, control and stabilization of virtual synchronous generator in future power electronics-dominated power systems: A survey of challenges, advances, and future trends. Int. J. Electr. Power Energy Syst. 2025, 171, 111001. [Google Scholar] [CrossRef]
Seylab, R.M.; Naderi, S.M.; Gharehpetian, B.G. Dynamic Equivalent Modeling of Microgrids with High Penetration of Renewable Energy Resources for Power System Frequency Stability Studies Using Deep Neural Networks (DNNs). Iran. J. Sci. Technol. Trans. Electr. Eng. 2025, 49, 2169–2186. [Google Scholar] [CrossRef]
Rodriguez-Amenedo, J.L.; Montilla-DJesus, M.E.; Arnaltes, S.; Arredondo, F. RMS Modeling and Control of a Grid-Forming E-STATCOM for Power System Stability in Isolated Grids. Appl. Sci. 2025, 15, 3014. [Google Scholar] [CrossRef]
Aththanayake, L.; Kaur, D.; Islam, S.N.; Gargoom, A.; Hosseinzadeh, N. A Transfer-Learning-Based Approach to Symmetry-Preserving Dynamic Equivalent Modeling of Large Power Systems with Small Variations in Operating Conditions. Symmetry 2025, 17, 1023. [Google Scholar] [CrossRef]
Fang, H.; Liao, J.; Huang, S.; Zhang, M. Research on Status Assessment and Operation and Maintenance of Electric Vehicle DC Charging Stations Based on XGboost. Smart Cities 2024, 7, 3055–3070. [Google Scholar] [CrossRef]
Peiris, S.; Filizadeh, S.; Muthumuni, D. Modular Dynamic Phasor Modeling and Simulation of Renewable Integrated Power Systems. Energies 2024, 17, 2480. [Google Scholar] [CrossRef]
Omoroghomwan, A.E.; Oyebanjo, I. A data segmentation and chronological nomenclature preprocessing approach for power system modeling. Comput. Chem. Eng. 2025, 201, 109218. [Google Scholar] [CrossRef]
Wen, X.; Contreras, J.G.; Stadelmann-Steffen, I.; Sasse, J.-P.; Trutnevyte, E. High sensitivity to methodological choices when integrating social acceptance data in electricity system modeling. Appl. Energy 2025, 402, 126893. [Google Scholar] [CrossRef]
Liu, H.; Zhai, J. Carbon Emission Modeling for High-Performance Computing-Based AI in New Power Systems with Large-Scale Renewable Energy Integration. Processes 2025, 13, 595. [Google Scholar] [CrossRef]
Hinton, G.E.; Vinyals, O.; Dean, J. Distilling the Knowledge in a Neural Network. arXiv 2015. [Google Scholar] [CrossRef]
Dong, Z.; Hou, K.; Liu, Z.; Yu, X.; Jia, H.; Zhang, C. A Sample-Efficient OPF Learning Method Based on Annealing Knowledge Distillation. IEEE Access 2022, 10, 99724–99733. [Google Scholar] [CrossRef]
Li, Y.; Zhang, C.; Zhou, X.; Chen, L.; Chu, F. A scalable method with synchronous parallelization for computing selected eigenvalues of large-scale power system model. Electr. Power Syst. Res. 2025, 238, 111085. [Google Scholar] [CrossRef]
Kim, G.T.; Yoon, G.S.; Song, B.K. Very Short-Term Load Forecasting Model for Large Power System Using GRU-Attention Algorithm. Energies 2025, 18, 3229. [Google Scholar] [CrossRef]
Nie, C.; Fang, H.; Xiang, X.; Xu, W.; Lei, Q.; Li, Y.; Wang, Y.; Yang, W. Optimization Method for Regulating Resource Capacity Allocation in Power Grids with High Penetration of Renewable Energy Based on Seq2Seq Transformer. Energies 2025, 18, 5218. [Google Scholar] [CrossRef]
Fang, H.; Shang, L.; Dong, X.; Tian, Y. High Proportion of Distributed PV Reliability Planning Method Based on Big Data. Energies 2023, 16, 7692. [Google Scholar] [CrossRef]

Figure 1. Cloud-Edge-Terminal Architecture.

Figure 2. Cloud-Edge-Terminal Model Architecture of the Simulation System.

Figure 3. MSE Comparison in Power System Tasks.

Figure 4. Response Time Comparison.

Figure 5. MSE vs. Distillation T in Power Flow in IEEE 33-Bus System.

Figure 6. MSE vs. Distillation T in Reactive Power Optimization in the MG-14 Node Microgrid.

Figure 7. MSE vs. Distillation T in Power Flow in Industrial Park System.

Table 1. Simulation Case Scenarios.

ID	Name	Number of Buses	Description
S1	IEEE 33-Bus Distribution System	33	Typical public grid topology with main and branch lines, includes distributed PV and EV loads
S2	MG-14 Microgrid System	14	Autonomously controlled local system with energy storage, PV, micro gas turbines, supporting grid-connected/islanded modes
S3	Typical Industrial Park System	18	Multi-load scenario with high-voltage inverters, induction furnaces, and large motors; characterized by large load fluctuations and fast response

Table 2. Computational Results Comparison.

Task Type	Test System	Model Type	MSE/Error	Response Time (ms)	Model Size (MB)	Accuracy (%)
Power Flow Forecasting	IEEE 33-Bus System	Traditional Centralized Model	0.0221	200	45	96.8
		Distilled Terminal Model	0.0239	28	7.4	95.1
		Fine-tuning model	0.0240	45	7.4	94.8
Reactive Power Optimization	Industrial Park System	OPF Iterative Method	-	340	21	92.1
		Distilled Edge Model	Control error: 3%	30	8.1	90.3
		Fine-tuning model	Control error: 3%	37	8.1	90.1
Stability Assessment	MG-14 Microgrid	SVM/Traditional ANN	-	430	23	87.5
Stability Assessment	MG-14 Microgrid	Distilled Terminal Model	-	40	5.2	94.7

Table 3. Computational Tasks and Descriptions for the IEEE 33-Bus System.

Task	Description
Power Flow Forecasting	Input: Node loads and generation states; Output: Predicted node voltages, currents, active/reactive power flows
Stability Classification	Determine whether the system remains voltage-stable after disturbances (e.g., load steps, distributed resource switching)
System Model	IEEE 33-bus distribution network (with distributed energy, storage, and typical load variations)
Teacher Models	Edge-layer high-accuracy models: OPF (power flow) + dynamic simulation (stability)
Student Models	LSTM (for power flow forecasting) + DNN (binary classifier: stable/unstable)
Distillation Method	Soft-label distillation with varying T to generate student training samples

Table 4. Power Flow Prediction Accuracy (MSE) in IEEE 33-Bus System.

(T)	Voltage Prediction MSE	Current Prediction MSE	Avg. Convergence Epochs
1	0.0089	0.0112	38
2	0.0061	0.0094	33
5	0.0042	0.0069	28
10	0.0041	0.0066	32

Table 5. Stability Classification Accuracy in IEEE 33-Bus System.

(T)	Accuracy	F1 Score	False Alarm Rate	Miss Rate	Convergence Epochs
1	91.0%	0.89	6.4%	2.6%	40
2	94.3%	0.92	3.5%	2.2%	35
5	96.8%	0.95	1.8%	1.4%	30
10	97.5%	0.91	2.9%	3.6%	34

Table 6. Reactive Power Optimization Regression Errors (MSE) in the MG-14 Node Microgrid.

(T)	Voltage Deviation MSE	Reactive Control Error MSE	Convergence Epochs
1	0.0092	0.0115	42
2	0.0068	0.0089	35
5	0.0046	0.0063	29
10	0.0045	0.0075	32

Table 7. Stability Classification Accuracy in the MG-14 Node Microgrid.

(T)	Accuracy	F1 Score	False Alarm Rate	Miss Rate
1	90.2%	0.88	7.3%	2.5%
2	92.7%	0.91	4.8%	2.1%
5	95.5%	0.94	2.1%	1.6%
10	93.2%	0.91	3.0%	3.4%

Table 8. OPF Prediction Model Accuracy (MSE) in Industrial Park System.

(T)	Voltage Error (MSE)	Power Distribution Error (MSE)	Inference Time (ms)
1	0.0071	0.0094	18
2	0.0052	0.0071	18
5	0.0037	0.0049	18
10	0.0036	0.0057	18

Table 9. Stability Classification Model Accuracy in Industrial Park System.

(T)	Accuracy	F1 Score	False Positive Rate	False Negative Rate
1	90.5%	0.88	6.5%	3.0%
2	92.8%	0.91	4.2%	2.6%
5	95.9%	0.94	2.0%	2.1%
10	96.1%	0.91	3.1%	3.8%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Fang, H.; Feng, Z.; Li, W. A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems. Energies 2026, 19, 1501. https://doi.org/10.3390/en19061501

AMA Style

Fang H, Feng Z, Li W. A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems. Energies. 2026; 19(6):1501. https://doi.org/10.3390/en19061501

Chicago/Turabian Style

Fang, Hualiang, Ziyi Feng, and Weibo Li. 2026. "A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems" Energies 19, no. 6: 1501. https://doi.org/10.3390/en19061501

APA Style

Fang, H., Feng, Z., & Li, W. (2026). A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems. Energies, 19(6), 1501. https://doi.org/10.3390/en19061501

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

A Study on the Cloud-Edge-Terminal Framework for Large Computing Models in New Power Systems

Abstract

1. Introduction

2. Organizational Structure of Large Models

3. Cloud Model

3.1. Object Models

3.2. Model Organization

3.3. Graph-Structured Modeling

3.4. Unified Model Training

4. Edge Model

4.1. Edge Model Architecture Analysis

4.2. Edge Model Generation and Training

5. Terminal Model

5.1. Terminal Model Distillation

5.2. Terminal Model Architecture Analysis

5.3. Distillation Parameter T

6. Case Study

6.1. Simulation Case Scenarios and Data Generation

6.2. Development of the Cloud-Edge-Terminal Model Architecture in the Simulation System

6.3. Computational Results and Analysis

6.4. Analysis of the Distillation Parameter T

7. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI