Research on Edge-Computing-Based High Concurrency and Availability “Cloud, Edge, and End Collaboration” Substation Operation Support System and Applications

: With the continuous promotion of digital transformation in the ﬁeld of power transformation, the diversiﬁcation of application scenarios, and the scale of pilot construction, the real-time, concurrency, and security requirements for data fusion and application support of the power monitoring system, management information system, and other business platforms are getting higher and higher, and this paper puts forward a high concurrency and availability “cloud-side-end collabora-tion” based on edge computing. This paper proposes a high concurrency and availability “cloud, edge and end collaboration” architecture based on edge computing for substation operation support systems. First, this paper summarizes the development status of domestic substation operation support systems and analyzes the advantages and disadvantages of various technical architectures. Then, a “cloud-side-end cooperative” substation operation support system architecture with “high real-time, high concurrency, high security and high stability” is proposed, which focuses on remote inspection, remote operation, and remote safety control of substation businesses from the perspective of engineering applications. It realizes transparent monitoring of equipment operation, uniﬁed management of operation data, and integration of production command and decision-making; solves the problems of dispersed coexistence of multiple systems for dispatching, monitoring, analysis, management, and other businesses, switching between multiple systems, and insufﬁcient real-time and stability of the system; and controls the risks of the grid, reduces the potential safety hazards, and solves the contradiction between the continuous growth of the grid equipment and the shortage of production personnel. The results of engineering application examples show that the proposed architecture compared with the existing system architecture has greater advantages and can meet the requirements of large-scale access to the substation, with feasible popularization and application.


Introduction
With the deep integration of substation field end control technology (OT) and information technology (IT), the digital transformation in the field of substations continues to advance, and the level of sensible, knowable, and controllable equipment continues to be enhanced; different scales of pilot applications of substation monitoring systems have appeared in different parts of power grids, and a variety of technological routes and system architectures have also been derived.But due to the concurrency, security, and stability of the system architecture, the digital transformation in the field of substations is still far away from scaled-up promotion [1].
China's preliminary research on power monitoring systems for the substation domain and intelligent substation transformation can be roughly divided into three stages.Stage 1 involves intelligent substation technology utilizing RTU technology to collect station signals through secondary cables, thus realizing the "four-remote" function in a single substation.However, it is generally used in the substation terminal automation system (SAS) which does not require the site, and the old RTU system is often unable to meet the requirements of long-term stable operation of the substation, reconstruction, and expansion of the access to new signals and substation automation monitoring system applications such as the need to carry out renovation [2,3].Stage 2 comprises the application of centralized distributed control technology, which is often used in the control of active distribution networks due to its high accuracy, effectively improves the performance of the distribution monitoring system (DMS) in the region.However, the centralized control of the corresponding time is slow, not conducive to real-time scheduling, and has high dependence on the centralized control, which is vulnerable to a single point of failure and has poor reliability [4].In stage 3, with the popularization and application of the IEC 61850 specification, the power monitoring system is enabled to realize hierarchical distributed control [5].Hierarchical distributed control can overcome the shortcomings of the traditional centralized control mentioned above, and distributed control can realize the cooperative control of each distributed substation with a good control effect, which is the hot spot of research in the field of power transformation at present [6,7].
Although substation monitoring systems (SMSs) have gone through different stages of transformation, current substation monitoring and control management usually requires human beings to carry out equipment inspections, monitoring, and maintenance [8].The enhancement of the automation level of the relevant equipment in the existing power system requires different operation and management methods for different equipment, which puts forward higher requirements for the basic level of the relevant management personnel and the effectiveness of management [9,10].However, this approach has obvious defects.Firstly, labor costs are high, as substation equipment needs to operate for long periods of time, requiring extensive manpower for regular inspection and maintenance.Additionally, the challenging and risky environments, such as working at great heights, further increase the difficulty and risks for maintenance personnel.Secondly, operation and maintenance is difficult [11].To solve these problems, through the construction of advanced, reliable, integrated, low-carbon, and environmentally friendly intelligent equipment in substations, with the goal of improving the overall operation of the system on a networkwide basis, the rendering intelligent of monitoring equipment and the comprehensive analysis of fault information are realized [12].
Although the intelligent transformation of substations has been implemented, there are still some challenges in the operation and maintenance of the power grid substations in China: (1) The system is complex.There are a large number of devices in the substation operation and maintenance system, and different devices have significant differences in functionality and performance, increasing the difficulty of maintenance management and the occurrence rate of faults.(2) Maintenance is difficult.Substation equipment is distributed widely, making centralized management difficult.
Therefore, it is urgent to optimize the traditional substation monitoring mode and promote the application of artificial intelligence terminals in safety monitoring at the operation and maintenance site, improve monitoring efficiency, and solve the problem of requiring a large amount of manpower and resources for safety monitoring at the operation and maintenance site [13].
Based on this, power monitoring and management systems based on cloud computing architecture have emerged [14].Such systems can upload real-time data from power system operation and maintenance sites to the cloud platform for centralized processing, and then provide timely feedback on the processing results to management personnel through the platform.In the field of power system substations, the combination of the Grid Information Model (GIM) and Geographic Information System (GIS) is widely used for data computation, with cloud computing being the predominant approach.Cloud computing refers to the formation of a highly capable system through computer networks, which can store and aggregate relevant resources and be configured on-demand to provide personalized services to users.With the continuous development of cloud computing technology, it is gradually being applied in power monitoring systems [15].Reference [16] proposed a power video surveillance platform based on cloud computing technology, providing a detailed analysis of its structural design and the application of its functionalities in the system.It verified the accuracy and efficiency of the platform compared to traditional surveillance systems.Ref. [17] builds upon digital image-processing methods for power video and addresses the challenges of a large number of devices and wide geographical coverage in power video surveillance systems.It proposes a new technology architecture for intelligent fault and status detection in power video based on cloud computing, effectively utilizing existing hardware resources to achieve network-wide video device fault diagnosis and status querying.This architecture provides reliable support for power grid operation and maintenance management.Ref. [18] presents a new web-enabled near-real-time power quality (PQ) monitoring system, I-Grid/spl trade, which was developed in the U.S. Its use of an internal modem to send event data over the internet to the I-Grid database server provides more reliable data for monitoring and management of its power system.Ref. [19] presents a system that can be used to monitor power consumption and, if connected over the internet to dedicated cloud services, can display in real-time the monitored quantities in the system.The monitoring system was tested with a programmable logic controllers (PLC)based system and with one based on field-programmable gate arrays (FPGAs).Monitoring systems can disconnect parts of the monitored system's circuits in order to identify powerhungry components or for studying the implementation of different power-saving modes.Ref. [20] proposes a transformer power monitoring system in India.It measures the current, voltage, and temperature of the transformer and uploads the recorded data to the cloud.This data helps to detect any anomalies in the transformer and alerts the authorities to take action.Ref. [21] proposes an IoT power monitor to fulfill the needs of three-phase power monitoring, which securely transmits real-time data to a cloud server, making the data available anytime, anywhere.Table 1 presents a comparison of related research both domestically and internationally.Based on the comprehensive results in Table 1, it can be seen that cloud computing has not yet formed a widely applicable solution in the field of power substations.Moreover, there are numerous challenges in terms of realtime performance, concurrency, stability, and security when it comes to the unified and centralized migration of power monitoring systems to the cloud.In response to the aforementioned background, this paper enumerates the existing architectures of power monitoring systems and compares them in terms of real-time performance, concurrency, stability, and security.It proposes a novel architecture for a highconcurrency and highly available "cloud, edge and end collaboration" substation operation support system based on edge computing.The test results demonstrate that compared to existing power monitoring system architectures, the proposed architecture has significant advantages in terms of large-scale substation integration and reasonable distribution of computing resources, meeting the requirements for scalable deployment [22].

Current Mainstream Power Monitoring System Architecture
According to the spirit of the document "Notice on Promoting the Optimization of Substation Operation and Maintenance Models and Pilot Construction of Centralized Control Stations" (Equipment Substation [2020] No. 57) issued by the State Grid Corporation of China, the pilot construction of centralized control stations is a key annual task of the State Grid Corporation.It aims to integrate and improve existing resources, optimize operation and maintenance management models, ensure high-quality implementation of pilot projects, and formulate reasonable construction plans for the pilot stations to steadily advance the pilot work.The 1000 kV UHV Dongwu Station and Taizhou Station of State Grid Jiangsu Electric Power Co., Ltd.(Nanjing, China) have been accepted by the State Grid Corporation of China (Beijing, China), making them the first batch of pilot projects for digital substations (converter stations).The achievements of these projects will be replicated and promoted within the State Grid system.Intelligent inspection is one of the key aspects of digital substation construction.A total of 198 visible light and infrared cameras have been deployed within the UHV Taizhou Station, covering all primary and secondary equipment.The construction of digital converter stations not only significantly reduces the workload of frontline personnel but also improves the accuracy of equipment status perception.In addition to the "smart eyes" distributed throughout the entire station, digital substations also possess the "brain" to diagnose equipment faults.
Not only the State Grid Corporation, but also the Southern Power Grid Guangxi Guigang Power Supply Bureau has successfully completed the transition handover period of the two county-level power supply enterprises in the area under its jurisdiction for the intensification of transmission and substation business of 35 kV and above, and it has completed the target one month ahead of time to enter the pilot application stage of the intensification work.Transmission and substation business intensification is an important means for Guangxi Power Grid Company to adapt to the needs of reform and development and promote high-quality development.Through the intensification work, the mature management practices of the prefecture and city power supply bureaus are extended to the whole network, which greatly improves the equipment level, resource allocation, and operation and maintenance quality of county-level power grids.By summarizing the construction of the domestic substation intensive pilot, it can be found that the mainstream power monitoring of domestic substations can be mainly divided into three modes: direct connection to the main station, hierarchical grading, and side-to-side collaboration.

Mode One: Master Direct Connection Mode
The master station direct connection mode refers to the mode in which edge devices in the substation are directly connected to the central master station.In this mode, edge devices communicate and exchange data with the central master station through communication networks such as ethernet and wireless communication.Therefore, the communication protocols and methods between edge devices and the central master station need to be compatible and consistent to ensure the correct transmission and parsing of data.The master station direct connection mode is a simple and efficient substation operation system architecture mode, suitable for smaller-scale substations and simpler application scenarios.It can provide real-time data transmission and response, facilitating system management and maintenance, but it has relatively weak scalability [23].As shown in Figure 1, the station-side substation intelligent gateway (without AI functionality) only collects and forwards data, while all task scheduling and intelligent analysis functions are performed by the regional platform.
scenarios.It can provide real-time data transmission and response, facilitating system management and maintenance, but it has relatively weak scalability [23].As shown in Figure 1, the station-side substation intelligent gateway (without AI functionality) only collects and forwards data, while all task scheduling and intelligent analysis functions are performed by the regional platform.This mode does not rely on the capabilities of intelligent gateways and achieves functionality and resource consolidation through the regional platform.However, the underlying framework of the system is complex and places high demands on hardware resources such as communication networks, master station computing power, and storage.

Model Two: Tiered and Graded Model
The layered and hierarchical mode is a pattern that divides and organizes system functions into different levels.By categorizing system functions into a hardware layer, data acquisition and transmission layer, data processing and storage layer, control and decision-making layer, and management and application layer, the substation operation support system can achieve functional isolation and decoupling between different levels, making the system more stable and reliable.At the same time, this mode also facilitates system scalability and upgrades, allowing flexible adjustments and optimizations of different levels according to actual needs [24].As shown in Figure 2, the station-side substation intelligent gateway (without AI functionality) only collects and forwards data, while task scheduling and intelligent operations are implemented by the regional platform.Several capability support modules are deployed at the regional level to achieve intelligent inspection and intelligent safety functions.This mode does not rely on the capabilities of intelligent gateways and achieves functionality and resource consolidation through the regional platform.However, the underlying framework of the system is complex and places high demands on hardware resources such as communication networks, master station computing power, and storage.

Model Two: Tiered and Graded Model
The layered and hierarchical mode is a pattern that divides and organizes system functions into different levels.By categorizing system functions into a hardware layer, data acquisition and transmission layer, data processing and storage layer, control and decisionmaking layer, and management and application layer, the substation operation support system can achieve functional isolation and decoupling between different levels, making the system more stable and reliable.At the same time, this mode also facilitates system scalability and upgrades, allowing flexible adjustments and optimizations of different levels according to actual needs [24].As shown in Figure 2, the station-side substation intelligent gateway (without AI functionality) only collects and forwards data, while task scheduling and intelligent operations are implemented by the regional platform.Several capability support modules are deployed at the regional level to achieve intelligent inspection and intelligent safety functions.
scenarios.It can provide real-time data transmission and response, facilitating system management and maintenance, but it has relatively weak scalability [23].As shown in Figure 1, the station-side substation intelligent gateway (without AI functionality) only collects and forwards data, while all task scheduling and intelligent analysis functions are performed by the regional platform.This mode does not rely on the capabilities of intelligent gateways and achieves functionality and resource consolidation through the regional platform.However, the underlying framework of the system is complex and places high demands on hardware resources such as communication networks, master station computing power, and storage.

Model Two: Tiered and Graded Model
The layered and hierarchical mode is a pattern that divides and organizes system functions into different levels.By categorizing system functions into a hardware layer, data acquisition and transmission layer, data processing and storage layer, control and decision-making layer, and management and application layer, the substation operation support system can achieve functional isolation and decoupling between different levels, making the system more stable and reliable.At the same time, this mode also facilitates system scalability and upgrades, allowing flexible adjustments and optimizations of different levels according to actual needs [24].As shown in Figure 2, the station-side substation intelligent gateway (without AI functionality) only collects and forwards data, while task scheduling and intelligent operations are implemented by the regional platform.Several capability support modules are deployed at the regional level to achieve intelligent inspection and intelligent safety functions.This model eases the pressure on bandwidth, arithmetic, and many other aspects through multi-tier deployment.However, it increases the operation and maintenance cost of each regional layer.

Model Three: Side-End Synergy Model
The edge-end collaborative mode refers to the mode in which data interaction and collaborative work are achieved between edge devices in the substation and the central master station.The central master station, as the core of the entire system, is responsible for receiving and processing data transmitted by edge devices.At the same time, edge devices also have certain edge computing capabilities, allowing them to perform local data processing and decision-making.Through edge computing, edge devices can perform real-time analysis and fault diagnosis on the collected data, enabling quick response to changes in the substation's operational status [25].As shown in Figure 3, the station-side substation intelligent gateway (with AI functionality) not only has the functions of data acquisition and forwarding but also has task execution and intelligent analysis capabilities.The regional platform only handles task scheduling and operational functions.
through multi-tier deployment.However, it increases the operation and maintenance cost of each regional layer.

Model Three: Side-End Synergy Model
The edge-end collaborative mode refers to the mode in which data interaction and collaborative work are achieved between edge devices in the substation and the central master station.The central master station, as the core of the entire system, is responsible for receiving and processing data transmitted by edge devices.At the same time, edge devices also have certain edge computing capabilities, allowing them to perform local data processing and decision-making.Through edge computing, edge devices can perform real-time analysis and fault diagnosis on the collected data, enabling quick response to changes in the substation's operational status [25].As shown in Figure 3, the stationside substation intelligent gateway (with AI functionality) not only has the functions of data acquisition and forwarding but also has task execution and intelligent analysis capabilities.The regional platform only handles task scheduling and operational functions.This model greatly relieves the pressure of communication and arithmetic scheduling on the side of the regional platform by realizing most of the functions at the station side.However, it relies heavily on the currently immature intelligent gateway technology capability, and the system architecture at the end is still focused on small-scale applications such as inspection and maintenance centers, failing to achieve a large scale.
In summary, each mode has its own advantages and disadvantages: (1) Mode 1: This mode is simple and easy to implement and maintain.The central master station is responsible for data processing and decision-making.However, it may face challenges such as high pressure on the master station, a single point of failure, high latency, and an inability to perform task execution.This mode is mainly suitable for simple data collection and transmission scenarios.(2) Mode 2: This mode uses distributed processing, which reduces the pressure on the central master station and enables task execution.It requires hardware resource configuration at different levels and needs to consider data transmission and synchronization issues.This mode is mainly suitable for large-scale data processing and decision-making scenarios.(3) Mode 3: The edge devices have certain computing capabilities and can perform data processing and decision-making locally, collaborating with the central master station.This improves real-time performance and enables task execution.However, it may face limitations in edge device processing capabilities and challenges in coordination and synchronization between edge devices.This mode is mainly suitable for edge computing and scenarios with high real-time requirements, especially when handling a large amount of data that is difficult or slow to process.This model greatly relieves the pressure of communication and arithmetic scheduling on the side of the regional platform by realizing most of the functions at the station side.However, it relies heavily on the currently immature intelligent gateway technology capability, and the system architecture at the end is still focused on small-scale applications such as inspection and maintenance centers, failing to achieve a large scale.
In summary, each mode has its own advantages and disadvantages: (1) Mode 1: This mode is simple and easy to implement and maintain.The central master station is responsible for data processing and decision-making.However, it may face challenges such as high pressure on the master station, a single point of failure, high latency, and an inability to perform task execution.This mode is mainly suitable for simple data collection and transmission scenarios.(2) Mode 2: This mode uses distributed processing, which reduces the pressure on the central master station and enables task execution.It requires hardware resource configuration at different levels and needs to consider data transmission and synchronization issues.This mode is mainly suitable for large-scale data processing and decision-making scenarios.(3) Mode 3: The edge devices have certain computing capabilities and can perform data processing and decision-making locally, collaborating with the central master station.This improves real-time performance and enables task execution.However, it may face limitations in edge device processing capabilities and challenges in coordination and synchronization between edge devices.This mode is mainly suitable for edge computing and scenarios with high real-time requirements, especially when handling a large amount of data that is difficult or slow to process.
It can be found that between different modes, for Mode 1 and Mode 2 the station-side substation intelligent gateway only collects and forwards data, and cannot realize the task execution function.Compared to Mode 3, they are more simple, but less practical.Mode 3, on the other hand, relies on intelligent gateway technology, which makes it difficult to process or slower to process when a large amount of data enters, and cannot meet the demand for real-time processing of monitoring systems in the field of substation in the future.

Business Requirements
The business requirements of domestic mega-city power grids (400-600 substations) are selected as a comparative prototype.The business requirements for the system architecture are formulated in terms of the dimensions of the core business in the production area, such as inspection, operation, security control, and storage requirements [26].
In the remote inspection scenario, with about 2000 inspection points per substation, it takes 1.5 h to complete the inspection task requirements for 600 substations; In the remote operation scenario, 15 operation tasks are carried out at the same time (3 operation tickets at the same time, each operation ticket involves 5 substations), which needs to satisfy the demand of simultaneously completing the recognition of 15 operation equipment's position within 1 min.
In the remote security control scenario, the fall maintenance peak period is at the same time for the operation of about 150 substations, with an average of each substation having 2 work controls, each job calling for 2 cameras, to meet the demand for 4 h of continuous 600-channel video work security identification.
In terms of storage, it needs to meet the requirements of 10 T of structured data and defective equipment videos and pictures to be saved permanently, and 50 T unstructured data to be stored for more than 3 months.

Comparison and Analysis
According to the business requirements of mega-city power grids, the advantages and disadvantages of the three construction modes are analyzed and compared from the dimensions of network requirements, hardware requirements, system performance, system reliability, construction cycle, economic costs, and operation and maintenance difficulties, as shown in Table 2.In terms of network demand, Mode 1 occupies the platform-side to station-side network in the whole process, with high bandwidth demand; Mode 2 occupies the regionalside to station-side network in the whole process, with partly high bandwidth demand; and Mode 3 occupies it only in the uploading of inspection results, with low bandwidth demand.
In terms of hardware requirements, Mode 1 is expected to require 34 20-core CPU streaming servers on the platform side, with AI power ≥ 21,760 T; Mode 2 is expected to require 3 20-core CPU streaming servers on the platform side, with AI power ≥ 1920 T, and 32 20-core CPU streaming servers on the regional layer, with AI power ≥ 21,500 T; Mode 3 is expected to require 3 20-core CPU streaming servers on the platform side, with AI power ≥ 1920 T, and a station-side configuration of the smart gateway with 4-core or more CPU resources, with AI power ≥ 128 T.
In terms of system performance, Mode 1 video decoding and analysis need to be completed by the platform side, which is less efficient; Mode 2 video decoding and analysis are completed by each regional platform, which is more efficient; and, in Mode 3, video decoding and analysis are completed by the intelligent gateway at the station end, which is highly efficient.
In terms of system reliability, Mode 1 has low reliability, and production operations may be halted in the event of system downtime, which is not allowed to happen.Mode 2 has the highest reliability, and in the event of system downtime, it can be switched to the regional platform for access.Mode 3 is higher reliability, and in the event of system downtime, the station-side smart gateway cyclical tasks can be carried out normally.
In terms of construction cycle, Mode 1 has a long construction cycle, high pressure on system logic strategy and resource scheduling, and high technical difficulty to realize.Mode 2 has a long construction cycle, and multiple clusters can simplify the scheduling strategy, reduce the pressure of resource scheduling on the platform, and lower the technical difficulty of the platform.Mode 3 has a short construction cycle, and the station ends are dispersed to independently schedule resources.
In terms of economic cost, Mode 1 has a low overall construction cost, but the number of substations that can be accessed is low, and the long-term operational efficiency is low.Mode 2 has a high overall construction cost, high number of substations that can be accessed, and moderate long-term operational efficiency.Mode 3 has a high overall construction cost, meets the number of substations accessed in mega-cities, and has high long-term operational efficiency.

Solutions and Optimizations
After comparative analysis, this paper integrates the three modes, carries out threepoint optimization, integrates the regional layer capacity support module to the platform side, maximizes the application of the AI intelligent gateway at the station side, designs the NVR at the station side to push the video streams to the intelligent gateway at the station side and the master station platform, and proposes a highly concurrent and usable "cloud-side-end collaboration" substation operation support system architecture based on edge computing on the basis of the base of the edge-end collaboration architectural model, which can satisfy the business requirements of mega-city power grids.
Optimization 1: Integrate the regional layer capacity support module to the platform side, adopt cluster deployment to realize the upper and lower levels of mutual master backup, and construct a highly available, highly stable, and highly concurrent system architecture.Under high concurrent access to achieve access load, AI computing, network communication, and other resource balancing are utilized to improve the efficiency of resource use and to ensure that the system has high reliability and compatibility with the gateway.It solves the complexity of the underlying framework of Mode 1 and the extremely high requirements for the carrying capacity of hardware resources such as the communication network, master arithmetic, and storage.
Optimization 2: Combining the current equipment state class identification and equipment defect class identification algorithms with an accuracy rate of more than 85%, configuring intelligent gateways containing AI expansion capabilities at the station side, unifying interface protocols and standards, giving full play to the edge-side computational power, sending universal algorithms from the cloud side, iterative training at the edge side, and feeding back the training results to the cloud side are utilized to continuously perform algorithm tuning.It solves the problem of high operation and maintenance costs of each regional layer in Mode 2.
Optimization 3: The station-side NVR pushes the video stream to the station-side intelligent gateway and the master station platform in one shot, decoupling the core video stream of the business from the intelligent gateway whose technology is not yet mature, and solving the problem that Mode 3 overly relies on the intelligent gateway and cannot be promoted on a large scale.

Architecture of "Cloud, Edge, and End Collaboration" Substation Operation Support System
The proposed "cloud, edge, and end collaboration" substation operation support system is based on the "cloud collaborative" substation operation support architecture, with the introduction of edge computing for real-time data collection and processing in substations.With the "cloud, edge, and end collaboration" architecture, real-time monitoring, fault diagnosis, load forecasting, and optimization scheduling of substations can be achieved.Additionally, this architecture can provide remote access and management capabilities, improving the efficiency of maintenance personnel and the reliability of substation operation.
4.1.Architecture of the "Cloud, Edge, and End Collaboration" System By comparing and analyzing the existing architectures and optimizing them from different aspects, it can also be seen that the proposed "cloud-side-end collaboration" system architecture is technically feasible.Traditional IoT technology is divided into a perception layer, network layer, and application layer, which are responsible for the recognition and perception of objects and data collection, transferring information, and processing data, combining the demand and intelligent application, respectively, but with the requirements of the blowout growth of the data volume and the application scenarios, it puts a great pressure on the network transmission of data real-time, bandwidth loading capacity, and so on [27].The introduction of edge computing and edge intelligence effectively relieves the burden of communication architecture and data processing, making the operation of the architecture more three-dimensional and efficient.The essence of edge computing is to decentralize some of the functions of the cloud center to the edge side of the network close to the data source in order to achieve local/nearby processing of data and related applications.In the context of massive data generated by the Internet of Everything, edge computing not only reduces the pressure of data traffic in the cloud center, but also improves the data processing efficiency, with low latency, low broadband, and high real-time characteristics, making it one of the key technologies to promote the development of the Internet of Things in electric power [28].In the "cloud-side-end cooperative" system architecture, the cloud center is responsible for handling non-real-time, high-complexity, global data services; the edge side supports small real-time local data services, which do not incur high equipment costs in computation and storage, and are highly economical; and the user side serves as a source of data, providing diversified and fine-grained information to support the upper-level decision-making and precise analysis.

Key Technologies for "Cloud, Edge, and End Collaboration" Collaboration
The substation operation support system architecture proposed in this paper utilizes IoT technology, combined with cloud-side collaboration technology, including four major capabilities: data collaboration, application collaboration, intelligent collaboration, and operation and maintenance collaboration [29].
Data consistency, management consistency, and service consistency between the cloud side and the plant and station side are ensured.
(1) Data synergy: It means that the gateway realizes real-time synchronization between the business management data on the cloud side and the real-time production monitoring data on the plant and station side through the unified physical model, API service, data service, and messaging service to ensure that the data above and below the cloud are consistent and the application experience is consistent.(2) Application synergy: It means that the gateway is docked to the cloud side, and it should have the ability to unify the management of plant-side applications as well as the remote deployment, remote upgrade, and remote monitoring of plant-side applications.The cloud-side application store is the server side, supporting the full life cycle management of application trial, uploading, releasing, subscribing, deploying, and operation and maintenance.The application store on the plant and station side the abilities of unified training, distribution, deployment, and remote upgrade of algorithms, and it is desirable for the gateway to have the ability of algorithm reception and operational reasoning on the plant side.Algorithms trained on the cloud side can be pushed online to the plant-side gateway for inference, and sample data generated on the plant side can be sent back to the cloud side.The downloading time of the model update should be limited to less than 2 h, and the update process should only interrupt the current interface service without affecting the interface services of other algorithms that have not been updated.In order for interface services that have not been updated with algorithms to be able to continue to perform operations, interface services during the update shall be interrupted for no more than ten minutes, and interface services after the update has been completed shall be able to continue to perform the unfinished operations prior to the update.(4) Operation and maintenance synergy: It means that the gateway, docked on the cloud side, should have the ability to unify operation and maintenance and monitor the equipment, intelligent inspection devices, and intelligent sensor terminals on the plant side; the gateway should have the ability to perform self-tests, alarms, and logging services for the equipment, intelligent inspection devices, intelligent sensor terminals, etc. on the plant side.

Architecture of "Cloud, Edge, and End Collaboration" Substation Operation Support System
Based on the summary and analysis of the existing architecture, and combined with the actual problems in the field of substations, the "cloud-side-end" cooperative substation operation system architecture is proposed, and the system architecture is shown in Figure 4. Based on the above discussion, the "cloud-side-end" cooperative substation operation system architecture is sorted out.
(2) Application synergy: It means that the gateway is docked to the cloud side, and it should have the ability to unify the management of plant-side applications as well as the remote deployment, remote upgrade, and remote monitoring of plant-side applications.The cloud-side application store is the server side, supporting the full life cycle management of application trial, uploading, releasing, subscribing, deploying, and operation and maintenance.The application store on the plant and station side is the client side, which unifies the operation framework under the cloud and can collaborate and interact with the cloud.(3) Intelligent collaboration: It means that the gateway is docked to the cloud side with the abilities of unified training, distribution, deployment, and remote upgrade of algorithms, and it is desirable for the gateway to have the ability of algorithm reception and operational reasoning on the plant side.Algorithms trained on the cloud side can be pushed online to the plant-side gateway for inference, and sample data generated on the plant side can be sent back to the cloud side.The downloading time of the model update should be limited to less than 2 h, and the update process should only interrupt the current interface service without affecting the interface services of other algorithms that have not been updated.In order for interface services that have not been updated with algorithms to be able to continue to perform operations, interface services during the update shall be interrupted for no more than ten minutes, and interface services after the update has been completed shall be able to continue to perform the unfinished operations prior to the update.(4) Operation and maintenance synergy: It means that the gateway, docked on the cloud side, should have the ability to unify operation and maintenance and monitor the equipment, intelligent inspection devices, and intelligent sensor terminals on the plant side; the gateway should have the ability to perform self-tests, alarms, and logging services for the equipment, intelligent inspection devices, intelligent sensor terminals, etc. on the plant side.

Architecture of "Cloud, Edge, and End Collaboration" Substation Operation Support System
Based on the summary and analysis of the existing architecture, and combined with the actual problems in the field of substations, the "cloud-side-end" cooperative substation operation system architecture is proposed, and the system architecture is shown in Figure 4. Based on the above discussion, the "cloud-side-end" cooperative substation operation system architecture is sorted out.End-side data can drive the operational efficiency of various aspects of substation monitoring from the bottom-up by mining the value of data.Combined with the traditional top-down operation and management mode of a power grid, the idea of cloud-sideend collaboration is utilized to design an application framework that can realize the two-way interaction between end-side data and cloud-side decision-making and utilize the edge computing technology to realize the data application in a distributed and hierarchical manner [30].The Figure 5 shows the main framework of the system architecture design based on "cloud-side-end collaboration".
tional top-down operation and management mode of a power grid, the idea of cloudsideend collaboration is utilized to design an application framework that can realize the two-way interaction between end-side data and cloud-side decision-making and utilize the edge computing technology to realize the data application in a distributed and hierar chical manner [30].The Figure 5 shows the main framework of the system architecture design based on "cloud-side-end collaboration".In edge-end collaboration, the end side uploads smart meters, geographic meteorol ogy, and other data to the nearby edge nodes, which use the processed real-time data and stored historical data as inputs to provide a basis for the edge nodes to realize the substa tion status.Compared with the traditional monitoring system, which can only be dis patched through the cloud center, the edge nodes can be regarded as multiple "monitors" assigned by the cloud center in each region to undertake fine-grained monitoring tasks.
Cloud collaboration is mainly responsible for the data interaction between the cloud center and special equipment.For example, important maintenance equipment should have a direct communication channel with the cloud center to ensure the safe and stable operation of the equipment.

Deployment Architecture of "Cloud, Edge and End Collaboration" Substation Operation Support System
Figure 6 shows the architecture of the "cloud, edge, and end collaboration" substation operation support system.On the platform side, the cloud platform allocates basic hardware and software resources, including application servers, database servers, data collection serv ers, streaming media servers, analysis servers, middleware servers, storage servers, load balancing servers, and cluster management servers.The system adopts distributed architec ture, the basic application platform is constructed based on Spring Cloud technology route all servers are deployed in dual-machine hot standby or cluster mode, and the system adopts application-level disaster recovery in off-site server rooms, which ensures the relia bility of the system operation while greatly enhancing the system's concurrent access and bearing capacity.
In response to the demand for concurrent access by multiple users and concurren requests for data from multiple stations, load balancing is used to optimize the connection In edge-end collaboration, the end side uploads smart meters, geographic meteorology, and other data to the nearby edge nodes, which use the processed real-time data and stored historical data as inputs to provide a basis for the edge nodes to realize the substation status.Compared with the traditional monitoring system, which can only be dispatched through the cloud center, the edge nodes can be regarded as multiple "monitors" assigned by the cloud center in each region to undertake fine-grained monitoring tasks.
Cloud collaboration is mainly responsible for the data interaction between the cloud center and special equipment.For example, important maintenance equipment should have a direct communication channel with the cloud center to ensure the safe and stable operation of the equipment.

Deployment Architecture of "Cloud, Edge and End Collaboration" Substation Operation Support System
Figure 6 shows the architecture of the "cloud, edge, and end collaboration" substation operation support system.On the platform side, the cloud platform allocates basic hardware and software resources, including application servers, database servers, data collection servers, streaming media servers, analysis servers, middleware servers, storage servers, load balancing servers, and cluster management servers.The system adopts distributed architecture, the basic application platform is constructed based on Spring Cloud technology route, all servers are deployed in dual-machine hot standby or cluster mode, and the system adopts application-level disaster recovery in off-site server rooms, which ensures the reliability of the system operation while greatly enhancing the system's concurrent access and bearing capacity.
In response to the demand for concurrent access by multiple users and concurrent requests for data from multiple stations, load balancing is used to optimize the connection mechanism for user access and data concurrency, ensuring smooth system access and stability under a large number of concurrent connections.For the business requirements of video stream forwarding, video stream parsing, and image and video analysis, the streaming media server and analysis server are deployed in a cluster mode, which improves the performance of video distribution, AI analysis capability, and operational reliability, and at the same time ensures that the bureau platform has the ability to flexibly expand in the long term [31].
bility under a large number of concurrent connections.For the business requirements of video stream forwarding, video stream parsing, and image and video analysis, the streaming media server and analysis server are deployed in a cluster mode, which improves the performance of video distribution, AI analysis capability, and operational reliability, and at the same time ensures that the bureau platform has the ability to flexibly expand in the long term [31] ...

Technical Architecture of "Cloud, Edge and End Collaboration" Substation Operation Support System
The technical architecture of the "cloud, edge and end collaboration" substation operation support system is divided into a perception layer, edge layer, network layer, digital platform layer, and application layer [32].The structure is shown in Figure 7.
The sensing layer collects substation equipment data in an all-round way through digital meters, online monitoring devices, sensors, cameras, and other intelligent terminals.
The edge layer carries out data forwarding, analysis, and other processing through the intelligent gateway (with AI function).The edge layer uses a cloud platform to allocate basic hardware and software resources, including application servers, database servers, data collection servers, etc. [33].The system can adopt a distributed architecture to ensure the reliability of system operation.Enhancing video distribution performance, AI analysis capability, and operational reliability can ensure that the bureau platform has the ability to flexibly expand capacity in the long term.
The network layer, on the other hand, ensures stable data transmission through communication protocols and communication statutes, ensures high-speed data transmission, and improves the reliability of the system.The substation operation support system utilizes the integrated data network for data transmission.
The digital platform layer realizes the operation and maintenance monitoring of data and ensures the application of equipment infrastructure through IoT platforms, data centers, and cloud platforms.
The application layer, on the other hand, utilizes artificial intelligence, media, etc. to present applications and security management to ensure real-time monitoring.

Technical Architecture of "Cloud, Edge and End Collaboration" Substation Operation Support System
The technical architecture of the "cloud, edge and end collaboration" substation operation support system is divided into a perception layer, edge layer, network layer, digital platform layer, and application layer [32].The structure is shown in Figure 7.
The sensing layer collects substation equipment data in an all-round way through digital meters, online monitoring devices, sensors, cameras, and other intelligent terminals.
The edge layer carries out data forwarding, analysis, and other processing through the intelligent gateway (with AI function).The edge layer uses a cloud platform to allocate basic hardware and software resources, including application servers, database servers, data collection servers, etc. [33].The system can adopt a distributed architecture to ensure the reliability of system operation.Enhancing video distribution performance, AI analysis capability, and operational reliability can ensure that the bureau platform has the ability to flexibly expand capacity in the long term.
The network layer, on the other hand, ensures stable data transmission through communication protocols and communication statutes, ensures high-speed data transmission, and improves the reliability of the system.The substation operation support system utilizes the integrated data network for data transmission.
The digital platform layer realizes the operation and maintenance monitoring of data and ensures the application of equipment infrastructure through IoT platforms, data centers, and cloud platforms.
The application layer, on the other hand, utilizes artificial intelligence, media, etc. to present applications and security management to ensure real-time monitoring.

Engineering Application Examples
The "cloud, edge, and end collaboration" substation operation support system introduces edge computing and edge intelligence, bringing cloud functionalities closer to the edge where data sources are located.This enables on-site data processing and related applications, effectively alleviating the burden on communication architecture and data processing.It reduces the data traffic pressure on the cloud center while improving data processing efficiency.This architecture is characterized by low latency, low bandwidth, and high real-time performance.To validate the effectiveness of the edge-end collaborative structure on the system, the proposed architecture is subjected to practical simulation testing [34].
The hardware environment for testing consists of a Qitian M620 desktop computer, equipped with an Intel Core i3-9100 CPU @3.60 GHz*4 processor, 16 GB of memory, and a 1TB hard drive.As for the software environment, the system used is Windows 10 Enterprise Edition as the test client operating system and LoadRunner 11 as the performance efficiency testing tool.The testing network environment is LAN.
Test content and method: Select the performance points such as environment monitoring, backup monitoring, and environment monitoring inspection result confirmation for download.Simulate 500 concurrent user accesses to the system, and record the response time, transaction success rate, server CPU occupancy rate, memory utilization rate, and other indices.
In the Section 2 of this paper, three mainstream architectures existing in the current power system are analyzed and compared in detail, which are the master connection mode, the hierarchical classification mode, and the side-end collaboration mode.Compared with Mode 1 and Mode 2, Mode 3 not only collects and forwards data, but also realizes the task execution function.In the current mainstream architecture, Mode 3 has relatively high performance and benefits.In order to better validate the feasibility of changing the architecture, this test will also simulate the testing of concurrent access to

Engineering Application Examples
The "cloud, edge, and end collaboration" substation operation support system introduces edge computing and edge intelligence, bringing cloud functionalities closer to the edge where data sources are located.This enables on-site data processing and related applications, effectively alleviating the burden on communication architecture and data processing.It reduces the data traffic pressure on the cloud center while improving data processing efficiency.This architecture is characterized by low latency, low bandwidth, and high real-time performance.To validate the effectiveness of the edge-end collaborative structure on the system, the proposed architecture is subjected to practical simulation testing [34].
The hardware environment for testing consists of a Qitian M620 desktop computer, equipped with an Intel Core i3-9100 CPU @3.60 GHz*4 processor, 16 GB of memory, and a 1TB hard drive.As for the software environment, the system used is Windows 10 Enterprise Edition as the test client operating system and LoadRunner 11 as the performance efficiency testing tool.The testing network environment is LAN.
Test content and method: Select the performance points such as environment monitoring, backup monitoring, and environment monitoring inspection result confirmation for download.Simulate 500 concurrent user accesses to the system, and record the response time, transaction success rate, server CPU occupancy rate, memory utilization rate, and other indices.
In the Section 2 of this paper, three mainstream architectures existing in the current power system are analyzed and compared in detail, which are the master connection mode, the hierarchical classification mode, and the side-end collaboration mode.Compared with Mode 1 and Mode 2, Mode 3 not only collects and forwards data, but also realizes the task execution function.In the current mainstream architecture, Mode 3 has relatively high performance and benefits.In order to better validate the feasibility of changing the architecture, this test will also simulate the testing of concurrent access to the system for  3 and Figure 8. the system for 500 users in Mode 3 and analyze the comparison through the final test results.The results are shown in Table 3 and Figure 8. Test results: In the simulation of 500 users' concurrent access, the average response time of the environment monitoring performance point of the new mode is 3.166 s, with a passing rate of 99.9%, while the average response time of the environment monitoring performance point of the Mode 3 case is 5.423 s, with a passing rate of 92.65%; the average response time of the device monitoring performance point of the new mode is 5.656 s, with a passing rate of 99.99%, while the average corresponding time for the device monitoring performance point in the case of Mode 3 is 8.462 s, with a passing rate of 93.42%; the average response time for the smart alarms of the new mode is 11 s, with a passing rate of 99.99%, while the average response time for the smart alarms of Mode 3 is 16.452 s, with a passing rate of 93.36%.The event passing rate of the inspection results for the Test results: In the simulation of 500 users' concurrent access, the average response time of the environment monitoring performance point of the new mode is 3.166 s, with a passing rate of 99.9%, while the average response time of the environment monitoring performance point of the Mode 3 case is 5.423 s, with a passing rate of 92.65%; the average response time of the device monitoring performance point of the new mode is 5.656 s, with a passing rate of 99.99%, while the average corresponding time for the device monitoring performance point in the case of Mode 3 is 8.462 s, with a passing rate of 93.42%; the average response time for the smart alarms of the new mode is 11 s, with a passing rate of 99.99%, while the average response time for the smart alarms of Mode 3 is 16.452 s, with a passing rate of 93.36%.The event passing rate of the inspection results for the new mode is 99.99% and the event passing rate of the inspection results for Mode 3 is 92.33%.Therefore, Energies 2024, 17,194 compared with the model considered, the new system architecture is better at meeting most of the mega-city power grid business requirements.
Compared with the traditional mainstream architecture Mode 3, the new system architecture shortens the response time of environment monitoring by 2.257 s, equipment monitoring by 2.806 s, intelligent alarm by 5.024 s, and inspection result confirmation by 1.647 s.Meanwhile, compared with the traditional mainstream architecture Mode 3, the success rate of key task execution increases by 7.34% for environment monitoring, 6.57% for equipment monitoring, 6.62% for intelligent alarm, and 7.66% for inspection result confirmation download.Meanwhile, compared with the traditional mainstream architecture Mode 3, the success rate of environment monitoring increased by 7.34%, the success rate of equipment monitoring increased by 6.57%, the success rate of intelligent alarm increased by 6.62%, and the success rate of inspection result confirmation download increased by 7.66%.Due to the current immature intelligent gateway technology capability, the side-end system architecture of Mode 3 is still focused on a small range of applications such as inspection and maintenance centers, and fails to scale up.Meanwhile, under the comparison of results, compared with the side-end collaborative architecture of Mode 3, the new system architecture has a greater improvement in real-time processing and accuracy.
Under the proposed architecture, the intelligent inspection result confirmation and defect management are analyzed.The results are shown in Figures 9 and 10.It can be seen that the average time for inspection result confirmation is only 0.3 s, and the average time for defect management is only 0.3 s, and the correct rate is 100%.It can be shown that the architecture proposed in this paper can effectively support the technology in the case of concurrent access by 500 users.
new mode is 99.99% and the event passing rate of the inspection results for Mode 3 is 92.33%.Therefore, compared with the model considered, the new system architecture is better at meeting most of the mega-city power grid business requirements.
Compared with the traditional mainstream architecture Mode 3, the new system architecture shortens the response time of environment monitoring by 2.257 s, equipment monitoring by 2.806 s, intelligent alarm by 5.024 s, and inspection result confirmation by 1.647 s.Meanwhile, compared with the traditional mainstream architecture Mode 3, the success rate of key task execution increases by 7.34% for environment monitoring, 6.57% for equipment monitoring, 6.62% for intelligent alarm, and 7.66% for inspection result confirmation download.Meanwhile, compared with the traditional mainstream architecture Mode 3, the success rate of environment monitoring increased by 7.34%, the success rate of equipment monitoring increased by 6.57%, the success rate of intelligent alarm increased by 6.62%, and the success rate of inspection result confirmation download increased by 7.66%.Due to the current immature intelligent gateway technology capability, the side-end system architecture of Mode 3 is still focused on a small range of applications such as inspection and maintenance centers, and fails to scale up.Meanwhile, under the comparison of results, compared with the side-end collaborative architecture of Mode 3, the new system architecture has a greater improvement in real-time processing and accuracy.
Under the proposed architecture, the intelligent inspection result confirmation and defect management are analyzed.The results are shown in Figures 9 and 10.It can be seen that the average time for inspection result confirmation is only 0.3 s, and the average time for defect management is only 0.3 s, and the correct rate is 100%.It can be shown that the architecture proposed in this paper can effectively support the technology in the case of concurrent access by 500 users.Based on the comparison of simulation test and analysis results, the proposed "cloudside-end coordination" substation operation support system architecture is better in terms of performance.Compared with the traditional cloud coordination system architecture, which is only applicable to small-scale grid systems, the "cloud-side-end coordination" system architecture can better meet the business requirements of most mega-city grids.Based on the comparison of simulation test and analysis results, the proposed "cloud side-end coordination" substation operation support system architecture is better in term of performance.Compared with the traditional cloud coordination system architecture which is only applicable to small-scale grid systems, the "cloud-side-end coordination system architecture can better meet the business requirements of most mega-city grids.

Conclusions
By organizing and summarizing the existing technologies related to substation sys tems, the "cloud-edge-end" collaborative capability is analyzed, and a "high availability high concurrency, high performance, high security" architecture for the "cloud, edge, and end collaboration" substation operation support system is proposed.In the simulation tes section, the system architecture shows better real-time performance and accuracy to bette cope with the business requirements of large-scale power grids.Meanwhile, based on sim ulation test comparisons, the proposed architecture for the substation operation suppor system achieves precise monitoring of equipment operation, unified management of op erational data, and integration of production command decision-making.It can effectivel respond to grid operation risks and comprehensively support the digital transformation o substations.Application examples demonstrate that the proposed architectural approac has significant advantages in terms of large-scale substation integration and reasonable dis tribution of computing power.The passing rate under different performance conditions ex ceeds 99%, providing effective technical support for concurrent access by 500 users and meeting the requirements for the scaled-up promotion of digital substations.Compared with the traditional system architecture Mode 3, the substation operation architecture pro posed in this paper in can greatly improve the real-time performance and accuracy of th monitoring system and can play a better role in large-scale urban power grid operation and management.
In the proposed architecture for the substation operation support system, furthe analysis and research are required to consider system performance when there is concur rent access by a larger number of users.In subsequent research, based on a unified tech nical approach, the architecture of cloud-edge collaboration will be implemented by se lecting pilot substations and converter stations to complete the coordination and verifica tion of edge and end devices.

Conclusions
By organizing and summarizing the existing technologies related to substation systems, the "cloud-edge-end" collaborative capability is analyzed, and a "high availability, high concurrency, high performance, high security" architecture for the "cloud, edge, and end collaboration" substation operation support system is proposed.In the simulation test section, the system architecture shows better real-time performance and accuracy to better cope with the business requirements of large-scale power grids.Meanwhile, based on simulation test comparisons, the proposed architecture for the substation operation support system achieves precise monitoring of equipment operation, unified management of operational data, and integration of production command decision-making.It can effectively respond to grid operation risks and comprehensively support the digital transformation of substations.Application examples demonstrate that the proposed architectural approach has significant advantages in terms of large-scale substation integration and reasonable distribution of computing power.The passing rate under different performance conditions exceeds 99%, providing effective technical support for concurrent access by 500 users and meeting the requirements for the scaled-up promotion of digital substations.Compared with the traditional system architecture Mode 3, the substation operation architecture proposed in this paper in can greatly improve the real-time performance and accuracy of the monitoring system and can play a better role in large-scale urban power grid operation and management.
In the proposed architecture for the substation operation support system, further analysis and research are required to consider system performance when there is concurrent access by a larger number of users.In subsequent research, based on a unified technical approach, the architecture of cloud-edge collaboration will be implemented by selecting pilot substations and converter stations to complete the coordination and verification of edge and end devices.
side, which unifies the operation framework under the cloud and can collaborate and interact with the cloud.(3)Intelligent collaboration: It means that the gateway is docked to the cloud side with

Figure 6 .
Figure 6.Architecture diagram of "Cloud, Edge, and End Collaboration" substation operation support system.

Figure 6 .
Figure 6.Architecture diagram of "Cloud, Edge, and End Collaboration" substation operation support system.
Mode 3 and analyze the comparison through the final test results.The results are shown in Table

Figure 9 .
Figure 9. Chart confirming the results of inspections.

Figure 9 .
Figure 9. Chart confirming the results of inspections.

Table 1 .
Detailed comparison of the proposed method with published papers.

Table 3 .
Results of 500 concurrent user accesses to the system.

Table 3 .
Results of 500 concurrent user accesses to the system.