Abstract
The subway power supply system, as a critical component of urban rail transit infrastructure, plays a pivotal role in ensuring operational efficiency and safety. However, current systems remain heavily dependent on manual interventions for fault diagnosis and recovery, limiting their ability to meet the growing demand for automation and efficiency in modern urban environments. While the concept of “self-healing” has been successfully implemented in power grids and distribution networks, adapting these technologies to subway power systems presents distinct challenges. This review introduces an innovative approach by integrating multi-agent systems (MASs) with advanced artificial intelligence (AI) algorithms, focusing on their potential to create fully autonomous self-healing control architectures for subway power networks. The novel contribution of this review lies in its hybrid model, which combines MASs with the IEC 61850 communication standard to develop fault diagnosis, isolation, and recovery mechanisms specifically tailored for subway systems. Unlike traditional methods, which rely on centralized control, the proposed approach leverages distributed decision-making capabilities within MASs, enhancing fault detection accuracy, speed, and system resilience. Through a thorough review of the state of the art in self-healing technologies, this work demonstrates the unique benefits of applying MASs and AI to address the specific challenges of subway power systems, offering significant advancement over existing methodologies in the field.
Keywords:
subway power supply systems; self-healing technologies; multi-agent systems (MASs); IEC 61850 standard; fault diagnosis and isolation; artificial intelligence (AI) algorithms; power grid self-healing; distribution networks; fault recovery; real-time fault detection; automated restoration processes; decentralized control systems; subway network resilience; predictive maintenance 1. Introduction
With the rapid pace of global urbanization, subways have become an essential solution to alleviate urban traffic congestion. According to the China Urban Rail Transit Association, subway operating mileage and passenger numbers continue to grow, cementing subways as a cornerstone of urban transportation [1]. As urban populations expand, ensuring the reliability of subway power supply systems has become increasingly crucial. Failures in the power supply system can lead to disruptions in subway operations, negatively impacting passenger safety and system efficiency. Traditionally, these systems have relied on manual interventions for fault diagnosis and recovery, limiting their ability to address the growing demand for automation and rapid response in modern urban environments.
To address these critical challenges, this review presents a novel approach for integrating multi-agent systems (MASs) with advanced artificial intelligence (AI) algorithms to enable fully autonomous self-healing capabilities in subway power systems. The integration of MASs with AI technologies aims to enhance subway power systems’ ability to detect, isolate, and recover from faults more efficiently than traditional methods, which heavily rely on centralized control. This hybrid system enables distributed decision-making, allowing for real-time, local fault detection and diagnosis without central authority, thus reducing response times and improving system resilience.
The main objectives of this review are as follows:
- (1)
- Investigate the historical development and current state of self-healing technologies in power supply systems, with a particular focus on their adaptation and application in subway power systems.
- (2)
- Analyze how MASs and AI enhance the capabilities of subway systems in fault detection, isolation, and recovery, enabling autonomous decision-making and real-time responses to system failures.
- (3)
- Examine the integration of the IEC 61850 communication standard with MASs [2], and how this contributes to decentralized control, improving fault recovery and enhancing the scalability of self-healing systems in subway power networks.
- (4)
- Address the unique challenges faced by subway systems, such as reliability, response times, fault management, and system resilience, and propose integrated solutions through the application of MASs and AI.
The reliability and efficiency of subway systems are tightly coupled with the performance and stability of their power supply systems. Ensuring uninterrupted service and the safety of passengers requires that power systems be able to self-heal, automatically recovering from faults and minimizing disruptions. While self-healing technology has been widely researched and implemented in power grids, adapting this technology to subway systems presents a distinct set of challenges due to the unique operational environment of urban rail systems. Traditional centralized control methods often fail to provide the level of speed, accuracy, and resilience required for the dynamic and complex environment of subway power systems.
Self-healing technologies in power systems allow for autonomous fault recovery without relying on human intervention, improving overall system reliability. First introduced in the U.S. power grid systems, this technology enables the automatic identification, isolation, and restoration of power during faults, significantly improving system performance and minimizing downtime [3]. This review explores how MASs and AI, integrated with the IEC 61850 communication standard, offer a decentralized, autonomous approach that is a marked improvement over traditional fault recovery methods. This innovative combination has the potential to revolutionize subway power systems, providing faster and more efficient responses to faults.
The hybrid model proposed in this review utilizes MASs to decentralize decision-making, allowing each agent within the system to independently detect, diagnose, and resolve faults. The decentralized nature of MASs enhances fault detection by distributing decision-making processes across multiple system components, enabling quicker responses and improving fault isolation accuracy. Furthermore, the integration of AI enhances predictive capabilities, enabling the system to anticipate potential failures and proactively manage faults before they escalate into service disruptions. Compared to traditional centralized methods, this decentralized approach offers greater flexibility, scalability, and resilience, addressing the dynamic and increasingly complex demands of modern subway networks.
The integration of MASs with the IEC 61850 communication standard represents a novel approach in self-healing technology, moving beyond conventional methods that primarily rely on centralized control systems. By empowering each agent within the system to independently detect, diagnose, and resolve faults, this hybrid architecture offers a significant improvement in speed, accuracy, and system resilience compared to traditional fault recovery methods. Through this novel integration, we aim to provide a comprehensive and scalable solution to enhance the reliability of subway power systems, setting the foundation for more autonomous urban transport networks.
Although subway power systems differ from traditional power grids in function and requirements, they similarly demand high reliability and fast response capabilities. Wang (2010) [4] and Du (2010) [5] explored fault diagnosis and protection methods in traction power systems, laying the foundation for research into self-healing technologies in subway systems. Subsequently, research into fault response and recovery in subway systems began incorporating MASs and AI to enhance the automation and intelligence of fault handling [6,7]. The application of MASs in subway power systems mainly focuses on optimizing fault detection, diagnosis, and system recovery. Song (2015) conducted an in-depth study of fault location in urban rail transit traction power systems, proposing MAS-based optimization strategies [8]. Additionally, AI technologies, particularly machine learning and deep learning, have been widely applied in fault data analysis and fault prediction [9,10].
Research on self-healing technology in subway power systems not only emphasizes rapid fault recovery but also explores how technological integration and innovation can improve overall system stability and reliability. For instance, Wang and Lv (2022) improved fault location accuracy and efficiency by studying fault point distance measurement methods in direct current (DC) traction power systems [11]. Another unique challenge faced by subway systems is how to restore power quickly without interrupting service. Wei et al. (2023) utilized global positioning system (GPS) time synchronization technology to enhance fault distance measurement accuracy, providing technical support for the rapid recovery of subway systems [12]. Additionally, Jin et al. (2017) conducted simulation studies on fault location in subway DC power systems using time-domain differentiation methods, improving fault response efficiency [13]. The integration of advanced AI technologies and real-time communication protocols like IEC 61850 has further enhanced the fault diagnosis process, enabling subway systems to predict and address multi-fault scenarios that traditional methods would struggle to handle. Reliability studies are also a crucial aspect of the development of self-healing technologies in subway power systems. Pei (2018) conducted an in-depth study on the reliability of subway traction power systems, identifying key technologies and methods for improving system reliability [14]. Meanwhile, Zhou (2012), in his master’s dissertation, analyzed online reliability assessments of subway power systems, providing scientific support for real-time monitoring and maintenance [15].
The subway power system is vast, with numerous risk points, and any fault can have widespread consequences, negatively impacting trains, passengers, and equipment. It could even lead to serious disruptions in traffic and social order. However, the current capabilities of subway power systems in fault analysis, handling, recovery, and prediction are relatively weak and inefficient. In the case of a failure, the system still relies heavily on emergency repairs and manual interventions, which fall short of the high service standards expected for modern subway systems.
Since 1999, when the United States’ “Consortium for Electric Infrastructure to Support a Digital Society (CEIDS)” [3] applied the concept of “self-healing” to the power grid, it has become a research focus and a key marker of grid intelligence. The “IntelliGrid” research project of the Electric Power Research Institute (EPRI) and the “Modern Grid Initiative” research project of the U.S. Department of Energy’s National Energy Technology Laboratory (NETL) have both made self-healing a primary research topic for the next generation of power grids. Similarly, in 2009, China’s State Grid Corporation proposed the development plan for a “robust smart grid”, emphasizing eight key characteristics for smart grids: self-healing, incentivizing and accommodating users, resisting attacks, providing power quality that meets the demands of the 21st century, allowing for the integration of various forms of power generation and storage, supporting a thriving electricity market, ensuring the optimal and efficient operation of assets, and utilizing high-speed communication and online monitoring. Thus, both domestic and international grids consider “self-healing” as a primary feature of next-generation smart grids [16].
The application of MASs and AI in subway power systems builds on extensive prior research in fault detection, location, and recovery within traditional power grids. Notable contributions to self-healing technologies in power systems have been made by initiatives such as the IntelliGrid project by the EPRI and the Modern Grid Initiative by the U.S. Department of Energy, which have explored the integration of self-healing technologies into grid systems [11,12]. These projects have demonstrated the effectiveness of self-healing technologies in improving grid stability, fault recovery speed, and reducing downtime, providing valuable insights into their potential application in subway systems.
Furthermore, the International Electrotechnical Commission 61580 communication standard (IEC 61850), i.e., Communication Networks and Systems for Power Utility Automation, initially developed for use in substation automation, is increasingly being adopted in subway power systems. This standard enables real-time data exchange and ensures interoperability between different devices within the subway power network. Its integration with MASs creates a highly responsive and adaptive environment that enhances the coordination of fault recovery efforts, optimizes energy distribution, and improves overall system resilience. IEC 61850 facilitates synchronized operations across multiple agents in the system, ensuring that fault recovery actions are carried out quickly and efficiently, with minimal impact on subway operations.
The need for autonomous fault management is particularly pressing in the context of subway systems. These systems are highly complex, with a large number of potential fault points across numerous subsystems. Faults in the power supply can cause significant disruptions, affecting train operations, passenger safety, and equipment functionality [17,18]. Despite advancements in fault management, current subway power systems still rely on manual interventions and are often slow to respond to issues. Self-healing technology, when integrated with MASs and AI, has the potential to address these challenges by providing autonomous, real-time responses to faults, thus improving system reliability and reducing downtime [19,20].
The unique operational environment of subway systems also presents several additional challenges. Power supply systems in subway networks must be able to maintain continuous service, even when faults occur, which is crucial for minimizing service interruptions and ensuring the safety of passengers [21,22]. Moreover, subway power systems often experience high levels of dynamic demand, with power requirements fluctuating throughout the day. This requires power systems to be highly adaptable and responsive to changes in demand, which traditional fault recovery methods are not equipped to handle. MASs and AI provide the necessary intelligence and flexibility to manage these dynamic conditions, enabling subway power systems to function more efficiently and reliably [15,23,24].
Overall, the self-healing technology of subway power systems plays a crucial role in enhancing the reliability and efficiency of these systems. As a key support technology for urban transportation, it not only ensures the safety and smooth operation of city traffic but also drives the modernization of subway systems through technological innovation and system optimization, enabling them to better meet the complex demands of modern urban development. By enabling faster fault detection and response, the self-healing technology in subway power systems reduces downtime, thereby improving service continuity and reliability. This technology is not limited to resolving existing faults but also aims at preventing potential issues, significantly boosting the overall operational efficiency of the subway system [25]. Moreover, the integration of self-healing technology facilitates real-time monitoring, which is crucial in predicting and preventing potential service interruptions [26]. Self-healing technology further contributes to the overall safety and efficiency of subway systems, offering real-time data insights for enhanced operational resilience and passenger safety. Safety is the top priority in urban rail transit systems. Self-healing technology significantly enhances the safety standards of subway power systems by enabling real-time monitoring and automatic adjustments of system settings. For instance, deploying advanced sensors and monitoring equipment allows for real-time detection of the power supply line’s status, enabling quick fault identification and isolation to prevent potential accidents [27]. The application of this technology greatly reduces train delays and accidents caused by power supply issues, providing passengers with a safer and more stable travel environment. Self-healing technology also plays a significant role in improving passenger convenience. By optimizing the self-healing capabilities of the subway power system, service interruptions due to power failures are minimized, ensuring smoother and more uninterrupted travel for passengers [28]. Furthermore, with the integration of self-healing technology and mobile connectivity, passengers can access real-time train operation statuses and fault recovery progress through smartphone applications, enhancing the transparency and convenience of the travel experience [29].
As urbanization accelerates, subway systems are expected to face increasing demands for higher capacity, greater operational efficiency, and faster response times. Self-healing technology in subway power systems is essential for meeting these demands, as it can improve service continuity, enhance system resilience, and reduce the need for manual intervention [30]. For example, through intelligent upgrades, subway systems can automatically adjust operating frequencies and power distribution during peak periods, optimizing resource utilization to meet the continuously changing passenger flow demands [31]. By integrating advanced monitoring technologies, AI-based fault diagnosis, and automated recovery mechanisms, self-healing systems can proactively address potential failures, enhancing system stability and ensuring seamless subway operations. Looking ahead, the self-healing capabilities of subway power systems will continue to improve with ongoing technological advancements and innovations. This improvement is not limited to technical innovations but also includes optimizing management strategies and operational models. Through these comprehensive measures, subway systems will be able to provide safer, more convenient, and more reliable services to passengers, while contributing to the sustainable development of cities [32]. Thus, in the process of urban development and modernization, subway systems, as an integral part of public transportation, play a crucial role. Subways not only significantly improve the efficiency of urban transportation but also help reduce road congestion and environmental pollution. However, the efficient operation of subway systems heavily relies on the reliability and stability of their power supply systems. In the event of a power system failure, operations can be disrupted, and safety incidents may occur, causing serious impacts on city operations and the daily lives of citizens. Therefore, researching and implementing self-healing technologies for subway power systems has become a critical necessity. This is not only to enhance system reliability but also to ensure passenger safety and improve their service experience. Based on this, the following sections of this review provide a detailed discussion of the research necessity.
- (1)
- Enhancing the Reliability and Efficiency of Power Supply Systems: The self-healing technology in subway power systems can significantly improve the system’s automatic diagnosis and fault recovery capabilities, thereby reducing service interruptions caused by system failures. By incorporating advanced monitoring technologies and automation tools, the self-healing system can respond rapidly to faults, minimizing reliance on manual intervention, and increasing the speed and accuracy of fault resolution.
- (2)
- Ensuring Safe and Smooth Urban Transit: As a major public transportation system, the safety of subway operations directly impacts the lives of thousands of passengers and the overall public safety of the city. Self-healing technology helps prevent accidents caused by power instability or interruptions by promptly detecting and addressing power supply issues, significantly improving the safety of subway operations.
- (3)
- Adapting to the Needs of Modern Urban Development: As urbanization accelerates, subway systems face increasing challenges, including rising passenger numbers, higher service expectations, and more complex operating environments. Self-healing technology, through intelligent management and real-time data analysis, optimizes the performance of subway power systems, better meeting these evolving demands.
- (4)
- Improving Passenger Experience: The application of self-healing technology goes beyond enhancing technical performance; it directly improves the passenger experience by reducing faults and delays. For example, the system can automatically isolate and repair minor faults without disrupting the entire network, providing passengers with more stable and reliable service.
- (5)
- Driving Technological Innovation and Industry Progress: Research into self-healing technology for subway power systems has spurred innovations in related technologies, including applications in artificial intelligence, the Internet of Things (IoT), and big data analytics. The integration and innovation of these technologies not only optimize subway power systems but also promote the development of intelligent transportation and smart city technologies.
In this review, the research on self-healing technology in subway power systems is of great significance in ensuring the safe, efficient, and reliable operation of urban rail transit. This not only meets the demands of modern cities for high-standard public transportation systems but also provides an effective means to enhance the technological level and service quality of public transit systems. Based on this, this paper provides a detailed summary of the research progress on self-healing technology in subway power systems, with a particular focus on the integrated application of MASs and the IEC 61850 standard. The following is a summary of the main contents of this review paper.
- (1)
- Introduction to the Concept of Self-Healing: This paper begins by introducing the basic concept of self-healing, which originally stems from biological systems. It explains how this concept has been adapted for use in power systems. The primary function of self-healing technology in power systems is to reduce human intervention by automating the processes of fault detection, isolation, and recovery. This, in turn, enhances the reliability and efficiency of the overall system. In terms of historical background, this paper reviews the evolution of the self-healing concept within power systems. It highlights notable initiatives such as EPRI’s IntelliGrid project and the U.S. Department of Energy’s Modern Grid Initiative, both of which signify the integration of self-healing technologies as essential components of modern intelligent energy systems.
- (2)
- Self-Healing Control Architectures: The discussion then shifts to various control architectures employed in self-healing technology for distribution networks. This paper compares hierarchical control systems with MASs, illustrating the shift from centralized systems to more decentralized and faster-responding systems. This paper emphasizes that, although self-healing technologies have been extensively researched and developed in traditional power systems, they are relatively new in the context of subway power systems. It advocates for adapting self-healing technology to subway systems by leveraging the unique characteristics of these systems and incorporating both MASs and the IEC 61850 standard.
- (3)
- Fault Diagnosis and Recovery Technologies: This paper provides a comprehensive review of current technologies used for fault location, isolation, and recovery in distribution networks and railway systems. Special attention is given to the application of these technologies in subway systems, where both direct judgment methods and computational analysis approaches for fault location are explored. In terms of innovation, this paper discusses the potential of using AI for fault diagnosis and recovery. The integration of AI is seen as a promising way to significantly enhance the system’s ability to address complex, multi-point faults.
- (4)
- Development of New Technologies and Challenges: This paper forecasts the future application of hybrid augmented intelligence and generative AI in subway power systems. These emerging technologies are expected to be effective tools for solving complex fault scenarios. However, this paper also discusses the technical challenges posed by the introduction of flexible direct current (DC) technology into subway power systems. It examines how this development may introduce new challenges for implementing self-healing technologies in these systems.
Through a comprehensive review and analysis, this paper not only clarifies the current state of research and future directions for self-healing technology in subway power systems but also provides a theoretical foundation and technical guidance for achieving intelligent and autonomous subway operations. This review offers an in-depth exploration of self-healing technologies for subway power systems, with a particular focus on the application of MASs and the IEC 61850 standard. This research is of significant academic value and offers critical insights and impetus for related fields of study and practice. The key contributions of this paper are summarized as follows:
- (1)
- Enhancing the Stability and Reliability of Subway Power Systems: The application of self-healing technologies can significantly reduce service interruptions and accidents caused by power issues in subway operations. This, in turn, improves the overall stability and reliability of the subway power system, which is crucial for meeting the growing demand for urban public transportation, ensuring the safety and efficiency of travel for millions of passengers.
- (2)
- Promoting the Development of Intelligent Transportation Systems: By integrating advanced information technologies and communication standards such as IEC 61850, self-healing technology in subway power systems not only improves the efficiency of fault management but also accelerates the development of intelligent transportation systems. These integrated technologies provide vital support for building smart cities.
- (3)
- Optimizing Energy Management and Environmental Sustainability: Self-healing technologies contribute to optimizing energy distribution and usage efficiency, helping reduce energy consumption and environmental impact. When applied globally, these technologies can positively influence energy conservation, emissions reduction, and environmental protection.
- (4)
- Inspiration and Advancement for Related Fields. (i) Cross-application of smart grid technologies: The self-healing technology in subway power systems draws from key aspects of smart grid technology, such as real-time data monitoring and automated fault response. This not only enhances the level of automation in subway systems but also provides new approaches and methodologies for applying smart grid technologies in other fields. (ii) Fostering multidisciplinary integration: This paper emphasizes the integration of MASs and artificial intelligence in self-healing systems for subway power supply. This multidisciplinary fusion promotes collaboration across fields such as computer science, electrical engineering, and transportation engineering, opening up new research and application areas. (iii) Inspiring new business models and policy development: The advancement of self-healing technologies in subway power systems may inspire new business models, such as performance-based service contracts and advanced maintenance services. It may also encourage governments and industries to establish relevant standards and policies to support the widespread deployment and application of such technologies.
This review aims to provide a comprehensive synthesis of the research landscape on self-healing subway power supply systems, focusing specifically on the integration of MASs and AI technologies. We will examine how these technologies contribute to improving fault detection, diagnosis, and recovery in subway power supply systems, and explore the challenges and future directions for further development. Through this exploration, we aim to present a clear understanding of the existing advancements in this field and propose avenues for future research to enhance the reliability and efficiency of subway power systems. In conclusion, this paper not only demonstrates academic innovation and foresight but also holds broad applicability and significance in real-world contexts. It provides a theoretical foundation and technical roadmap for self-healing technologies in subway power systems, while also offering valuable insights and long-lasting influence for researchers in related fields.
In conclusion, integrating MASs, AI, and the IEC 61850 communication standard represents a groundbreaking approach to self-healing subway power systems. This combination offers autonomous, distributed decision-making capabilities that significantly improve the speed and accuracy of fault detection, isolation, and recovery. By addressing the unique challenges of subway systems, these technologies will play a pivotal role in enhancing the resilience, efficiency, and safety of urban transit infrastructure. The following sections of this review will explore these advancements in detail, providing a comprehensive overview of the state of the art in self-healing technologies and their potential to revolutionize subway power supply systems.
The rest of this article is organized as follows: Section 2 provides a comprehensive review of the current state of self-healing technologies within electrical and subway power systems, detailing the historical development and recent advancements that establish the foundation for subsequent discussions. This section also introduces the key concepts and terminologies used throughout this paper, setting the stage for deeper exploration. Section 3 delves into the specific challenges faced by subway power supply systems, distinguishing them from general electrical grids, with a focus on their unique operational demands and the critical need for reliable power delivery. It presents a critical analysis of existing fault diagnosis and protection methodologies as explored by significant studies in the field. Section 4 explores the integration of MASs and the IEC 61850 standard into subway power systems. This section outlines how these technologies converge to enhance the self-healing capabilities of subway systems, offering a detailed discussion on the synergy between advanced control architectures and standardized communication protocols. Section 5 presents novel research findings and practical applications of self-healing techniques in subway systems. It discusses various case studies and experimental results that demonstrate the effectiveness of MASs and AI algorithms in improving fault detection, isolation, and recovery processes. Section 6 discusses the implications of these technologies for future developments in subway power systems. It highlights the potential for broader applications of AI and MASs in enhancing the automation and intelligence of urban transit systems, proposing a roadmap for future research. Finally, Section 7 concludes the article with a summary of the key findings and their implications for the field of subway power supply systems. It reiterates the importance of advancing self-healing technologies to meet the growing demands of modern urban transportation, calling for continued research and collaboration within the field.
The structure of this article meticulously develops the narrative on self-healing technologies in subway power systems. Section 2 lays the groundwork by reviewing the history and current advancements in self-healing technologies, setting the stage for Section 3, which addresses the unique challenges and needs of subway systems. Section 4 builds on this by discussing the integration of MASs and the IEC 61850 standard, which are crucial for enhancing fault management capabilities. This technical exploration feeds into Section 5, where case studies illustrate the practical effectiveness of these technologies. Section 6 explores broader implications and future potentials, leading to Section 7 that synthesizes all discussions, summarizing key insights and affirming the importance of continued research. This progression ensures a coherent flow, with each section logically supporting the next in exploring the application and impact of self-healing technologies in urban transit.
2. Review of Self-Healing Technologies Within Electrical and Subway Power Systems
In this chapter, we first introduce the conception of self-healing (Section 2.1). Then, by examining the historical evolution (Section 2.2), delving into the key architectural components of self-healing frameworks (Section 2.3), and exploring their emerging adoption and future prospects in subway power systems (Section 2.4), this chapter establishes a comprehensive foundation for understanding how self-healing technologies can—and increasingly do—shape modern electrical and traction networks. This exploration underscores the technical sophistication, interdisciplinary nature, and forward-looking research opportunities that define self-healing as a transformative force in the pursuit of reliable, efficient, and intelligent urban power infrastructures.
2.1. The Concept of Self-Healing in Metro Power Supply Systems
The concept of “self-healing” originates from biology, where it refers to the intrinsic ability of living organisms to maintain homeostasis and recover from external disturbances and damage. In the context of power grids, self-healing generally denotes the capability of an electrical network to identify and isolate faults rapidly and to restore power supply to critical loads—ideally with minimal or no human intervention [31]. While theoretical advancements in self-healing systems are promising, real-world applications remain underexplored. Future empirical studies will be necessary to test the efficacy of these techniques in operational subway power systems. This process is often likened to the immune system in biological organisms, which functions to detect threats, isolate them, and re-establish normal operation. The earliest formal definition of self-healing in the electric grid was provided by the Electric Power Research Institute (EPRI) in the SPID (Strategic Power Infrastructure Defense System) project [32], emphasizing the adaptive measures (e.g., intentional islanding, adaptive protection, information, and sensing) to mitigate various threats, including natural disasters, communication failures, market perturbations, and deliberate acts of sabotage.
In China’s power industry, self-healing has been similarly characterized as the procedure by which problematic or failed components in the grid are detected and isolated automatically, or with minimal operator interaction, such that the broader system swiftly returns to normal operational conditions. In international practice, the essential self-healing functionality is broadly summarized as FLISR (Fault Location, Isolation, and Service Restoration). Research efforts worldwide have focused on developing self-healing control architectures, algorithms for fault identification and analysis, and strategies for fault isolation and rapid system reconfiguration.
2.1.1. Relevance to Metro Power Supply Systems
Metro power supply systems are generally concentrated in densely populated urban centers, where operational reliability and continuity of service are paramount for passenger safety, transit efficiency, and broader socio-economic stability. A power supply disruption in metro systems can cause immediate and extensive societal impacts, ranging from passenger inconvenience to significant losses in productivity and heightened safety risks. Consequently, implementing self-healing mechanisms in metro power supply systems involves adopting fast, automated, and robust control strategies that can isolate malfunctioning lines or equipment and restore power in seconds or even sub-second timescales. By leveraging real-time monitoring, advanced fault detection, and switchgear automation, metro systems aim to ensure seamless service despite faults or disturbances. Although these concepts have been widely proposed in the literature, empirical validation is required to confirm their effectiveness under real-world operating conditions. Pilot studies and simulations will be essential to demonstrate the practical viability of self-healing systems in metro environments.
2.1.2. Mathematical Representation of Self-Healing in Metro Power Supply
To encapsulate key aspects of self-healing—namely fault detection, fault isolation, and supply restoration—quantitative models are often developed [33,34,35]. For example, several studies have implemented fault detection and isolation algorithms in simulation environments, but empirical validation through real-world data is necessary to assess the accuracy of these models in practice. In Ref. [35], researchers present the functioning mechanisms of five different strategies for implementing self-healing capability into cement-based materials. Future efforts will involve testing these algorithms in operational subway systems to calibrate and refine the mathematical models based on actual fault occurrences and system behavior. These models facilitate the design and evaluation of strategies that minimize fault impact on metro operations while satisfying stringent safety and reliability constraints. Below are illustrative formulations that can be adapted for more detailed analyses:
1. Fault Detection Probability
Let t represent the time elapsed since a fault occurrence, and let Pd(t) denote the probability of having successfully detected and located the fault by time t. A widely adopted model assumes an exponential increase in detection probability over time, given by
where α is a positive parameter that captures the sensitivity of the monitoring equipment or detection algorithm. A higher α implies faster and more reliable fault detection. This formula plays a crucial role in the modeling of fault detection performance. The exponential relationship signifies that the probability of detecting a fault increases rapidly as time progresses, highlighting the importance of efficient fault detection in minimizing service interruptions. The sensitivity parameter α emphasizes how the system’s responsiveness can be improved through advanced fault detection algorithms and more sensitive monitoring equipment.
2. Load Restoration Ratio
The measure of how effectively the system can restore loads after a disturbance can be quantified by the load restoration ratio η:
where Lrestored is the total load that is successfully restored following reconfiguration and Ltotal is the total pre-disturbance load. This ratio is vital for assessing the efficacy of the self-healing system. A higher η value indicates a system that is more effective in recovering from faults by restoring a higher proportion of the total load. The load restoration ratio provides a tangible measure of the system’s robustness in handling disruptions, with significant implications for the overall operational stability of subway power supply networks.
3. Self-Healing Objective Function
In faulted conditions, the metro supply system typically seeks to optimize both restoration speed and the proportion of recovered loads, subject to safety constraints. One may frame the self-healing problem as minimizing an objective function of the following form:
where Tinterrupt is the duration of service interruption, η is the load restoration ratio, and u is a vector of decision variables (e.g., breaker switching states and power flow allocations). The weights w1 and w2 reflect the relative importance assigned to minimizing interruption time versus maximizing load restoration. This optimization framework captures the essence of a self-healing system by balancing the competing goals of minimizing service disruptions and ensuring effective load recovery. The decision variables u represent the system’s control parameters, which can be adjusted to achieve the desired outcomes. The objective function quantifies the trade-offs involved in system reconfiguration, providing a mechanism for dynamically adapting to fault conditions.
These formulations highlight the primary objectives and constraints associated with designing self-healing strategies for metro power supply systems. In practice, more sophisticated models can integrate various operational constraints (such as power quality, thermal limits, or protection coordination) to accurately represent the system’s behavior under disturbance. Based on the above, Table 1 synthesizes key similarities and differences of the self-healing concept in general power/energy systems and in metro power supply systems. It examines at least eight distinct dimensions—ranging from scope and control hierarchy to fault characteristics and implementation status—to provide a high-level comparative overview.
Table 1.
Comparative summary of self-healing in power/energy systems vs. metro power supply systems.
Overall, while the metro power supply setting shares commonalities with broader power and energy systems in terms of self-healing principles (e.g., FLISR (fault location, isolation, and service restoration)), it imposes more stringent real-time performance requirements and heightened safety standards. Future advancements in metro self-healing are anticipated to involve the deeper integration of sensing technologies, sophisticated fault isolation and reconfiguration algorithms, and more robust data communication protocols. These developments aim to ensure that even under adverse conditions, metro systems can autonomously detect and isolate faults, reconfigure feeder networks, and restore power with minimal disruption to passenger transit and operational stability.
2.2. Historical Evolution of Self-Healing Strategies in Electrical Power Systems
The concept of self-healing within electrical power systems emerged in tandem with the growing importance of system reliability, stability, and automation [14,36,37]. Historically, power utilities faced ever-increasing demands for seamless electricity provision while grappling with the technical and economic challenges posed by grid expansion and complexity. Early engineering solutions were typically aimed at enhancing robustness through redundancy measures and improved protective devices [38]; however, the notion of a system that could detect, isolate, and autonomously recover from faults without significant human intervention was not fully articulated until the latter half of the 20th century. While the evolution of self-healing strategies has been extensively documented in the literature, the actual implementation and operational effectiveness of these strategies in large-scale power systems remain under-verified. It is crucial to conduct field trials to evaluate the real-world performance of these strategies in various environmental and operational contexts, including urban subway power systems. This subsection traces the foundational developments that led to contemporary self-healing frameworks, emphasizing the evolution of control paradigms, the role of technological innovations such as supervisory control and data acquisition (SCADA) systems, and the gradual shift toward intelligent, automated solutions.
2.2.1. Early Concepts and Precursor Technologies
Prior to the advent of fully computerized control centers, power systems relied on mechanical relays, manual switchgear, and onsite personnel to handle contingencies. Protective relays were designed to operate when specific fault conditions exceeded threshold limits, thus providing a basic isolation mechanism. While these early solutions prevented catastrophic equipment damage, they were reactive in nature and limited by a lack of real-time data or predictive insight. Operators could only respond to disturbances once alarms were triggered or visible signs of failure became apparent.
The introduction of SCADA technology during the mid-20th century marked a major milestone in laying the groundwork for self-healing strategies [39]. SCADA systems allowed for remote monitoring and control of substations, transforming operational practices by enabling operators to gather near-real-time data regarding voltage levels, current flows, breaker statuses, and other critical parameters [40,41]. This shift created a platform for more advanced computational tools that could process large volumes of data and inform decision-making at control centers.
Simultaneously, power systems research began to address dynamic stability problems, frequency control, and load forecasting. The emergent field of power system stability studies spearheaded by various research groups underscored the need for adaptive, real-time techniques to maintain system equilibrium after disturbances. These developments foreshadowed the modern idea of “self-healing”, wherein the system would respond to perturbations with minimal external intervention.
2.2.2. The Emergence of Self-Healing Principles in the Late 20th Century
As computational capabilities expanded in the 1970s and 1980s, power engineers and researchers explored ways to automate fault detection and isolation [42,43]. For example, Ref. [42] discusses the integration of microprocessor-based digital relays and their application to self-healing systems. It covers how the use of microprocessors in substations and control centers, starting in the 1980s, allowed for real-time data analytics and enabled more flexible protection schemes. Innovative digital relays supplanted their purely electromechanical counterparts, offering more flexible protection schemes and the ability to communicate detailed fault information back to central control systems. The wider use of microprocessors in substations and control centers allowed for real-time data analytics—an essential enabler for the advanced functionalities that characterize self-healing systems.
During this period, the term “self-healing” began to appear in power system discourse, reflecting a move from static reliability concepts (such as N-1 contingency planning) to dynamic resilience and adaptability. Early academic studies proposed hierarchical control architectures that would locally identify faults, isolate affected segments, and promptly reconfigure the network to restore service. Yet, practical implementation faced significant barriers, including the complexity of coordinating multiple control agents, the limited bandwidth and reliability of communication channels, and the computational cost of running real-time algorithms on then-current hardware.
The hierarchical control architecture is a well-established framework for organizing complex control tasks into multiple layers, ensuring efficient management and operation across large-scale systems. Initially proposed by Professor G.N. Saridis of Purdue University in 1977, the hierarchical control architecture has been widely applied in various fields, including electrical power systems. This approach divides control responsibilities into three distinct levels: the organizational level, the coordination level, and the execution level. The adaptability and clear structure of this architecture have made it a key tool for managing and controlling the complex, distributed systems found in modern electrical grids and transit networks.
In the context of subway power systems, the hierarchical control architecture offers a structured solution to the challenges of fault recovery and system stability. Subway power systems, characterized by their intricate, multi-layered distribution network, benefit from the clear division of responsibilities inherent in this architecture. The organizational level oversees strategic decision-making and global optimization, while the coordination level manages the distribution of tasks and ensures coordination between different subsystems. The execution level, where actual control and operational adjustments occur, is responsible for fault isolation, reconfiguration, and system recovery.
1. Application of Hierarchical Control in Subway Power Systems
In the context of subway power systems, the hierarchical control framework aligns well with the system’s natural structure, which spans from the main power stations to substations and finally to the traction systems. Each level of the architecture performs specific functions tailored to the subway’s operational requirements:
- (1)
- Organizational Level: At this highest level, the central control center formulates global power strategies, ensuring the continuous operation and safety of the system. It is responsible for strategic planning, long-term optimization, and high-level decision-making. The decisions made at this level influence the overall performance and resilience of the subway power network.
- (2)
- Coordination Level: The substation level corresponds to the coordination level, where individual regions or sub-networks are managed. This level coordinates the activities of various subsystems, ensuring that resources are effectively allocated, especially during fault conditions. Coordination includes dynamic load balancing, fault recovery task prioritization, and optimization of energy distribution across the network. The system is capable of rapid response during fault events, adjusting operational parameters to restore normal conditions as quickly as possible.
- (3)
- Execution Level: The execution level consists of intelligent devices and control units, such as circuit breakers, switches, and sensors. These devices perform specific actions, such as disconnecting faulty areas, restoring power from backup sources, and adjusting load distribution. The execution level’s effectiveness is critical for minimizing the impact of faults, as it directly influences the speed and precision of the fault recovery process.
2. Hierarchical Control Architecture Scheme
Based on the above, a hierarchical control architecture scheme is demonstrated in Figure 1, which visually represents the division of control responsibilities across the different levels of a power system. The diagram illustrates how control tasks are organized, starting from the strategic decisions made at the organizational level, which cascade down to the coordination level where operational tasks are distributed and managed. Finally, at the execution level, the schematic shows the physical devices that carry out the control actions necessary for fault recovery and system stabilization.
Figure 1.
Hierarchical control architecture for subway power systems: a structured approach to fault recovery and system optimization.
The hierarchical control architecture in Figure 1 organizes power system management into three levels: organizational, coordination, and execution. At the organizational level, global strategies and resource allocation are determined. The coordination level manages inter-subsystem cooperation, ensuring efficient fault recovery. The execution level handles direct actions, such as fault isolation and load reconfiguration, through intelligent devices. In subway power systems, this architecture enhances operational stability and fault recovery by clearly delineating responsibilities across levels, enabling rapid response to disruptions, optimizing energy distribution, and improving overall system resilience, crucial for maintaining continuous and reliable service in complex transit networks.
This layered structure is designed to optimize the management of complex power systems by ensuring that each level focuses on specific tasks, with minimal overlap. The organizational level ensures that global objectives are met, while the coordination level handles the real-time adjustment and distribution of tasks across the system. The execution level then implements these decisions, taking direct action to address faults and restore normal operations.
In summary, the hierarchical control architecture offers a comprehensive, adaptable framework for managing subway power systems, providing significant advantages in terms of fault detection, isolation, and recovery. By clearly dividing control responsibilities into multiple layers, it enhances system stability, improves recovery times, and allows for more efficient management of resources, making it an ideal solution for the highly complex and dynamic environment of subway power systems.
2.2.3. Influence of the Smart Grid Paradigm
The evolution of self-healing strategies gained further momentum with the emergence of the “smart grid” paradigm in the early 21st century. Smart grid initiatives emphasized digitalization, two-way communication, and integration of decentralized energy resources to enhance sustainability and efficiency. Within this paradigm, self-healing became a vital functionality, aiming to maintain power quality and reliability amid a growing proliferation of distributed energy resources (DERs), such as photovoltaic systems and wind farms, and increasing load volatility caused by electric vehicle charging and other new demands.
With greater sensor deployment—ranging from phasor measurement units (PMUs) in transmission systems to intelligent electronic devices (IEDs) in distribution networks—operators acquired a richer set of real-time measurements. Coupled with advanced analytics, these data streams opened the door to automated fault management protocols. Self-healing functions within the smart grid context typically entailed the following:
- (1)
- Wide-Area Monitoring, Protection, and Control (WAMPAC): PMUs measuring voltage and current phasors synchronized to a global positioning system (GPS) time reference provided near-instantaneous snapshots of system conditions [44]. Such granular visibility enabled early fault detection and advanced protection schemes that adapt to changing conditions [45].
- (2)
- Distributed Intelligent Control: The shift from monolithic control architectures to decentralized or distributed approaches, wherein local controllers or agents communicate and collaborate, accelerated. This setup was regarded as crucial for self-healing, as localized intelligence can isolate faults closer to their source and coordinate reconfiguration strategies quickly.
- (3)
- Predictive and Preventive Measures: Smart grids embraced a shift from reactive fault management to proactive asset management and system planning. Machine learning models and robust optimization techniques were developed to predict equipment failures, forecast load patterns, and identify vulnerabilities in the network topology.
By integrating these elements, the modern power industry envisioned systems capable of maintaining stability and continuity of service under a variety of operational threats.
2.2.4. Convergence with Multi-Agent Systems and Artificial Intelligence
Although early self-healing concepts relied heavily on centralized approaches, the limitations of single-point decision-making—such as communication bottlenecks and slower response times—spurred interest in MASs. In MAS frameworks, multiple intelligent agents (e.g., at substations, feeder lines, or distributed generators) interact, negotiate, and collaborate to detect, isolate, and remedy faults. Each agent typically possesses partial knowledge of the system but is capable of local decision-making, thus distributing the computational burden and avoiding single points of failure.
AI further revolutionized self-healing strategies by enabling more sophisticated fault detection, classification, and system optimization [46,47]. Techniques such as artificial neural networks [46], support vector machines, deep learning [47], and fuzzy logic controllers facilitated rapid and accurate fault diagnosis, particularly in complex or noisy scenarios. As AI algorithms matured, they began to provide real-time decision support for reclosing sequences, sectionalizing, and load transfer operations [48]. Additionally, advanced machine learning models that rely on historical data and real-time sensor inputs could predict incipient failures in cables, transformers, or switchgear, thereby allowing proactive or condition-based maintenance to avert large-scale outages. Despite promising theoretical results, empirical testing is crucial to verify the real-world performance of MASs in self-healing systems. Simulations and pilot studies in operational environments will be key to refining MAS frameworks, especially when applied to metro power systems, which pose unique challenges such as rapid load fluctuations and complex network topologies.
By the early 2010s, industrial pilot projects began to demonstrate the feasibility of full or partial self-healing systems at distribution and sub-transmission voltage levels. These systems responded autonomously to single-phase or multi-phase faults by performing fault isolation and service restoration within seconds or minutes. Some utilities reported substantial improvements in reliability indices such as the system average interruption duration index (SAIDI) and the system average interruption frequency index (SAIFI).
2.2.5. Lessons Learned and Ongoing Challenges
Decades of technological progress show that self-healing strategies can significantly enhance power system resilience. Yet, challenges remain. Among the key lessons learned from these historical developments are the following:
- (1)
- Communication Infrastructure: Adequate, reliable, and secure data exchange is critical for successful self-healing. Historically, the absence of high-bandwidth, low-latency communication hampered early initiatives, underscoring the need for robust communication standards and architectures, such as IEC 61850, to ensure interoperability among devices and systems.
- (2)
- Coordination Complexity: The transition from centralized to distributed control paradigms introduces complexity in coordination among multiple agents. The necessity of robust algorithms for consensus, negotiation, and conflict resolution remains an important area of ongoing research.
- (3)
- Scalability: Early demonstration projects often took place on relatively small-scale feeders. Scaling up self-healing solutions to entire distribution networks or interlinked systems involving numerous microgrids requires careful architectural design that balances local autonomy with central oversight.
- (4)
- Cybersecurity Concerns: Increased digitalization raises the threat of cyberattacks, data tampering, and privacy breaches. Protecting self-healing frameworks from malicious interventions or denial-of-service attacks presents a nontrivial challenge that requires sophisticated security protocols and risk assessment methodologies.
- (5)
- Economic Viability: Self-healing systems can be capital-intensive to implement, especially in existing grids with aging infrastructure. The cost-effectiveness of retrofits, the complexity of new device installation, and the required training of operational staff are all factors influencing widespread adoption.
Overall, the historical evolution of self-healing in electrical power systems reflects the progression from manual, reactive fault handling toward a digitally enabled, data-driven, and intelligent control paradigm. This evolution underscores the potential for similar developments in niche application areas, most notably subway power systems, which share many of the reliability and safety imperatives that have historically shaped the broader electrical grid.
2.3. Key Components and Architecture of Self-Healing Mechanisms in Modern Power Networks
Modern self-healing power networks are defined by sophisticated hardware and software components designed to ensure rapid fault detection, isolation, and system reconfiguration. These systems rely on the synergy of advanced sensors, protection devices, communication protocols, and intelligent algorithms to deliver an automated, efficient, and highly resilient energy supply. However, despite the promise of these technologies, real-world validation through pilot studies is essential to assess the true operational effectiveness of self-healing mechanisms in real-world systems. In particular, metro power systems, with their unique load dynamics and safety requirements, require extensive testing of fault detection and isolation algorithms in operational trials. This subsection dissects the primary building blocks of self-healing mechanisms as they have evolved in contemporary electrical grids, focusing on the architectural arrangements, control paradigms, and underlying standards—particularly IEC 61850—that facilitate interoperability and real-time responsiveness.
2.3.1. Hardware Foundation: Intelligent Electronic Devices, Sensors, and Switchgear
At the physical layer of a self-healing network, intelligent electronic devices (IEDs) and sensors form the backbone of measurement and protection. IEDs are microprocessor-based controllers that perform multiple functions, such as protective relaying, metering, and local automation. They gather high-resolution data on current, voltage, frequency, and harmonic content, enabling sophisticated fault detection schemes. When integrated with remote terminal units (RTUs) or SCADA systems, these IEDs relay detailed status updates and measurements to central or distributed controllers.
Equally important are the automated switchgear components—reclosers, sectionalizers, and circuit breakers—that physically isolate faulty segments and reconfigure network topology. The switchgear must respond rapidly and reliably to commands generated by the control logic. Advancements in switchgear design, including the use of vacuum or SF6 interrupting mediums, have improved operational speed and reduced maintenance requirements. In many modern systems, these components can be triggered either by local protective relays or by higher-level controllers orchestrating broader reconfiguration strategies.
2.3.2. Communication Protocols and Standards
A robust communication framework is vital for self-healing. Power utilities have historically used proprietary protocols, which often hindered interoperability. However, industry-wide acceptance of open communication standards, such as IEC 61850, has greatly facilitated multi-vendor interoperability and laid the groundwork for integrated, system-wide self-healing solutions.
IEC 61850 delineates a comprehensive data model and communication framework for substation automation. Its object-oriented design structures data into logical nodes that represent devices, measurements, and control functions. This approach allows for seamless exchange of information among relays, protection devices, and supervisory systems. Notably, IEC 61850 supports generic object-oriented substation event (GOOSE) messaging, which provides high-priority, low-latency data transfer for critical protection and control signals. Through GOOSE, devices can publish or subscribe to messages on the network, enabling rapid relay coordination and sophisticated interlocking schemes.
Furthermore, modern systems may employ protocols like DNP3 (Distributed Network Protocol) or Modbus for backward compatibility, while layering advanced cybersecurity measures (e.g., encryption and authentication) to safeguard communications. Where wide-area coordination is necessary, particularly in transmission-level self-healing or large-scale distribution automation schemes, telecommunication technologies such as fiber optics, wireless mesh networks, or 5G solutions can be employed to achieve the latency and reliability thresholds required for real-time control.
2.3.3. Control Hierarchies: Centralized, Decentralized, and Distributed Approaches
One of the most critical aspects of self-healing architecture is the organizational structure of control. Historically, centralized approaches dominated, wherein control centers collected measurements from the entire network, executed fault detection and isolation algorithms, and issued commands to field devices. This approach can be effective in relatively small or well-defined systems, but it risks single points of failure and communication bottlenecks, which become problematic as the network grows in complexity.
In contrast, decentralized (or hierarchical) approaches distribute decision-making authority closer to the field level, granting local controllers the autonomy to detect and respond to faults. A commonly adopted structure is a three-tier hierarchy:
- (1)
- Primary Control (Local/Device Level): Protective relays and IEDs that execute overcurrent detection, undervoltage protection, or distance protection. They can isolate faults locally with minimal latency.
- (2)
- Secondary Control (Feeder or Zone Level): Substation-based controllers that coordinate reconfiguration among multiple feeders or zones. They receive aggregated data from local devices and can implement advanced reconfiguration strategies such as switching feeder ties or transferring loads.
- (3)
- Tertiary Control (Control Center Level): Higher-level supervision that oversees the entire utility network, optimizing long-term planning, load balancing, and restoration procedures when local measures are insufficient.
A fully distributed or multi-agent architecture further refines the decentralized approach by allowing intelligent agents—each equipped with localized sensing, decision-making, and communication capabilities—to collaborate with one another. This MAS approach is particularly powerful for fault restoration in complex distribution networks, as it can reduce computation time and enhance system-wide resilience. Agents may employ consensus algorithms, negotiation protocols, or artificial intelligence techniques to optimize reconfiguration in real time.
2.3.4. Core Functions of Self-Healing Mechanisms
Despite differences in architectural preferences and technology stacks, most self-healing systems revolve around a set of shared core functions:
- (1)
- Fault Detection and Classification: High-speed relays, coupled with modern sensor networks, identify abnormal conditions (e.g., short circuits, overcurrent, or voltage collapse) and classify the type and location of the fault. AI-based classifiers often enhance accuracy under noisy conditions or complex fault scenarios.
- (2)
- Fault Isolation: Once a fault is identified, circuit breakers, reclosers, or sectionalizers operate to isolate only the affected section. This isolation must be performed quickly to mitigate damage and maintain stability in the healthy portions of the system.
- (3)
- Service Restoration (Reconfiguration): The most distinctive feature of self-healing systems is their capacity to reroute power around the fault, restoring service to the greatest extent possible. Automated reconfiguration may involve closing tie switches or adjusting feeder topology. MASs can play a significant role in coordinating these reconfigurations autonomously.
- (4)
- System Optimization: Beyond restoring service, many self-healing frameworks incorporate optimization functions that ensure voltage profiles, line loading, and overall reliability are improved. Techniques such as dynamic voltage regulation, reactive power compensation, and automated load shedding contribute to system stability and performance.
- (5)
- Predictive and Preventive Maintenance: Self-healing extends beyond fault handling to proactively safeguard system health. Condition monitoring of critical assets (e.g., transformers and cables) and AI-driven anomaly detection can reduce the incidence of unexpected failures and optimize maintenance scheduling.
2.3.5. Role of Artificial Intelligence and Advanced Analytics
Modern power networks leverage AI and big data analytics to implement adaptive, predictive, and real-time self-healing solutions. AI methods excel at interpreting the vast influx of sensor data, enabling the following [49,50,51,52,53]:
- (1)
- Fault Pattern Recognition: Neural networks and machine learning models can detect subtle fault precursors by analyzing waveform distortions, harmonic anomalies, or partial discharge data.
- (2)
- Real-Time Contingency Analysis: AI-driven simulators can run contingency analyses in parallel, evaluating various switching actions or load transfers under multiple fault scenarios.
- (3)
- Adaptive Protection: In networks with high penetration of distributed generation, fault levels and power flows can vary significantly. AI-based adaptive protection adjusts relay settings dynamically to accommodate changing conditions.
- (4)
- Asset Health Forecasting: Machine learning algorithms parse historical failure data, meteorological records, and real-time measurements to predict the residual life of components, supporting proactive replacement or refurbishment decisions.
Such capabilities significantly bolster the autonomy and responsiveness of self-healing. Nevertheless, the adoption of AI necessitates robust verification, validation, and interpretability measures—especially for mission-critical power system applications.
2.3.6. Security and Reliability Considerations
Because self-healing systems rely on extensive data exchange and automated decision-making, ensuring cybersecurity and reliability is paramount. Malicious actors could theoretically disrupt or manipulate automated functions, leading to unnecessary outages or, worse, physical damage to infrastructure. Consequently, modern architectures integrate the following [54,55,56]:
- (1)
- Intrusion Detection Systems (IDSs): Deployed at the substation level to monitor suspicious network traffic or unauthorized system access.
- (2)
- Encryption and Authentication: Communication protocols incorporate cryptographic methods to protect data integrity and confidentiality.
- (3)
- Access Control Policies: Role-based access, multi-factor authentication, and stringent authorization policies limit the potential attack surface.
- (4)
- Redundant Pathways: Networks are often designed with diverse communication paths and backup control systems, preventing single points of failure from compromising the entire self-healing mechanism.
In parallel, reliability assessments must consider the possibility of simultaneous equipment failures and communication outages. Scenario-based testing, hardware-in-the-loop simulations, and stress testing are commonly used to validate self-healing performance under extreme or cascading fault conditions.
2.3.7. Outlook: Convergence with Distributed Energy Resources and Microgrids
A contemporary trend shaping self-healing architectures is the rising prevalence of distributed energy resources (DERs). As more consumers install rooftop solar panels or adopt electric vehicles, distribution feeders can experience bidirectional power flow and dynamic load/generation profiles. Self-healing systems therefore require new algorithms capable of balancing local generation and consumption while maintaining system voltage and frequency stability.
Microgrids—localized energy networks that can operate autonomously—also introduce novel opportunities for self-healing [57,58]. In islanded mode, a microgrid’s self-healing mechanism can isolate internal faults and reorder generation resources to preserve critical loads. When connected to the main utility grid, microgrids serve as controllable cells that bolster overall system resilience. Coordinating self-healing at the microgrid level with higher-level grid control is an active area of research, promising future improvements in reliability and energy efficiency.
In summary, the core components and architecture of modern self-healing mechanisms reflect a multifaceted interplay of advanced protective devices, intelligent data sharing governed by standards such as IEC 61850, distributed or multi-agent control paradigms, and AI-powered analytics. These technologies collectively undergird the robust, flexible, and future-proof electrical power networks, setting the stage for specialized applications in subway systems, where the imperatives of safety, operational continuity, and rapid fault response are particularly pronounced.
2.4. Emerging Self-Healing Solutions in Subway Power Systems and Future Directions
Subway power systems, often referred to as traction power supply systems (as demonstrated in Figure 2), present a unique environment in which reliability, safety, and operational efficiency are paramount. Compared to traditional distribution networks, subway systems typically exhibit higher load densities over shorter distances, frequent load variations due to train acceleration and deceleration, and stringent safety requirements for passengers. These systems also incorporate specialized equipment such as rectifier transformers, third-rail or overhead catenary structures, and robust protective relays calibrated for traction loads. Although theoretical models and early-stage simulations show promise, empirical data from pilot studies and real-world subway systems are essential to validate the proposed self-healing solutions. Such studies will enable a comprehensive evaluation of the impact of self-healing on subway network reliability, operational efficiency, and safety. This subsection discusses how self-healing solutions are being adapted and refined to address the particular challenges of subway environments, highlighting key technological innovations, best practices, and prospects for future development.
Figure 2.
A typical electrified railway traction power supply system architecture serves as the foundational architecture for providing electrical power to urban subway transit systems.
Figure 2 illustrates an electrified railway traction power supply system structure, which is designed to provide the necessary power to trains using a single-phase or three-phase alternating current (AC) or DC system. The system includes a transformer substation that converts high-voltage electricity to a suitable level for rail operations. The design is highly reliable and ensures constant power supply to the trains while mitigating power losses. A significant feature is the integration of the return current rail, which provides a path for the current to flow back, ensuring that the system is both efficient and stable. The power supply network is typically designed for single-side (or single-arm) distribution, offering fault tolerance and making the system easy to monitor and maintain.
1. Function of the Traction Substation
The traction substation in the electrified railway system plays a critical role in power conversion and distribution. It receives high-voltage electricity from the main grid and steps it down to lower voltages suitable for the traction system. The traction substation regulates the power supply, ensuring the voltage and frequency meet the requirements for the trains to operate smoothly. In addition, it handles the distribution of electricity across various sections of the rail network, ensuring continuous power delivery and operational reliability for trains.
2. Function of the Catenary and Track
The catenary system in the electrified railway system provides the overhead line through which electrical power is transmitted to the trains. It is connected to the traction substation and supplies electricity to the train’s pantograph, ensuring consistent voltage and current flow. The track, often referred to as the return current rail, serves as the return path for the electrical current. It ensures that the electrical loop is closed, allowing the traction system to function effectively and ensuring the safe operation of the railway network by maintaining the flow of electricity and reducing the risk of electrical faults.
3. Summary of a Typical Electrified Railway Traction Power Supply System
The electrified railway traction power supply system depicted in Figure 2 serves as the foundational architecture for providing electrical power to urban transit systems, such as subways. This system efficiently converts high-voltage electricity from the main grid through traction substations, which step down the voltage to levels suitable for traction operations. Key components include the catenary system, which supplies power to the trains, and the return current rail, which closes the electrical loop by guiding the current back. The system’s key advantages lie in its robust reliability, adaptability to varying operational loads, and its fault-tolerant design that ensures continuous power delivery under normal and fault conditions. A notable feature is the integration of both AC and DC systems, offering the flexibility to accommodate different types of trains and operational demands. The system’s hybrid nature allows it to efficiently manage energy distribution, enhance operational stability, and minimize power losses, ensuring the sustained operation of the subway network.
2.4.1. Characteristics and Challenges of Subway Power Systems
Unlike conventional distribution grids, subway power networks are designed to handle high transient currents caused by accelerating trains and regenerative braking. They also feature multiple traction substations spaced along the railway line to ensure stable voltage supply. Key challenges include the following [59,60,61]:
- (1)
- Rapid Fluctuations in Load Demand: Train movements impose high-power draws within seconds, necessitating real-time monitoring of current and voltage profiles. Self-healing mechanisms must thus accommodate frequent load spikes without triggering false alarms.
- (2)
- Critical Safety Requirements: Failure in a subway power circuit can strand trains in tunnels or disrupt essential ventilation and signaling systems. Any self-healing strategy must prioritize passenger safety, ensuring that fault isolation or network reconfiguration does not inadvertently disconnect essential loads or violate safety protocols.
- (3)
- Limited Redundancy and Topological Constraints: While overhead distribution networks can add tie-lines or reconfigure feeders relatively easily, subway systems often have limited alternatives for routing power around a fault due to space constraints and rigid corridor layouts. This places heavier emphasis on pinpoint fault localization and targeted restoration strategies.
- (4)
- Integration with Signaling and Control Systems: Subway power infrastructure is closely interlinked with signaling, communications, and station facilities. Coordinating self-healing events with traction power protection, passenger information systems, and operational schedules can be complex, requiring robust communication and control architectures.
2.4.2. Adapting Self-Healing Functions for Subway Applications
Subway power systems have begun to adopt many of the core self-healing functions originally developed for electrical distribution grids, albeit with tailored modifications to meet stringent traction needs.
- (1)
- Fault Detection and Localization: Traditional overcurrent or distance protection relays, combined with advanced sensor arrays, are complemented by traction-specific detection algorithms that account for the distinctive waveforms and power electronics used in subway systems. For instance, in systems equipped with regenerative braking, fault signals can overlap with normal operational signals. Intelligent algorithms, often grounded in AI-based pattern recognition, can distinguish these conditions more accurately than conventional threshold-based relays.
- (2)
- Isolation Strategies: Unlike overhead feeders, subway power rails or catenaries cannot always be sectionalized as flexibly. Self-healing solutions typically rely on specialized disconnect switches or breaker arrangements at traction substations. These devices must isolate the faulted segment while retaining power to adjacent segments, preventing a single fault event from cascading into large-scale service interruptions. The isolation strategy may also consider the dynamic location of trains, ensuring that no train is left in an unsafe or dark tunnel segment during the isolation process.
- (3)
- Rapid Service Restoration: Given the high passenger throughput in urban subway networks, restoring power promptly is a top operational priority. Some subway operators deploy ring or loop architectures, allowing the line to be fed from multiple substations. When a fault occurs, the system automatically opens circuit breakers to isolate the fault and closes alternate pathways so that power can still be supplied from another substation. Adaptive algorithms within a multi-agent framework can further refine the restoration sequence, minimizing inrush currents and voltage dips when re-energizing lines.
2.4.3. Leveraging IEC 61850 and Multi-Agent Systems in Subway Contexts
Building on the architectural insights gained from larger distribution systems, subway operators are increasingly looking to IEC 61850 for standardizing communications among traction substations, protective devices, and control centers. The flexibility of IEC 61850 logical nodes permits the modeling of traction-specific devices—like rectifier units or track section switches—ensuring that relevant fault signals and control commands can be shared efficiently. While these integration strategies have been discussed extensively in the literature, field trials and pilot programs are necessary to assess the true effectiveness of IEC 61850 in operational subway systems. Empirical case studies will provide valuable insights into the challenges and opportunities of implementing these technologies in metro environments.
MASs are proving especially promising in this domain. Agents deployed at each traction substation or track section can monitor local conditions (e.g., voltage levels, breaker states, and train locations) and communicate with neighboring agents to coordinate fault isolation and reconfiguration. By distributing intelligence throughout the power network, MASs can significantly reduce dependence on a central control center, thereby mitigating single-point failures and communication latency issues. In scenarios where partial or full communication loss occurs—an unfortunate but conceivable event in underground tunnels—MAS agents can resort to fallback strategies or local heuristics to maintain at least basic service levels.
2.4.4. AI-Driven Fault Prediction and Maintenance
Artificial intelligence techniques are increasingly adopted for condition-based maintenance and fault prediction in subway power systems, complementing their role in real-time restoration. For instance, traction power cables and switchgear can be equipped with sensors measuring temperature, partial discharge activity, and vibration. Machine learning models process these data to predict the health status of components and forecast the likelihood of imminent failure.
This predictive approach is especially valuable in subways where service disruptions can affect thousands of passengers in a short timeframe. By scheduling maintenance during off-peak hours or proactively replacing aging components, subway operators reduce the risk of disruptive breakdowns. Furthermore, advanced data analytics can optimize maintenance budgets by prioritizing interventions on components with the highest criticality and most pronounced signs of deterioration.
2.4.5. Case Studies and Pilots
A growing number of urban transit authorities have undertaken pilot programs to test self-healing functionalities:
- (1)
- Pilot A deployed a multi-agent system spanning multiple traction substations on a busy metropolitan rail line. Each substation agent automatically adjusted feeder connections when localized faults were detected. Early results showed a drastic reduction in fault clearance times and improved power quality during reconfiguration.
- (2)
- Pilot B focused on AI-based fault prediction for critical power components. By combining historical data on cable insulation failures with real-time temperature and partial discharge sensors, the pilot achieved a substantial decrease in unexpected cable faults, enhancing overall system availability.
- (3)
- Pilot C explored the integration of IEC 61850-based control architecture in a newly built subway extension. Standardized communication protocols allowed different vendors’ substations, protective relays, and SCADA systems to interoperate. The pilot demonstrated that advanced GOOSE messaging could achieve fault isolation within milliseconds, significantly reducing service interruptions.
While these pilots demonstrate the feasibility and benefits of self-healing solutions, they also highlight the importance of robust training programs for maintenance staff and control center operators, clear design guidelines for applying standards like IEC 61850 to traction scenarios, and thorough cybersecurity audits to safeguard the system from unauthorized access.
2.4.6. Future Directions and Research Opportunities
The evolution of subway power systems toward self-healing architectures continues to present numerous opportunities for further innovation and refinement. One of the primary research areas is the validation of self-healing technologies through pilot studies and real-world data collection. These studies will play a crucial role in demonstrating the practical applicability of these technologies in operational metro systems. Furthermore, research in advanced AI-based fault prediction and maintenance, coupled with real-time data analytics, will be essential to optimize self-healing processes and reduce the occurrence of service disruptions. These opportunities are summarized as follows.
- (1)
- Integration with Smart Mobility and Energy Management: As urban areas adopt more holistic “smart city” strategies, subway power systems may be integrated with other mobility solutions, such as electric buses or shared autonomous vehicles, forming an interconnected transportation energy ecosystem. Coordinated energy management across these systems could unlock novel self-healing and load balancing capabilities, for example by rerouting excess regeneratively braked energy to nearby electrical loads or EV charging stations.
- (2)
- Enhanced Sensor Deployment and Data Analytics: Future subways could leverage high-resolution sensors for continuous waveform monitoring, partial discharge analysis, and real-time location tracking of trains. With the advent of 5G and edge computing, massive data streams can be processed at the substation or trackside in near real time, facilitating ultra-fast fault detection and system reconfiguration. Research in advanced analytics, such as deep neural networks or reinforcement learning, promises to further refine these capabilities.
- (3)
- Holistic Resilience Frameworks: Beyond electrical faults, subway systems may face a variety of disruptions, from extreme weather events (e.g., flooding in tunnels) to cyberattacks targeting control systems. Expanding self-healing to encompass multi-hazard resilience would involve integrated monitoring of infrastructure conditions (e.g., water leakage and track integrity) and dynamic adaptation of protective or evacuation measures. This comprehensive approach would require new interdisciplinary collaborations among electrical engineers, civil engineers, cybersecurity experts, and urban planners.
- (4)
- Human-in-the-Loop vs. Full Autonomy: Although the end goal for many operators is to minimize human intervention, achieving full autonomy in critical infrastructure raises important questions regarding reliability, liability, and public acceptance. Ongoing research could investigate hybrid frameworks that allow human supervisors to override or guide self-healing decisions when system states deviate significantly from normal operating conditions. This “human-in-the-loop” paradigm can bolster operator trust while still leveraging the speed and efficiency of AI-driven automation.
- (5)
- Regulatory and Standardization Needs: Uniform guidelines for applying IEC 61850 or similar standards to traction power systems remain in their infancy. Multiple national and international standard-setting bodies may need to coordinate new protocols specific to subway environments. Moreover, regulators must evaluate safety and reliability metrics in the context of self-healing performance, ensuring that subway operators maintain rigorous compliance with established norms.
2.4.7. Synthesis and Outlook
In summary, the adoption of self-healing technologies in subway power systems signifies a pivotal shift from reactive fault response to proactive, intelligent, and resilient operations. By leveraging principles from larger electrical grids—such as advanced sensor networks, IEC 61850-based communications, multi-agent coordination, and AI-enhanced analytics—subway operators can substantially improve service reliability and safety. The distinct challenges posed by subterranean environments, constrained topology, and high-density loads necessitate careful customization of these technologies, but successful pilot programs demonstrate their feasibility and benefits.
Looking forward, the continued urbanization of metropolitan centers and the increasing importance of mass transit solutions position subway systems as prime candidates for next-generation self-healing research. Ongoing studies will likely explore deeper integration with other urban energy systems, broader resilience frameworks that account for climate and security risks, and innovative control architectures balancing automated intelligence with prudent human oversight. As these efforts mature, self-healing subway power systems will not only advance the overall reliability of urban rail transit but will also serve as an exemplary application domain, pushing the boundaries of intelligent control and automation in critical infrastructures worldwide.
3. Specific Challenges Faced by Subway Power Supply Systems
Subway power supply systems present a unique set of technical, operational, and regulatory challenges that distinguish them from traditional power grids or typical distribution networks. These challenges arise not only from the complex topology and confined operating environment in underground rail systems but also from increasingly stringent requirements for reliability, safety, and real-time fault management. Furthermore, the push toward higher levels of automation and intelligence introduces additional layers of complexity and integration needs, including communication standards, multi-agent coordination, and data-driven algorithms.
In this chapter, we discuss three principal categories of challenges, each meriting a dedicated subsection. First, in Section 3.1, we analyze the complex topology and operational constraints that impede straightforward adoption of conventional self-healing solutions. Second, in Section 3.2, we delve into fault diagnosis, isolation, and recovery in real time, focusing on the technological barriers and performance metrics that must be addressed to achieve rapid response. Finally, in Section 3.3, we examine regulatory, safety, and integration barriers with emerging technologies, emphasizing the interplay between standards compliance (e.g., IEC 61850), safety-critical requirements, and the integration of MASs and AI. These three dimensions are interlinked, collectively shaping the reliability, intelligence, and overall feasibility of self-healing subway power supply systems.
3.1. Complexity of Topology and Operational Constraints
Subway power supply systems typically feature intricate power distribution architectures, including multiple substations and complex feeder lines that operate under high load density and limited physical space [62]. This complexity is driven by high passenger demand, the need for continuous operation, and safety requirements such as mandatory power redundancy to allow safe evacuation of passengers in case of a single-point failure [63,64]. These design constraints make it difficult to implement conventional self-healing solutions, which are often based on more flexible or widely distributed networks. In the following sections, we will employ methodological simplification in presenting these technical complexities, with the explicit objective of enhancing accessibility for interdisciplinary audiences and non-specialist readerships.
3.1.1. Unique Structural Layout and Load Characteristics
Unlike standard distribution networks, subway systems employ ring-like or radial topologies—often in combination—to ensure redundancy. These are designed to provide backup power in the event of a fault, but the limited space in tunnels restricts the addition of extra cables or protective devices. Furthermore, subway systems experience rapid fluctuations in load due to train acceleration, deceleration, and regenerative braking. This variability creates challenges for monitoring and controlling the system in real time. To make this concept clearer, we will now describe the load behavior in simpler terms.
Mathematically, we can describe the power flow through a given feeder segment i using a simplified representation of a DC traction power system (assuming DC electrification for many modern subway systems). If Pi is the power demanded by the train(s) on feeder i and Vi is the operating voltage, then the current Ii is as follows:
Equation (4) expresses the power demand (Pi) for a given feeder i, where Pi changes as the train moves and accelerates. This time-varying load behavior makes it difficult for traditional fault detection algorithms to work effectively. For example, traditional models assume that the load is stable, but in a subway system, it fluctuates rapidly, requiring adaptive methods to account for these changes. This dynamic load behavior complicates fault detection algorithms that rely on steady-state signals, making them less effective for real-time monitoring in a subway environment.
However, due to frequent changes in Pi arising from train movement and acceleration patterns, the power balance equation must be continually updated in real time. This results in a time-varying load profile:
where F(·) is a function capturing the instantaneous power consumption influenced by operational and environmental factors. Such dynamic load behavior complicates fault detection algorithms that rely on stable or quasi-steady-state load signatures, necessitating advanced, predictive, or adaptive methods.
3.1.2. Space Constraints and Infrastructure Limitations
Owing to dense urban environments, subway substations and power cables often share limited underground space with other utilities. This limited space makes it difficult to install additional equipment or sensors, which are crucial for real-time fault detection and system monitoring. The physical layout can restrict the addition of redundant cables or the implementation of standard protective equipment, such as circuit breakers and advanced switchgear. The deployment of sensors for condition monitoring and the retrofitting of intelligent devices also become more challenging in tight spaces. These infrastructural limitations underscore the importance of designing compact yet reliable control modules.
Moreover, specialized ventilation and cooling requirements, imposed by the underground setting, introduce additional operational constraints. Protective devices must be designed to handle higher ambient temperatures and humidity levels, while also meeting stringent fire and smoke control regulations. Consequently, hardware designed for standard aboveground distribution networks is not always directly applicable to subway environments. In summary, the confined environment of subway systems demands compact, yet highly reliable equipment for efficient self-healing mechanisms.
3.1.3. Operational Demands and Safety Considerations
Safety is of paramount concern in subway operations. Any fault or power interruption could endanger passengers, particularly in tunnels with limited escape routes. Subway operators need systems that can quickly detect faults, isolate them, and restore power to critical systems like ventilation and lighting to ensure passenger safety. In a typical self-healing system, power restoration is based on load importance. In subways, however, safety-critical loads such as lighting and ventilation must always be prioritized, even if they conflict with restoring power to non-essential areas. This places additional demands on the system to function optimally under high stress.
To meet these heightened safety requirements, self-healing solutions must incorporate advanced features, such as automatic rerouting of power and dynamic load shedding for non-essential systems. The complexity of these solutions, combined with the underground operational constraints, highlights the need for more refined, adaptive control strategies. Based on this, Table 2 summarizes eight key aspects of complex topological and operational constraints in power and energy systems, along with a special focus on subway power supply systems [65,66,67]. This table encapsulates the multifaceted nature of operational constraints and network topologies in subway systems compared to broader power and energy frameworks. From the heightened fault tolerance requirements necessary to safeguard human lives, to severe spatial limitations and challenging environmental conditions, subway environments significantly amplify conventional distribution network complexities. These constraints underscore the urgent need for research into compact, resilient, and adaptive technologies, including advanced sensor networks, real-time analytics, and robust communication protocols. Our viewpoint is that developing a holistic approach—one that integrates hardware design, data analytics, and regulatory compliance—will be essential for effectively addressing these challenges in self-healing subway systems.
Table 2.
Complex topological and operational constraints in subway power supply systems.
Below is Table 3, which further expands on the same constraints but from the perspective of potential technological and research-based interventions aimed at overcoming them. This table highlights the range of existing and emerging technological interventions that address the unique constraints of subway power supply systems. While many solutions are at a pilot or early-adoption stage, they collectively represent promising avenues for achieving greater reliability and smarter operations. Crucially, the success of these innovations depends not only on technological feasibility but also on regulatory frameworks, cost-effectiveness, and the availability of skilled personnel. Our view is that a holistic, lifecycle approach—one that spans design, implementation, maintenance, and upgrade—will be the linchpin for ensuring these interventions evolve into robust, standardized solutions tailored to underground rail environments.
Table 3.
Potential technological and research interventions for complex operational constraints.
3.2. Fault Diagnosis, Isolation, and Recovery in Real-Time
Real-time fault diagnosis, isolation, and system recovery form the technical core of any self-healing power network. In subway power supply systems, rapid fault response is even more imperative because of high safety requirements, the potential for substantial passenger disruptions, and the confined nature of subway tunnels. This section delves into the distinctive aspects of fault management in subway systems, focusing on the integration of advanced diagnostics, multi-agent coordination, and communication protocols suitable for underground conditions. We will now break down these concepts into simpler terms, explaining the steps involved in fault detection and recovery.
3.2.1. High-Speed Fault Detection and Localization
Subway power systems typically rely on a combination of protective relays and local measurement devices (e.g., current transformers and voltage sensors) to identify fault conditions [68,69,70]. However, the high-speed nature of subway systems requires these devices to detect faults almost instantly, even before they fully propagate through the network. In traditional systems, faults can be detected in milliseconds, but in subway systems, we need to detect and respond within fractions of a cycle to prevent accidents and minimize damage. Formula (6) provides a mathematical framework for fault detection using wavelet transforms, enabling the detection of rapid fault transients with high precision.
Recent advances in algorithms, such as wavelet-based methods, have shown promise in identifying fault signals almost instantly [71,72,73]. These methods work by detecting sudden changes in voltage or current signals. The key challenge is ensuring that these algorithms can handle the variability caused by rapid load changes, which are common in subway systems. Wavelet transforms are particularly effective because they can analyze signals at multiple scales, allowing for the detection of abrupt changes in voltage or current in real time. Formula (6) defines the wavelet transform used to capture such changes. By applying this transform to the measured signals, the system can identify fault occurrences almost immediately, thus reducing detection time.
For instance, a wavelet transform approach can identify the abrupt changes in current or voltage signals within fractions of a cycle. Let Ia(t) be the fault current signal measured at a feeder location a. A wavelet-based algorithm may compute the wavelet coefficients W(τ, s) at scale s and time shift τ:
where ψ is the mother wavelet. By identifying large coefficient magnitudes in certain frequency bands, the fault can be detected and localized almost instantaneously. Ensuring that these algorithms remain robust to variable load levels and potential measurement noise is an ongoing challenge, especially in the subterranean environment. This formula provides a key methodology for real-time fault detection in subway power systems, leveraging wavelet-based analysis to quickly respond to fault conditions while accounting for the dynamic nature of subway systems. To further enhance detection accuracy, it is important to consider the operational context and environmental variability of subway networks. This consideration highlights the complexity of fault detection in subway power systems, where environmental factors such as electromagnetic interference and fluctuating load demands must be accounted for in the design of detection algorithms.
3.2.2. Isolation Strategies in Constrained Environments
Once a fault is detected, it must be isolated quickly to prevent further damage. In subway systems, space constraints make it difficult to deploy additional circuit breakers. However, MASs offer a solution by using distributed sensors and devices that communicate with each other to decide the best strategy for isolating the faulted section. In simpler terms, instead of relying on a central controller, the system uses a network of intelligent devices that work together to identify and isolate faults.
MASs offer a promising solution by allowing distributed relays or intelligent electronic devices (IEDs) to communicate and coordinate their actions [74,75,76]. When a fault is detected, these agents negotiate which segment should be isolated, balancing the safety, load requirements, and operational constraints. Such decision-making can be modeled as an optimization problem under real-time constraints, often solvable through heuristics or simplified linear programming approaches to ensure the solution is computed fast enough for practical deployment.
3.2.3. Rapid Service Restoration and Self-Healing Techniques
After isolation, a paramount objective is to restore service to as many segments as possible while maintaining critical loads. Self-healing mechanisms may utilize alternative feed paths, or in DC traction systems, reconfigure the power supply from one substation to another if multiple substations supply overlapping regions. The final goal of a self-healing system is to restore power to as many segments as possible while ensuring that critical loads remain operational. To do this, the system may need to reroute power along alternate paths, either from a different substation or by using available backup power sources. The process must prioritize safety-critical systems and avoid overloading other sections of the network. MASs can help coordinate this process by allowing different agents to communicate and adjust their actions based on real-time conditions. This adaptive approach helps minimize downtime and prevent further faults from occurring during the restoration process. Key considerations for restoration strategies include the following:
- (1)
- Prioritization of Essential Loads: Station lighting, ventilation fans, and communication systems typically take precedence.
- (2)
- Gradual Re-Energization: Inrush currents from multiple loads can lead to secondary faults if not controlled.
- (3)
- Adaptive Coordination: Agents update each other on the status of circuit breakers and load demands, recalculating the optimal restoration path dynamically.
Fault-tolerant communication remains pivotal here, as real-time data exchange among IEDs, substation controllers, and train control centers is critical for coordinated restoration. Based on the above, below is Table 4, describing the key dimensions of real-time fault management and how they compare between general power systems and subway-specific applications. This table clarifies how fault management considerations in subways differ from those in broader power systems. While the need for speedy detection and isolation exists universally, the stakes in a subway environment are amplified by passenger safety and the confined space. Traditional solutions must often be miniaturized, accelerated, or re-engineered for subterranean use. Our viewpoint is that a concerted push toward distributed, intelligent solutions that integrate seamlessly with robust communication frameworks is vital. Given the urgency and operational constraints, these solutions should also incorporate redundancy in both hardware and decision-making processes.
Table 4.
Dimensions of real-time fault management in subway vs. general power systems.
Further, below is Table 5, focusing on specific technological enablers and approaches that enhance fault detection, isolation, and recovery in real time. This table spotlights the technological solutions that promise to revolutionize real-time fault management in subway power systems. From high-speed relays and advanced signal processing techniques to multi-agent coordination and AI-driven prediction, the options are diverse yet complementary. In our assessment, the challenge lies in harmonizing these approaches into a unified architecture that can meet the stringent safety and reliability benchmarks specific to subway operations. As communications improve and the costs of sensors and computational hardware decline, the feasibility of sophisticated real-time schemes will only increase, reinforcing the need for standardization and robust testing in live subway environments.
Table 5.
Technological enablers for real-time fault management in subway power networks.
3.3. Regulatory, Safety, and Integration Barriers with Emerging Technologies
While technology plays a vital role in self-healing subway power systems, it must align with strict regulatory frameworks and safety standards. Subways are subject to multiple layers of oversight from local governments and safety authorities, making the certification and adoption of new technologies a complex process.
3.3.1. Safety Standards and Compliance Requirements
Subway systems are subject to multiple layers of regulation, typically involving railway authorities, local governments, and international standards bodies. For example, any modifications to power infrastructure may require compliance with IEC 62443 (for industrial communication networks and security), local traction power guidelines, and specialized rail transit codes. IEC 61850, although originally developed for substation automation in power grids, is increasingly recognized for its potential in rail environments, yet it must be adapted to handle traction power specifics and integrated with existing railway safety protocols (e.g., EN 50126/50128/50129 in Europe) [77,78,79].
Attaining certification for newly introduced protective devices or software modules can take years, owing to the rigorous testing processes mandated for passenger safety. This elongated timeframe impacts the agility with which subway operators can adopt emerging self-healing technologies and necessitates thorough planning from the inception of any R&D initiative.
3.3.2. Interoperability and Integration with Legacy Systems
Many subway power systems were installed decades ago and lack the modern communication interfaces required to support new technologies. Integrating these systems with MAS and AI solutions requires overcoming significant challenges such as protocol mismatches and hardware limitations. One solution is to use gateways and middleware to bridge the gap between old and new technologies, allowing legacy systems to communicate with modern devices. This approach can help transition subway systems to self-healing architectures without needing a complete overhaul. The major hurdles include the following [80,81,82]:
- (1)
- Data Format Incompatibility: Legacy devices may not generate standardized digital outputs necessary for AI-based analysis.
- (2)
- Protocol Mismatch: Communication standards, such as IEC 61850, must be layered on top of older SCADA systems or even analog control signals.
- (3)
- Hardware Limitations: Legacy switchgear may lack the control interfaces to enable external agent-based decisions or real-time reconfiguration.
For example, the integration of modern automation and intelligent control systems into legacy subway power infrastructures presents significant challenges due to outdated equipment and communication protocols. Mbango (2009) [80] highlights the difficulties of retrofitting SCADA-based legacy systems with modern communication standards like IEC 61850, emphasizing compatibility issues with aging switchgear and transformers. Dutta Pramanik and Upadhyaya (2025) [81] further explore how advanced IoT solutions, including motorized actuators and standardized communication protocols, can be layered onto older grid systems to bridge protocol mismatches and ensure interoperability while mitigating vendor lock-in. Additionally, their study underscores the necessity of updating legacy data formats and communication systems to enable AI-driven and MAS applications, as modern digital outputs and real-time decision-making capabilities often exceed the capabilities of older infrastructure. Together, these studies provide a comprehensive analysis of the technical and financial challenges associated with modernizing legacy subway power networks while ensuring reliability and efficiency.
As a result, achieving a unified self-healing architecture often involves partial overhauls or staged deployments, which complicate operational continuity and budget planning. Based on the above, Figure 3 illustrates the process of integrating advanced MAS or AI solutions into aging subway power infrastructures. This flowchart highlights key challenges—such as data format incompatibility, protocol mismatch, and hardware constraints—and proposes a phased approach to achieve a unified self-healing architecture while preserving operational continuity. Seen from Figure 3, the step-by-step explanation is summarized as follows.
Figure 3.
A flowchart of the legacy equipment integration process.
1. Identify Existing Legacy Devices
A thorough survey of legacy switchgear and control equipment is conducted to determine their current functionality, communication interfaces, and overall compatibility with modern data acquisition and control standards.
2. Evaluate Key Constraints
- (1)
- Data Format Incompatibility: Older devices may provide analog or proprietary digital signals, which necessitate specialized conversion or encapsulation.
- (2)
- Protocol Mismatch: Historic SCADA platforms or purely analog signaling can diverge significantly from contemporary standards like IEC 61850.
- (3)
- Hardware Limitations: Legacy switchgear often lacks the necessary interfaces for remote actuation or real-time reconfiguration, hindering direct MAS or AI control.
3. Data and Protocol Adaptation
- (1)
- Protocol Gateways/Bridges: Gateways facilitate communication between legacy systems and modern platforms without requiring a wholesale replacement.
- (2)
- IEC 61850 Encapsulation: Wrapping legacy SCADA or analog signals in IEC 61850-compliant structures enables standardized management and interoperability.
- (3)
- Middleware for Data Format Conversion: Dedicated software tools unify disparate data formats, facilitating seamless integration into MAS or AI analytics.
4. Hardware Retrofits and Expansions
- (1)
- Upgraded Control Interfaces: Introducing new control boards or modules into legacy switchgear equips these devices with real-time monitoring and remote operation capabilities.
- (2)
- Additional Real-Time Monitoring Modules: Enhancing measurement accuracy and granularity via sensors or digital metering units provides critical data for AI-driven decision-making.
- (3)
- Partial Preservation of Analog Devices: When a full replacement is not immediately feasible, integrating digital solutions alongside retained analog components ensures a gradual transition.
5. Formulate Phased Deployment
- (1)
- Prioritize Critical Nodes: Target the most failure-prone or operationally significant components for early-stage retrofitting.
- (2)
- Assess Feasible Investment and Downtime: Balance the need for system reliability with available funding and permissible service interruptions.
- (3)
- Define a Technological Evolution Path: Implement an overarching plan that anticipates future standards and protects long-term compatibility.
6. Implementation and Testing
- (1)
- Incremental Equipment Replacement/Installation: Execute hardware upgrades and software integration in a series of controlled deployments to minimize risk.
- (2)
- Protocol and Data Interface Validation: Conduct rigorous testing of gateways, interfaces, and data conversion processes to ensure coherence and reliability.
- (3)
- MAS/AI-Integrated Testing: Validate the interaction between upgraded devices and AI-driven control systems, confirming that self-healing mechanisms function effectively under real-world conditions.
7. Unified Self-Healing Architecture Online
Upon successful validation, the modernized system with integrated MAS/AI solutions is commissioned, enabling comprehensive automated fault detection, isolation, and service restoration.
8. Ongoing Optimization and Maintenance
- (1)
- Technological Upgrades: Continuously refine the integrated system in response to emerging digital standards and novel AI algorithms.
- (2)
- Scheduled Device Renewal: Replace aging assets as part of routine maintenance, gradually increasing the proportion of modern, digitally enabled equipment.
- (3)
- Adoption of Emerging Standards: Align future developments with evolving industry protocols to maintain long-term interoperability and performance excellence.
As demonstrated in Figure 3, this phased methodology ensures the progressive transformation of legacy subway power systems into fully integrated, self-healing networks. By systematically identifying key technological constraints, implementing both hardware and software retrofits, and employing protocol adaptation strategies, stakeholders can maintain robust operational continuity while incrementally modernizing their infrastructures. Through the careful prioritization of critical nodes and the adoption of specialized tools for data conversion, legacy devices can be seamlessly incorporated into cutting-edge MAS or AI frameworks. The outcome is a resilient power network characterized by intelligent fault management, real-time monitoring, and long-term adaptability to new technological standards.
3.3.3. Balancing Innovation, Cost, and Public Acceptability
Public transportation authorities must balance the costs of upgrading infrastructure with the need to maintain service affordability. The cost of implementing self-healing systems—along with the specialized training and software licenses required—can be a barrier to adoption. Pilot projects provide an opportunity to gather performance data and build confidence in the technology before full-scale implementation. Demonstrating a clear return on investment (ROI)—usually via reductions in service interruptions and associated penalties—is often critical to securing funding. Notably, public acceptance serves as a critical prerequisite for system innovation, given that substantial operational modifications may elicit concerns over service reliability and cost escalations.
From a strategic perspective, adopting emerging technologies in smaller pilot projects can help gather performance data and build confidence among stakeholders. However, scaling from pilot to system-wide deployment introduces additional complexities, underscoring the need for stable, well-documented, and standardized solutions.
Below is Table 6, summarizing major regulatory, safety, and integration issues, contrasting their treatment in general power systems versus specialized subway networks. This table highlights the interplay between stringent regulatory environments, public safety imperatives, and the integration of legacy systems that typify subway power networks. Unlike typical power utilities, where incremental modernization is possible with relatively less public scrutiny, subway systems face direct accountability to commuters and municipal governments. Consequently, adopting a self-healing paradigm requires a thorough demonstration of reliability and compliance. Our perspective is that the regulatory dimension should not be viewed merely as a constraint but as an essential guideline to ensure passenger well-being and system robustness. Collaborative efforts between standardization bodies and railway authorities, coupled with well-defined pilot projects, can help accelerate the adoption of advanced technologies.
Table 6.
Regulatory, safety, and integration barriers in subway power systems.
Further, below is Table 7, focusing on specific measures, strategies, and research directions that can alleviate regulatory, safety, and integration bottlenecks. In this table, we map out a variety of strategic pathways by which regulatory, safety, and technological barriers can be mitigated. From unified standardization to pilot sandbox environments and modular retrofitting approaches, there are numerous tactics available to ease the transition toward self-healing architectures in subway systems. Our viewpoint underscores the importance of synergy between technology developers, railway authorities, and public stakeholders. The potential payoffs—enhanced passenger safety, improved operational efficiency, and a more resilient transit system—amply justify the investment and effort required to navigate these constraints.
Table 7.
Strategies to overcome regulatory, safety, and integration barriers.
Overall, this chapter has surveyed the multifaceted challenges unique to subway power supply systems. In Section 3.1, we covered the intricate topological constraints and the difficulty of operating within confined underground environments. Section 3.2 then examined the technical hurdles in implementing real-time fault diagnosis, isolation, and self-healing restoration, emphasizing the importance of speed, coordination, and robust communication. Finally, Section 3.3 analyzed the broader regulatory, safety, and integration issues that govern how new technologies can be adopted and scaled within subway networks. These three sections collectively highlight that any self-healing strategy for subway power supply systems must be holistically designed—encompassing engineering solutions, operational practices, and regulatory frameworks—to meet the stringent reliability and safety demands of modern urban rail transit.
3.4. Advancing Fault Management and Self-Healing Capabilities in Subway Power Supply Systems
In this section, we delve into the key technological innovations that are reshaping fault management and self-healing capabilities within subway power supply systems. As subway systems face increasing operational demands and stringent safety requirements, the need for more intelligent, adaptive, and efficient fault management systems has never been more critical. By leveraging emerging technologies such as MASs, AI-based algorithms, and real-time data analytics, subway power networks can achieve faster fault detection, isolation, and recovery, ultimately enhancing both operational reliability and passenger safety.
As presented in Table 8, the comparison between traditional self-healing techniques and the proposed MAS-based strategy highlights the distinct advantages of adopting AI-driven solutions in managing faults within the complex and confined environments of subway systems. This table presents a detailed comparison, illustrating how the MAS-based approach addresses critical challenges, such as fault detection speed, real-time adaptability, and recovery efficiency, that conventional methods struggle to overcome.
Table 8.
Comparison between current self-healing techniques and the suggested MAS-based strategy.
1. Key Advantages
The comparison in Table 8 highlights the significant advancements that MAS-based self-healing systems offer over traditional techniques. By leveraging AI-based fault detection and distributed decision-making, MASs offer more precise, rapid, and adaptive fault management compared to conventional methods that rely on static algorithms and manual intervention [83,84,85]. Key advantages include the following:
- (1)
- Improved Detection Speed: MAS-based systems are capable of detecting faults in near real time, a critical feature for high-speed subway systems where rapid fault detection can minimize service disruptions and enhance passenger safety.
- (2)
- Increased Flexibility: Unlike traditional systems that follow fixed algorithms, MASs adapt to real-time network conditions, allowing for dynamic fault isolation and recovery strategies. This is especially important in complex subway topologies, where conventional systems struggle to handle intricate network configurations.
- (3)
- Enhanced Recovery Efficiency: MASs can reconfigure power distribution networks on the fly, ensuring that critical loads like lighting and ventilation are restored first, which is crucial for subway systems where passenger safety is paramount.
- (4)
- Space and Maintenance Benefits: Traditional systems require significant hardware installations, which can be difficult in the confined underground spaces of subway systems. MASs, by contrast, use distributed sensors and intelligent agents, reducing the need for additional hardware and simplifying system maintenance.
2. Future Outlook and Research Directions
The adoption of MAS-based self-healing systems in subway power supply networks is a promising avenue for overcoming the unique challenges faced by these systems. However, several hurdles remain:
- (1)
- Integration with Legacy Systems: Integrating MASs with older subway power infrastructures presents significant challenges due to outdated communication protocols and hardware limitations. Future research should focus on developing seamless integration frameworks that allow for gradual modernization of legacy systems without major disruptions to existing operations.
- (2)
- Regulatory Challenges: The deployment of MAS-based systems in subway networks requires updates to regulatory frameworks, especially in terms of safety certifications and standards compliance. Ongoing collaboration between AI experts, regulatory bodies, and transit authorities will be essential to harmonize new technologies with existing safety protocols.
- (3)
- Cost and Implementation Feasibility: While MASs offer significant benefits, the initial cost of implementation may be prohibitive for many subway systems, especially those in less economically developed regions. Future research should focus on developing cost-effective solutions that make MAS adoption more accessible, including low-cost sensors and cloud-based processing frameworks.
- (4)
- Real-Time Data Processing: Edge computing and machine learning algorithms will play a critical role in processing the massive amounts of data generated by subway power supply systems [86,87,88,89,90]. Future advancements in these technologies will be crucial for achieving the real-time decision-making required for effective self-healing.
Overall, this section has examined the technological innovations that underpin self-healing subway power supply systems, with a focus on the MAS-based strategy. A comparison with traditional systems underscores the substantial benefits of MASs in terms of fault detection, isolation, recovery efficiency, and system scalability. However, challenges related to legacy system integration, regulatory compliance, and cost remain, and further research is necessary to address these barriers. As technology evolves, MASs will become an increasingly integral component of resilient and adaptive power networks, offering enhanced reliability and safety for subway operations.
4. The Integration of MASs and the IEC 61850 Standard into Subway Power Systems
Before delving into the detailed discussions in this chapter, it is crucial to establish a coherent overview of how each subsection will contribute to our central theme: integrating MASs with the IEC 61850 standard to achieve enhanced self-healing capabilities in subway power networks. Section 4.1 lays the theoretical and conceptual groundwork by examining the principles, operational frameworks, and core algorithms underlying MAS-based self-healing approaches, thereby clarifying the motivations for distributing intelligence and control across multiple agents in complex subway power infrastructures. Section 4.2 shifts the focus to the IEC 61850 standard itself, detailing its data modeling techniques, communication protocols, and engineering methodologies. This portion provides clarity on why IEC 61850 is pivotal for creating standardized data structures and high-speed communication channels in modern traction power systems. Finally, Section 4.3 synthesizes the findings of the previous two sections by proposing a convergent MAS–IEC 61850 architecture, highlighting how the synergy between a distributed agent framework and a standardized communication protocol can revolutionize fault detection, isolation, and restoration (FDIR) processes. These three subsections collectively illustrate how MASs and IEC 61850 can be cohesively integrated to bolster both the intelligence and interoperability of next-generation subway power systems.
4.1. MAS-Based Approaches to Self-Healing in Subway Power Systems
MASs have gained prominence as a robust paradigm to address the growing complexities within electric power networks, including distribution grids, transmission infrastructures, and railway/metro traction power systems. In subway power networks, the non-trivial combination of AC (e.g., 25 kV or 35 kV) and DC (often 750 V or 1500 V) segments, along with extensive feeder lines, makes centralized architectures prone to single-point failures, communication bottlenecks, and slow response times. By contrast, MAS-based frameworks distribute intelligence among localized agents that monitor and control subsets of the network. These agents can collaborate with one another to execute real-time fault diagnosis, isolation, and restoration decisions, thereby enabling faster self-healing actions and minimizing disruptions to subway operations.
4.1.1. Conceptual Foundations and Control Philosophies of MASs
From a theoretical standpoint, MASs can be viewed as an ensemble of autonomous entities—termed agents—each responsible for a specific functional or geographical segment of the power system. Agents are designed to perceive local states (such as voltage, current, power flows, or device status) and communicate with peer agents (or higher-level controllers) to reach optimized global or semi-global control objectives [91,92,93]. These studies [91,92,93] collectively explore the control and management methodologies of MASs in power systems, emphasizing the role of autonomous agents in local perception, global optimization, and coordinated control. Logenthiran (2012) [91] investigates the application of MASs in distributed power systems, proposing a real-time management and optimization strategy based on agents to enhance system flexibility and autonomy. Dou et al. (2014) [92] further developed a decentralized coordinated control method based on MASs, improving the transient stability of large-scale power systems through information exchange and collaboration among agents. Farid (2015) [93] focuses on the design principles of MASs for resilient coordination and control in future power systems, introducing a framework that enables more efficient responses to sudden failures and dynamic load changes. These studies demonstrate that MAS technology, through the collaborative work of distributed intelligent agents, can achieve optimized control and enhance the reliability and self-healing capabilities of power systems.
A key advantage lies in the ability of agents to respond locally to disturbances while still coordinating across the network to avoid suboptimal or contradictory actions. Formally, let us denote the network by a set of buses B and lines L. Each agent Ai (where i∈{1, 2, …, N}) monitors a subset of buses Bi ⊆ B and lines Li ⊆ L. In a distributed control algorithm, an agent’s decision uiu_iui may be formulated as follows:
where
- xi is the local state vector observed by agent i;
- x−i represents the states observed by neighboring agents or those shared through communication;
- Ji is the cost function capturing local objectives (e.g., minimize power loss, ensure safe operation under fault conditions, or maintain voltage within permissible limits);
- Ui is the feasible action space for agent i.
Through an iterative or event-triggered communication protocol, each agent refines its control decision ui, coordinating with adjacent agents until a convergent solution is reached or a deadline for fast self-healing control expires. This iterative process, facilitated by MASs, ensures real-time adaptation to evolving operating conditions in the subway power network.
4.1.2. MAS-Based Fault Detection and Diagnosis
In the context of subway power systems, MAS-based fault detection methodologies frequently combine local sensor data with higher-level decision-making processes. Each agent employs local measurements—such as overcurrent readings, voltage dips, traveling-wave signals, or negative sequence components—to detect anomalies. When a local agent suspects a fault, it initiates a distributed consensus mechanism to confirm that the disturbance is genuine and not merely a sensor malfunction or transient event. A simple but illustrative approach is shown below:
By exchanging αi(t) values, if a sufficient fraction of neighboring agents also detect anomalies (αi(t) = 1), the network of agents collectively raises a system-wide fault alarm. This distributed confirmation drastically reduces false positives and eliminates reliance on a single central controller. Once a fault is confirmed, specialized diagnostic agents implement advanced signal processing or pattern recognition algorithms to classify the fault type (e.g., single-phase-to-ground, short-circuit between phases, or DC traction line grounding fault) and localize the faulted segment within the network’s geographic or topological mapping.
4.1.3. MAS-Based Fault Isolation and Restoration
Fault isolation and restoration processes in MASs rely on cooperative control actions among protective devices such as breakers, switches, and reclosers. In a typical subway distribution scenario, the primary objective is to de-energize only the faulted section while maintaining power supply to all unaffected sections. The MAS approach can be summarized by the following pseudo-equations. Suppose that an agent controlling a switch SWj needs to decide whether to open or close the switch after a fault is detected and located:
Here, ΔPloss represents the power that would be disconnected if the switch is opened, ΔPrisk quantifies the operational or safety risk of leaving the switch closed, and λ is a threshold that the agent dynamically adjusts based on the system’s real-time operational context (e.g., passenger load demands or train frequency). Once the faulted section is isolated, restoration agents reconfigure network topology to reroute power through alternative paths or feeder lines. Each agent reevaluates the line capacities, voltage levels, and breaker statuses, ensuring that the newly configured network operates within acceptable thermal limits.
4.1.4. Evaluation of MASs in Subway Environments
Empirical studies and field trials suggest that MAS-based approaches reduce fault-clearing times and curtail the scope of outages in urban rail power networks. Moreover, their distributed nature inherently accommodates incremental system expansions. However, challenges remain, notably the standardization of agent communication protocols and the design of robust agent negotiation schemes that handle complex couplings between AC traction, DC traction, and station loads. These gaps emphasize the importance of coupling MASs with standardized frameworks such as IEC 61850, which we examine more thoroughly in subsequent sections.
Based on the above, Table 9 provides a comparative overview of key application areas where MAS approaches can significantly enhance the self-healing capabilities of subway power systems [94,95,96]. For example, the reviewed literature highlights the integration of MASs in the optimization and control of power systems, particularly in enhancing self-healing capabilities and fault management. Herrera et al. (2020) [94] provide a comprehensive review of MAS applications in complex networks, emphasizing its potential in resilient design and self-healing, which can significantly improve fault isolation and service restoration. Sharifi (2015) [95] focuses on energy-aware service provisioning in peer-to-peer cloud ecosystems, discussing how MASs can optimize energy flow and enhance system reconfiguration. Meanwhile, Irfan et al. (2017) [96] examine the role of MASs in the control of smart grids, stressing its contribution to predictive maintenance, fault detection, and self-healing, thus strengthening the overall reliability of the system. Collectively, these works underscore the strategic importance of MASs in advancing the fault management and resilience of modern power networks, particularly in urban infrastructure like subway power systems. Based on this, Table 9 summarizes potential growth trends, technological barriers, and strategic importance across ten dimensions. Overall, it underscores that fault isolation, restoration, and system reconfiguration hold the most immediate promise for broad adoption, while areas such as predictive maintenance, microgrid integration, and power quality monitoring remain underexplored but present high potential for future research and implementation. Our viewpoint is that a collaborative, standardized approach—supported by robust communication protocols—will be crucial in advancing MAS-based solutions from pilot demonstrations to large-scale deployments.
Table 9.
Comparative applications of MASs in subway power systems.
As shown in Table 9, this table provides a comprehensive comparison of the applications of MASs in subway power systems across various scenarios, including fault diagnostics, system reconfiguration, and energy management, highlighting adoption levels, challenges, and future directions. Based on this, Table 10 delves deeper into emergent research directions for MASs in subway power systems, covering topics such as scalable architectures, agent-based security, big data integrations, and varying hierarchical designs. A salient point from this table is the growing interest in hybrid hierarchical–distributed frameworks and integrated intrusion detection solutions that enhance both system reliability and cybersecurity. In our assessment, real-time simulation and hardware-in-the-loop testing remain critical to validate new MAS concepts in an environment that faithfully reflects the complexities of day-to-day subway operations.
Table 10.
Research directions and emerging trends in MASs for subway networks.
Overall, this section underscores that MASs offer a potent framework for distributed self-healing control in subway power systems. Yet, to unlock its full potential, it must be reinforced by standardized communication infrastructures—particularly those specified in IEC 61850. The next section provides an in-depth exposition of how IEC 61850 can facilitate high-speed, interoperable, and secure data exchange across a broad spectrum of power system devices, thereby complementing and enhancing MAS-based strategies.
4.2. Implementation of the IEC 61850 Standard in Subway Power Systems
The IEC 61850 standard was originally devised for substation automation, enabling standardized object models, data structures, and communication protocols that support interoperability and vendor-neutral engineering [97,98,99,100,101]. Over the past decade, it has been extended to encompass distribution automation, renewable energy integration, and even railway electrification contexts. In subway power systems, the standard addresses complex challenges such as seamless multi-vendor integration, real-time protective relaying, and advanced automation functionalities essential for self-healing. This section thoroughly explores the protocols, models, and engineering techniques that make IEC 61850 a cornerstone for modernizing subway power networks.
4.2.1. Foundations of IEC 61850 and Its Relevance to Subway Networks
IEC 61850 provides a comprehensive approach encompassing data modeling, communication stacks, and configuration languages [102,103]. Its fundamental building blocks include the following:
- (1)
- Logical Nodes (LNs): Abstract representations of power system functions (e.g., measurement, protection, and control).
- (2)
- Data Objects and Data Attributes: Structured to capture various aspects of a function’s state, measurement readings, and control parameters.
- (3)
- Communication Services: Such as GOOSE for high-speed event transfer, and MMS (Manufacturing Message Specification) for client–server communications.
For subway systems, the dual AC/DC supply lines, the presence of intricate protective schemes (like distance protection for AC lines, and overcurrent or undervoltage protection for DC traction feeders), and the need for reliable real-time data exchange across stations render IEC 61850 uniquely valuable. In particular, the GOOSE mechanism allows for “peer-to-peer” communication [104,105]. This ensures that protective relays and control IEDs (Intelligent Electronic Devices) can exchange trip signals, blocking commands, or reclose instructions with minimal latency—a critical requirement when trains must be continuously powered and passenger safety is paramount.
4.2.2. IEC 61850 Network Redundancy and Communication Protocols
Reliability is a non-negotiable criterion in traction power systems. IEC 61850 accommodates various redundancy protocols to ensure minimal downtime in the event of network faults:
- (1)
- Parallel Redundancy Protocol (PRP) sends duplicate packets over independent LANs, eliminating single points of failure.
- (2)
- High-availability Seamless Redundancy (HSR) adopts a ring topology, where each node forwards frames in both directions around the ring, ensuring zero recovery time in the case of link interruption.
- (3)
- Rapid Spanning Tree Protocol (RSTP) [106,107,108] provides a loop-free topology but may involve small reconvergence delays.
In large-scale subway systems, a combination of PRP and HSR is often deemed optimal for process-level communications to guarantee near-instant failover. Stations, wayside cabinets, and centralized operation control centers thus rely on robust ring or mesh topologies that incorporate specialized switches supporting IEC 61850 traffic priorities. Latency, jitter, and packet-loss thresholds must be carefully specified to accommodate the stringency of traction power automation.
4.2.3. System Configuration Language (SCL) and Engineering
One of the distinguishing features of IEC 61850 is its system configuration language (SCL), defined in Part 6 of the standard. SCL allows power engineers to describe a substation’s single-line diagram, the communication architecture, and the functions hosted on each IED in a vendor-neutral XML (eXtensible Markup Language)-based format. For a typical subway traction substation, the SCL file might include .
By unifying the engineering process, SCL lowers the risk of misconfigurations and ensures that future expansions or modifications can be accommodated with minimal re-engineering efforts [109]. In self-healing contexts, the tight correlation of logical nodes (e.g., protection distance intelligent system (PDIS) for distance protection and protection overcurrent unit (PTOC) for overcurrent) with the real physical layout helps in automatically updating agent-based restoration strategies whenever the station’s layout changes.
4.2.4. IEC 61850 Services for Self-Healing
IEC 61850 offers several key services that directly underpin self-healing operations:
- (1)
- GOOSE Messaging for Fast Trip Signals: Agents or protective relays can disseminate critical messages throughout the local substation or extended feeder within milliseconds, facilitating prompt fault isolation.
- (2)
- Reporting and Logging Services: These allow an MAS or central authority to monitor system states in near real-time, capturing event sequences needed for diagnosing deeper network issues.
- (3)
- MMS for Agent–IED Interactions: MAS architecture can rely on MMS-based client–server communications to read or write device parameters, retrieve trending data, or orchestrate switchgear commands at a slower but more comprehensive timescale.
4.2.5. Challenges and Limitations in Subway Contexts
Despite its merits, deploying IEC 61850 in subway power contexts is not without hurdles. Ensuring electromagnetic compatibility in high-voltage/low-voltage mixed environments, training a workforce specialized in substation automation, and retrofitting older devices that do not inherently support standard object models are formidable tasks. Moreover, bridging DC traction equipment with AC-based LN definitions often requires custom or extended logical nodes. Nonetheless, the industry is gradually developing specialized profiles for railway electrification that align with standard IEC 61850 principles. Based on this, Table 11 enumerates the current landscape of IEC 61850 applications within subway power systems along multiple dimensions. Noteworthy takeaways include the relatively high adoption rates for protection and SCADA integration in newer installations, as well as the emerging interest in condition-based monitoring and MAS–IEC 61850 hybrid solutions. While the standard provides robust functionalities for AC traction, bridging the gap with DC traction elements remains a work in progress. Nonetheless, ongoing refinements and extension efforts position IEC 61850 as an increasingly indispensable backbone for any advanced self-healing architecture in subway settings.
Table 11.
IEC 61850 applications in subway systems: status and outlook.
Based on Table 11, we further summarize the challenges and future potential of IEC 61850 in subway power systems, as presented in Table 12. This table emphasizes ongoing issues in retrofitting legacy DC equipment, securing GOOSE/MMS (manufacturing message specification) communications, and grappling with the complexity of LN/DO/DA (logical node/data object/data attribute) mapping. Despite these obstacles, active research and standardization efforts continue to refine IEC 61850 for railway electrification contexts. In the long term, better synergy with MAS frameworks, improved security protocols, and a cohesive approach to LN expansions will yield a robust environment where advanced self-healing functions can be reliably deployed.
Table 12.
Key challenges and research potential for IEC 61850 in subway environments.
Overall, this section demonstrates that IEC 61850 serves as an enabling framework for achieving real-time, interoperable communications in subway power systems. Its emphasis on standardized data models, high-speed messaging, and robust engineering languages paves the way for advanced control schemes—particularly those based on MASs. In the subsequent section, we will elaborate on how MASs and IEC 61850 can be harmonized into a convergent architecture that maximizes the advantages of both approaches.
4.3. Convergent MAS–IEC 61850 Architecture for Fault Diagnosis, Isolation, and Restoration
While Section 4.1 and Section 4.2 have, respectively, presented the strengths of MASs and IEC 61850, the fusion of these two technologies represents a paradigm shift for urban rail power systems. By harnessing agent intelligence in tandem with standardized communication, subway operators can establish an ecosystem where self-healing actions are triggered, coordinated, and verified in a manner that is both robust and scalable. This final section outlines how a convergent MAS–IEC 61850 architecture can be implemented, highlighting design considerations, operational workflows, and potential obstacles along the integration pathway.
4.3.1. Architectural Overview and Design Considerations
A convergent MAS–IEC 61850 architecture introduces distributed “agent brains” into the well-defined communication and modeling backbone offered by IEC 61850. Each protective device or Intelligent Electronic Device (IED) can be associated with an “agent” that interprets local measurements (modeled under IEC 61850 logical nodes), exchanges GOOSE messages with neighboring agents, and cooperates with a supervisory agent at the substation or control center level via MMS. The system-level design can be broken down as follows:
- (1)
- Agent Layer: Comprising local device agents (e.g., relay agents and switchgear agents) and a station-level agent (or aggregator) that coordinates sectional restoration.
- (2)
- IEC 61850 Communication Layer: Facilitating high-speed GOOSE transmissions among local agents for real-time protective functions, and employing MMS for configuration, monitoring, and slower control commands.
- (3)
- Coordinated Control Layer: A higher-level mechanism, often at the control center, that merges data from multiple stations or lines. This layer might also integrate advanced AI algorithms for centralized oversight and strategic decisions.
From a modeling standpoint, each local agent references LN objects for measurements (e.g., (measurement unit) MMXU for power/voltage and PTOC for overcurrent protection) and thus can directly manipulate or read the data attributes under these logical nodes. The agent’s internal logic can be abstractly formulated as follows:
where X is the set of local LN data attributes (such as measured currents, voltages, and breaker statuses), Y is the set of GOOSE signals received from neighbors (e.g., protective trip commands and alarm states), and t indicates time or event trigger references.
4.3.2. Fault Diagnosis and Localization Workflow
A typical fault diagnosis scenario under convergent MAS–IEC 61850 architecture proceeds as follows:
- (1)
- Initial Detection (Local Agent): Upon sensing an abnormal current or voltage signature (modeled by LN PTOC or PDIS), the local agent increments an internal fault counter. If the magnitude exceeds a set threshold, the agent broadcasts a GOOSE-based “suspected fault” message to adjacent nodes.
- (2)
- Peer Confirmation (Neighboring Agents): Neighboring agents also evaluate their local signals. If they detect correlated anomalies, they respond with a GOOSE “confirmation” message. Weighted voting can be employed to mitigate false positives.
- (3)
- Station-Level Aggregation: The station-level or aggregator agent (connected via MMS and local GOOSE) collects these events. Using a pre-defined topology map (SCL-based), it identifies the line segment or bus location with the highest probability of fault development.
- (4)
- Refined Diagnostics: Optionally, advanced AI modules or traveling-wave-based algorithms can run at the aggregator level to further pinpoint the fault location.
- (5)
- Isolation Instruction: Once the aggregator agent validates the fault location, it issues GOOSE open commands to the relevant breakers or switches, ensuring minimal disruption to unaffected lines.
4.3.3. Restoration and Reconfiguration Strategies
After isolating the fault, agents collaborate to restore power to the maximum possible portion of the subway network. The station-level agent consults the SCL topology to identify alternate feeding paths. If the AC ring or DC feeder lines can accommodate the extra load without violating thermal or voltage constraints, a reconfiguration command is broadcast. Restoration might unfold in a multi-step process:
Step 1: SWhealthy←Close, Step 2: Check Pcapacity ≥ Pdemand, Step 3: Activate the line if all constraints are satisfied.
Each local agent (switchgear or feeder agent) acknowledges the command, rechecks local conditions, and then closes or opens respective switches. This sequence is governed by multi-agent consensus, ensuring that no single device performs an unsafe action. Detailed logging and reporting (via MMS) guarantee thorough post-event analysis.
4.3.4. Security and Redundancy Considerations
When MAS intelligence relies on an IP-based IEC 61850 network, cybersecurity and redundancy are paramount. Agents must handle the encryption or authentication of GOOSE messages. Meanwhile, ring or dual-network topologies ensure that a single link failure does not compromise the entire self-healing process. Key security approaches include the following:
- (1)
- Role-Based Access Control (RBAC) [110]: Agents only process control instructions from authenticated roles recognized by the substation system.
- (2)
- Encrypted Tunneling of GOOSE/MMS: Emerging solutions propose TLS-based encryption for MMS, though GOOSE typically remains unencrypted for performance reasons.
- (3)
- Backup Communication Channels [111]: For critical commands, multiple GOOSE subscriptions may be created in parallel networks (e.g., PRP + HSR) to reduce the risk of packet loss or delay.
4.3.5. Practical Challenges and Future Outlook
In practice, the synergy of MASs and IEC 61850 faces numerous engineering, organizational, and financial challenges. Notably, older DC traction systems lack standardized LN definitions, requiring the use of proxy or extended LN models. Additionally, debugging multi-agent logic in a live subway environment with thousands of daily passengers demands rigorous offline testing (hardware-in-the-loop simulations) prior to rollout. Nonetheless, the promise of real-time, distributed intelligence—coordinating with a vendor-agnostic, standards-based communication framework—strongly indicates that convergent MAS–IEC 61850 solutions will shape the next generation of subway power systems.
Based on the above, Table 13 outlines various deployment modalities—ranging from full greenfield implementations to more conservative station-focused upgrades. Each scenario entails different levels of complexity, initial expenditure, and performance demands. The success of these deployments hinges on a confluence of factors, including the readiness of legacy systems for integration, the expertise of stakeholders, and the clarity of standard definitions for DC traction contexts. Nonetheless, incremental or phased approaches can systematically unlock the benefits of MAS–IEC 61850 synergy.
Table 13.
Practical deployment cases for MAS–IEC 61850 in subway systems.
Based on Table 13, we further summarize the future R&D themes for MAS–IEC 61850 convergence in subway networks, as demonstrated in Table 14. In this table, several emergent R&D themes underscore the evolving nature of MAS–IEC 61850 convergence, including the integration of edge computing, AI-driven fault forecasting, and the development of new LN classes for DC traction applications. Each theme demands multidisciplinary collaborations that range from cryptography for secure GOOSE messaging to advanced hardware engineering for resilient edge-based agents. The ultimate payoff—a fully autonomous, self-healing subway power system that leverages standardized communication—justifies the complexity of these endeavors.
Table 14.
Future R&D themes for MAS–IEC 61850 convergence in subway networks.
Overall, this section has presented a comprehensive view of how MASs and the IEC 61850 standard can be synthesized into a single, cohesive architecture that elevates fault management and self-healing to new levels of efficiency and reliability in subway power networks. While operationalizing this synergy demands substantial effort in areas such as engineering, cybersecurity, and standardization, the strategic advantages are evident: higher system resiliency, minimized downtime, and an adaptive framework capable of meeting future urban transportation demands. By situating MAS intelligence within the standardized data and communication protocols of IEC 61850, subway operators can accelerate fault response; reduce manual interventions; and pave the way for a truly smart, autonomous rapid transit infrastructure. This integrated approach stands at the forefront of the ongoing “intelligence revolution” in power systems, offering a compelling vision for the next generation of urban rail electrification.
5. Practical Applications of Self-Healing Techniques in Subway Systems
In this chapter, we will examine the practical applications of self-healing techniques within subway power systems, specifically focusing on the integration of MASs, AI, and relevant standards such as IEC 61850. The application of these techniques is vital for improving the reliability and efficiency of subway networks, ensuring rapid fault detection, isolation, and recovery. This chapter will be organized into four key sub-sections:
- (1)
- Substation-level self-healing applications in subway power systems;
- (2)
- Line-level self-healing mechanisms and network reconfiguration;
- (3)
- Cross-layer fault recovery techniques and strategies;
- (4)
- AI-driven fault diagnosis and recovery in complex scenarios.
Each sub-section will delve into the practical applications of self-healing mechanisms in subway systems, highlight the technological advancements, and provide an in-depth analysis of the benefits and challenges involved in implementing these strategies. The sub-sections are connected logically, starting from individual substations and their self-healing capabilities; progressing to line-level self-healing; and finally addressing complex, multi-layer fault recovery strategies with the help of AI algorithms.
5.1. Substation-Level Self-Healing Applications in Subway Power Systems
Substations play a crucial role in the overall operation of subway power systems. They are the primary points of interaction between the high-voltage grid and the subway’s internal power distribution network. Their responsibility includes converting and distributing electrical power to subway stations and trains, making them a critical point of failure in any power outage event [112,113,114]. For example, Refs. [112,113,114] provide insights into the role of substations, fault detection, isolation, and reconfiguration in self-healing smart grid systems. They also highlight how these processes are integrated into critical infrastructure, like subway power systems, for reliability and operational efficiency. As previously discussed, substations play a crucial role in fault detection, isolation, and network reconfiguration, which is essential for the continuous operation of subway systems. Integrating self-healing mechanisms at the substation level enhances system reliability by enabling rapid fault identification and isolation, thereby restoring power to unaffected areas more efficiently.
However, scalability concerns arise when considering the application of these technologies in larger subway systems. While substation-level self-healing systems work effectively in smaller networks, the implementation of these systems in expansive urban subway networks requires careful planning. Specifically, as the number of substations increases, communication overheads and real-time data processing requirements increase, which may necessitate substantial investment in communication infrastructure and AI-driven control systems.
Cost considerations are also significant in large-scale implementation. While automated fault detection, isolation, and reconfiguration systems can minimize downtime and reduce maintenance costs over time, the initial capital expenditure for deploying advanced communication systems like IEC 61850 and AI-driven algorithms can be high. Therefore, a phased deployment strategy is recommended, starting with pilot implementations in smaller, manageable sections of the subway network before scaling up.
5.1.1. Fault Detection and Isolation
In substation-level self-healing systems, fault detection is the first critical step, followed by rapid isolation to prevent further system disruption. The application of digital fault recorders, current transformers, and protection relays with IEC 61850 protocols enables real-time fault detection and automated isolation. These systems are essential for minimizing the mean time to repair (MTTR) and improving overall network resilience.
Scalability issues may emerge when these systems are applied to larger subway networks with more substations, as the complexity of managing real-time communication between all devices increases. Effective system integration across substations becomes critical, and ensuring data consistency across multiple network layers will require robust data management systems and high-capacity communication infrastructure.
Once a fault is detected, the system must isolate the affected area quickly to prevent it from spreading and impacting other parts of the network. IEC 61850 standards enable real-time communication between devices within the substation, allowing for automated decision-making in the isolation process. Remote control devices and automated switches help to disconnect faulty sections from the rest of the grid, ensuring that only the impacted area is affected, and power continues to flow to other critical sections of the subway system.
5.1.2. Automated Reconfiguration
After isolating the faulted section, automated reconfiguration becomes necessary to restore service to the remaining sections. The key challenge here lies in optimizing the network’s configuration dynamically, based on real-time load conditions and fault location. The integration of AI-based reconfiguration allows the system to predict network behavior based on historical data and current system conditions, ensuring that power is rerouted efficiently.
Cost issues arise in the deployment of AI-based reconfiguration systems. While AI models require substantial computational resources and high-quality data for training, the long-term benefits—such as reduced downtime, optimized system performance, and predictive maintenance—can outweigh these initial costs. Predictive maintenance algorithms, for instance, can help prevent system failures before they occur, reducing unplanned maintenance costs significantly.
For example, Refs. [115,116,117] collectively explore the role of AI and machine learning in optimizing network performance, enhancing reconfiguration processes, and improving system efficiency. Alabi (2023) [115] discusses how AI methodologies such as reinforcement learning and deep learning contribute to network optimization in telecommunications, particularly through autonomous reconfiguration. Similarly, Umoga and Sodiya (2024) [116] delve into AI-driven optimization for dynamic network performance, focusing on how machine learning algorithms can facilitate adaptive network configurations in response to fluctuating conditions. Cruz et al. (2024) [117] provide a comprehensive review of AI applications for self-reconfiguration in smart manufacturing systems, emphasizing the integration of machine learning techniques to optimize operational efficiency and predict system behavior. Together, these studies underline the transformative impact of AI on system reconfiguration, predictive maintenance, and overall optimization in complex network environments.
The scalability of AI-based reconfiguration is also worth noting. As subway networks grow and expand, the demand for real-time processing power and advanced predictive models will increase, making it necessary to implement scalable cloud-based solutions and distributed computing frameworks. The ability of AI systems to scale will depend on the development of more efficient algorithms and distributed learning techniques that can be deployed across different sections of the network.
5.1.3. Key Technologies for Substation-Level Self-Healing
Several critical technologies contribute to substation-level self-healing, each addressing a specific aspect of fault detection, isolation, and recovery, as summarized in Table 15. These technologies include advanced communication protocols, remote control and monitoring systems, machine learning-based fault detection algorithms, and automated reconfiguration systems. These systems work in tandem to ensure that substations can respond quickly and efficiently to faults, minimizing downtime and ensuring continuous operation.
Table 15.
Key technologies for substation-level self-healing.
The integration of self-healing technologies at the substation level has proven to significantly enhance the overall reliability and efficiency of subway power systems. Automated detection, isolation, and reconfiguration ensure that faults are managed with minimal manual intervention, which not only reduces the risk of human error but also speeds up recovery times. Despite the advantages, challenges such as communication delays and system integration need to be addressed to further improve these systems’ effectiveness.
5.2. Line-Level Self-Healing Mechanisms and Network Reconfiguration
Line-level self-healing mechanisms are designed to handle faults along power distribution lines, which are often subject to environmental factors like storms, wildlife interference, and overloads. In this context, ring network configurations are particularly valuable, as they allow power to be rerouted through an alternative path when a fault occurs.
However, implementing these systems in large subway networks involves scalability concerns. Ring networks may become complex and inefficient if the number of interconnected lines increases. Additionally, line-level self-healing systems must be able to handle the increased number of fault detection sensors, automated switches, and MAS agents required to manage the additional complexity. Data communication between these components needs to be highly synchronized, and large-scale deployments will require robust communication protocols to handle the increased volume of data.
Cost and practical implementation issues also arise. While automated switches and MAS-based decision-making systems can significantly enhance fault detection and recovery, the high initial investment costs and the need for continuous maintenance of these systems could pose a barrier to widespread adoption in large urban transit systems. Therefore, the implementation of modular systems, where critical components are first deployed in high-priority areas, followed by gradual expansion, could offer a cost-effective solution.
5.2.1. Ring Network Configuration
A ring network is a type of network topology where multiple power paths are connected in a loop. This configuration allows for the rerouting of power if one segment of the network fails, ensuring that power continues to flow without major interruptions. When a fault occurs, the ring network can detect the faulted section and automatically isolate it, while simultaneously rerouting power to the affected area. This feature is particularly valuable in subway systems where uninterrupted service is crucial.
5.2.2. Automated Switches and MASs for Fault Detection and Isolation
Automated switches play a critical role in line-level self-healing. These switches can automatically disconnect faulty sections from the grid, ensuring that faults do not propagate further. They are equipped with sensors and communication devices that relay information about fault conditions to the central control system. These switches are often controlled through MASs, where multiple agents (representing different network components) collaborate to determine the best course of action based on real-time data.
5.2.3. Reconfiguration and Rerouting Power
When a fault is detected and isolated, the network must quickly reconfigure to restore power to the affected areas. Automated systems, powered by MASs and AI, help in making decisions about the most efficient way to reroute power to ensure continuity of service. This includes balancing loads across unaffected sections and optimizing power flow to reduce stress on the remaining parts of the network. Based on this, Table 16 presents an in-depth comparative analysis of several key line-level self-healing mechanisms within subway power networks. These mechanisms, including ring network reconfiguration, automated switches, MAS-based decision-making, fault detection sensors, real-time load balancing, and adaptive rerouting, play critical roles in enhancing the resilience, reliability, and overall performance of subway power supply systems. The table evaluates these mechanisms across multiple dimensions, including their degree of implementation, differences between systems, future prospects, issues to address, research potential, reliability impact, cost considerations, maintenance requirements, and impact on power quality.
Table 16.
Comparative analysis of line-level self-healing mechanisms in subway power networks.
A detailed summary and evaluation for Table 16 is presented as follows.
1. Ring Network Reconfiguration
Ring network reconfiguration is implemented at a high degree in most subway networks. It provides significant advantages, particularly in reducing the risk of power surges. However, the main challenge lies in the variability of its design across different systems, which can complicate integration. The future prospects for ring network reconfiguration are high, especially in terms of standardization. While it has a high research potential and reliability impact, it requires moderate costs and maintenance. Its impact on power quality is significant, making it an essential feature for enhancing power system robustness.
2. Automated Switches
Automated switches offer high implementation rates, particularly for fault isolation and rerouting. These systems are commonly available in most subway systems, contributing significantly to power reliability. However, time delays in operation and the moderate nature of their future prospects remain challenges. These systems present moderate research potential, and their high reliability impact makes them indispensable in critical power situations. Although the cost considerations are medium, they have relatively low maintenance requirements. Their contribution to improving power quality is notable, though not as high as some of the more advanced technologies.
3. MAS-Based Decision Making
The implementation of MAS-based decision-making is currently moderate, focusing on real-time fault management. This mechanism holds great promise for the future, particularly in smart grids, where its innovative potential can lead to more dynamic and adaptive responses. Despite the challenges in managing data communication overhead, this mechanism’s research potential remains high. The system’s impact on reliability is significant, especially for maintaining consistent power quality, but it faces moderate cost and maintenance requirements. Furthermore, MAS-based decision-making has high potential to improve system efficiency in the long term.
4. Fault Detection Sensors
Fault detection sensors are crucial for detecting and locating faults in subway networks. Their implementation is high, and they are critical for the efficiency of the systems. However, the occurrence of false negatives remains a challenge. These sensors provide medium research potential but contribute significantly to the reliability of power networks. The systems require moderate to high costs for implementation but have medium maintenance requirements. Their high impact on power quality makes them an essential part of a well-functioning self-healing system.
5. Real-time Load Balancing
Real-time load balancing plays a critical role in dynamically balancing the network load. This mechanism exhibits very high scalability but also presents challenges in coordinating complex systems. Real-time load balancing systems are highly promising in terms of enhancing power quality, though they require a significant investment in coordination and data throughput. Despite its high initial costs, this system shows great promise for optimizing the power system’s performance in the long term. Its research potential is high, as it can enable real-time adjustments to prevent overloads.
6. Adaptive Rerouting
Adaptive rerouting mechanisms, which adjust power flow dynamically, are based on cutting-edge technology and hold very high future prospects. These mechanisms offer significant advantages in terms of real-time adaptability, with the potential to dramatically reduce power disruptions. However, adaptive rerouting requires high data throughput and advanced technologies, making it one of the more complex systems to implement. Despite the high costs and medium maintenance requirements, adaptive rerouting mechanisms have a very high impact on power quality, which is crucial for ensuring the continuous operation of subway networks.
In conclusion, line-level self-healing mechanisms such as ring network reconfiguration, automated switches, and MAS-based decision-making enable quick fault isolation and recovery (Figure 4). These technologies ensure that power disruptions are minimized, and service can be restored quickly. The main challenges lie in the complexity of coordinating multiple agents across the network, and the requirement for continuous communication. Nonetheless, these mechanisms significantly improve the robustness and resilience of the system.
Figure 4.
Fault isolation and recovery process in self-healing control for subway power supply systems.
The fault isolation and recovery flowchart presented in Figure 4 illustrates the essential processes involved in self-healing control within a subway power supply system, highlighting the crucial steps from fault detection to system restoration. This flowchart includes four steps, as elaborated as follows.
1. Real-Time Monitoring and Data Collection
In this first step, the system continuously monitors operational parameters such as voltage, current, and temperature through sensors and monitoring devices. Once an anomaly is detected, such as voltage fluctuations or current imbalances, the system automatically triggers the self-healing control process. The use of the IEC 61850 communication protocol ensures the efficient and reliable transmission of data, allowing for the dynamic monitoring of critical system parameters. This early detection phase is crucial for initiating a rapid and accurate response to potential faults.
2. Fault Diagnosis and Location Analysis
After identifying an abnormal condition, the system utilizes historical data and real-time sensor readings to quickly diagnose the fault type and its precise location. AI algorithms, including pattern recognition and deep learning, process these data to create diagnostic models that pinpoint the fault accurately. The integration of these advanced AI techniques, coupled with the use of zero-sequence and differential current monitoring, enhances the efficiency and precision of fault identification, enabling the system to respond rapidly to issues in the subway power supply network.
3. Fault Isolation and Protection Actions
Once the fault is diagnosed, automated isolation devices are triggered to disconnect the affected area, preventing the fault from spreading and causing further damage. The multi-agent system (MAS) plays a critical role in coordinating resources across the entire network, dynamically adjusting load distribution to stabilize the system. The MAS ensures that the fault isolation is executed in an optimized sequence, minimizing the overall impact on the system’s functionality and ensuring that non-faulted regions maintain power. This coordination enhances system resilience and ensures minimal service disruption.
4. Recovery of Power Supply and Load Optimization
After the fault is isolated, the system focuses on restoring power to the unaffected regions. This is accomplished through the automatic switching of circuits to reconnect these areas to the power supply. To prevent overloads and secondary faults, intelligent optimization algorithms, such as game theory-based optimization, are applied to balance resource distribution across the network. These algorithms ensure the efficient recovery of power without overloading critical lines and help avoid secondary faults or service interruptions. The dynamic load optimization enhances the overall system stability and speeds up the recovery process, minimizing the downtime of the subway network.
As illustrated in Figure 4, this process begins with real-time monitoring and data collection, where sensors continuously gather operational parameters such as voltage and current. Upon detecting abnormalities, such as voltage fluctuations or current imbalances, the system triggers the self-healing procedure. The fault diagnosis and location analysis stage utilizes AI algorithms, including pattern recognition and deep learning, to swiftly identify the fault type and pinpoint its exact location based on both historical and real-time data. Following diagnosis, fault isolation and protection actions are automatically implemented, where automated isolation devices disconnect the faulted area, preventing further damage. The multi-agent system (MAS) coordinates resources across the network, dynamically adjusting load distribution to maintain system stability. Finally, in the recovery and load optimization phase, the system restores power to the unaffected areas by automatically switching circuits, and smart optimization algorithms ensure balanced resource allocation, minimizing the risk of overloading or secondary faults. This flowchart exemplifies the advantages of modern self-healing systems, combining AI-driven diagnostics, automated fault isolation, and dynamic load management to rapidly restore service and maintain network stability. Its key strengths lie in its high-speed response to faults, minimal disruption to non-faulted areas, and the intelligent optimization of resources, making it highly efficient for managing complex subway power networks.
Overall, this fault isolation and recovery flowchart is integral to a self-healing control system, showcasing an intelligent and automated approach to managing faults in complex subway power supply networks. The system’s reliance on real-time data collection, AI-based diagnostics, automated fault isolation, and dynamic recovery processes ensures a rapid, accurate, and minimal-impact response to power system disruptions. This approach enhances both operational reliability and efficiency, crucially maintaining continuous service while minimizing power loss and service downtime.
The mechanisms discussed in the table—ring network reconfiguration, automated switches, MAS-based decision-making, fault detection sensors, real-time load balancing, and adaptive rerouting—each contribute uniquely to the enhancement of subway power systems’ resilience and reliability. While each mechanism has its own set of challenges, particularly in terms of integration complexity, coordination, and data management, their future potential remains high. The mechanisms with higher scalability and dynamic response capabilities, such as real-time load balancing and adaptive rerouting, are particularly important for adapting to the growing demands of modern subway networks.
It is clear that while some systems, like automated switches, are already well integrated and have a proven track record, others like MAS-based decision-making and adaptive rerouting present substantial opportunities for future research and development. These technologies will become increasingly vital as the complexity of urban transit systems continues to evolve, making continuous innovation and research investment in this area critical.
5.3. Cross-Layer Fault Recovery Techniques and Strategies
Cross-layer fault recovery involves the coordination of fault management across different layers of the power network, from generation to distribution. This approach is essential for ensuring that faults are handled in a synchronized manner, minimizing the risk of cascading failures across multiple network layers.
Scalability challenges arise when attempting to implement multi-layer coordination systems in large subway networks. The complexity of cross-layer communication and data synchronization increases as the number of network layers grows. Hierarchical recovery systems, which involve different levels of control, must be carefully designed to ensure that they can scale without overwhelming the system’s computational resources.
Cost issues are particularly relevant here. The deployment of cross-layer recovery systems often requires substantial investment in advanced communication infrastructure and real-time data processing technologies. However, as these systems improve the overall resilience and efficiency of the subway power network, the long-term savings from reduced downtime, improved power quality, and predictive fault management can justify the initial costs.
5.3.1. Hierarchical Recovery Systems
Hierarchical recovery systems involve different levels of fault management, from local fault detection and isolation to higher-level network-wide coordination. The first level involves the detection and isolation of faults at the substation or line level, while the next level includes coordinating recovery actions across multiple substations or even across the entire power grid. At the highest level, centralized control centers monitor the overall system status and ensure that recovery actions are coordinated across the network.
5.3.2. Coordinated Fault Isolation Across Layers
Coordinated fault isolation is crucial to ensure that faults at one level do not affect other levels of the system. For example, if a fault occurs at the substation level, the system must isolate the fault while maintaining the overall operation of the grid. At the same time, automated decision-making processes must be in place to restore service as quickly as possible, whether through reconfiguration, load balancing, or rerouting.
5.3.3. MASs for Multi-Layer Coordination
Multi-agent systems are particularly useful in multi-layer fault recovery. They enable distributed decision-making, where each agent is responsible for coordinating recovery actions within its designated layer. These agents communicate with each other to ensure that the system-wide recovery actions are optimized. For example, an agent at the substation level may isolate a fault, while an agent at the transmission level can reroute power from other sources to ensure continuous service [118,119,120]. For example, the research work in [118,119,120] highlights the significant role of MASs in enhancing the resilience and efficiency of power systems, particularly in the context of multi-layer fault recovery. Lin and Bie (2018) [118] provide a comprehensive analysis of strategies for achieving power system resilience, emphasizing decentralized decision-making in MASs. Yu et al. (2020) [119] explore survivability-aware routing restoration mechanisms, demonstrating how MASs can optimize network communication during large-scale failures. Furthermore, Moradi et al. (2016) [120] examine the application of MASs in power engineering, emphasizing their ability to coordinate distributed recovery actions across different layers of the power network, such as substations and transmission systems. These studies collectively underscore the importance of MASs in ensuring rapid fault detection, isolation, and system restoration in complex, multi-layered infrastructures.
Based on this, Table 17 outlines key cross-layer fault recovery mechanisms aimed at enhancing the resilience of subway power systems. These mechanisms—hierarchical recovery, cross-layer isolation, MASs for multi-layer coordination, adaptive load balancing, data integration systems, and predictive recovery algorithms—represent various approaches to fault isolation, decision-making, and recovery coordination across multiple layers of the power system. The table assesses each mechanism based on its degree of implementation, differences between systems, future prospects, issues to address, research potential, reliability impact, cost considerations, maintenance requirements, and impact on power quality.
Table 17.
Comparative analysis of cross-layer fault recovery mechanisms in subway power networks.
From Table 17, a detailed summarization and evaluation is presented as follows.
1. Hierarchical Recovery
Hierarchical recovery is implemented at a high level in subway power systems, particularly for coordinated fault recovery. The main challenge here is the system complexity, as the approach varies significantly across different systems. Hierarchical recovery shows very high future prospects due to its potential for seamless fault management. The research potential is high, and its impact on reliability is very significant. However, its cost considerations and maintenance requirements are high, which is a common challenge in large-scale systems. Despite these challenges, hierarchical recovery mechanisms provide very high benefits in terms of power quality, making them essential for enhancing overall system robustness.
2. Cross-Layer Isolation
Cross-layer isolation mechanisms are moderately implemented and are particularly focused on isolating faults across multiple layers of the system. These systems are critical for new and emerging subway power networks. The key issue to address is maintaining data consistency, which is essential for effective fault detection and recovery. The research potential is high, and the impact on reliability is substantial. While the costs are moderate, the complexity of integrating multiple layers of data can be a challenge. Cross-layer isolation mechanisms have a high impact on power quality, making them crucial for efficient and resilient power systems.
3. MASs for Multi-Layer Coordination
MASs are used for distributed decision-making in recovery processes. These systems are implemented at high degrees, requiring data consistency across multiple layers of the power system. The challenge is overcoming communication delays between agents, which can hinder decision-making efficiency. Despite these challenges, MASs for multi-layer coordination hold very high future prospects, particularly in enhancing system resilience. Research into this mechanism remains high, as it is central to modern smart grids and large-scale systems. While it demands high costs and maintenance, it has a very high impact on power quality, especially when applied to complex networks requiring adaptive decision-making.
4. Adaptive Load Balancing
Adaptive load balancing is critical for balancing load across multiple levels of the subway network. Its implementation is high, especially in large systems where dynamic load adjustment is needed. The key issue in this approach is the requirement for real-time data processing and accurate system monitoring. The research potential for adaptive load balancing is high, with very high prospects for improving system efficiency. While it has high costs and maintenance needs, it offers a very high impact on power quality by ensuring that the network can dynamically adjust to load fluctuations. This makes adaptive load balancing a vital tool for maintaining power stability in large subway networks.
5. Data Integration Systems
Data integration systems are essential for integrating data across multiple layers of the subway network, particularly for decision-making processes. These systems are moderately implemented, with ongoing challenges in ensuring data synchronization across different layers. The research potential remains high, with a significant focus on overcoming data consistency and synchronization issues. Data integration systems contribute highly to the reliability and efficiency of subway power systems, although their cost and maintenance requirements are also high. Despite these challenges, the systems have a very high impact on power quality, making them a key area for continued development.
6. Predictive Recovery Algorithms
Predictive recovery algorithms are in the experimental stage and focus on anticipating faults and recovery actions. These systems are particularly useful in providing proactive solutions to prevent faults from escalating. However, they face challenges related to the accuracy of fault models, which can limit their effectiveness. The research potential is moderate, as this area is still evolving, and real-world applications are limited. While the costs for implementing predictive recovery algorithms are moderate, their maintenance needs are high due to the complexity of the algorithms. Despite the challenges, these systems offer a very high impact on power quality by enhancing the system’s ability to preemptively address potential disruptions.
Overall, the cross-layer fault recovery techniques ensure that faults are managed seamlessly across different levels of the subway power network. By using hierarchical recovery systems, coordinating fault isolation, and leveraging MASs for multi-layer coordination, these techniques help improve overall system resilience. The integration of real-time data and predictive algorithms further enhances the effectiveness of these systems. However, issues related to data synchronization, communication, and system complexity remain challenges to be addressed.
The cross-layer fault recovery mechanisms outlined in the table—hierarchical recovery, cross-layer isolation, MASs for multi-layer coordination, adaptive load balancing, data integration systems, and predictive recovery algorithms—are all integral to improving the robustness and resilience of subway power systems. Each mechanism offers distinct advantages in terms of fault isolation, dynamic recovery, and decision-making, although they also present various challenges related to system complexity, data synchronization, and real-time operation.
Among these, hierarchical recovery and MASs for multi-layer coordination show the most promise for long-term development, as they provide coordinated and adaptive solutions for large-scale systems. However, issues related to data consistency, communication delays, and high implementation costs remain significant barriers that need to be addressed. Predictive recovery algorithms, while still experimental, represent a transformative approach to proactive fault management and could become crucial as AI and machine learning technologies advance.
Ultimately, the integration of these mechanisms into subway power systems will enhance operational efficiency, reduce downtime, and improve overall power quality, making them indispensable for the future of smart and resilient subway networks.
5.4. AI-Driven Fault Diagnosis and Recovery in Complex Scenarios
AI is playing an increasingly critical role in the diagnosis and recovery of faults in complex subway power systems. Machine learning, deep learning, and predictive analytics allow for faster and more accurate identification of faults, even in challenging scenarios where traditional methods may struggle.
Scalability remains a significant challenge when applying AI-based fault diagnosis to large-scale systems. The number of sensors, data points, and computational requirements for deep learning models increases as the subway network expands. Therefore, a cloud-based approach or distributed AI systems may be necessary to handle the data processing demands of larger systems. These AI models must be continuously updated with real-time operational data to ensure their accuracy and adaptive capabilities in dynamic environments [121,122,123].
The cost of AI-driven systems can be high, particularly in terms of computational resources and data storage requirements. However, the implementation of AI-based systems can significantly reduce manual intervention and improve operational efficiency, leading to long-term savings. Additionally, AI-based reconfiguration systems can reduce downtime and optimize network recovery, providing a high return on investment over time.
5.4.1. Machine Learning for Fault Prediction
Machine learning algorithms can be trained using historical fault data to predict potential future faults based on patterns and trends [124,125,126]. By analyzing large volumes of data, these algorithms can identify early warning signs of potential faults before they occur. This proactive approach allows for better planning and mitigation strategies, reducing the overall impact of faults.
5.4.2. Deep Learning for Real-Time Fault Diagnosis
Deep learning models, which are a subset of machine learning, can be particularly useful for real-time fault diagnosis [127,128]. These models analyze data from multiple sensors and sources, identifying complex patterns that may indicate a fault. The advantage of deep learning is its ability to process large amounts of data and learn from it, enabling quicker diagnosis and recovery times.
5.4.3. AI for Automated Reconfiguration
AI can also assist in the automated reconfiguration of subway power systems after a fault has been isolated [129]. By considering a range of factors such as network load, fault severity, and environmental conditions, AI can determine the most effective configuration for restoring power. This dynamic reconfiguration is crucial in ensuring that service is restored as quickly as possible without overloading other parts of the network. Based on this, Table 18 presents an analysis of several AI-driven technologies that are applied to fault diagnosis and automated reconfiguration in subway power systems. The technologies highlighted include machine learning algorithms, deep learning models, AI-based reconfiguration, predictive analytics, fault pattern recognition, and automated diagnostic systems. These systems aim to predict faults, diagnose issues in real time, and optimize recovery strategies to ensure the rapid restoration of power in the event of system disruptions.
Table 18.
AI-driven fault diagnosis and reconfiguration in subway power systems.
From Table 18, a detailed summary and evaluation is elaborated as follows.
1. Machine Learning Algorithms
Machine learning algorithms are widely implemented in modern systems to predict faults before they occur [130]. The key issue with this technology is ensuring data accuracy, as accurate data input is critical for the effective prediction of faults. The future prospects for machine learning in fault prediction are very high due to their ability to enhance real-time diagnostics and preemptively manage faults. These algorithms have a very high reliability impact as they improve the system’s overall resilience. Although the research potential remains high, the associated costs are moderate, with medium maintenance requirements. Their impact on power quality is also very high due to the proactive nature of fault management and early intervention.
2. Deep Learning Models
Deep learning models are increasingly used for real-time fault diagnosis, with moderate implementation in subway systems [127]. These models require large datasets to effectively identify faults, which poses challenges in data consistency and accuracy. Despite these challenges, deep learning models have very high future prospects, as they can learn from vast amounts of historical data to provide highly accurate diagnostics. The reliability impact is very high, and the research potential remains significant, especially with advancements in neural networks and computational capabilities. However, deep learning models require high computational power, making them costly and high-maintenance. Despite these requirements, the impact on power quality is very high due to their accuracy and real-time capabilities.
3. AI-based Reconfiguration
AI-based reconfiguration focuses on optimizing network recovery after a fault is isolated. While the technology is still evolving, it holds very high future prospects due to its potential to improve recovery times and efficiency. One of the primary challenges with AI-based reconfiguration is the need for fast computation to optimize decisions in real time, particularly in large subway systems. The research potential remains high, and the technology’s reliability impact is also significant. The cost considerations are high, as the system requires substantial computational resources and integration with existing infrastructure. Despite the costs, the impact on power quality is very high due to its ability to dynamically restore power to affected parts of the network.
4. Predictive Analytics
Predictive analytics is widely used in advanced grids to forecast network conditions and potential faults. The system is highly implemented and has very high future prospects, given its ability to anticipate issues before they occur. The major challenge for predictive analytics lies in algorithm complexity, as the models must process and analyze large volumes of real-time data. The research potential for predictive analytics is high, and its impact on reliability is significant due to its ability to optimize power system management before faults emerge. Cost considerations remain high, but predictive analytics offers high potential for improving system reliability. Its maintenance requirements are medium, and the impact on power quality is very high, especially in maintaining system stability.
5. Fault Pattern Recognition
Fault pattern recognition identifies complex fault scenarios, which are common in AI-driven systems. The system is highly implemented, especially for scenarios where faults may be difficult to detect manually. However, the key challenge is the need for large amounts of training data to accurately identify fault patterns. The future prospects for fault pattern recognition are very high, and its reliability impact is also substantial, as it enhances fault detection accuracy. The research potential is high, and the technology requires significant investment in training data and integration complexity. Despite these challenges, the impact on power quality is very high, as it can significantly reduce downtime and system disruptions.
6. Automated Diagnostic Systems
Automated diagnostic systems are used for automated fault detection and troubleshooting, and their implementation is moderate [131,132]. This technology is relatively new in subway systems, which can lead to integration complexity. Despite these challenges, automated diagnostic systems offer high research potential due to their ability to quickly identify and address faults. They require medium to high costs for integration and maintenance, but the technology significantly improves the system’s ability to respond to faults efficiently. The impact on power quality is very high, as automated diagnostics help maintain continuous operation and prevent larger system failures.
As summarized above, AI-driven fault diagnosis and recovery technologies are central to enhancing the resilience, efficiency, and quality of subway power systems. Each of the technologies listed in Table 18—machine learning algorithms, deep learning models, AI-based reconfiguration, predictive analytics, fault pattern recognition, and automated diagnostic systems—contributes uniquely to fault management and network recovery. While they all present some challenges related to data accuracy, integration complexity, and computational demands, their future prospects remain very high.
Machine learning and deep learning technologies, in particular, have the potential to revolutionize fault prediction and diagnosis, providing advanced solutions to issues related to system reliability. AI-based reconfiguration and predictive analytics are poised to improve recovery times and power system optimization, ensuring that power disruptions are minimized. Fault pattern recognition and automated diagnostic systems are essential for improving fault detection and reducing downtime.
Overall, the continued development and integration of these technologies are crucial for the advancement of modern subway power networks. However, it is important to address issues related to data synchronization, computational requirements, and system integration to fully realize their potential in the future. In Section 5, these subsections together provide a comprehensive overview of the practical applications of self-healing technologies in subway power systems, focusing on fault recovery, network reconfiguration, and the integration of advanced AI and MASs for improved system resilience and efficiency. The application of these techniques enhances the reliability and sustainability of urban transportation networks, ensuring uninterrupted service even in the event of faults.
6. Implications of AI Technologies for Future Subway Power Systems
In this chapter, we delve into the wide-ranging implications of emerging AI technologies for the future development of subway power systems. Building on previous discussions of self-healing architectures, multi-agent frameworks, and IEC 61850-based communication protocols, we focus here on how AI—particularly machine learning (ML), deep learning (DL), and reinforcement learning (RL)—can transform operational strategies, maintenance paradigms, data governance, and broader socio-economic outcomes in the context of subway power supply. Specifically, we present five key subtopics that capture the multifaceted opportunities and challenges posed by AI in this domain.
First, Section 6.1 explores AI-enhanced fault diagnosis and prognostics, illustrating how cutting-edge data analytics can enable predictive maintenance and near-real-time failure detection. We present a distinct approach to predictive maintenance by incorporating AI methodologies that combine historical data with real-time operational signals to improve failure prediction accuracy in complex subway power systems. This expands upon traditional techniques by incorporating newer AI-based predictive models that integrate physics-driven insights and data fusion for improved fault detection and prognosis, as seen in recent applications within smart grid and industrial automation systems. Next, Section 6.2 investigates reinforcement learning and decision-making in self-healing processes, illustrating how adaptive algorithms can optimize power restoration and system resilience. Section 6.3 addresses the integration of AI and MASs under IEC 61850, highlighting the ways in which standardized communication protocols can be leveraged to support distributed intelligence and collective fault management. Moving beyond the purely technical dimensions, Section 6.4 focuses on cybersecurity, privacy, and data management for AI-driven subway power systems, considering how the influx of real-time data demands novel governance frameworks. Finally, Section 6.5 concludes with next-generation operational strategies and socio-economic implications, discussing how AI-enabled subway power systems may reshape workforce development, economic modeling, and stakeholder engagement.
These five sections provide a comprehensive perspective on how AI innovations are expected to evolve and influence subway power systems. Through the inclusion of novel approaches, particularly in AI-driven decision-making processes for self-healing, we introduce emerging techniques and applications that have not been extensively explored in prior studies. These techniques, including hybrid machine learning methods and real-time adaptive models, are at the forefront of advancing subway power infrastructure, offering new contributions to the field and distinguishing this review from earlier works.
6.1. AI-Enhanced Fault Diagnosis and Prognostics
One of the most significant applications of artificial intelligence in subway power systems is the enhancement of fault diagnosis and prognostics. Traditional fault detection methods—often reliant on predefined thresholds and manual inspections—are increasingly inadequate in the face of complex, dynamic operating conditions. AI-based approaches, by contrast, can leverage advanced data analytics to identify subtle patterns in real-time signals, historical maintenance logs, and contextual environmental data. The implications for subway power systems are profound: improved detection speed, greater accuracy, reduced downtime, and the ability to predict failures before they occur.
AI-based fault diagnosis now incorporates advanced methodologies that allow for dynamic analysis of both sensor data and environmental context. A shift from traditional rule-based models to AI-enhanced diagnostics involves the fusion of deep learning and machine learning techniques with domain-specific knowledge. This approach, particularly when applied to hybrid AI-physics models, helps identify fault signatures that might otherwise remain undetected using conventional methods. Recent research has shown the successful application of such hybrid models in areas like predictive maintenance within industrial and energy systems, enhancing the robustness of subway power systems.
6.1.1. Transition from Reactive to Predictive Maintenance
AI-empowered strategies for fault diagnosis enable a shift from reactive to predictive maintenance paradigms. Historically, failures in power components such as transformers, switchgear, and traction substations could lead to severe operational disruptions, significant repair costs, and compromised passenger safety. Predictive maintenance uses AI to process multi-source data (e.g., sensor readings, operational logs, images from thermal cameras, vibrations, and acoustic signals) and identify early warning signs.
- (1)
- Machine Learning Models: Supervised learning algorithms (e.g., Support Vector Machines and Random Forests) can detect anomalies in high-dimensional data, building predictive models that correlate subtle parameter shifts—such as partial discharges or fluctuating voltage profiles—to impending failures.
- (2)
- Deep Learning Architectures: Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) offer enhanced feature extraction and sequence analysis, making it possible to detect complex non-linear relationships in sensor streams, identify irregularities in real time, and precisely localize potential faults within specific subsystems.
By incorporating these AI methods, subway operators can anticipate maintenance needs and schedule repairs in a manner that minimizes disruption. As a result, not only can subway power systems achieve higher reliability, but also overall operational expenditures can be significantly reduced.
6.1.2. Real-Time Monitoring and Edge Analytics
Real-time monitoring is essential for the effective operation of subway power systems, given their mission-critical nature. Deploying AI-driven edge analytics in local controllers allows for immediate processing of data without the latency introduced by centralized cloud computing [133]. This decentralized approach, similar to advancements in the industrial IoT sector, offers significant improvements in fault response times by enabling local processing of sensor data [134]. For instance, Refs. [133,134] emphasize the integration of real-time monitoring and AI-based edge analytics to enhance the performance and reliability of critical infrastructure, including subway power systems. Leveraging edge computing with deep learning models for real-time fault detection ensures continuous operation even during communication breakdowns, maintaining system resilience in highly dynamic environments. This setup reduces system vulnerabilities and optimizes response times, ensuring critical components remain functional despite potential disruptions.
In practice, an edge device near a substation might run a compact deep learning model trained to recognize specific fault signatures (e.g., harmonic distortions indicative of insulation breakdown). Upon detecting an anomaly, it could initiate an automated diagnostic routine or communicate with a higher-level control center for further analysis. Such distributed intelligence aligns closely with the principles of self-healing networks and multi-agent systems, wherein localized decisions can contain or mitigate faults quickly.
6.1.3. Challenges and Future Directions
While AI-enhanced fault diagnosis offers considerable potential, several challenges remain. One key obstacle is the quality and integrity of data, which are essential for generating accurate predictions. Recent advancements in AI, such as the application of generative models for data augmentation, have been proposed to address the challenge of limited datasets, especially for rare fault scenarios. These techniques, which have proven successful in industries such as healthcare and autonomous vehicles, could greatly enhance the robustness of AI models used for fault prediction in subway power systems. Furthermore, the interpretability of deep learning models continues to be a concern. Research is actively addressing this issue by developing explainable AI techniques that provide insight into the decision-making process, fostering operator trust and ensuring model transparency. Looking ahead, addressing these challenges will involve the following:
- (1)
- Data Fusion: Combining structured data (e.g., SCADA logs) with unstructured data (e.g., images and audio signals) for more holistic fault models.
- (2)
- Transfer Learning: Leveraging knowledge from related domains (e.g., electrified railways) to build robust classifiers even when local fault data are scarce.
- (3)
- Explainable AI: Integrating interpretable model architectures or post hoc interpretability frameworks to ensure that operators understand and trust AI-driven fault detection decisions [135,136].
Table 19 highlights various fault diagnosis and prognostics scenarios in subway power systems, mapping each to its implementation stage, unique features, and prospective impact on overall system performance. This table demonstrates that AI-based methods yield significant improvements in fault isolation speeds and predictive maintenance accuracy, while persistent challenges remain in addressing data scarcity and system complexity. Notably, bridging these gaps requires both domain-oriented strategies—like hybrid physics-data approaches—and advanced AI methods such as transfer learning, federated learning, and explainable AI. Overall, AI-enhanced fault diagnosis and prognostics offer a transformative path forward, enabling subway power systems to move beyond reactive maintenance and toward a resilient, future-ready operational model. As data quality, model interpretability, and real-time decision-making capabilities improve, these techniques will become integral to enhancing reliability, lowering operational costs, and ensuring passenger safety.
Table 19.
Key dimensions of AI-enhanced fault diagnosis and prognostics in subway power systems.
6.2. Reinforcement Learning and Decision-Making in Self-Healing Processes
Reinforcement learning (RL) presents a compelling framework for dynamic decision-making in self-healing subway power systems. Unlike traditional control methods that rely on static, rule-based logic, RL algorithms learn optimal policies through continuous interactions with the environment. This approach holds immense promise for managing complex power distribution networks, where real-time reconfiguration, load balancing, and fault recovery must be orchestrated under varying operational constraints.
Recent advancements in deep reinforcement learning (DRL), particularly in the form of multi-agent RL frameworks, have demonstrated significant promise for complex infrastructure management. These frameworks enable distributed decision-making among multiple agents operating in parallel, each responsible for managing specific subsystems. This approach is well suited for subway power systems, where different agents can independently optimize substation operations, power rerouting, and fault recovery while ensuring overall system stability. These developments represent a significant evolution from traditional RL approaches and align closely with emerging smart grid technologies.
6.2.1. Core Principles of Reinforcement Learning
RL algorithms are typically defined by the concepts of agents, environments, states, actions, and rewards. In a subway power system context, they are as follows:
- (1)
- Agent: The AI controller (or controllers) responsible for adjusting switch positions, redirecting power flows, or prioritizing fault isolation.
- (2)
- Environment: The subway power system, inclusive of its multi-bus topology, traction substations, feeders, and protective devices.
- (3)
- State: The real-time status of the system, including voltage levels, load demands, equipment health indicators, and fault locations.
- (4)
- Action: Any operational command the RL agent can perform, such as opening or closing circuit breakers, adjusting converter setpoints, or initiating fault isolation protocols.
- (5)
- Reward: A numerical signal representing the quality of each action, often linked to performance metrics like minimized outage duration, voltage stability, or reduced energy losses.
Through iterative exploration and exploitation, RL algorithms can discover dynamic strategies for reconfiguring the power system in response to faults or varying load conditions, thereby supporting self-healing functionality.
6.2.2. Adaptive Self-Healing Under Uncertainty
One of the critical challenges in self-healing networks is dealing with uncertainty—both in terms of partial system observability and rapidly changing demand patterns. RL excels in these conditions, as it can learn to balance exploration (trying new configurations) with exploitation (using known successful actions).
- (1)
- Deep Q-Networks (DQN): These incorporate neural networks to approximate the action-value function, enabling RL agents to handle high-dimensional state spaces [137,138].
- (2)
- Policy Gradient Methods: Algorithms like Proximal Policy Optimization (PPO) or Advantage Actor-Critic (A2C) learn continuous control policies, facilitating more nuanced actions such as incremental power flow adjustments.
- (3)
- Model-Based RL: By integrating predictive models of system behavior (e.g., partial differential equations describing network flows), RL agents can plan ahead, simulating potential outcomes of different actions before implementation.
This adaptability allows RL-driven self-healing systems to isolate faults more rapidly and re-route power with minimal operator intervention, enhancing overall reliability and resilience.
6.2.3. Multi-Agent Coordination
In the context of subway power systems, different functional areas—traction substations, feeders, and signaling equipment—can be managed by specialized RL agents. A multi-agent RL approach allows these subsystems to coordinate actions effectively. The integration of cooperative multi-agent RL models ensures that decisions made by individual agents are aligned with broader system-wide goals, such as minimizing power outages and maximizing operational efficiency. Recent progress in cooperative RL, particularly in optimizing energy distribution across urban power networks, shows promise for enhancing coordination among subway power system agents, ensuring both fault isolation and optimal power restoration.
6.2.4. Challenges in RL-Based Self-Healing
Despite its promise, RL faces notable hurdles in practical railway power scenarios:
- (1)
- Safety Constraints: Subways operate under strict safety regulations, so RL actions must never compromise passenger safety. Techniques like safe RL or reward shaping can incorporate safety margins.
- (2)
- Scalability: Large systems with dozens of substations and thousands of sensors result in vast state-action spaces, necessitating advanced function approximation and distributed training architectures.
- (3)
- Learning Speed: RL algorithms may require numerous interactions or simulated “episodes” to learn effective policies. Building high-fidelity digital twins for training is thus essential.
- (4)
- Generalization: Policies learned under certain load patterns or fault conditions may not generalize well to unseen scenarios, underscoring the need for robust domain adaptation and online learning strategies.
Table 20 outlines the primary RL approaches, benefits, and challenges for self-healing processes in subway power systems. While feeder reconfiguration and load balancing show high promise, scaling these solutions requires addressing safety, computational complexity, and real-time adaptability. Cooperative multi-agent RL emerges as a powerful paradigm, but it demands seamless data exchange and robust coordination protocols. Each scenario also highlights the necessity of sophisticated simulation tools (digital twins) for training RL agents under realistic conditions.
Table 20.
Core aspects of RL-driven self-healing in subway power systems.
In conclusion, RL offers a dynamic, adaptive framework for enhancing self-healing capabilities in subway power systems. By learning from experience, RL algorithms can optimize fault isolation and power restoration in uncertain and evolving conditions. Although technical challenges—particularly in the realms of safety, scalability, and domain adaptation—remain, continued research and pilot implementations will refine RL-driven approaches, ushering in a new era of intelligent, resilient, and self-reconfiguring subway power infrastructures.
6.3. Integration of AI and Multi-Agent Systems Under IEC 61850
The third focal area explores how AI methods, when paired with MASs and standardized by IEC 61850 protocols, can unlock advanced self-healing, interoperability, and collaborative decision-making in future subway power systems. While MAS architectures enable the distribution of tasks and knowledge among specialized agents, IEC 61850 provides a common language for data exchange. AI techniques further enrich these frameworks by injecting predictive capabilities, adaptive optimization, and real-time learning.
Recent advancements in IEC 61850 extensions have focused on integrating AI-driven systems to improve real-time fault management, such as through enhanced predictive maintenance capabilities and adaptive load balancing. Furthermore, the application of AI-enhanced decision-making algorithms within these MAS frameworks ensures that fault recovery processes are both faster and more reliable. The increasing integration of AI and MASs with IEC 61850 is driving new innovations in the development of self-healing networks that rely on adaptive, data-driven approaches to enhance system reliability.
6.3.1. Roles of MASs and IEC 61850 in Subway Power Systems
MASs are collections of semi-autonomous entities—“agents”—capable of independent decision-making. In a subway environment, each agent might represent a specific subsystem or physical asset (e.g., a traction transformer, a protection relay, or a signaling interface). The MAS approach decentralizes control, enhancing flexibility, scalability, and fault tolerance.
IEC 61850, originally developed for substation automation in utility power grids, offers a rich object-oriented data model and standardized communication services (GOOSE, MMS, etc.) [139,140]. As subway power systems become more sophisticated, adopting IEC 61850 ensures consistent naming conventions, standardized data structures, and event-driven messaging. Consequently, agents in an MAS architecture can seamlessly exchange data, coordinate actions, and maintain a shared situational awareness.
6.3.2. AI-Driven Coordination and Decision-Making
AI plays a pivotal role in orchestrating multi-agent cooperation under IEC 61850. By analyzing system-wide telemetry and status signals, AI algorithms can identify potential conflicts or synergies among agents. For instance, a traction substation agent anticipating overload conditions might request a neighboring substation agent to reroute power flows. AI-driven coordination ensures that these negotiations happen quickly and optimally, considering constraints like safety margins, service priorities, or economic dispatch rules.
Moreover, advanced AI algorithms—such as Graph Neural Networks [141] or complex optimization solvers—can leverage the structured data from IEC 61850 to model the relationships among power system components. Agents can then perform local computations while concurrently feeding results into a global optimization layer, leading to emergent, system-wide intelligence.
6.3.3. Interoperability and Standardization Benefits
A key advantage of integrating AI with MASs under IEC 61850 is interoperability. Legacy subway systems often use proprietary communication protocols, leading to device incompatibility and vendor lock-in. IEC 61850 breaks down these barriers, enabling cross-vendor and cross-application integration. From the perspective of AI, having a standardized data schema enhances the portability and scalability of models, as consistent data structures streamline data ingestion and model training processes.
As a result, operators can incorporate new AI-driven applications—like advanced fault analytics or dynamic reconfiguration—without overhauling existing hardware. The MAS approach further compartmentalizes tasks, so if a new AI module is added, only the relevant agents need updating, preserving overall system stability.
6.3.4. Potential Obstacles and Evolution
The integration of AI and MASs under IEC 61850 faces several potential obstacles:
- (1)
- Communication Latency: While IEC 61850 supports high-speed messaging, real-time AI inference may still demand edge computing infrastructure to avoid round-trip delays.
- (2)
- Cybersecurity: The standard’s emphasis on connectivity raises cybersecurity concerns. AI modules and MAS agents could become targets of sophisticated cyberattacks, necessitating robust encryption, authentication, and intrusion detection schemes.
- (3)
- Complexity of Agent Interactions: As the number of agents grows, orchestrating their interactions can become unwieldy. AI-based supervision layers must handle negotiation protocols, conflict resolution, and consistency checks.
- (4)
- Operational Validation: Formal verification and testing of AI-driven MAS solutions remain challenging, given the high-stakes nature of subway operations.
In the future, ongoing standardization efforts for advanced functionalities—like the IEC 61850 extensions for distributed energy resources—could incorporate guidelines specific to AI integration. Initiatives aimed at real-time digital twins and 5G communication might also enhance the synergy between MASs and IEC 61850, broadening the horizon for intelligent subway power systems. Based on this, Table 21 enumerates key dimensions of integrating AI, MASs, and IEC 61850. Each row highlights how agent-based architectures benefit from standardized communication and AI-driven analytics, describing both advantages (e.g., swift fault isolation and improved energy routing) and obstacles (e.g., vendor constraints and cybersecurity risks). Notably, the long-term potential for each dimension tends to be high or very high, underscoring the transformative power of this integration.
Table 21.
Implications of integrating AI, MASs, and IEC 61850 in subway power systems.
In closing, merging AI with MASs under the IEC 61850 umbrella paves the way for a more coordinated, interoperable, and adaptive subway power ecosystem. By distributing intelligence across multiple agents, harnessing standard communication protocols, and leveraging AI for data-driven decisions, subway systems can evolve toward more agile and resilient operations. Future developments will likely emphasize refined cybersecurity policies, real-time digital twins, and advanced scheduling algorithms, collectively forming the backbone of next-generation intelligent rail networks.
6.4. Cybersecurity, Privacy, and Data Management for AI-Driven Subway Power Systems
With the integration of AI, IoT devices, and multi-agent architectures, the volume and sensitivity of data in subway power systems are growing at an unprecedented rate. While these data streams fuel advanced analytics and self-healing capabilities, they also introduce heightened risks related to cybersecurity, data privacy, and governance. A robust data management framework is thus imperative to ensure reliable operations and maintain public trust.
To safeguard AI-driven systems, cutting-edge technologies such as blockchain for secure data logging and federated learning for data privacy are being explored. These technologies, proven in fields such as digital finance and healthcare, allow for secure data sharing and model training while ensuring data privacy. Blockchain provides a transparent and immutable ledger, ensuring that operational data remain tamper-proof and secure, particularly in the event of cyberattacks. Additionally, federated learning enables AI models to be trained locally, reducing data exposure risks and allowing for privacy-preserving data analytics, critical for managing sensitive infrastructure data.
6.4.1. Cyber Threat Landscape
AI-driven subway power systems, connected through IEC 61850 or other communication protocols, present an attractive target for cybercriminals or malicious state actors. Potential attacks include the following:
- (1)
- Data Poisoning: Manipulating training datasets to degrade AI model performance or trigger erroneous system decisions.
- (2)
- Ransomware: Encrypting critical operational data and demanding payment to restore access, thus threatening system continuity.
- (3)
- Denial of Service (DoS): Flooding communication channels with spurious traffic, hindering real-time control signals.
- (4)
- Sensor Spoofing: Feeding corrupted sensor data into AI models, leading to incorrect fault diagnoses or false alarms.
In the context of a critical infrastructure like a subway system, even minor disruptions can have severe social, economic, and safety repercussions. Consequently, cybersecurity must become an integral component of AI system design, not an afterthought.
6.4.2. Data Privacy and Ethics
Beyond technical vulnerabilities, subway operators and city authorities must navigate privacy concerns. AI systems might aggregate detailed operational data, video feeds, passenger flow metrics, or location patterns. While these data points are essential for predictive maintenance or advanced analytics, storing and analyzing them also raise ethical questions. For instance, if camera feeds are used to measure passenger loads to predict electrical demand, they might inadvertently collect personally identifying information. Ensuring compliance with data protection laws (such as the General Data Protection Regulation, GDPR, in the European context) is critical for maintaining public trust.
Operators should consider adopting privacy-preserving AI techniques, including differential privacy or secure multiparty computation, which allow for collaborative model training or data analysis without exposing sensitive details. Implementing role-based data access controls and robust de-identification procedures further reduces the risk of misuse or accidental disclosure.
6.4.3. Comprehensive Data Governance
Data governance encompasses the policies, processes, and technical measures necessary for responsible data handling. A holistic framework would include the following:
- (1)
- Data Ownership: Defining clear ownership structures for sensor data, operational logs, and passenger metrics, potentially involving multiple stakeholders (public transport authorities, private operators, and technology vendors).
- (2)
- Data Lifecycle Management: Establishing guidelines for data collection, storage, access, retention, and deletion. Ensuring that data archiving practices meet both regulatory requirements and system operational needs.
- (3)
- Metadata and Standardization: Maintaining standardized metadata to enhance data discoverability and interoperability, vital for multi-agent systems reliant on consistent data schemas.
- (4)
- Quality Assurance: Integrating data validation protocols and anomaly detection to safeguard against corrupted or incomplete data inputs that could compromise AI-driven decisions.
6.4.4. Strategies for Resilience and Compliance
To fortify cybersecurity and privacy in an AI-driven environment, operators must adopt a layered security approach, including encryption, anomaly detection, intrusion detection systems (IDSs), and zero-trust architectures. Specific measures may involve the following:
- (1)
- Secure AI Pipelines: Implementing code-signing, containerization, and version control to prevent tampering with ML models or inference services.
- (2)
- Federated Learning: Training AI models locally on devices or substations and then aggregating only model parameters. This approach minimizes data movement and reduces exposure risks.
- (3)
- Incident Response and Recovery: Developing well-rehearsed contingency plans that detail how to isolate compromised systems, restore operational data, and communicate effectively with stakeholders.
- (4)
- Certification and Audits: Conducting regular third-party audits and penetration testing to validate the integrity of software components and to ensure ongoing compliance with evolving regulatory standards.
Here, Table 22 provides a structured overview of cybersecurity and data governance dimensions in AI-driven subway power systems. Each row pinpoints the major threats, required security measures, privacy implications, and governance challenges. Notably, risk levels vary from medium to very high, underscoring the criticality of robust security frameworks. The table also emphasizes that future technological developments—from zero-trust network architectures to advanced privacy-preserving analytics—will heavily influence how effectively operators can safeguard these systems.
Table 22.
Core dimensions of cybersecurity, privacy, and data management in AI-driven subway power systems.
In sum, as AI and big data analytics become ubiquitous in subway power systems, cybersecurity, privacy, and data management must be addressed comprehensively. Achieving a holistic solution involves aligning technical safeguards, organizational protocols, and regulatory mandates. While the challenges are significant, so are the rewards: with well-managed data and secure AI pipelines, subway power networks can harness the full promise of advanced analytics without compromising safety or public trust.
6.5. Next-Generation Operational Strategies and Socio-Economic Implications
Beyond the immediate technical benefits of AI in monitoring, diagnosis, and self-healing, these technologies also herald broader changes in operational strategies and socio-economic landscapes. By reducing maintenance costs, improving reliability, and enabling flexible energy management, AI-driven subway power systems are poised to influence workforce development, system financing, and urban planning in meaningful ways.
AI technologies are expected to foster a shift in workforce skills towards data science, cybersecurity, and AI model interpretation, as traditional roles in power engineering evolve. This transition reflects broader trends seen in industries adopting AI for system optimization and automation. Moreover, AI-enhanced decision-making is likely to introduce new policy and regulatory frameworks, particularly in relation to the management of public transportation infrastructure and energy resources. As cities adopt AI-driven subway systems, there is an opportunity to enhance collaboration between technology developers, transit authorities, and urban planners to create more sustainable and resilient urban environments.
6.5.1. Evolving Role of the Workforce
- (1)
- As AI-driven analytics and semi-autonomous systems take on routine tasks—such as fault detection or reconfiguration—human roles are likely to shift toward oversight, strategic decision-making, and specialized technical functions.
- (2)
- Upskilling and Reskilling: Engineers and technicians will need new skill sets, bridging power engineering with data science, cybersecurity, and AI model interpretation.
- (3)
- Collaborative Decision-Making: Operators will collaborate more closely with AI recommendations, requiring training in human–machine interfaces and explainable AI solutions to bolster trust and accountability.
- (4)
- New Roles: AI ethicists, data stewards, and cybersecurity specialists will emerge as essential staff for managing the complex socio-technical ecosystem.
In this context, it is imperative for subway authorities and vocational institutions to align educational programs with these new requirements, fostering a workforce that can manage and continually refine AI-enabled subway power systems.
6.5.2. Financial and Economic Dimensions
AI-driven efficiency gains—such as reduced unplanned downtime, lower energy losses, and improved asset utilization—can translate into substantial cost savings. These savings can be redirected toward infrastructure upgrades or used to reduce the fiscal burden on local governments. Additionally, more reliable services may boost ridership, generating indirect economic benefits for the city (e.g., increased retail sales near stations and improved labor mobility).
However, the initial investment required for AI tools, sensor networks, and data infrastructures can be considerable. This may prompt public–private partnerships or alternative financing models to share risks and rewards among multiple stakeholders. Over time, data generated by these systems might even be monetized (e.g., through analysis services for third parties), creating new revenue streams. While this can strengthen financial sustainability, it also necessitates robust governance frameworks to ensure data privacy and equitable value distribution.
6.5.3. Urban Planning and Sustainable Development
Intelligent subway power systems can play a critical role in shaping sustainable urban growth. By optimizing energy consumption and integrating with other urban infrastructure—such as electric vehicle (EV) charging networks or district heating—subway systems can support a more holistic approach to urban energy management. For example, advanced forecasting of passenger flows, combined with dynamic power distribution, can reduce peak loads on city grids. This synergy fosters better urban planning, reduced carbon emissions, and improved overall quality of life for residents.
Moreover, AI-driven fault detection and rapid incident response can bolster public perception of subways as safe and reliable. In turn, cities may be more inclined to expand rail transit networks, encouraging a modal shift away from cars and thereby reducing congestion and air pollution.
6.5.4. Policy and Regulatory Considerations
Governments and regulatory bodies will need to modernize policies to keep pace with AI’s rapid integration. Potential areas of focus include the following:
- (1)
- Standardization: Expanding IEC 61850 or similar standards to cover next-generation AI requirements (e.g., real-time data streaming and advanced analytics models).
- (2)
- Safety and Liability: Clarifying who is responsible when AI-driven systems make decisions that lead to incidents—particularly if they deviate from conventional operator guidelines.
- (3)
- Incentive Structures: Providing tax breaks, grants, or other incentives for subway operators investing in advanced AI technologies, especially if these innovations yield public benefits such as reduced CO₂ emissions or improved accessibility.
A forward-looking regulatory environment will ensure that AI enhancements align with public interest objectives, balancing innovation with risk management and equity considerations. For example, Refs. [142,143] underscore the crucial intersection of AI regulation, public interest, and risk management in ensuring the responsible and equitable deployment of artificial intelligence. Concretely, Alex-Omiogbemi et al. (2024) [142] present a framework for enhancing regulatory compliance and mitigating risks in emerging markets through digital innovations, illustrating how policy frameworks can support the responsible use of AI. Furthermore, Wang and Wu (2024) [143] address the need to strike a balance between fostering AI-driven innovation and maintaining robust regulatory oversight, highlighting the ethical and social implications of generative AI technologies. Together, these works contribute to the growing discourse on ensuring that AI development serves societal well-being while managing associated risks effectively.
Based on this, Table 23 outlines key socio-economic and operational considerations that arise from deploying AI in subway power systems. Gains in reliability and cost-effectiveness can translate into broader economic and environmental benefits, but they also introduce transitions in labor markets, regulatory frameworks, and urban planning. Each dimension involves interplay between technical innovations and societal factors, underscoring the necessity for multidisciplinary collaboration.
Table 23.
Socio-economic and operational dimensions of AI-driven subway power systems.
Overall, AI-driven subway power systems have the potential to redefine how metropolitan regions plan, finance, and operate their mass transit infrastructures. Policymakers, industry leaders, and community stakeholders should collaborate to craft strategies that maximize public benefit, minimize negative externalities, and ensure equitable access to these transformative technologies. By doing so, cities worldwide can harness AI’s power to create cleaner, safer, and more efficient transportation systems that support sustainable growth for generations to come.
Through the five sections above, we have examined the comprehensive implications of AI technologies for future subway power systems. Beginning with the role of AI in fault diagnosis and prognostics, we moved to reinforcement learning applications in self-healing and then explored the synergy of AI, MASs, and IEC 61850 for interoperable infrastructures. Subsequently, we addressed critical issues in cybersecurity, privacy, and data management and finally evaluated the broader socio-economic and operational transformations likely to emerge. This holistic coverage underscores not only the technical possibilities of AI-driven subway power systems but also the regulatory, workforce, and societal shifts required to bring about a truly intelligent, secure, and sustainable urban rail future.
6.6. Potential Security Flaws in AI-Driven Subway Power Systems and Mitigation Strategies
As subway power systems increasingly adopt AI technologies and MASs for self-healing, fault detection, and optimization, they become more vulnerable to cybersecurity threats. The integration of real-time data streams, edge computing, and AI-driven analytics introduces significant risks related to data integrity, system privacy, and overall network security. This section discusses the potential security flaws associated with AI-enhanced subway power systems and provides comprehensive strategies to mitigate these risks, ensuring the robustness, resilience, and safety of urban rail infrastructures.
6.6.1. Key Security Threats in AI-Driven Subway Power Systems
AI-enabled subway power systems, especially those incorporating machine learning (ML), deep learning (DL), and reinforcement learning (RL), increase both the complexity and attack surface of the system. The following are the primary security vulnerabilities identified in such systems:
1. Data Poisoning Attacks
AI models rely heavily on large datasets for training and decision-making. Data poisoning occurs when attackers intentionally manipulate training datasets to degrade the performance of the AI model, leading to incorrect predictions and compromised decision-making processes.
2. Sensor Spoofing
AI-driven systems depend on real-time sensor data to make decisions. In sensor spoofing, malicious actors manipulate sensor outputs—such as voltage, current, and temperature readings—to create false information that can trigger inappropriate actions by the system, such as misidentifying faults or failing to isolate them properly.
3. Ransomware and Denial of Service (DoS) Attacks
Ransomware attacks target critical infrastructure systems, encrypting operational data and demanding payment for restoring access. In DoS attacks, attackers flood communication channels, preventing timely data exchange between system components, potentially leading to failures in real-time control and communication.
4. Unauthorized System Access
AI-enabled subway systems are susceptible to unauthorized access, especially when communication protocols such as IEC 61850 are implemented. Cybercriminals could exploit vulnerabilities in these communication protocols to gain control over the system, affecting the decision-making capabilities of MAS- and AI-driven controllers.
5. Communication Latency and Spoofing
AI algorithms, especially those based on reinforcement learning, rely on real-time data for decision-making. Any latency or interruption in communication, whether through network failures or malicious interference, can degrade the performance of the self-healing system and delay fault isolation and recovery.
6.6.2. Mitigation Strategies for Enhancing Security
To safeguard AI-driven subway power systems, a multi-layered security approach is essential. The following mitigation strategies are proposed:
1. Advanced Encryption and Secure Communication Protocols
To prevent unauthorized access and data breaches, all communication between the system’s components, including sensors, agents, and controllers, should be encrypted using state-of-the-art encryption algorithms such as Advanced Encryption Standard (AES) and Transport Layer Security (TLS). Secure protocols like IEC 61850, with enhanced security features for critical infrastructure, should be employed to ensure data integrity and confidentiality.
2. Intrusion Detection and Prevention Systems (IDSs/IPSs)
Implementing IDSs/IPSs can help detect and prevent malicious activities such as sensor spoofing, unauthorized access, and data tampering. These systems monitor the network for unusual activities and trigger alerts for potential security threats, allowing operators to take timely action.
3. Federated Learning for Decentralized AI Models
Federated learning allows for the training of AI models without exposing sensitive data to centralized systems. By keeping data locally at each station or substation and only aggregating model updates, federated learning mitigates the risks associated with data breaches and ensures that privacy concerns are addressed while maintaining AI model performance.
4. Data Validation and Integrity Checks
Ensuring the integrity of the data fed into AI models is critical for maintaining accurate predictions. AI models should be equipped with built-in data validation checks to identify inconsistencies or anomalies in real-time data. Moreover, regular audits and updates of sensor calibration and maintenance schedules are necessary to ensure the continued accuracy of the system.
5. AI-Powered Anomaly Detection and Secure AI Pipelines
AI models should be trained to recognize and alert the system when abnormal patterns—indicative of attacks such as data poisoning—are detected. Additionally, secure AI pipelines, where each model update or decision is validated and signed by a trusted authority, can protect against tampering and unauthorized changes.
6. Backup and Recovery Mechanisms
To mitigate the risks of ransomware and DoS attacks, subway power systems must implement robust backup and recovery mechanisms. Regular backups of system configurations, AI model parameters, and critical operational data should be maintained, and recovery plans should be established to restore system functionality swiftly in case of an attack.
7. Zero-Trust Security Architecture
The zero-trust model assumes that no device or user is inherently trustworthy, whether inside or outside the network. In the context of subway power systems, this approach would involve strict access control policies, continuous monitoring of all system interactions, and multi-factor authentication (MFA) for all users and devices.
6.6.3. Summary of the Key Security Threats in AI-Driven Subway Power Systems
Based on the above, Table 24 provides a comprehensive summary of the key security threats in AI-driven subway power systems, along with the corresponding mitigation strategies and their implementation priorities. The table highlights critical threats such as data poisoning, sensor spoofing, ransomware, unauthorized access, and communication latency, which can compromise system performance and safety. For each threat, the table outlines specific countermeasures, including secure data pipelines, anomaly detection, encryption, multi-factor authentication, and edge computing, along with the necessary complexity for implementation. Notably, threats like data poisoning and ransomware are deemed high-priority due to their potential to disrupt system functionality, while solutions such as anomaly detection and secure AI pipelines are highlighted as essential for maintaining data integrity and system resilience. The table underscores the importance of a multi-layered, adaptive security approach to ensure the reliability and robustness of AI-powered subway power systems, with a focus on addressing both technical and operational vulnerabilities. In our view, the implementation of these mitigation strategies should be approached incrementally, with particular emphasis on continuous monitoring and system updates to stay ahead of emerging threats in the evolving landscape of smart transportation infrastructure.
Table 24.
Summary of security threats, mitigation strategies, and their implementation priorities.
Overall, the integration of AI and MASs in subway power systems significantly enhances system efficiency and self-healing capabilities but also introduces new security risks that must be addressed comprehensively. The proposed security measures, including advanced encryption, intrusion detection systems, federated learning, and AI-powered anomaly detection, are critical in safeguarding subway power systems from malicious threats. Ensuring that these systems are robust against potential cyberattacks will require continuous monitoring, adaptive security measures, and ongoing research to stay ahead of emerging threats.
By prioritizing cybersecurity and data integrity, operators can safeguard the reliability and safety of AI-driven subway power systems, thus fostering a secure and resilient urban transportation infrastructure that can meet the demands of future smart cities. As AI and cybersecurity technologies evolve, these systems will need to be periodically reassessed and upgraded to maintain their effectiveness in protecting public infrastructure.
7. Conclusions and Policy Implications
7.1. Conclusions
The research presented in this review highlights groundbreaking advancements in enhancing the self-healing capabilities of subway power supply systems, with a particular focus on the integration of MASs and AI algorithms. As a critical component of urban rail transit, the reliability and safety of the subway power supply are paramount, and traditional manual interventions for fault diagnosis and recovery have become insufficient to meet the increasingly complex demands of modern urban transportation systems. This paper has explored the evolving concept of self-healing technology, which has found successful applications in power grids and distribution networks, and it has demonstrated how these technologies can revolutionize subway power supply systems.
The integration of MASs and the IEC 61850 standard offers a novel, innovative approach to building an autonomous, adaptive, and intelligent self-healing control framework. By leveraging the strengths of MASs in decentralized control, coordination, and decision-making, subway power systems can respond dynamically to faults in ways that minimize the impact on service continuity and operational safety. The IEC 61850 standard, a globally recognized communication protocol for power systems, provides the interoperability and flexibility needed to implement these complex, decentralized self-healing mechanisms effectively. This novel hybrid model has not only enhanced the reliability of subway power systems but also set the foundation for more robust and scalable self-healing systems in urban rail infrastructure.
This review has also demonstrated that MASs combined with AI-driven fault diagnosis algorithms can drastically improve the speed, accuracy, and efficiency of fault detection, analysis, isolation, and recovery. Specifically, AI algorithms have the capacity to handle complex, multi-fault scenarios that may overwhelm traditional control methods. Furthermore, through advanced machine learning techniques, the system can continuously learn and adapt to new fault patterns, improving efficiency over time, which distinguishes this approach from conventional methods.
One of the most promising findings is the application of hybrid architectures that combine MASs with the IEC 61850 framework to support critical functions such as fault localization, isolation, and recovery in subway power systems. These hybrid architectures facilitate seamless communication between various subsystems, enabling a holistic view of the system’s health, making them ideal for real-time fault management in complex, large-scale networks like those of modern subway systems. This innovative integration is presented as a unique contribution, offering greater resilience and adaptability in fault management than existing technologies.
The potential for intelligent fault recovery strategies, supported by AI, is also highlighted in this review. These strategies, capable of quick adaptation to various fault conditions, can drastically reduce the time required for recovery, thereby improving the overall reliability of subway operations. Through continuous monitoring and real-time decision-making, AI-based recovery systems enhance the ability of subway power supply systems to self-heal, ensuring operational resilience even in the face of unforeseen challenges. This ability to adapt in real time, without requiring manual intervention, marks a significant shift in how subway systems handle faults.
In conclusion, the research presented in this review indicates that self-healing technologies, underpinned by MASs and AI, represent a crucial evolution in the design and operation of subway power systems. By reducing the need for manual intervention, enhancing fault detection and recovery processes, and improving system efficiency, these technologies will play a key role in shaping the future of urban transportation infrastructure. The integration of these innovative technologies not only holds promise for improving the resilience and performance of subway power supply systems but also sets the stage for broader applications of self-healing technologies in other critical infrastructure systems, marking a significant contribution to the ongoing intelligent infrastructure revolution.
7.2. Policy Implications
The findings of this research highlight several critical policy implications for the advancement and deployment of self-healing technologies in subway power systems, particularly those driven by MASs and AI. As subway systems worldwide face increasing demands for reliability, efficiency, and automation, the adoption of self-healing technologies is becoming an essential step toward achieving these goals. To fully realize the potential of MASs and AI in self-healing subway power systems, policymakers will need to consider a range of strategic actions and regulatory frameworks to facilitate the integration of these advanced technologies.
First and foremost, policymakers must recognize the need for substantial investment in research and development (R&D) to continue advancing the capabilities of MASs and AI algorithms in self-healing applications. While promising, the implementation of these technologies in subway power systems requires overcoming technical challenges, including data acquisition, system integration, and real-time decision-making capabilities. Governments and industry stakeholders should collaborate to fund R&D initiatives that focus on refining the algorithms, improving system interoperability, and testing the performance of these technologies in real-world environments. Public–private partnerships can play a crucial role in accelerating the development and deployment of these innovations.
Another significant policy consideration is the establishment of regulatory standards that ensure the compatibility and interoperability of self-healing systems across different subway networks and urban environments. The IEC 61850 standard, which has already been proposed as a framework for integration, is a step in the right direction. However, to facilitate the widespread adoption of self-healing technologies, policymakers must ensure that these standards are continuously updated to reflect the rapid advancements in AI and MASs. This includes promoting global standardization efforts to ensure that subway power systems in different regions can communicate seamlessly, share data, and collaborate in real time.
Furthermore, policymakers must address the training and upskilling of the workforce to manage and maintain the advanced self-healing systems. As AI and MASs take a more prominent role in subway power system management, there will be a need for a skilled workforce capable of operating and troubleshooting these complex systems. Educational institutions, in collaboration with industry experts, should develop specialized training programs to equip engineers, operators, and maintenance personnel with the necessary knowledge and skills. Additionally, governments can incentivize workforce development through grants, scholarships, and industry partnerships.
In addition to technological and workforce considerations, policymakers should ensure that the implementation of self-healing technologies aligns with broader sustainability and resilience goals. Subway power supply systems play a key role in reducing urban congestion and greenhouse gas emissions. By integrating self-healing technologies, these systems can become more energy-efficient, reducing the overall environmental footprint of urban transportation infrastructure. Policies promoting the adoption of green technologies and a reduction in carbon emissions in subway networks will further incentivize the integration of advanced self-healing solutions.
Finally, the policy landscape should encourage the collection and sharing of data for ongoing performance analysis. Self-healing systems require continuous data input for machine learning algorithms to adapt and optimize. Therefore, policies promoting data transparency, privacy, and security will be critical to ensuring the safe and efficient operation of self-healing technologies. Regulations must balance the need for open data sharing with the protection of sensitive information, particularly regarding the security of critical infrastructure.
In conclusion, the successful deployment of self-healing technologies in subway power supply systems requires comprehensive policy support that encompasses investment in R&D, the establishment of standards, workforce development, sustainability considerations, and data governance. Policymakers must take proactive steps to create an enabling environment for these innovations to thrive, ensuring that subway systems are not only more resilient and efficient but also more adaptable to future challenges. Through targeted policy initiatives, governments can play a vital role in shaping the future of urban transportation infrastructure and ensuring its continued evolution in an increasingly intelligent, autonomous, and sustainable direction.
Author Contributions
Conceptualization, J.F., T.Y., K.Z. and L.C.; methodology, J.F., T.Y., K.Z. and L.C.; formal analysis, J.F., K.Z. and L.C.; investigation, J.F., T.Y., K.Z. and L.C.; resources, J.F., T.Y., K.Z. and L.C.; data curation, J.F., T.Y., K.Z. and L.C.; writing—original draft preparation, J.F., T.Y., K.Z. and L.C.; writing—review and editing, J.F., T.Y., K.Z. and L.C.; visualization, J.F., T.Y., K.Z. and L.C.; supervision, L.C.; project administration, L.C.; funding acquisition, L.C. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded in part by the Guangzhou Education Bureau University Research Project - Graduate Research Project, grant number 2024312278 (funder: L.C.), and in part by the STU Scientific Research Initiation Grant (SRIG), grant number STF23021 (funder: K.Z.).
Data Availability Statement
No new data were created or analyzed in this study.
Acknowledgments
We sincerely thank the associate editor and invited anonymous reviewers for their kind and helpful comments on our paper.
Conflicts of Interest
Author Jianbing Feng was employed by the company Guangzhou Metro Construction Management Co., Ltd. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Abbreviations
| Abbreviation | Full Form |
| AI | Artificial Intelligence |
| AC/DC | Alternating Current/Direct Current |
| CI | Condition-based Inspection |
| DA/DO | Data Attribute/Data Object |
| DC | Direct Current |
| DER | Distributed Energy Resources |
| DO | Data Object |
| DOS | Denial of Service |
| EMS | Energy Management System |
| FLISR | Fault Location, Isolation, and Service Restoration |
| GOOSE | Generic Object-Oriented Substation Event |
| IED | Intelligent Electronic Device |
| IEC 61850 | International Electrotechnical Commission 61850 Standard |
| IDS | Intrusion Detection System |
| IOT | Internet of Things |
| LN | Logical Node |
| MAS | Multi-Agent System |
| MMS | Manufacturing Message Specification |
| MMXU | Measurement Unit |
| MTTR | Mean Time to Repair |
| PDIS | Protection Distance Intelligent System |
| PMU | Phasor Measurement Unit |
| PRP | Parallel Redundancy Protocol |
| PTOC | Protection Overcurrent Unit |
| RSTP | Rapid Spanning Tree Protocol |
| SAIDI | System Average Interruption Duration Index |
| SAIFI | System Average Interruption Frequency Index |
| SCADA | Supervisory Control and Data Acquisition |
| SCL | System Configuration Language |
| VAR | Voltage Amperes Reactive |
| WAMPAC | Wide-Area Monitoring, Protection, and Control |
| XML | eXtensible Markup Language |
References
- China Association of Metros. Annual Report on Statistics and Analysis of Urban Rail Transit 2023. Available online: https://www.camet.org.cn/xytj/tjxx/14894.shtml (accessed on 29 March 2024).
- IEC 61850; Communication Networks and Systems in Substations. International Electrotechnical Commission (IEC): Geneva, Switzerland, 2011.
- Shang, W.L.; Lv, Z. Low carbon technology for carbon neutrality in sustainable cities: A survey. Sustain. Cities Soc. 2023, 92, 104489. [Google Scholar] [CrossRef]
- Wang, L. Study on the Fault Diagnosis and Protection of Energy-Fed Supply System in Urban Mass Transit. Ph.D. Thesis, Beijing Jiaotong University, Beijing, China, 2010. [Google Scholar]
- Du, F. Modeling for Metro Locomotive and Analysis of Fault Condition of DC Traction Power Supply System. Ph.D. Thesis, Beijing Jiaotong University, Beijing, China, 2010. [Google Scholar]
- Zeng, B.; Zhang, J.; Yang, X.; Wang, J.; Dong, J.; Zhang, Y. Integrated planning for transition to low-carbon distribution system with renewable energy generation and demand response. IEEE Trans. Power Syst. 2013, 29, 1153–1165. [Google Scholar] [CrossRef]
- Allegretti, G.; Montoya, M.A.; Bertussi, L.A.S.; Talamini, E. When being renewable may not be enough: Typologies of trends in energy and carbon footprint towards sustainable development. Renew. Sustain. Energy Rev. 2022, 168, 112860. [Google Scholar] [CrossRef]
- Song, X. Research on Fault Location Methods for City DC Railway Traction System; Beijing Jiaotong University: Beijing, China, 2015. [Google Scholar]
- Qin, B.; Wang, H.; Wang, Z.; Xiong, Z.; Zhao, J.; Lu, H.; Wang, M. Integrated development of urban rail transit and energy systems supported by underground space. Strateg. Study CAE 2023, 25, 45–59. [Google Scholar] [CrossRef]
- Serdar, M.Z.; Koç, M.; Al-Ghamdi, S.G. Urban transportation networks resilience: Indicators, disturbances, and assessment methods. Sustain. Cities Soc. 2022, 76, 103452. [Google Scholar] [CrossRef]
- Wang, K.K.; Lv, Y. Fault Location Method of Metro DC Traction Power Supply System Catenary. Urban Mass Transit 2022, 7, 222–224, 229. [Google Scholar]
- Wei, R.; Shi, G.; Zhuang, K.; Xia, J. Research on Fault Location of Subway DC Traction Power Supply System Based on GPS Time Synchronization. Mar. Electr. Electron. Eng. 2023, 43, 85–88. [Google Scholar]
- Jin, X.; Li, Z.; Hu, Z. Simulation of Fault Location for Subway DC Power Supply System Based on Time Domain Differential. Mar. Electr. Electron. Eng. 2017, 37, 67–70. [Google Scholar]
- Pei, W. Research on the Reliability of Subway Traction Power Supply Systems; Nanjing University of Science and Technology: Nanjing, China, 2018. [Google Scholar]
- Zhou, J. Research on Online Reliability Evaluation for Traction Power Supply System of Metro Network; Shanghai Jiaotong University: Shanghai, China, 2012. [Google Scholar]
- Sheng, S.; Li, K.K.; Chan, W.L.; Xiangjun, Z.; Xianzhong, D. Agent-based self-healing protection system. IEEE Trans. Power Deliv. 2006, 21, 610–618. [Google Scholar] [CrossRef]
- Ji, X.; Jian, L.; Yan, X.; Wang, H. Research on self-healing technology of smart distribution network based on multi-agent system. In Proceedings of the 2016 Chinese Control and Decision Conference (CCDC), Yinchuan, China, 28–30 May 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 6132–6137. [Google Scholar]
- Xiang, G.; Xin, A. The application of self-healing technology in smart grid. In Proceedings of the 2011 Asia-Pacific Power and Energy Engineering Conference, Wuhan, China, 25–28 March 2011. [Google Scholar]
- Zhang, R.; Bie, Z. Distributed cluster-level cooperative control of dynamic virtual microgrid cluster for active distribution network. Autom. Electr. Power Syst. 2022, 46, 55–62. [Google Scholar]
- Zhao, Y.; Rieger, C.; Zhu, Q. Multi-agent learning for resilient distributed control systems. arXiv 2022, arXiv:2208.05060. [Google Scholar]
- Pang, Y.; Lodewijks, G. Agent-based intelligent monitoring in large-scale continuous material transport. In Proceedings of the 2012 9th IEEE International Conference on Networking, Sensing and Control, Beijing, China, 11–14 April 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 79–84. [Google Scholar]
- Mayorov, G.; Stennikov, V.; Barakhtenko, E. Application of the multiagent approach to the research of integrated energy supply systems. In E3S Web of Conferences 2019; EDP Sciences: Les Ulis, France, 2019; Volume 114, p. 01006. [Google Scholar]
- Sujil, A.; Verma, J.; Kumar, R. Multi agent system: Concepts, platforms and applications in power systems. Artif. Intell. Rev. 2018, 49, 153–182. [Google Scholar] [CrossRef]
- Yu, H.; Wang, Y.; Chen, Z. A novel renewable microgrid-enabled metro traction power system—Concepts, framework, and operation strategy. IEEE Trans. Transp. Electrif. 2021, 7, 1733–1749. [Google Scholar] [CrossRef]
- Saray, M.; Saray, M.; Kazan, C.; Guner, S. Optimization of renewable energy usage in public transportation: Mathematical model for energy management of plug-in PV-based electric metrobuses. J. Energy Storage 2024, 78, 109946. [Google Scholar] [CrossRef]
- Kilic, B.; Dursun, E. Integration of innovative photovoltaic technology to the railway trains: A case study for Istanbul airport-M1 light metro line. In Proceedings of the IEEE EUROCON 2017-17th International Conference on Smart Technologies, Ohrid, Macedonia, 6–8 July 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 336–340. [Google Scholar]
- Kumar, G.M.S.; Cao, S. Leveraging energy flexibilities for enhancing the cost-effectiveness and grid-responsiveness of net-zero-energy metro railway and station systems. Appl. Energy 2023, 333, 120632. [Google Scholar] [CrossRef]
- Yu, J.; Wang, J.; Tong, F. Research and analysis of power supply load forecasting and self-healing control in urban rail transit system. In IOP Conference Series: Earth and Environmental Science 2021; IOP Publishing: Bristol, UK, 2021; Volume 769, p. 042093. [Google Scholar]
- Zheng, S.; Liu, Y.; Lin, Y.; Wang, Q.; Yang, H.; Chen, B. Bridging strategy for the disruption of metro considering the reliability of transportation system: Metro and conventional bus network. Reliab. Eng. Syst. Saf. 2022, 225, 108585. [Google Scholar] [CrossRef]
- Kalyvas, M.; McCracken, A. Doha Metro Novel Building Automation and Control System (BACS). J. Ind. Integr. Manag. 2024, 9, 571–596. [Google Scholar] [CrossRef]
- Longo, M.; Bramani, M. The automation control systems for the efficiency of metro transit lines. In Proceedings of the 2015 AEIT International Annual Conference (AEIT), Naples, Italy, 14–16 October 2015; IEEE: Piscataway, NJ, USA, 2015; pp. 1–6. [Google Scholar]
- National Energy Technology Laboratory. The Modern Grid Initiative; US: Department of Energy: Washington, DC, USA, 2008; pp. 26–30. [Google Scholar]
- Liu, C.; Jung, J.; Heydt, G.T.; Vittal, V.; Phadke, A. The Strategic Power Infrastructure Defense (SPID) System: A Conceptual Design. IEEE Control Syst. Mag. 2000, 20, 40–52. [Google Scholar]
- Li, T. Research on the Self-Healing Functions of Smart Distribution Grid and Its Benefits Evaluation Model; North China Electric Power University: Beijing, China, 2012. [Google Scholar]
- Amin, M. Toward self-healing energy infrastructure systems. IEEE Comput. Appl. Power 2001, 14, 20–28. [Google Scholar] [CrossRef]
- Shittu, E.; Tibrewala, A.; Kalla, S.; Wang, X. Meta-analysis of the strategies for self-healing and resilience in power systems. Adv. Appl. Energy 2021, 4, 100036. [Google Scholar] [CrossRef]
- Arefifar, S.A.; Alam, M.S.; Hamadi, A. A review on self-healing in modern power distribution systems. J. Mod. Power Syst. Clean Energy 2023, 11, 1719–1733. [Google Scholar] [CrossRef]
- Madani, V.; Novosel, D.; Horowitz, S.; Adamiak, M.; Amantegui, J.; Karlsson, D.; Imai, S.; Apostolov, A. IEEE PSRC report on global industry experiences with system integrity protection schemes (SIPS). IEEE Trans. Power Deliv. 2010, 25, 2143–2155. [Google Scholar] [CrossRef]
- Maqsood, M.; Masood, A. Integration of Wireless HART and STK600 Development Kit for Data Collection in Wireless Sensor Networks. Master’s Thesis, Universitetet i Agder/University of Agder, Kristiansand, Norway, 2013. [Google Scholar]
- Morais, B.T.P. Emerging Technologies and Future Trends in Substation Automation Systems for the Protection, Monitoring and Control of Electrical Substations; PQDT-Global: Ann Arbor, MI, USA, 2013. [Google Scholar]
- Majhi, A.A.K.; Mohanty, S. A comprehensive review on Internet of Things applications in power systems. IEEE Internet Things J. 2024, 11, 34896–34923. [Google Scholar] [CrossRef]
- Wang, L.; Bo, Z.; Wang, Q.P.; Liu, R.T.; Fan, W. Design of integrated wide area protection and control for power grid. DPI Proc. 2018, 1, 206–214. [Google Scholar] [CrossRef] [PubMed]
- Hellman, C.; Aronson, M.; Tom, N.; Quan, W. The microprocessor and the minicomputer for earth terminal and network control. ITC Proc. 1981, 1, 529–548. [Google Scholar]
- Terzija, V.; Valverde, G.; Cai, D.; Regulski, P.; Madani, V.; Fitch, J.; Skok, S.; Begovic, M.M.; Phadke, A. Wide-area monitoring, protection, and control of future electric power networks. Proc. IEEE 2010, 99, 80–93. [Google Scholar] [CrossRef]
- Rahman, W.U.; Ali, M.; Mehmood, C.A.; Khan, A. Design and implementation for wide area power system monitoring and protection using phasor measuring units. WSEAS Trans. Power Syst. 2013, 8, 57–64. [Google Scholar]
- Cheng, L.F.; Yu, T. A new generation of AI: A review and perspective on machine learning technologies applied to smart energy and electric power systems. Int. J. Energy Res. 2019, 43, 1928–1973. [Google Scholar] [CrossRef]
- Cheng, L.F.; Wei, X.; Li, M.; Tan, C.; Yin, M.; Shen, T.; Zou, T. Integrating evolutionary game-theoretical methods and deep reinforcement learning for adaptive strategy optimization in user-side electricity markets: A comprehensive review. Mathematics 2024, 12, 3241. [Google Scholar] [CrossRef]
- Cheng, L.F.; Yu, T.; Zhang, X.S.; Yang, B. Parallel cyber-physical-social systems based smart energy robotic dispatcher and knowledge automation: Concepts, architectures and challenges. IEEE Intell. Syst. 2019, 34, 54–64. [Google Scholar] [CrossRef]
- Nyangon, J. Climate-proofing critical energy infrastructure: Smart grids, artificial intelligence, and machine learning for power system resilience against extreme weather events. J. Infrastruct. Syst. 2024, 30, 03124001. [Google Scholar] [CrossRef]
- Ahmad, T.; Madonski, R.; Zhang, D.; Huang, C.; Mujeeb, A. Data-driven probabilistic machine learning in sustainable smart energy/smart energy systems: Key developments, challenges, and future research opportunities in the context of smart grid paradigm. Renew. Sustain. Energy Rev. 2022, 160, 112128. [Google Scholar] [CrossRef]
- Dick, K.; Russell, L.; Souley Dosso, Y.; Kwamena, F.; Green, J.R. Deep learning for critical infrastructure resilience. J. Infrastruct. Syst. 2019, 25, 05019003. [Google Scholar] [CrossRef]
- Nama, P.; Reddy, P.; Pattanayak, S.K. Artificial Intelligence for Self-Healing Automation Testing Frameworks: Real-Time Fault Prediction and Recovery. Artif. Intell. 2024, 64 (Suppl. S3), 111–141. [Google Scholar]
- Plevris, V.; Papazafeiropoulos, G. AI in Structural Health Monitoring for Infrastructure Maintenance and Safety. Infrastructures 2024, 9, 225. [Google Scholar] [CrossRef]
- Manoharan, A.; Sarker, M. Revolutionizing Cybersecurity: Unleashing the Power of Artificial Intelligence and Machine Learning for Next-Generation Threat Detection. Int. Res. J. Mod. Eng. Technol. Sci. 2022, 4, 2151–2164. [Google Scholar] [CrossRef]
- Fadi, O.; Karim, Z.; Mohammed, B. A survey on blockchain and artificial intelligence technologies for enhancing security and privacy in smart environments. IEEE Access 2022, 10, 93168–93186. [Google Scholar] [CrossRef]
- Tooki, O.O.; Popoola, O.M. A critical review on intelligent-based techniques for detection and mitigation of cyberthreats and cascaded failures in cyber-physical power systems. Renew. Energy Focus 2024, 51, 100628. [Google Scholar] [CrossRef]
- Ahmad, S. Real-Time Control and Power Management for Interconnected Microgrids with Self-Healing Capability. Ph.D. Thesis, University of Malaya, Kuala Lumpur, Malaysia, 2022. [Google Scholar]
- Rath, S.; Nguyen, L.D.; Sahoo, S.; Popovski, P. Self-healing secure blockchain framework in microgrids. IEEE Trans. Smart Grid 2023, 14, 4729–4740. [Google Scholar] [CrossRef]
- Watuwa, B. Power Reliability Analysis of DC Traction Power Supply System: A Case Study of Addis Ababa Light Rail Transit; Addis Ababa University: Addis Ababa, Ethiopia, 2019; Available online: https://scholar.googleusercontent.com/scholar?q=cache:Q3mqEIVgZ1kJ:scholar.google.com/&hl=zh-CN&as_sdt=0,5&scioq=Power+Reliability+Analysis+of+DC+Traction+Power+Supply+System:+A+Case+Study+of+Addis+Ababa+Light+Rail+Transit (accessed on 10 March 2025).
- Ogunsola, A.; Mariscotti, A. Electromagnetic Compatibility in Railways: Analysis and Management; Springer Science & Business Media, Springer Publishing Company, Incorporated, 1 July 2013; pp. 1–528. Available online: https://books.google.com/books?hl=zh-CN&lr=&id=N5B3S13cPpIC&oi=fnd&pg=PR2&dq=related:YcJun2vRVgIJ:scholar.google.com/&ots=EecduOIHre&sig=teucUbqoCLzfPK-9XlVwRKHvlp4#v=onepage&q&f=false (accessed on 10 March 2025). [CrossRef]
- López, D.Á.J.L. Optimising the Electrical Infrastructure of Mass Transit Systems to Improve the Use of Regenerative Braking. Ph.D. Thesis, Universidad Pontificia Comillas, Madrid, Spain, 2016. [Google Scholar]
- Parizad, A.; Baghaee, H.R. Overview of smart cyber-physical power systems: Fundamentals, Challenges, and Solutions. Wiley Online Libr. 2025, 1, 157–178. Available online: https://onlinelibrary.wiley.com/doi/abs/10.1002/9781394191529.ch1 (accessed on 10 March 2025).
- Castro Gómez, A. Feasibility for the Introduction of Current Limiting Impedance for a Previously Solid Grounded Medium Voltage Distribution Network. Master’s Thesis, Politecnico di Milano, Milan, Italy, 28 April 2017. Available online: https://www.politesi.polimi.it/retrieve/a81cb05c-3dc7-616b-e053-1605fe0a889a/Thesis%20Alex%20Castro.pdf (accessed on 10 March 2025).
- Haque, A.; Malik, A.; Shah, N.; Malik, J.A.; Ahmad, R.; Arif, M. Fundamentals of power electronics in smart cities. Taylor Fr. 2024, 1, 77–89. Available online: https://www.taylorfrancis.com/chapters/edit/10.1201/9781032669809-1/fundamentals-power-electronics-smart-cities-ahteshamul-haque-naila-shah-junaid-ahmad-malik-azra-malik (accessed on 10 March 2025).
- Iovanovici, A. Designing Low Latency, Fault-Tolerant Sensor Networks Using Complex Networks Analysis. Timişoara: Editura Politehnica. 2015. ISBN 9786065549623, 6065549622. Available online: https://search.worldcat.org/zh-cn/title/1288695767 (accessed on 10 March 2025).
- Raghunath, K.; Rengarajan, N. Response time optimization with enhanced fault-tolerant wireless sensor network design for on-board rapid transit applications. Clust. Comput. 2019, 22 (Suppl. 4), 9737–9753. [Google Scholar] [CrossRef]
- Kumari, S.; Tyagi, A.K. Wireless sensor networks: An introduction. Digit. Twin Blockchain Smart Cities 2024, 1, 12–22. Available online: https://scholar.google.com/citations?user=RIgaVmUAAAAJ&hl=en&num=20&oi=sra (accessed on 10 March 2025).
- Hernandez, J.C.; Sutil, F.S.; Vidal, P.G. Protection of a multiterminal DC compact node feeding electric vehicles on electric railway systems, secondary distribution networks, and PV systems. Turk. J. Electr. Eng. Comput. Sci. 2016, 24, 3123–3143. [Google Scholar] [CrossRef]
- Swain, A.; Abdellatif, E.; Mousa, A.; Pong, P.W. Sensor technologies for transmission and distribution systems: A review of the latest developments. Energies 2022, 15, 7339. [Google Scholar] [CrossRef]
- Georgilakis, P.S.; Hatziargyriou, N.D. Optimal distributed generation placement in power distribution networks: Models, methods, and future research. IEEE Trans. Power Syst. 2013, 28, 3420–3428. [Google Scholar] [CrossRef]
- Muzzammel, R.; Raza, A.; Hussain, M.R.; Abbas, G. MT-HVdc systems fault classification and location methods based on traveling and non-traveling waves—A comprehensive review. Appl. Sci. 2019, 9, 4760. [Google Scholar] [CrossRef]
- Hamidi, R.J.; Livani, H. A recursive method for traveling-wave arrival-time detection in power systems. IEEE Trans. Power Deliv. 2018, 33, 1097–1106. [Google Scholar]
- Costa, F.B.; Miranda, V.; Leite, H. Wavelet-based analysis and detection of traveling waves due to DC faults in LCC HVDC systems. Int. J. Electr. Power Energy Syst. 2019, 105, 158–165. [Google Scholar]
- Esmail, E.M.; Elsadd, M.A.; Elkalashy, N.I. A review: Smart distribution grid management using agents. WSEAS Trans. Power Syst. 2020, 1, 348234782. [Google Scholar] [CrossRef]
- Liu, C.; Chen, Z.; Bak, C.L. Multi-agent system based adaptive protection for dispersed generation integrated distribution systems. Trans. Power Syst. 2013, 1, 270506087. [Google Scholar] [CrossRef]
- Rahman, M.S.; Muyeen, S.M.; Ghosh, A.; Islam, S.M. Multi-agent systems in ICT enabled smart grid: A status update on technology framework and applications. IEEE Trans. Power Deliv. 2019, 1, 8765552. [Google Scholar]
- Alstom. Towards the First Railway Cybersecurity International Standard: Why Standards Are Important to Secure Railways; Alstom: Saint-Ouen-sur-Seine, France, 2024; Available online: https://www.alstom.com/press-releases-news/2024/3/towards-first-railway-cybersecurity-international-standard-why-standards-are-important-secure-railways (accessed on 10 March 2025).
- Radiflow. Securing Railway Operations from OT Cyberattacks; Radiflow: Mahwah, NJ, USA, 2024; Available online: https://www.radiflow.com/white-papers/securing-railway-operations-from-ot-cyberattacks/ (accessed on 11 March 2025).
- REPLIL. Cybersecurity in Railway Digital Transformation Journey; REPLIL: Dubai, United Arab Emirates, 2024; Available online: https://www.replil.com/cybersecurity-in-railway-digital-transformation-journey/ (accessed on 11 March 2025).
- Mbango, F. Investigation into Alternative Protection Solutions for Distribution Networks. Ph.D. Thesis, Cape Peninsula University of Technology, Bellville, TX, USA, 2009. Available online: https://core.ac.uk/download/pdf/148365012.pdf (accessed on 11 March 2025).
- Dutta Pramanik, P.; Upadhyaya, B.; Kushwaha, A.; Bhowmik, D. Harnessing IoT: Transforming Smart Grid Advancements. In IoT for Smart Grid: Revolutionizing Electrical Engineering 2025, Chapter 7, 127–174; Wiley Online Library: Hoboken, NJ, USA, 2025; Available online: https://onlinelibrary.wiley.com/doi/abs/10.1002/9781394279401.ch7 (accessed on 11 March 2025).
- Pandiyan, P.; Saravanan, S.; Kannadasan, R.; Krishnaveni, S.; Alsharif, M.; Kim, M. A comprehensive review of advancements in green IoT for smart grids: Paving the path to sustainability. Energy Rep. 2024, 11, 5504–5531. [Google Scholar] [CrossRef]
- Baroud, S.Y.; Yahaya, N.A.; Elzamly, A.M. Cutting-Edge AI Approaches with MAS for PdM in Industry 4.0: Challenges and Future Directions. J. Appl. Data Sci. 2024, 5, 455–473. [Google Scholar] [CrossRef]
- Chouhan, S.; Mohammadi, F.D.; Feliachi, A.; Solanki, J.M.; Choudhry, M.A. Hybrid MAS Fault Location, Isolation, and Restoration for Smart Distribution System with Microgrids. In Proceedings of the 2016 IEEE Power and Energy Society General Meeting (PESGM), Boston, MA, USA, 17–21 July 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–5. [Google Scholar]
- Han, Y.; Zhang, K.; Li, H.; Coelho, E.A.A.; Guerrero, J.M. MAS-Based Distributed Coordinated Control and Optimization in Microgrid and Microgrid Clusters: A Comprehensive Overview. IEEE Trans. Power Electron. 2017, 33, 6488–6508. [Google Scholar] [CrossRef]
- Hua, H.; Li, Y.; Wang, T.; Dong, N.; Li, W.; Cao, J. Edge Computing with Artificial Intelligence: A Machine Learning Perspective. ACM Comput. Surv. 2023, 55, 1–35. [Google Scholar] [CrossRef]
- Wang, F.; Zhang, M.; Wang, X.; Ma, X.; Liu, J. Deep Learning for Edge Computing Applications: A State-of-the-Art Survey. IEEE Access 2020, 8, 58322–58336. [Google Scholar] [CrossRef]
- Wang, X.; Han, Y.; Leung, V.C.M.; Niyato, D.; Yan, X.; Chen, X. Convergence of Edge Computing and Deep Learning: A Comprehensive Survey. IEEE Commun. Surv. Tutor. 2020, 22, 869–904. [Google Scholar] [CrossRef]
- Chen, J.; Ran, X. Deep Learning with Edge Computing: A Review. Proc. IEEE 2019, 107, 1655–1674. [Google Scholar] [CrossRef]
- Murshed, M.G.S.; Murphy, C.; Hou, D.; Khan, N.; Ananthanarayanan, G.; Hussain, F. Machine Learning at the Network Edge: A Survey. ACM Comput. Surv. 2021, 54, 1–37. [Google Scholar] [CrossRef]
- Logenthiran, T. Multi-Agent System for Control and Management of Distributed Power Systems. Ph.D Thesis, National University of Singapore, Singapore, 2012. [Google Scholar]
- Dou, C.; Hao, D.; Jin, B.; Wang, W.; An, N. Multi-agent-system-based decentralized coordinated control for large power systems. Int. J. Electr. Power Energy Syst. 2014, 63, 814–821. [Google Scholar] [CrossRef]
- Farid, A.M. Multi-agent system design principles for resilient coordination & control of future power systems. Intell. Ind. Syst. 2015, 1, 13–34. [Google Scholar]
- Herrera, M.; Pérez-Hernández, M.; Parlikad, A.; Izquierdo, J. Multi-Agent Systems and Complex Networks: Review and Applications in Systems Engineering. Processes 2020, 8, 312. [Google Scholar] [CrossRef]
- Sharifi, L. Economics Inspired Energy Aware Service Provisioning in P2P Assisted Cloud Ecosystems. Technico.Ulisboa.Pt 2015, 1, 72–98. Available online: https://web.tecnico.ulisboa.pt/~ist14191/repository/Leila-Sharifi-CAT.pdf (accessed on 13 March 2025).
- Irfan, M.; Iqbal, J.; Iqbal, A.; Riaz, R.A. Opportunities and Challenges in Control of Smart Grids—Pakistani Perspective. Renew. Sustain. Energy Rev. 2017, 7, 652–674. [Google Scholar] [CrossRef]
- Aftab, M.A.; Hussain, S.M.S.; Ali, I.; Ustun, T.S. IEC 61850-Based Communication Layer Modeling for Electric Vehicles: Electric Vehicle Charging and Discharging Processes Based on the International Electrotechnical Commission 61850 Standard and Its Extensions. IEEE Ind. Electron. Mag. 2020, 14, 4–14. [Google Scholar] [CrossRef]
- Mackiewicz, R.E. Overview of IEC 61850 and Benefits. In Proceedings of the 2006 IEEE Power Engineering Society General Meeting, Montreal, QC, Canada, 18–22 June 2006; IEEE: Piscataway, NJ, USA, 2006; p. 8. [Google Scholar]
- Youssef, T.A.; El Hariri, M.; Bugay, N.; Mohammed, O.A. IEC 61850: Technology Standards and Cyber-Threats. In Proceedings of the 2016 IEEE 16th International Conference on Environment and Electrical Engineering (EEEIC), Florence, Italy, 7–10 June 2016; IEEE: Piscataway, NJ, USA, 2016; pp. 1–6. [Google Scholar]
- Aftab, M.A.; Hussain, S.M.S.; Ali, I.; Ustun, T.S. IEC 61850-Based Substation Automation System: A Survey. Int. J. Electr. Power Energy Syst. 2020, 120, 106008. [Google Scholar] [CrossRef]
- Shin, I.J.; Song, B.K.; Eom, D.S. International Electronical Committee (IEC) 61850 Mapping with Constrained Application Protocol (CoAP) in Smart Grids Based European Telecommunications Standard Institute Machine-to-Machine (M2M) Environment. Energies 2017, 10, 393. [Google Scholar] [CrossRef]
- Ozansoy, C.R.; Zayegh, A.; Kalam, A. Object Modeling of Data and Datasets in the International Standard IEC 61850. IEEE Trans. Power Deliv. 2009, 24, 1140–1147. [Google Scholar] [CrossRef]
- Kostic, T.; Preiss, O.; Frei, C. Understanding and Using the IEC 61850: A Case for Meta-Modelling. Comput. Stand. Interfaces 2005, 27, 679–695. [Google Scholar] [CrossRef]
- Ihle, C.; Trautwein, D.; Schubotz, M.; Meuschke, N.; Gipp, B. Incentive Mechanisms in Peer-to-Peer Networks—A Systematic Literature Review. ACM Comput. Surv. 2023, 55 (Suppl. S14), 1–69. [Google Scholar] [CrossRef]
- Reckerd, D.; Vico, J. Application of Peer-to-Peer Communication, for Protection and Control, at Seward Distribution Substation. In Proceedings of the 58th Annual Conference for Protective Relay Engineers, College Station, TX, USA, 5–7 April 2005; IEEE: Piscataway, NJ, USA, 2005; pp. 40–45. [Google Scholar]
- Wojdak, W. Rapid Spanning Tree Protocol: A New Solution from an Old Technology. Reprinted from CompactPCI Systems March 2003. Available online: http://pdf.cloud.opensystemsmedia.com/advancedtca-systems.com/PerfTech.Mar03.pdf (accessed on 13 March 2025).
- Marchese, M.; Mongelli, M. Simple Protocol Enhancements of Rapid Spanning Tree Protocol Over Ring Topologies. Comput. Netw. 2012, 56, 1131–1151. [Google Scholar] [CrossRef]
- Pallos, R.; Farkas, J.; Moldovan, I.; Lukovszki, C. Performance of Rapid Spanning Tree Protocol in Access and Metro Networks. In Proceedings of the 2007 Second International Conference on Access Networks & Workshops, Ottawa, ON, Canada, 22–24 August 2007; IEEE: Piscataway, NJ, USA, 2007; pp. 1–8. [Google Scholar]
- Li, Q.; Wang, D.; Huang, X.; Zhang, H. A System Configuration Description Language (SCL) Complied File Based Configuration Method for Bridges in Smart Substation. In Proceedings of the 2023 7th International Conference on Smart Grid and Smart Cities (ICSGSC), Lanzhou, China, 22–24 September 2023; IEEE: Piscataway, NJ, USA, 2023; pp. 136–141. [Google Scholar]
- Cruz, J.P.; Kaji, Y.; Yanai, N. RBAC-SC: Role-Based Access Control Using Smart Contract. IEEE Access 2018, 6, 12240–12251. [Google Scholar] [CrossRef]
- Zeeshan, M.; Manzoor, M.F.; Qadir, J. Backup Channel and Cooperative Channel Switching On-Demand Routing Protocol for Multi-Hop Cognitive Radio Ad Hoc Networks (BCCCS). In Proceedings of the 2010 6th International Conference on Emerging Technologies (ICET), Islamabad, Pakistan, 18–19 October 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 394–399. [Google Scholar]
- Ahmad, A.; El Haffar, A.; Lavanya, P. Moving towards reliable and fault-tolerant smart grid systems. Int. J. Adv. Res. Electr. Electron. Instrum. Eng. 2023, 1, 508. [Google Scholar] [CrossRef]
- Essackjee, I.A. Leveraging disruptive technologies to realize the smart grid. ResearchGate 2023, 1, 346402311. Available online: https://www.researchgate.net/profile/Ismael-Essackjee/publication/346402311_Leveraging_Disruptive_Technologies_to_Realize_the_Smart_Grid/links/62a23ada55273755ebe07e71/Leveraging-Disruptive-Technologies-to-Realize-the-Smart-Grid.pdf (accessed on 14 March 2025).
- De Almeida, L.F.F.; Pereira, L.A.M.; Sodré, A.C. Control networks and smart grid teleprotection: Key aspects, technologies, protocols, and case-studies. IEEE Access 2020, 1, 9200485. [Google Scholar] [CrossRef]
- Alabi, M. The Impact of Artificial Intelligence on Network Optimization in Telecommunications. ResearchGate 2023, 1, 384664972. Available online: https://www.researchgate.net/profile/Moses-Alabi/publication/384664972_The_Impact_of_Artificial_Intelligence_on_Network_Optimization_in_Telecommunications/links/6701933d9e6e82486f0549d5/The-Impact-of-Artificial-Intelligence-on-Network-Optimization-in-Telecommunications.pdf (accessed on 14 March 2025).
- Umoga, U.J.; Sodiya, E.O.; Ugwuanyi, E.D.; Jacks, B.S.; Lottu, O.A.; Daraojimba, O.D.; Obaigbena, A. Exploring the potential of AI-driven optimization in enhancing network performance and efficiency. Magna Sci. Adv. Res. Rev. 2024, 10, 368–378. [Google Scholar] [CrossRef]
- Cruz, Y.J.; Castaño, F.; Haber, R.E.; Villalonga, A.; Ejsmont, K.; Gladysz, B.; Flores, Á.; Alemany, P. Self-Reconfiguration for Smart Manufacturing Based on Artificial Intelligence: A Review and Case Study. In Artificial Intelligence in Manufacturing: Enabling Intelligent, Flexible and Cost-Effective Production Through AI; Springer Nature: Cham, Switzerland, 2024; pp. 121–144. [Google Scholar]
- Lin, Y.; Bie, Z. A review of key strategies in realizing power system resilience. Glob. Energy Interconnect. 2018, 1, 2096511718300094. Available online: https://www.sciencedirect.com/science/article/pii/S2096511718300094 (accessed on 14 March 2025).
- Yu, P.; Shi, L.; Liu, B. Survivability-aware routing restoration mechanism for smart grid communication network in large-scale failures. EURASIP J. Wirel. Commun. Netw. 2020, 1, 104. Available online: https://link.springer.com/article/10.1186/s13638-020-1653-4 (accessed on 14 March 2025).
- Moradi, M.H.; Razini, S.; Hosseinian, S.M. State of the art of multi-agent systems in power engineering: A review. Renew. Sustain. Energy Rev. 2016, 58, 814–824. [Google Scholar] [CrossRef]
- Cheng, L.; Yu, T. Smart Dispatching for Energy Internet with Complex Cyber-Physical-Social Systems: A Parallel Dispatch Perspective. Int. J. Energy Res. 2019, 43, 3080–3133. [Google Scholar] [CrossRef]
- Cheng, L.; Yu, T.; Zhang, X.; Yin, L. Machine Learning for Energy and Electric Power Systems: State of the Art and Prospects. Autom. Electr. Power Syst. 2019, 43, 15–43. [Google Scholar] [CrossRef]
- Renugadevi, R.; Shobana, J.; Arthi, K.; AV, K.; Satishkumar, D.; Sivaraja, M. Real-Time Applications of Artificial Intelligence Technology in Daily Operations. In Using Real-Time Data and AI for Thrust Manufacturing; IGI Global: Hershey, PA, USA, 2024; pp. 243–257. [Google Scholar]
- Cen, J.; Yang, Z.; Liu, X.; Xiong, J.; Chen, H. A Review of Data-Driven Machinery Fault Diagnosis Using Machine Learning Algorithms. J. Vib. Eng. Technol. 2022, 10, 2481–2507. [Google Scholar] [CrossRef]
- Diez-Olivan, A.; Del Ser, J.; Galar, D.; Sierra, B. Data Fusion and Machine Learning for Industrial Prognosis: Trends and Perspectives Towards Industry 4.0. Inf. Fusion 2019, 50, 92–111. [Google Scholar] [CrossRef]
- Fernandes, M.; Corchado, J.M.; Marreiros, G. Machine Learning Techniques Applied to Mechanical Fault Diagnosis and Fault Prognosis in the Context of Real Industrial Manufacturing Use-Cases: A Systematic Literature Review. Appl. Intell. 2022, 52, 14246–14280. [Google Scholar] [CrossRef]
- Saufi, S.R.; Ahmad, Z.A.B.; Leong, M.S.; Lim, M.H. Challenges and Opportunities of Deep Learning Models for Machinery Fault Detection and Diagnosis: A Review. IEEE Access 2019, 7, 122644–122662. [Google Scholar] [CrossRef]
- Leite, D.; Martins Jr, A.; Rativa, D.; De Oliveira, J.F.; Maciel, A.M. An Automated Machine Learning Approach for Real-Time Fault Detection and Diagnosis. Sensors 2022, 22, 6138. [Google Scholar] [CrossRef]
- Jung, K.H.; Kim, H.; Ko, Y. Network Reconfiguration Algorithm for Automated Distribution Systems Based on Artificial Intelligence Approach. IEEE Trans. Power Deliv. 1993, 8, 1933–1941. [Google Scholar] [CrossRef]
- Shakiba, F.M.; Azizi, S.M.; Zhou, M.; Abusorrah, A. Application of Machine Learning Methods in Fault Detection and Classification of Power Transmission Lines: A Survey. Artif. Intell. Rev. 2023, 56, 5799–5836. [Google Scholar] [CrossRef]
- Bruton, K.; Raftery, P.; Kennedy, B.; Keane, M.M.; O’sullivan, D.T.J. Review of Automated Fault Detection and Diagnostic Tools in Air Handling Units. Energy Effic. 2014, 7, 335–351. [Google Scholar] [CrossRef]
- Fenton, W.G.; McGinnity, T.M.; Maguire, L.P. Fault Diagnosis of Electronic Systems Using Intelligent Techniques: A Review. IEEE Trans. Syst. Man Cybern. Part C 2001, 31, 269–281. [Google Scholar] [CrossRef]
- Wu, D.; Zheng, A.; Yu, W.; Cao, H.; Ling, Q.; Liu, J.; Zhou, D. Digital Twin Technology in Transportation Infrastructure: A Comprehensive Survey of Current Applications, Challenges, and Future Directions. Appl. Sci. 2025, 15, 1911. [Google Scholar] [CrossRef]
- Arora, S.; Tewari, A. AI-Driven Resilience: Enhancing Critical Infrastructure with Edge Computing. Int. J. Curr. Eng. Technol. 2022, 12, 151–157. [Google Scholar]
- Nasarian, E.; Alizadehsani, R.; Acharya, U.R.; Tsui, K.L. Designing interpretable ML system to enhance trust in healthcare: A systematic review to propose responsible clinician-AI-collaboration framework. Inf. Fusion 2024, 108, 102412. [Google Scholar] [CrossRef]
- KN, K.; Perrusquia, A.; Tsourdos, A.; Ignatyev, D. Integrating Explainable AI into Two-Tier ML Models for Trustworthy Aircraft Landing Gear Fault Diagnosis. In AIAA SCITECH 2025 Forum; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2025; p. 1928. [Google Scholar]
- Nguyen, T.T.; Nguyen, N.D.; Nahavandi, S. Deep reinforcement learning for multiagent systems: A review of challenges, solutions, and applications. IEEE Trans. Cybern. 2020, 50, 3826–3839. [Google Scholar] [CrossRef]
- Cross, L.; Cockburn, J.; Yue, Y.; O’Doherty, J.P. Using deep reinforcement learning to reveal how the brain encodes abstract state-space representations in high-dimensional environments. Neuron 2021, 109, 724–738. [Google Scholar] [CrossRef]
- Altaher, A. Implementation of a Dependability Framework for Smart Substation Automation Systems: Application to Electric Energy Distribution. Ph.D. Thesis, Université Grenoble Alpes, Grenoble, France, 2018. [Google Scholar]
- Baigent, D.; Adamiak, M.; Mackiewicz, R.; Sisco, G.M.G.M. IEC 61850 Communication Networks and Systems in Substations: An Overview for Users; SISCO Systems: Sterling Heights, MI, USA, 2004. [Google Scholar]
- Cappart, Q.; Chételat, D.; Khalil, E.B.; Lodi, A.; Morris, C.; Veličković, P. Combinatorial optimization and reasoning with graph neural networks. J. Mach. Learn. Res. 2023, 24, 1–61. [Google Scholar]
- Alex-Omiogbemi, A.A.; Sule, A.K.; Omowole, B.M. Conceptual framework for advancing regulatory compliance and risk management in emerging markets through digital innovation. World J. Adv. Res. Rev. Dec. 2024, 24, 1155–1162. [Google Scholar] [CrossRef]
- Wang, X.; Wu, Y.C. Balancing innovation and regulation in the age of generative artificial intelligence. J. Inf. Policy 2024, 14, 93–112. [Google Scholar] [CrossRef]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).



