Enhancing Substation Protection Reliability Through Economical Redundancy Schemes

Samkari, Husam S.

doi:10.3390/electronics14204097

Open AccessArticle

Enhancing Substation Protection Reliability Through Economical Redundancy Schemes

by

Husam S. Samkari

^1,2

¹

Electrical Engineering Department, University of Tabuk, Tabuk 47713, Saudi Arabia

²

Artificial Intelligence and Sensing Technologies Research Center, University of Tabuk, Tabuk 47713, Saudi Arabia

Electronics 2025, 14(20), 4097; https://doi.org/10.3390/electronics14204097

Submission received: 24 September 2025 / Revised: 17 October 2025 / Accepted: 17 October 2025 / Published: 19 October 2025

(This article belongs to the Special Issue Advances in MIMO Communication)

Download

Browse Figures

Versions Notes

Abstract

This paper proposes an economical scheme to provide redundancy for protection in digital power sub-transmission and distribution substations. The scheme is based on Ethernet communication networks and uses the International Electrotechnical Commission (IEC) standard 61850 sampled values (SV). This redundancy scheme develops a method for alternative sources of the SV measurements for feeder and bus relays. The objective is to use the same number of intelligent electronic devices (IEDs), also referred to as merging units (MUs), while improving the overall reliability of substation protection. The multisource-based proposed scheme does not require two sets of MUs for redundancy. Instead, each MU is used to back up an adjacent MU. For instance, in a substation using IEC 61850, the protection relay can automatically switch to another available SV stream without interrupting the protection function if an MU fails. This dynamic reconfiguration capability, which ensures the system’s adaptability to changing conditions, is particularly valuable in maintaining system reliability during equipment failures. It allows for real-time adaptation to changing conditions within the substation. The paper evaluates the reliability of the proposed scheme using fault tree analysis (FTA). For demonstration, commercially available MUs and relays are connected to the Real-Time Digital Simulator (RTDS) for hardware-in-the-loop testing.

Keywords:

relay; IEC 61850; reliability; power substation; intelligent electronic device; protection

1. Introduction

The use of fully redundant protection systems is a common practice in most electrical power substations, with the main goals of redundancy improving system reliability and availability and reducing overall operating costs by reducing the size and duration of power outages [1]. Guidelines for substation protection systems redundancy are being considered by the North American Electric Reliability Corporation (NERC) and the Western Electricity Coordinating Council (WECC) [2,3]. The guidelines are intended for transmission systems, and some utilities apply the same criteria for sub-transmission and distribution systems [1]. The importance of effective redundancy schemes grows as the demand for dependable and uninterrupted power supply grows. Our proposed scheme plays a crucial role in this context, particularly considering increased renewable energy integration into sub-transmission and distribution systems by offering an economical and reliable solution for substation protection. Unlike previous studies that primarily addressed full or partial hardware redundancy, this work introduces a novel multisource-based digital redundancy scheme that maintains equivalent reliability using the same number of devices. This innovation directly addresses cost constraints in medium-voltage substations while preserving operational performance.

An effective redundant system is defined as two or more protection elements that function in parallel and overlap to protect a specific part of the system. An example of a local redundancy of components criteria is to have two sets of protective relays, current transformers (CTs), voltage transformers (VTs), and power supplies. Applying protection system redundancy can be divided into two common approaches. The first approach is to consider one system as the primary system and the other as the backup system with a time delay. The other approach is to have both systems work in parallel without time delays and consider them systems A and B. There is no preferred operating system in this case, and both should race to trip. For both approaches, the two systems should be independent of each other and available to protect the specified system [1].

However, the financial implications of these methods can be substantial, particularly for sub-transmission and distribution systems, where cost constraints are a significant consideration. Utilities have often implemented partial redundancy schemes for sub-transmission and distribution substations, which involve duplicating only the most critical components while relying on robust design and maintenance practices. In response, recent research has focused on developing more economical redundancy schemes that provide similar levels of reliability while reducing costs [4]. The idea is to create independent protection systems that operate in parallel, ensuring that if one system fails, the other can take over without interruption [5]. Although partial redundancy can reduce costs compared to full redundancy, it still involves a significant financial expenditure and does not eliminate the risk of system failures [6]. This has led to the exploration of alternative redundancy strategies that can achieve similar levels of reliability at a lower cost.

Using the International Electrotechnical Commission (IEC) standard 61850 in substation protection over Ethernet networks has gained acceptance for substation applications [7]. The standard defines the power substation’s communication protocol, data format, and configuration language [7]. The IEC 61850 standard has positive impacts, such as increasing power quality, reducing outages, and reducing substation costs. Protection schemes that apply the standard have advantages such as reduced wiring and easy adaptation to changing bus configuration in the substation [8,9,10,11,12].

A typical IEC 61850 networked digital secondary system consists of sampled valued (SV) subscriber and generic object-oriented substation event (GOOSE) publisher relays in the control house, SV publisher and GOOSE subscriber intelligent electronic devices (IEDs), also referred to as merging units (MUs) in the yard, a time source providing a common time reference, and a network of Ethernet switches [7,8]. While networked digital secondary systems offer flexibility, redundant components are necessary to prevent loss of protection from a single component failure. The published data can be accessed by the entire process bus network, allowing subscriber relays to remain secure and dependable for an MU, time source, or network failure without modifying existing physical connections [11,13,14].

In other words, protection relays can subscribe to multiple SV streams from different MUs, ensuring continuous protection even if one data source fails. This approach significantly reduces the need for physical redundancy, lowering costs and simplifying the system architecture. Digital redundancy schemes based on IEC 61850 offer several advantages over traditional methods. These include reduced installation costs, easier upgrades, and greater adaptability to changing substation configurations. Moreover, the flexibility of digital systems makes them particularly suitable for sub-transmission and distribution substations, where economic considerations are most important [15,16,17].

This paper proposes an economic multisource-based redundancy scheme for protection in digital power sub-transmission and distribution substations using the IEC 61850 standard. The proposed scheme does not require two sets of protection components for redundancy. Instead, each MU from a feeder is used to back up an adjacent MU in a daisy chain configuration. The proposed scheme uses the same number of MUs and CTs needed in a no-redundancy scheme while improving the overall reliability of protection systems. The reliability of the proposed scheme is evaluated using fault tree analysis (FTA) [18,19]. For demonstration, commercially available MUs and relays are connected to a Real-Time Digital Simulator (RTDS) (RTDS Technologies Inc., Winnipeg, MB, Canada) for hardware-in-the-loop (HIL) testing.

The main contributions of this work can be summarized as follows. First, the paper introduces an economical redundancy scheme that enables adaptive switching between multiple MUs without requiring duplicate sets of hardware, thereby maintaining full protection reliability while reducing system cost. Second, a quantitative reliability assessment is performed using the FTA method to demonstrate the improvement in system availability and identify critical components influencing protection performance. Third, an HIL validation using RTDS and commercially available relays is presented to verify the proposed redundancy mechanism under real-time operating conditions. Together, these contributions establish a practical and scalable approach to enhance protection reliability in medium-voltage sub-transmission and distribution substations. The core mechanism of the proposed redundancy algorithm is an adaptive switching logic integrated into the protection relays. Each relay subscribes to multiple SV streams from adjacent MUs through the IEC 61850 process bus. When a primary MU fails, the relay automatically switches to an alternative SV source within one to two cycles, maintaining protection continuity without adding extra hardware.

2. Reliability Analysis and Proposed Scheme

As discussed earlier, a complete redundancy scheme is essential for transmission-level substations where every component should be backed up. However, utilities might not consider full redundancy for sub-transmission and distribution substations because of the high cost. Contrariwise, the proposed redundancy scheme described in this paper uses adjacent MUs for backup.

Since Ethernet networks are inherently nondeterministic, measuring analog data for protection applications requires devices to use a common high-accuracy time reference. In addition, an adequately configured process bus Ethernet network will require adequate bandwidth, security, redundancy, and traffic management. While Ethernet networking and time source configuration are vital to a 61850-9-2 system, this paper will focus exclusively on the configuration of MUs [18,19].

The reliability of protection systems is evaluated using FTA introduced in [18,19]. Section 2.1 and Section 2.2 provide a brief background on FTA and IEC 61850. A two-feeder system with a no-redundancy scheme is analyzed using FTA and discussed in Section 2.3. Section 2.4 introduces the multisource-based redundancy scheme. The scheme is also analyzed using FTA and compared to the no-redundancy scheme.

2.1. Fault Tree Analysis

The FTA method is applied across various industries, particularly in power system protection, due to its effectiveness in identifying and mitigating potential system failures, such as single points of failure in the protection system [5]. The method is well-established for assessing the reliability of protection schemes by modeling the logical relationships between different failure modes and calculating the overall probability of system failure. The FTA method has been applied in evaluating both traditional and digital redundancy schemes in substations [4,6,18,19,20].

FTA is a top-down, deductive analysis method used to systematically determine the causes of system failures. The analysis begins with a specific system failure and works downwards through a logical structure to identify all possible events that could contribute to this failure. These events, represented in a fault tree diagram, typically include component failures, human errors, or environmental conditions [18,19,21]. Static fault tree analysis (SFTA) is commonly used in power system protection, specifically when failures are independent and time-invariant [5]. However, there are other types of FTA, such as dynamic fault tree analysis (DFTA), fuzzy fault tree analysis (FFTA), and binary decision diagram-based FTA (BDD-FTA) [6,21].

The unavailability of protection components is a critical metric in the FTA method. It represents the fraction of time a protection component cannot perform its intended function. It is equal to the device’s failure rate multiplied by the average downtime per failure, as shown in (1). The notations in the equation are defined as follows: q is the unavailability, λ is the constant failure rate, T is the average downtime per failure, and Mean Time Between Failures (MTBF) is the average downtime between failures.

q = λ T = \frac{T}{M T B F}

(1)

The unavailability of a component is unitless and calculated based on the failure rates and repair of individual components, which can come from theoretical calculations, such as the U.S. Department of Defense Military Handbook (MIL-HDBK-217F) parts-count procedures or field experience [18,19].

FTA drives decisions on whether to add redundancy. However, the benefit of redundancy must be weighed against the additional cost. Relay failure rates and other power protection components are estimated in Table 1. Then, the unavailability of each component in the protection system is aggregated to assess the overall system’s reliability [18,22,23].

FTA breaks down the top events into lower-level events. Logic gates in FTA show the relationship between lower-level and top events. An OR gate represents a failure caused by any of several possible lower-level failures, where the unavailability of a subsystem equals the total of the unavailabilities of the individual devices. An AND gate represents a failure caused when all lower-level failures occur, where the unavailability of a subsystem is determined by multiplying the unavailabilities of the individual devices [19].

Over the past decades, the codebase for protective relays has grown significantly. In the context of protective relay systems, KLOC (thousands of lines of code) is an essential metric for evaluating software complexity introduced in [22]. For instance, early microprocessor-based relays operated with just 20 KLOC, primarily focused on essential protection functions.

However, modern relays, with advanced features such as Ethernet, SV, and GOOSE, have expanded to 600 KLOC. While this increase in lines of code enables improved protection, automation, and communication features, it also introduces the potential for software-related defects and human configuration errors. Table 1 shows several protection components’ unavailability and considers the firmware’s KLOC factor [22].

2.2. Redundancy with IEC 61850

IEC 61850 in power protection systems establishes digital communication over Ethernet networks between the protection components, enabling resilient protection schemes. Traditional protection systems relied heavily on physical redundancy, requiring duplicate relays, CTs, and VTs. IEC 61850 allows protection relays to subscribe to multiple SV streams from different MUs. This digital approach ensures continuous protection even if one data source fails, significantly reducing the need for physical redundancy [16,24].

The standard’s flexibility also simplifies upgrading or expanding substation equipment, as new devices can be easily integrated into the existing digital framework without requiring extensive rewiring or additional hardware [17]. Moreover, it provides various options, including the centralization of protection and the digitization of secondary systems [23].

2.3. Fault Tree Analysis for a System with No Redundancy

In this study, one MU per feeder is assumed to simplify the FTA for a no-redundancy scheme and clarify the comparison with the proposed redundancy scheme. In all FTAs throughout the paper, components are noted with acronyms for the clarity and consistency of the diagrams: CT for a current transformer, DC for a direct current (DC) power system, CB for a circuit breaker, MU for a merging unit, HW for hardware, and FW for firmware. Also, the unavailability estimates in Table 1 are used as baseline values for all components in the FTA, and the human configuration errors are ignored since they are irrelevant to the analysis.

Figure 1a shows a two-feeder substation system with no redundancy scheme for bus and feeder protection systems. Feeder one has two CTs and two MUs, where the current transformer CT-F1 is wired to the merging unit MU-1, and relay-1 is subscribed to MU-1 for feeder one protection. Similarly, CT-F2 is wired to MU-2, and relay-2 is subscribed to MU-2 for feeder two protection. CT-B1 is wired to MU-1, and CT-B2 is wired to MU-2 for bus protection. Relay-B is subscribed to both MU-1 and MU-2.

The unavailability of feeder protection is 1094 × 10⁻⁶, as shown in Figure 2. For bus protection, the unavailability is 1456 × 10⁻⁶, as shown in Figure 3. The higher unavailability observed in the bus protection system compared to feeder protection is largely due to its increased complexity and larger number of components. Section 2.4 uses this simple two-feeder substation system as a reference case to compare the same system with the proposed multisource-based redundancy scheme. Communication equipment will also be considered.

2.4. Fault Tree Analysis for Proposed Redundancy

The primary goal of the proposed redundancy scheme is to improve overall system reliability by reducing the unavailability of feeder and bus protection with the same number of MUs and CTs. In [25], a similar method is applied at Snohomish County Public Utility District to improve the reliability of their protective microprocessor-based relay systems. However, the utility is not using communication systems between the yard and the control house.

Instead, physical copper wirings connect relays and CTs to provide current measurements for each feeder. Each relay provides primary protection and control for an assigned feeder, plus simple backup protection for an adjacent feeder. The backup relay is set with a slightly higher pickup and longer time lever than the primary to ensure relay selectivity, which means the primary relay trips only the corresponding circuit breaker. The proposed scheme builds on such concepts by leveraging digital communications to eliminate the need for physical backup wiring, further enhancing flexibility and reducing costs.

The proposed scheme uses SV and GOOSE over Ethernet networks to provide redundant analog data and circuit breaker control without needing backup MUs. This is achieved by daisy-chaining the CT and trip circuits of each, merging with those of adjacent MUs.

An example of this configuration is shown in a two-feeder substation in Figure 4. In this configuration, each MU will publish analog data and issue trip commands for two different circuit breakers (CBs). Under these conditions, losing a single MU will not compromise protection performance. There is no need for an intentional delay or different pickup values between the primary and the backup systems. Both MUs will provide a data stream, and the relay can switch between them if failures occur. This allows the subscriber relay to alternate between multiple current sources compared with the scheme described in [25], enabling full backup protection.

Figure 1b shows an example of a two-feeder substation. Feeder one is protected by relay-1, which subscribes to merging units MU-1 and MU-2 for SV streams. MU-1 is wired to current transformers CT-F1 and CT-B2, whereas MU-2 is wired to CT-F2 and CT-B1. The primary source of the SV stream for relay-1 is MU-1 throughout CT-F1, whereas the alternative source is MU-2 throughout CT-B1. Also, MU-1 and MU-2 subscribe to relay-1 for GOOSE messages and are wired to circuit breakers CB-1 and CB-2.

Similarly, feeder two is protected by relay-2, which also subscribes to MU-1 and MU-2. However, the primary source of the SV stream for relay-2 is MU-2 throughout CT-F2, whereas the alternative source is MU-1 throughout CT-B2.

The same MUs used for feeder protection are used for bus protection using relay-B. The only difference is that MU-2 and CT-B1 are the primary sources for feeder one bus data acquisition. Thus, MU-1 and CT-F1 are the alternative sources.

The terms primary and main are used interchangeably to refer to the same concept, which denotes the source from which the protection relays receive their current measurements during normal operation. The terms alternative and backup are also used interchangeably to refer to the same concept, which indicates the redundant source of data a relay uses if the primary source fails.

Unlike full redundancy schemes, the proposed scheme has a dynamic protection zone. Figure 5b–d illustrate the dynamic changes in feeder one, feeder two, and bus protection zones when CT-B1, CT-F1, and MU-1 fail. The protection zones during normal conditions are shown in Figure 5a. When CT-B1 fails, the bus protection zone dynamically contracts because the other CT is used as an alternative source of current, as shown in Figure 5b. Also, when CT-F1 fails, the feeder protection zone dynamically contracts, as in Figure 5c. The last example is when MU-1 fails, which changes the bus protection zone in feeder two and the feeder protection zone in feeder one.

Figure 6 shows FTA for feeder protection, and Figure 7 shows FTA for bus protection, with the addition of the proposed multisource-based redundancy scheme. The FTA results show the advantage of the proposed scheme compared to the no-redundancy scheme discussed earlier. The unavailability of feeder protection is 732.1111 × 10⁻⁶, which is about 33% less than the no-redundancy scheme. For bus protection, the unavailability is 732.1138 × 10⁻⁶, which is about 49% less than that of the no-redundancy scheme. The bus protection system shows notable gains from the redundancy scheme since it benefits from the larger number of redundant components.

Figure 8 provides a detailed breakdown of the failure paths within a protection system that does not include redundancy for the communication system. Feeder and bus protection unavailabilities are considered with and without redundancy. The FTA shows that the overall system’s unavailability is impacted by two key factors: the failure of communication systems and the failure to clear faults in the protection system. The unavailability due to communication failures is calculated to be 1564 × 10⁻⁶. This failure can come from multiple factors introduced earlier in Table 1.

The comparison illustrates the importance of communication systems in maintaining reliable protection operations. Without redundancy in the communication system, unavailability can increase significantly, as shown by the high clock and switch failure values.

Table 2 and Figure 9 compare protection system unavailabilities under two scenarios: with and without redundancy. The proposed redundancy scheme improves the overall system reliability by reducing the unavailability of the protection systems in sub-transmission and distribution substations. However, the magnitude of improvement varies based on system complexity and the components involved. Bus protection systems show notable gains from redundancy, while systems that integrate communication components could benefit from further refinements beyond basic redundancy, focusing on communication system resilience.

As illustrated in Figure 10, the proposed multisource-based redundancy scheme demonstrates a general form of the daisy chain wiring method for MUs connected to CTs across multiple feeders. The design indicates that the multisource-based redundancy approach can be scaled to an arbitrary number of feeders (m) and MUs (n), making it adaptable for sub-transmission and distribution substations of varying sizes. The design ensures that each feeder has multiple redundant sources of current measurements.

Table 2 and Figure 9 summarize the reduction in unavailability achieved by the proposed multisource-based redundancy scheme. The results show a 33% improvement for feeder protection and a 49% improvement for bus protection compared with the non-redundant configuration. These results confirm that the proposed method significantly enhances system reliability while maintaining the same hardware count.

Recent studies have examined redundancy and reliability enhancement in digital substation protection systems using different architectural approaches. The authors in [26] proposed a centralized multiple back-up protection scheme based on IEC 61850, where sampled values are shared between adjacent substations through additional central protection and control units. The authors in [27] analyzed the reliability of protection, automation, and control systems (PACS) using integrated architectures with parallel redundant IEDs and demonstrated measurable reductions in protection unavailability through hardware duplication. In contrast, the proposed multisource-based redundancy scheme achieves comparable or higher reliability improvement through algorithmic redundancy and adaptive switching logic without increasing the number of IEDs or MUs. This distributed approach eliminates single points of failure associated with centralized designs and provides a cost-effective, scalable solution for medium-voltage substation protection.

Figure 10 illustrates the general scalability of the daisy-chain MU configuration and its applicability to substations with an arbitrary number of feeders and merging units.

The FTA results directly inform the configuration of the proposed redundancy scheme by identifying the components with the highest contribution to overall unavailability. The design leverages these findings to minimize single points of failure, ensuring that the relays and MUs most critical to reliability are interconnected digitally. This step-by-step connection between FTA and the redundancy architecture reinforces the methodological consistency of the proposed approach. This study applies a similar principle used in [28] to enhance substation protection reliability through redundancy and adaptive data sharing rather than voltage regulation.

3. Modeling and Simulation

An RTDS model is developed to demonstrate the scheme’s operation [29]. The model includes a three-feeder bus system with two loads, one power source, two sets of CTs, and a CB per feeder, as shown in Figure 11. Then, three commercially available MUs [30], three feeder protection relays [31], and one bus protection relay [32] are connected to the HIL simulation model.

The RTDS model is configured to use the same number of IEDs, MUs, and CTs as the reference no-redundancy system to ensure a consistent basis for comparison. The proposed scheme achieves redundancy through digital interconnection and adaptive data sharing rather than by adding additional hardware devices. This configuration demonstrates that improved reliability can be obtained without increasing equipment count or altering the system’s physical structure.

The CBs are labeled CB-1, 2, and 3 for feeders 1, 2, and 3, respectively. The CTs used as the main source of current for feeder protection are labeled CT-F1, F2, and F3, and the CTs used as the main source for bus protection are labeled CT-B1, B2, and B3 for feeders 1, 2, and 3, respectively. The diagram also indicates four faults in different locations simulated to evaluate the performance of the relays during MU-1 failure.

3.1. RTDS Model and Testbed

The RTDS generates analog voltage and current signals, which are sent to MUs via the RTDS wiring interface, as shown in Figure 12. The interface consists of two main components: a gigabit-transceiver front panel interface (GTFPI) card for low voltage interface, which sends trip signals from the relays to the CBs, and a gigabit transceiver analog output (GTAO) card for high voltage interface, which receives analog voltage and current signals from the RTDS model and sends them to MUs. The interface cards allow for HIL testing, where the commercially available MUs and relays process signals from the simulated power system.

The IEC 61850 process bus network aids communication between the MUs and the relays, enabling the transmission of SV and GOOSE messages, as shown in Figure 12. The MUs digitize the analog signals from the RTDS model and publish SV, which is transmitted over the process bus network to the relays. The relays send trip signals back to the MUs using GOOSE messages. A precision time protocol (PTP) clock synchronizes the communication between the relays and MUs.

The RTDS testbed was interfaced with commercial protection relays using GTAO and GTFPI cards through calibrated analog channels. All voltage and current outputs were scaled and verified with an accuracy of ±0.5%. Each fault scenario was executed three times, and the measured relay trip times showed a maximum variation within ±1 ms, confirming the repeatability and stability of the hardware-in-the-loop setup.

Each MU is connected to two associated CTs through the interface cards to obtain current measurements for feeder and bus protection systems using GTAO and connected to CBs for trip control using GTFPI. The proposed multisource-based redundancy with the daisy chain technique is implemented as introduced earlier and shown in Figure 11.

Table 3 summarizes the primary sources and the backup sources for both feeder protection relays and bus relays. The primary current sources for feeder protection relays 1, 2, and 3 are MU-1, MU-2, and MU-3, wired to CT-F1, F2, and F3. On the other hand, the backup current sources MU-3, 1, and 2 are wired to CT-B1, B2, and B3 using alternative connection points within each MU. Similarly, for bus protection, the primary current sources for relay-B are MU-3, 1, and 2, wired to CT-B1, B2, and B3. On the other hand, the backup current sources MU-1, 2, and 3 are wired to CT-F1, F2, and F3.

The MUs’ currents are denoted as I_{(Phase) (Feeder #) (MU #)}. For example, Table 3’s first row and second column indicate currents for phase A from feeder one and MU 1, the primary current source for relay-1, which protects feeder one.

The RTDS model includes fault controls at four locations to test the system. Also, the failure of MU-1 is simulated by controlling the unit’s power supply from the RTDS model. Figure 13 shows relay-1 data acquisition from the MU-1 primary current source in (a) and the MU-3 backup current source in (b) during the simulation of MU-1 failure. The three-phase current signals are no longer available from MU-1, where MU-3 provides reliable signals.

3.2. Case Study

Four study cases are conducted, with single line-to-ground (SLG) phase A faults applied in four locations, as illustrated in Figure 11. MU-1 is chosen to fail to demonstrate the proposed multisource-based redundancy scheme. The main difference between the four cases lies in the fault locations, which were strategically chosen to illustrate the dynamic change in protection zones. In feeder one, the feeder protection zone by relay-1 is contracted. In feeder two, the bus protection zone by relay-B is contracted.

Each case focuses on a specific demonstration goal. In case 1, the fault is applied at feeder one, and relay-1 trips for the fault using signals from the backup source MU-3. The dynamic zone change did not impact the case. However, the case demonstrates that relay-1 tripped during MU-1 failure. In case 2, the fault is applied at the overlap feeder-bus protection zone of feeder two, but only relay-2 trips due to the dynamic zone change using the primary source MU-2. In normal conditions, both relay-2 and relay-B would trip for this fault.

In case 3, the fault is applied to the bus protection zone and relay-B trips for the fault. The dynamic zone change did not impact the case. However, the case demonstrates that relay-B tripped during MU-1 failure. In case 4, the fault is applied at the overlap feeder-bus protection zone at feeder one, but only relay-B trips for the fault because of the dynamic zone change. In normal conditions, both relay-1 and relay-B would trip for this fault.

The feeder protection relays 1, 2, and 3 are configured to operate in the system setup based on the instantaneous overcurrent element (PIOC) [31]. On the other hand, the bus protection relay-B is configured to use the bus bar differential element (PDIF) [32].

Although this study focuses on four representative fault locations and a single MU failure to demonstrate the concept, the redundancy logic is scalable to larger systems with multiple feeders and concurrent fault conditions. The same RTDS test structure can be extended to additional simulation and hardware-in-the-loop scenarios, enabling experimental verification of multi-fault and simultaneous MU failure events. Future work will expand the test matrix accordingly to further confirm the robustness and adaptability of the proposed scheme.

4. Results and Discussion

The results in this section validate the application of the proposed multisource-based redundancy method and prove the ability to switch between MUs. Table 4 summarizes, from left to right, which relay is tripped, the protection element used, the corresponding MU utilized, and the figure number displaying the results for each case. The event recorders are extracted from the relays as a common format for transient data exchange (COMTRADE) files and then analyzed in MATLAB^®R2024b (MathWorks, Natick, MA, USA).

Figure 14, Figure 15, Figure 16, Figure 17, Figure 18 and Figure 19 show the recorders obtained from relays, including published phase A current signals by the main and backup MUs and relay responses. Subplot (a) displays the current signal in kiloampere (kA) published by the main MU, which is subscribed as the main source of current measurements by the relay. In subplot (b), the backup MU also publishes the current signal in kA, which is subscribed to as the relay’s alternative source of current measurements. Subplot (c) presents relay word bits assertion and de-assertion, starting from the top to down: the availability of the main MU, the availability of the backup MU, the element enabling the word bit, and the element tripping the word bit.

Moreover, the x-axis in the figures represents the time in milliseconds (ms). Phase A faults and MU failure simulation are applied at 25 ms. The system and notation are consistent with Figure 11 and Figure 12, Table 3 and Table 4.

The results across all figures consistently demonstrate the effectiveness of the multisource-based redundancy scheme and dynamic zone changes. In Case 1, relay-1 successfully trips for a phase A fault applied to feeder one using backup MU-3 signals during MU-1 failure, as shown in Figure 14. In Case 2, relay-2 operates for a phase A fault applied to the overlap zone using the main source MU-2, and relay-B does not trip due to the dynamic zone change, as shown in Figure 15 and Figure 16.

Figure 17 shows Case 3, which validates that relay-B trips for faults in the bus zone during MU-1 failure. In Case 4, relay-B trips alone for phase A fault applied to the overlap zone at feeder one using the backup source MU-2, and relay-1 does not trip due to the dynamic zone change at feeder one.

The results show a brief delay in the relay decision because of the switching between the MUs. The de-assertion of word bits PIOC Enable and PDIF Enable illustrates the delay during the switching. This delay results from the momentary data loss from the main MU, causing the relay’s buffer to clear the previously stored data and acquire data from the backup MU.

More specifically, the results show buffer-clearing time between one and a half and two cycles when relays switch from the main MU to the backup MU. Table 5 summarizes the buffer-clearing times in ms. The buffer-clearing time is about two cycles when the PIOC element is used in relay-1 and relay-2, as shown in Figure 14c and Figure 19c [31]. On the other hand, the buffer-clearing time is about 1.5 cycles when the PDIF element is used in relay-B, as shown in Figure 16c, Figure 17c, and Figure 18c [32].

Since relay-1, -2, and -3 [31] are different relay types compared to relay-B [32], different buffer-clearing times are observed. The main reason is that the delay from the buffer-clearing process depends on the digital relay architecture. The buffer-clearing duration did not impact the relay decision, and all relays tripped for in-zone faults.

5. Conclusions

The proposed multisource-based redundancy scheme offers a cost-effective reliability boost solution for sub-transmission and distribution substations by using adjacent MUs to back up each other and allowing relays to switch between the MUs during a failure event. The scheme is assessed using the FTA method to prove the overall system reliability improvement by reducing the unavailability of the protection systems.

Furthermore, commercially available MUs and relays are connected to a three-feeder RTDS substation model for HIL testing to evaluate the practicality of the proposed scheme. The results validate the applicability of the proposed method and prove the relays’ ability to switch between MUs and trip for in-zone faults smoothly with a maximum unintentional delay of two cycles for buffer clearing in relays. The delay introduced has minimal impact on most sub-transmission and distribution substation applications and does not compromise protection.

While the proposed scheme represents an advancement in potential cost-saving redundant protection schemes, several challenges remain. Integrating digital redundancy schemes into existing substation infrastructure can be complex, requiring careful planning and coordination. The next step for this research includes testing several commercially available relays from different vendors, which is important to validate the capability of switching between MUs. Another factor to check is the impact of switching between two sources on the relay decision and the unintentional delay imposed.

The scientific originality of this work lies in achieving enhanced protection reliability without adding new IEDs or MUs. The proposed multisource-based scheme utilizes algorithmic redundancy and adaptive data sharing to maintain system dependability during MU or communication failures. This represents a shift from traditional hardware-based redundancy toward a more intelligent, software-driven protection design, providing both technical innovation and economic advantage for medium-voltage digital substations.

Beyond protective relays, the proposed redundancy concept can be applied to other embedded systems such as intelligent controllers and PMUs. The algorithm is compatible with different deterministic communication protocols as long as time synchronization and data integrity are maintained. While the RTDS-based hardware-in-the-loop validation provides a high level of system accuracy and reliability, future studies may extend the evaluation to pilot substations to assess long-term stability under real operating conditions.

Funding

This research was funded by the Deanship of Research and Graduate Studies at the University of Tabuk, grant number 0283-1443-S.

Data Availability Statement

Data is contained within the article.

Acknowledgments

The author expresses his gratitude for the technical support from the Artificial Intelligence and Sensing Technologies Research Center at the University of Tabuk, and the financial support from the Deanship of Research and Graduate Studies at the University of Tabuk, grant number 0283-1443-S.

Conflicts of Interest

The author declares no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

CB	Circuit Breaker
COMTRADE	Common Format for Transient Data Exchange
CT	Current Transformer
DC	Direct Current
DFTA	Dynamic Fault Tree Analysis
FFTA	Fuzzy Fault Tree Analysis
FTA	Fault Tree Analysis
FW	Firmware
GOOSE	Generic Object-Oriented Substation Event
GTAO	Gigabit Transceiver Analog Output
GTFPI	Gigabit-Transceiver Front Panel Interface
HIL	Hardware-in-the-Loop
HW	Hardware
IA	Current (phase designation in RTDS simulation)
IEC	International Electrotechnical Commission
IED	Intelligent Electronic Device
KLOC	Thousands of Lines of Code
MIL-HDBK-217F	U.S. Department of Defense Military Handbook
MU	Merging Unit
MTBF	Mean Time Between Failures
NERC	North American Electric Reliability Corporation
PDIF	Bus Bar Differential Element
PIOC	Instantaneous Overcurrent Element
PTP	Precision Time Protocol
RTDS	Real-Time Digital Simulator
SFTA	Static Fault Tree Analysis
SLG	Single Line-to-Ground (fault)
SV	Sampled Values
VT	Voltage Transformer
WECC	Western Electricity Coordinating Council

References

Kasztenny, B.; Hunt, R.; Vaziri, M. Protection and Control Redundancy Considerations in Medium Voltage Distribution Systems. In Proceedings of the 2007 60th Annual Conference for Protective Relay Engineers, College Station, TX, USA, 27–29 March 2007. [Google Scholar]
North American Electric Reliability Corporation (NERC). Available online: https://www.nerc.com/ (accessed on 6 August 2024).
Western Electricity Coordinating Council (WECC). Available online: https://www.wecc.org/ (accessed on 6 August 2024).
Saleh, A.; Harris, R.B. Application of the Fault Tree Analysis for Assessment of Power System Reliability. Int. J. Electr. Power Energy Syst. 2015, 67, 613–626. [Google Scholar] [CrossRef]
Pan, K.; Liu, H.; Gou, X.; Huang, R.; Ye, D.; Wang, H.; Glowacz, A.; Kong, J. Towards a Systematic Description of Fault Tree Analysis Studies Using Informetric Mapping. Sustainability 2022, 14, 11430. [Google Scholar] [CrossRef]
Andrews, J.; Dunnett, T. Fault Tree Analysis: A Survey of the State-of-the-Art in Modeling, Analysis, and Tools. Reliab. Eng. Syst. Saf. 2009, 94, 503–508. [Google Scholar] [CrossRef]
International Electrotechnical Commission (IEC). Communication Networks and Systems for Power Utility Automation—All Parts; IEC 61850:2025 SER; IEC: Geneva, Switzerland, 2025; Available online: https://webstore.iec.ch/en/publication/6028 (accessed on 18 October 2025).
Mackiewicz, R.E. Overview of IEC 61850 and benefits. In Proceedings of the IEEE Power Engineering Society General Meeting, Montreal, QC, Canada, 18–22 June 2006. [Google Scholar]
Apostolov, A. IEC 61850-based bus protection—Principles and benefits. In Proceedings of the IEEE Power & Energy Society General Meeting, Calgary, AB, Canada, 26–30 July 2009. [Google Scholar]
Scheer, G.; Dolezilek, D. Comparing the Reliability of Ethernet Network Topologies in Substation Control and Monitoring Networks. In Proceedings of the 2nd Annual Western Power Delivery Automation Conference, Spokane, WA, USA, 27–30 July 2000. [Google Scholar]
Kanabar, M.G.; Sidhu, T.S. Performance of IEC 61850-9-2 Process Bus and Corrective Measure for Digital Relaying. IEEE Trans. Power Deliv. 2011, 26, 725–735. [Google Scholar] [CrossRef]
Lim, I.; Sidhu, T.S. Design of a Backup IED for IEC 61850-Based Substation. IEEE Trans. Power Deliv. 2013, 28, 2048–2055. [Google Scholar] [CrossRef]
García-Gracia, M.; Borroy, S.; de Urtasun, L.G.; Comech, M. Novel protection scheme based on IEC61850. Electr. Power Syst. Res. 2011, 81, 2178–2187. [Google Scholar] [CrossRef]
Chawla, A.; Aftab, M.A.; Hussain, S.S.; Panigrahi, B.; Ustun, T.S. Cyber–physical testbed for Wide Area Measurement System employing IEC 61850 and IEEE C37.118 based communication. Energy Rep. 2022, 8 (Suppl. S10), 570–578. [Google Scholar] [CrossRef]
Ahmad, S.; Asar, A.U. Reliability Enhancement of Electric Distribution Network Using Optimal Placement of Distributed Generation. Sustainability 2021, 13, 11407. [Google Scholar] [CrossRef]
Kumar, S.; Abu-Siada, A.; Das, N.; Islam, S. Toward a Substation Automation System Based on IEC 61850. Electronics 2021, 10, 310. [Google Scholar] [CrossRef]
Cai, Y.; Chen, Y.; Li, Y.; Cao, Y.; Zeng, X. Reliability Analysis of Cyber-Physical Systems: Case of the Substation Based on the IEC 61850 Standard in China. Energies 2018, 11, 2589. [Google Scholar] [CrossRef]
Schweitzer, E.; Fleming, B.; Lee, T.; Anderson, P. Reliability Analysis of Transmission Protection Using Fault Tree Methods. In Proceedings of the Western Protective Relay Conference, Spokane, WA, USA, 21–23 October 1997. [Google Scholar]
Anderson, P.M.; Henville, C.; Rifaat, R.; Johnson, B.; Meliopoulos, S. (Eds.) Fault Tree Analysis of Protective Systems. In Power System Protection; Wiley: Hoboken, NJ, USA, 2022; pp. 1261–1310. [Google Scholar] [CrossRef]
Chen, Y. Risk Assessment of Rail Transit System with Photovoltaic and Energy Storage Based on Operation Characteristics. CSEE J. Power Energy Syst. 2020, 6, 750–759. [Google Scholar] [CrossRef]
Shooman, M.L. An Overview of Fault Tree Analysis and Its Application in Model-Based Approaches. Reliab. Eng. Syst. Saf. 2009, 94, 677–698. [Google Scholar] [CrossRef]
Schweitzer, E.O., III; Whitehead, D.E. Resetting Protection System Complexity. In Proceedings of the Western Protective Relay Conference, Spokane, WA, USA, 21–24 October 2019. [Google Scholar]
Byerly, J.; Thakur, M.; Hostetler, J.; Burger, C.; Wenke, S. Distribution Digital Substation—Consolidated Protection and Digital Secondary Systems. In Proceedings of the Western Protective Relay Conference, Spokane, WA, USA, 9–12 October 2023. [Google Scholar]
Silos, Á.; Señís, A.; De Pozuelo, R.M.; Zaballos, A. Using IEC 61850 GOOSE Service for Adaptive ANSI 67/67N Protection in Ring Main Systems with Distributed Energy Resources. Energies 2017, 10, 1685. [Google Scholar] [CrossRef]
Oens, M.; Lange, C. Improvements in Feeder Protection Providing a Primary and Backup Relay System Utilizing One Relay per Feeder. In Proceedings of the 33rd Annual Western Protective Relay Conference, Spokane, WA, USA, 17–19 October 2006. [Google Scholar]
Kim, M.-S.; Kang, S.-H. Centralized Multiple Back-Up Protection Scheme with Sharing Data between Adjacent Substations Based on IEC 61850. Energies 2022, 15, 4195. [Google Scholar] [CrossRef]
Chen, X.; Jin, L. Study on Reliability of PACSs with Integrated Consideration of Both Basic and Mission Reliability. Energies 2024, 17, 365. [Google Scholar] [CrossRef]
Wang, L.; Xie, L.; Yang, Y.; Zhang, Y.; Wang, K.; Cheng, S.-J. Distributed Online Voltage Control with Fast PV Power Fluctuations and Imperfect Communication. IEEE Trans. Smart Grid 2023, 14, 3681–3695. [Google Scholar] [CrossRef]
Real-Time Digital Simulator (RTDS). Available online: https://www.rtds.com/ (accessed on 6 August 2024).
SEL-401 Protection, Automation, and Control Merging Unit. Available online: https://selinc.com/products/401/ (accessed on 6 August 2024).
SEL-451 Protection, Automation, and Bay Control System. Available online: https://selinc.com/products/451/ (accessed on 6 August 2024).
SEL-487B Bus Differential and Breaker Failure Relay. Available online: https://selinc.com/products/487B/ (accessed on 6 August 2024).

Figure 1. An example of (a) a two-feeder substation system with no redundancy scheme, and (b) the proposed daisy chain MUs wiring to CTs.

Figure 2. Fault tree analysis for feeder protection in a two-feeder substation system with no redundancy scheme.

Figure 3. Fault tree analysis for bus protection in a two-feeder substation system with no redundancy scheme.

Figure 4. Reliability block diagram of the proposed multisource-based redundancy scheme.

Figure 5. The dynamic change in feeder and bus protection zones when (a) no failures, (b) CT-B1 fails, (c) CT-F1 fails, and (d) MU-1 fails.

Figure 6. Fault tree analysis for feeder protection in a two-feeder substation system with the proposed multisource-based redundancy scheme.

Figure 7. Fault tree analysis for bus protection in a two-feeder substation system with the proposed multisource-based redundancy scheme.

Figure 8. Fault tree analysis for the protection system without including redundancy for the communication system.

Figure 9. A comparison of the unavailability of protection systems in a two-feeders substation with no redundancy and with the proposed multisource-based redundancy scheme.

Figure 10. A general form of the proposed daisy chain MUs wiring to CTs in substations with an arbitrary number of feeders (m) and MUs (n).

Figure 11. A one-line diagram of the three-feeder bus system RTDS model with two loads, one power source, two sets of CTs, and a CB per feeder.

Figure 12. Data flow between the RTDS model and the MUs, and between the MUs and the relays.

Figure 13. Relay-1 data acquisition from (a) MU-1 primary current source and (b) MU-3 backup current source during the simulation of MU-1 failure.

Figure 14. Case 1: Relay-1 trips for phase A fault applied to feeder one using the backup source MU-3. (a) MU-1 published signals, (b) MU-3 published signals, and (c) relay-1 word bits.

Figure 15. Case 2: Relay-2 trips phase A fault applied to overlap zone using the main source MU-2. (a) MU-2 published signals, (b) MU-1 published signals, and (c) relay-2 word bits.

Figure 16. Case 2: Relay-B does not trip phase A fault applied to overlap zone at feeder two using the backup source MU-2. (a) MU-1 published signals, (b) MU-2 published signals, and (c) relay-B word bits.

Figure 17. Case 3: Relay-B trips for phase A fault applied to bus zone, showing signals from feeder two. (a) MU-1 published signals, (b) MU-2 published signals, and (c) relay-B word bits.

Figure 18. Case 4: Relay-B trips for phase A fault applied to overlap zone at feeder one using the backup source MU-2. (a) MU-1 published signals, (b) MU-2 published signals, and (c) relay-B word bits.

Figure 19. Case 4: Relay-1 does not trip phase A fault applied to the overlap zone at feeder one using the backup source MU-3. (a) MU-1 published signals, (b) MU-3 published signals, and (c) relay-B word bits.

Table 1. Unavailabilities of Several Protection Components.

Component	KLOC	Unavailability × 10⁻⁶
Relay hardware	-	100
Relay firmware	800	282
Merging unit hardware	-	100
Merging unit firmware	500	232
DC Power System	-	50
Circuit breaker	-	300
Current transformer (three-phase)	-	30
Fiber channel	-	100
Clock hardware	-	100
Clock firmware	4000	632
Switch hardware	-	100
Switch firmware	4000	632

Table 2. Unavailability of Protection Systems.

System	No Redundancy Unavailability × 10⁻⁶	Proposed Redundancy Unavailability × 10⁻⁶	Unavailability Reduced by
Feeder	1094	732.1111	33%
Bus	1456	732.1138	49%
Feeder and communication	2658	2294.1111	13%
Bus and communication	3020	2294.1138	24%

Table 3. Relays Main and Backup MU Sources.

Relay	Main Source	Alternative Source
Relay-1—Feeder 1	MU-1 (I_A11)	MU-3 (I_A13)
Relay-2—Feeder 2	MU-2 (I_A22)	MU-1 (I_A21)
Relay-3—Feeder 3	MU-3 (I_A33)	MU-2 (I_A32)
Relay-B—Feeder 1	MU-3 (I_A13)	MU-1 (I_A11)
Relay-B—Feeder 2	MU-1 (I_A21)	MU-2 (I_A22)
Relay-B—Feeder 3	MU-2 (I_A32)	MU-3 (I_A33)

Table 4. Summary of the Results.

Study Case	Tripped Relay	Enabled Element	Utilized MUs	Results Shown in Figures
Case 1	Relay-1	PIOC	MU-3	Figure 14
Case 2	Relay-2	PIOC	MU-2	Figure 15 and Figure 16
Case 3	Relay-B	PDIF	MU-2 and 3	Figure 17
Case 4	Relay-B	PDIF	MU-2 and 3	Figure 18 and Figure 19

Table 5. Buffer-Clearing Times.

Study Case	Tripped Relay	Enabled Element	Buffer Clearing (ms)	Results Shown in Figures
Case 1	Relay-1	PIOC	33.63	Figure 14
Case 2	Relay-2	PIOC	0.0	Figure 15
Case 2	Relay-B	PDIF	26.38	Figure 16
Case 3	Relay-B	PDIF	26.50	Figure 17
Case 4	Relay-B	PDIF	26.38	Figure 18
Case 4	Relay-1	PIOC	33.50	Figure 19

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Samkari, H.S. Enhancing Substation Protection Reliability Through Economical Redundancy Schemes. Electronics 2025, 14, 4097. https://doi.org/10.3390/electronics14204097

AMA Style

Samkari HS. Enhancing Substation Protection Reliability Through Economical Redundancy Schemes. Electronics. 2025; 14(20):4097. https://doi.org/10.3390/electronics14204097

Chicago/Turabian Style

Samkari, Husam S. 2025. "Enhancing Substation Protection Reliability Through Economical Redundancy Schemes" Electronics 14, no. 20: 4097. https://doi.org/10.3390/electronics14204097

APA Style

Samkari, H. S. (2025). Enhancing Substation Protection Reliability Through Economical Redundancy Schemes. Electronics, 14(20), 4097. https://doi.org/10.3390/electronics14204097

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Enhancing Substation Protection Reliability Through Economical Redundancy Schemes

Abstract

1. Introduction

2. Reliability Analysis and Proposed Scheme

2.1. Fault Tree Analysis

2.2. Redundancy with IEC 61850

2.3. Fault Tree Analysis for a System with No Redundancy

2.4. Fault Tree Analysis for Proposed Redundancy

3. Modeling and Simulation

3.1. RTDS Model and Testbed

3.2. Case Study

4. Results and Discussion

5. Conclusions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI