Reliability Assessment of a Fault-Tolerant PV Multistring Inverter

In photovoltaic (PV) systems, the reliability of the system components, especially the power converters, is a major concern in obtaining cost effective solutions. In order to guarantee service continuity in the case of failure of elements of the PV converter, in particular, semiconductor switching devices, a solution is to design power converter with fault-tolerance capability. This can be realized by aggregating hardware redundancy on an existing converter, providing the possibility of replacement of faulty elements. This paper evaluates the reliability of a fault-tolerant power electronics converter for PV multistring application. The considered fault-tolerant design includes a single redundant switching leg, which is used in order to reconfigure the structure in case of a switch failure either on DC-AC or DC-DC stages. This paper details the reliability estimation of the considered PV multistring fault-tolerant converter. Furthermore, a comparison with a conventional structure without fault-tolerant capability is provided. The results show that the introduction of a single redundant leg allows for improving the converter mean time to failure by a factor of almost two and it reduces, by half, the power loss due to system-failure shutdowns in PV applications, while only increasing the converter cost by 2–3%.


Introduction
System reliability is one of the key parameters for obtaining cost-effective and sustainable solutions in the photovoltaic (PV) system. Indeed, an unexpected shutdown due to system components failure can cause significant production loss [1]. In PV applications, the weakest elements with the higher failure rate are the power electronic converters [2]. In this regard, the power inverter is responsible for approximately 50 % of the total failures on a PV plant [3,4], whch means an impact on the average power loss of more than 35 % [3,5] with a larger impact when considering central-inverter solutions. Therefore, designing a more reliable power electronic converter is very important. One way to improve the converter reliability is to design a fault-tolerant solution that is able to maintain operation in the case of a failure of elements of the converter [2].
Capacitors, printed-circuit boards, and semiconductor switching devices are the weakest elements of power electronic systems [2]. The failure of insulated-gate bipolar transistors (IGBTs) in power electronics converters is then of major importance and it has been largely studied, such as in [6,7], where the existing methods to handle the failure and achieve fault-tolerant operation are also reviewed. Faults that occur on the switching devices can be classified into short-and open-circuit faults [8].
Short-circuit fault results in a short-circuit at the DC-link capacitor and it would lead to a shut down of the system. However, short-circuit can be detected by the switch control driver and it has become a standard feature of commercial drivers in order to prevent a shoot-through fault during short-circuit faults by suppressing the switching control drive signals to the healthy switching device on the same leg [8]. Short-circuit faults then result in an open-circuit on the leg. On the other hand, open-circuit faults cannot be detected by the driver, and an additional circuit is required in order to obtain open-circuit fault-tolerant operation [9].
Many fault-tolerant techniques have been proposed in the literature, as reviewed in [10]. Fault-tolerant solutions are always based on hardware redundancy and associated control strategies [10]. They can be classified between switch-level, leg-level, module-level, and system-level solutions, depending on the type of hardware redundancy [10], with the redundant-leg topology being considered to be the optimal compromise between system cost, performance, and reliability [10]. Leg-redundant fault-tolerant solutions have been largely proposed in the case of three-phase inverters [11,12] and single-phase DC-AC converters [13]. Other studies have focused on leg-redundant fault-tolerant operation of DC-DC converters, such as [14,15]. In [16], fault detection and identification for both DC-DC and DC-AC converters are proposed, but without fault-tolerant operation.
In this paper, a fault-tolerant PV multistring inverter is proposed. It consists of a conventional two-stage DC-AC PV multistring converter with the introduction of a single redundant switching leg. The redundant leg can be used for reconfiguring the system and replacing a faulty leg, either on the DC-DC and DC-AC converter stages. The proposed converter can then continue operating in case of failure of one switching device of the system. This paper provides a reliability assessment of the proposed converter and compares it with the conventional non-redundant converter, in order to quantify the interest of introducing one additional leg. This paper is organized as follows. First, Section 2 describes the proposed fault-tolerant PV multistring inverter and its principle of operation, calculates the converter component failure rates, and then obtains the reliability model of the converter. Section 3 provides the methodology that was followed to perform the simulation of the proposed fault-tolerant system. In order to verify the converter feasibility, Section 4 presents the simulation results of the converter switching model under different faulty switches. It also presents the converter reliability results that were obtained from the reliability model and compares them with the reliability of the conventional non-fault-tolerant topology. A thorough discussion of the results is provided. Section 5 provides a view on the economic aspects of the proposed fault-tolerant converter and its impact on PV applications. Finally, Section 6 provides some conclusions of this work.

Theory
This section describes the structure and principle of operation of the proposed fault-tolerant PV multistring inverter, as well as the reliability model of the converter components and the overall system, which allows for computing the estimated reliability of the converter. Figure 1 shows the considered fault-tolerant PV multistring inverter. It consists of two stages. The first stage is constituted by three DC-DC boost converters that are connected to the PV strings and the second stage consists of a two-level three-phase DC-AC converter that is connected to the grid. The fault-tolerant part consists of a single redundant switching leg (S RT , S RB , and its corresponding anti-parallel diodes) and six bidirectional triode thyristors (TRIACs, T pv1 , T pv2 , T pv3 , T A , T B , and T C ) that connect the output of the redundant leg to the output of the DC-DC and DC-AC stages switching legs.

Fault-Tolerant Converter
The converter fault-tolerant principle of operation consists of connecting the redundant leg in place of the faulty leg by enabling the TRIAC that is connected to the faulty leg. The described fault-tolerant mechanism is only valid when the switching devices fail in open circuit. Nonetheless, a switch short-circuit failure can be readily converted into an open-circuit failure by isolating the faulty device from the rest of the circuit. This can be done with fuses, TRIACs, or other appropriate devices (as the iFuse device described in [17]) by placing them in series, with each switch or at the switching-leg output. Therefore, the presented fault-tolerant mechanism can sustain switch open-circuit and short-circuit failures if proper measures are taken. The switch open-circuit fault must be detected and localized in order to obtain fault-tolerant operation. Any fault detection and localization methods can be used with the proposed converter, provided that the detection, localization, and reconfiguration process is fast enough not to damage other parts of the system. For the DC-AC stage, possible fault detection and localization algorithm can be found in [18][19][20][21][22][23] and for the DC-DC stage it can be used, for example, the methods proposed in [14,15,24].
Nonetheless, in order to demonstrate the feasibility of the proposed fault-tolerant converter structure, specific fault-diagnosis methods are considered, which are later implemented in the converter model that is described in Section 3 in order to obtain the simulation results that are shown in Section 4. For instance, the switch open-circuit fault detection and localization functionalities on the DC-AC stage are implemented with the method that is proposed in [25]. This method is based on the current Park vector that was obtained from sensing the phase currents and allows detecting and localizing individual switch open-circuit faults in less than one fundamental period of the grid currents. It has been compared to other Park vector based approaches in [26], showing that one of its main advantages is the robustness of this diagnosis method. The switch open-circuit fault diagnosis on the DC-DC stage consists on sensing the inductor current and comparing it to a threshold value, empirically set to 2 % of the nominal inductor current. Open-circuit fault is detected if the value of the current stays below this threshold for a number of samples higher than a threshold value (empirically set to 50). Fault detection is independent for each boost converter. Therefore, no localization algorithm is required. Both fault-diagnosis methods employ the existing current sensors that are required by the DC-DC and DC-AC stages controls, therefore offering a simple and economic method to perform the fault diagnosis. Table 1 shows the number of elements and the references of the selected components for the proposed fault-tolerant converter. Table 2 shows the relevant converter parameters.

Component Failure Rates
In this paper, MIL-HDBK-217F [27] is used in order to obtain the component failure rates. Reference [27] is a commonly used standardization handbook that is employed for this purpose. It is often criticized to give pessimistic results and it has not been updated recently [28]. However, the aim of this paper is to prepare a framework for reliability comparison and assessment of the improvements that are achieved with the described fault-tolerant mechanism. Subsequently, any available data source can be adopted in the outlined procedure. Table 3 presents the variables defining the component failure rates, whose units are in failures in time (FIT); i.e., failures per 10 6 h. These variables are defined in terms of a base failure rate (λ b ) and a number of adjustments factors (π). The value of these parameters is dependent on the type of component, the component operating conditions, and its constructive parameters. For instance, λ IGBT1 may be different to λ IGBT2 , since the active IGBTs in the DC-DC stage may operate at different junction temperatures than those of the DC-AC stage. Appendix A provides the value of these parameters. It should be noted that the equations and parameter values defining the IGBT failure rates (λ IGBT1 and λ IGBT2 ) do not appear in [27]. In order to overcome this, the same failure-rate model defined for the power metal-oxide-semiconductor field-effect transistor (MOSFET) will be considered, but with a λ b half of that of the MOSFET, as done in [28]. Table 3. Variables defining the failure rate per component.

Component
Variable The failure rate of the redundant-leg devices is considered to be the same as the failure rate of the devices that they are substituting. For instance, when one of the DC-DC stage leg fails, the redundant-leg IGBT and diode failure rates will be equal to λ IGBT1 and λ D1 , respectively. Figure 2 shows the relative contribution from each component to the multistring-inverter failure rate, calculated according to Table 1 and Appendix A. It can be noticed that the IGBTs and diodes are the most critical elements in terms of reliability. Therefore, the introduction of a redundant switching leg would be beneficial.

Reliability Model
Different approaches are typically applied in order to analyze the reliability of systems such as Monte Carlo or Markov chain reliability models [29,30]. For the present reliability assessment, a Markov chain reliability model is adopted, since it is an effective approach that can cover many features of redundant systems, such as sequence of failures, failure coverage, and state-dependent failure rates [29].
Two subsystems are considered in order to assess the reliability of the fault-tolerant PV multistring inverter. The first subsystem considers the critical faults that cannot be handled by the fault-tolerant system, i.e., capacitor and inductor failures. The second subsystem considers the fault-tolerant part of the converter (IGBTs, diodes, and TRIACs).
The first subsystem represents a trivial two-state Markov chain, which is represented in Figure 3 in dashed line. State 0 indicates that all the capacitors and inductors are working, while state 3 indicates that one or more of these devices has failed and the system is shut down. The reliability of the subsystem is where P(t) is the probability that all of the inductors and capacitors are operational. P(t) evolves over time according to dP(t) dt = −λ LC P(t), where defines the compound failure rate of capacitors and inductors. In other words, as time progresses, the probability of all inductors and capacitors being operational decreases at a rate equal to λ LC P(t), tending to zero when t → ∞. The second subsystem is analyzed with a four-state Markov chain, which is illustrated in Figure 3 in solid lines. For its analysis, the reparation or degraded mode of operation of the converter is not considered; i.e., if one of the switching legs fails after the reconfiguration, the converter will cease its operation.
The states of the Markov chain are: • State 0: All of semiconductor devices are operational; redundant and connecting devices (TRIACs) are inactive. • State 1: One IGBT or diode on the DC-DC stage has failed; the redundant leg and corresponding TRIAC are activated. • State 2: One IGBT or diode on the DC-AC stage has failed; the redundant leg and corresponding TRIAC are activated. • State 3: A second IGBT or diode fails or the TRIAC activated in the previous states fails; the system shuts down.
The probability that the system is in the mth state at t moment is denoted as P m (t). This probability evolves over time, according to that is, the probability of being in the mth state increases with the probability of its preceding states multiplied by the failure rates connecting it to those, and it diminishes with the probability of its succeeding states multiplied by the failure rates connecting it to those. The state equation of the considered subsystem is given by where are the failure rates that define the transitions between the Markov chain diagram states. In the definition of (8) and (9) it is considered that all of the converter switches (including the redundant-leg switches) are installed in the same heatsink. Therefore, when the redundant leg is enabled, the junction temperature of its devices will be equal to the prefault temperature of the devices that they are replacing. For this reason, λ 01 and λ 02 appear in (8) and (9). The reliability R 2 (t) of this subsystem can be calculated following Finally, the reliability of the complete system is the intersection of the two independent subsystem, which is The reliability is the probability of a device performing its purpose adequately for the period of time intended under the operating conditions encountered. The mean time to failure (MTTF) is the average time during which the system successfully operates before it fails, and it serves as a simple indicator of the system reliability. The MTTF is given by

Methods and Materials
The fault-tolerant PV multistring inverter has been simulated under various scenarios in order to demonstrate the feasibility of the proposed fault-tolerant converter structure and reconfiguration strategy. Matlab-Simulink software has been used for performing the simulations. Detailed modeling of the component losses has been implemented in order to obtain the converter component temperatures, when considering an ambient temperature of 40 • C.
The converter energy source is implemented with a model of a PV panel with a voltage and current of 660 V and 10 A, respectively, at the panel maximum power. A maximum-power-point tracker for the PV panels is implemented for each boost converter in the DC-DC stage control, while the DC-AC stage control loop regulates the grid currents in order to maintain the DC-link voltage (V DC ) at the value that is specified in Table 2.

Results and Discussion
In a first simulation, an open-circuit fault is simulated on switch S AT at time t = 0.02 s. After the detection and localization of the faulty switch, the system is reconfigured by enabling TRIAC T A and redirecting phase A switching signals to the redundant switching leg, as it can be seen in the grid currents that are shown in Figure 4a. Normal operation is achieved after a few ms transient. Figure 4b allows verifying that the DC bus voltage is affected within reasonable limits and its control is kept during the fault detection and reconfiguration interval.   According to the results, the proposed fault-tolerant converter structure is feasible, since, after a fault occurs, the converter reconfiguration can be performed without incurring in harmful current and voltage values that could damage the healthy devices. Figure 6 shows the reliability as a function of time (11) of the proposed fault-tolerant PV multistring inverter. In this figure, the reliability of the conventional structure without fault-tolerant capability is also given. It can be seen from Figure 6 that the reliability of the system is significantly improved by introducing the fault-tolerant capability.  Table 4 provides the MTTF for the fault-tolerant PV multistring inverter (12) and the conventional structure. Through the introduction of a single redundant leg, the MTTF of the system is then improved by 96%. The fact that the MTTF of the system is improved by a factor of almost two makes the proposed fault-tolerant solution very attractive. Making an acceptable guess, the results that are presented in Table 4 can be translated to total lifetime by assuming 8 hours of operation per day, leading to about 10 years of lifetime for the conventional structure; meanwhile, it is more than 20 years for the proposed fault-tolerant converter. Table 4. Mean time to failure (MTTF) of the considered topologies.

Topology MTTF
Conventional converter 3.20 years Fault-tolerant converter 6.26 years

Economic Impact Evaluation
This section provides a short evaluation of the potential impact of the proposed fault-tolerant solution from an economical point of view. First, an evaluation of the investment cost is given. The proposed fault-tolerant converter has more switching components, which implies higher costs than the conventional solution that consists of a two-stage two-level three-phase inverter. According to [31], the semiconductor-component and ancillary-circuit cost of the conventional solution amounts to about 12% of the total converter cost. Thus, because the proposed solution requires 20% more switches (12 IGBTs as compared to the 8 to 10 IGBTs present in the conventional configuration), it can be stated that the cost of such equipment is 2% to 3% higher.
Additionally, inverter failure leads to an average power loss in PV applications of more than 35% [3,5]. Thus, thanks to the proposed redundant system doubling the lifetime of the inverter, this power loss is expected to be reduced a 50%. Failures in inverters in PV applications usually lead to a reduction of the performance ratio of 5% to 6% [32]. With the proposed fault-tolerant inverter, this reduction is expected to be lower, around 3%.

Conclusions
This paper proposes a reliability assessment of a fault-tolerant PV multistring inverter. The proposed fault-tolerant converter consists of a conventional PV multistring converter with the addition of a single redundant leg and the reconfiguration hardware. The reliability estimation that was performed in this paper shows that, with the introduction of the fault-tolerant capability, the MTTF of the converter is improved by 96 %, effectively doubling the power converter lifetime when compared with the conventional non-fault-tolerant structure. Moreover, it allows for reducing, by half, the power loss due to system-failure shutdowns in PV applications. Considering also that the increase in the converter cost is only between 2 % to 3 %, the proposed fault-tolerant PV multistring inverter is very attractive for applications where maintenance is difficult, such as PV plants in remote areas and applications where the reliability is critical, for example, when the PV source is the main or single source of energy in a micro-grid. Moreover, the simulation results prove that the fault-tolerant converter structure is feasible when proper fault-diagnosis algorithms are employed.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A. Adjustment Factors
Adjustment factors for component failure rate calculation [27] are given in Table A1. Component temperatures are obtained through simulations, considering an ambient temperature of 40 • C. Table A1. Adjustment factors for component failure rate calculation [27].