Fault Prediction Model of High-Power Switching Device in Urban Railway Traction Converter with Bi-Directional Fatigue Data and Weighted LSM

The switching device is relatively weakest in the traction converter, and this paper aims at the fault prediction of it. Firstly, the mathematical distribution is analyzed based on the results that were obtained in electro thermal simulation and a single-directional accelerated fatigue test. Then, the accelerated fatigue test with bi-directional fatigue current is proposed, the data from which reflects the accelerating effect from FWD on the device aging process. The analytical model of fatigue process is fitted with the data that were obtained in the test. In order to shorten the test time consumption, we propose a weighted least squares method (LSM) to fit the failure data. Finally, the prediction model is presented with the consideration of fatigue signature and Arrhenius temperature factor.


Introduction
The urban rail transit system is a reliable and environmentally friendly way of transport; it generates low pollution, has high passenger capacity and is not prone to delays.The traction converter (TC) in the EMU provides motive and electric braking power through traction motors.TC's switching IGBTs are high-power devices that are prone to failure compared to other components [1].Switching device fault accounts for more than 10% of the total TC failures.In most cases, fatigue causes the failure of the switching device [2].If the fatigue state of the switching device can be predicted and manually maintained or replaced in the device [3], the train failure caused by the failure of the switching device can be reduced.
The switching device in the TC is usually a package module composed of an IGBT and an FWD.The fatigue of the switching device is mainly manifested by package failure, including two main forms: Wire liftoffs and solder layer cracking [4,5].Wire liftoffs and solder layer cracking are caused by shear thermal stresses that are produced by temperature drift inside the device.In the intermittent operation of the device when it is turned on, heat is generated by current; when it is turned off, i.e., the current is cut off, the heat is dissipated.In such a way, the device is heated and cooled periodically.The internal temperature of the device (not only the IGBT junction temperature T j , however also the temperature of each single layer) would increase and decrease accordingly.One switching device consists of multiple layers (as shown in Figure 1) with different coefficients of thermal expansion (CTE) of each layer.The relationship between thermal stress and the change of CTE is shown in Equation (1).
( ) S is the thermal stress, ΔCTE is the difference of CTE and ΔT is junction temperature variation.Equation (1) shows that the thermal stress is caused by the difference of CTE and ΔT.
Such temperature change is caused by internal thermal resistance (such as Rth-c_1 and Rth-c_2 in Figure 1) and heat flow, as is shown in Equation ( 2) In Equation (2), P is the heating power.Rth is the thermal resistance.Temperature changes causes shear thermal stress, while shear thermal stress causes liftoffs and cracks [6].Due to the encapsulation, internal liftoffs and cracks cannot be observed directly; several sensitive parameters have been considered to represent the fatigue characteristics instead, such as the IGBT saturation voltage drop (UCES) [7], SOA [8], thermal resistance between IGBT junction and device substrate (Rth [9], threshold voltage of the IGBT gate [10], etc.These methods are based on the one-to-one relationship between fatigue signatures and fatigue levels.However, these parameters are difficult to obtain reliably in the situation of high power and high EMI (Electromagnetic Interference), such as converters.
In other cases, liftoffs and cracks can be estimated by experimental data where a certain number of thermal cycles and Temperature drift (obtained from the mission profile) are applied to the IGBT under test; we call it AFT (accelerated fatigue test).Then, the model is built based on fatigue observations from the AFT.The model that is established by the AFT is the fitting relationship between the fixed working conditions and the fatigue life of switching devices.
An AFT tests a DUT with a predetermined time-effect force and obtains data showing the fatigue characteristics of it.However, the existing fatigue model requires stable and fixed operating conditions because these models were established under the AFT results of fixed experimental conditions [4,11,12].Therefore, the existing model cannot be fully applied to the changing conditions of the actual operating conditions, and this is the actual application of the TC.For example, in practical TC operation, the switching device is affected by the changing ambient temperature due to the movement of the EMU, while the ambient temperature remains constant in AFT [13]; in addition, the switching frequency of the switching device (i.e., The number of currents turn-ons and turn-offs in one second) and currents are also constantly changing as the speed of the EMU or the passenger load changes, however the conventional AFT only considers the constant switching frequency and the constant test current [14].
In this paper, it shows that Weibull distribution is more suitable to describe the cumulative failure rate of high-power switching device fatigue based on the data from conventional AFT.Then, The relationship between thermal stress and the change of CTE is shown in Equation (1).
S is the thermal stress, ∆CTE is the difference of CTE and ∆T is junction temperature variation.Equation (1) shows that the thermal stress is caused by the difference of CTE and ∆T.
Such temperature change is caused by internal thermal resistance (such as R th-c_1 and R th-c_2 in Figure 1) and heat flow, as is shown in Equation ( 2) In Equation (2), P is the heating power.R th is the thermal resistance.Temperature changes causes shear thermal stress, while shear thermal stress causes liftoffs and cracks [6].Due to the encapsulation, internal liftoffs and cracks cannot be observed directly; several sensitive parameters have been considered to represent the fatigue characteristics instead, such as the IGBT saturation voltage drop (U CES ) [7], SOA [8], thermal resistance between IGBT junction and device substrate (R th [9], threshold voltage of the IGBT gate [10], etc.These methods are based on the one-to-one relationship between fatigue signatures and fatigue levels.However, these parameters are difficult to obtain reliably in the situation of high power and high EMI (Electromagnetic Interference), such as converters.
In other cases, liftoffs and cracks can be estimated by experimental data where a certain number of thermal cycles and Temperature drift (obtained from the mission profile) are applied to the IGBT under test; we call it AFT (accelerated fatigue test).Then, the model is built based on fatigue observations from the AFT.The model that is established by the AFT is the fitting relationship between the fixed working conditions and the fatigue life of switching devices.
An AFT tests a DUT with a predetermined time-effect force and obtains data showing the fatigue characteristics of it.However, the existing fatigue model requires stable and fixed operating conditions because these models were established under the AFT results of fixed experimental conditions [4,11,12].Therefore, the existing model cannot be fully applied to the changing conditions of the actual operating conditions, and this is the actual application of the TC.For example, in practical TC operation, the switching device is affected by the changing ambient temperature due to the movement of the EMU, while the ambient temperature remains constant in AFT [13]; in addition, the switching frequency of the switching device (i.e., The number of currents turn-ons and turn-offs in one second) and currents are also constantly changing as the speed of the EMU or the passenger load changes, however the conventional AFT only considers the constant switching frequency and the constant test current [14].
In this paper, it shows that Weibull distribution is more suitable to describe the cumulative failure rate of high-power switching device fatigue based on the data from conventional AFT.Then, a novel bidirectional aging approach is proposed to show the aging effect from FWD inside the switching device.In the process of AFT, the experiment time is often too long under low current.Therefore, a weighted LSM (least squares method) is proposed to complete the piecewise fitting of BAFT (bidirectional AFT) experimental data, which improves the accuracy of modeling.Through the aging model that is proposed in this paper, the service life of switching devices under actual working conditions can be predicted and the fault trend can be evaluated online when the load current of switching devices is known.

Analysis of the Results of Conventional Single-Directional Aging Experiments
The analytic model of the IGBT is the analytical relationship between the sensitive factor in the IGBT and the number of failures (i.e., the lifetime).The aging data is fitted correspondingly, and the expression is intuitive.In order to describe the relationship between the cumulative failure rate F and the number of experiments (i.e., the number of cycles), it is necessary to select an appropriate life distribution.If the distribution model is properly selected, the F under various working conditions could be obtained with more accuracy [15].Common life distribution models include normal distribution, lognormal distribution, Weibull distribution and log-Weibull distribution.The choosing of distribution model is the key to switching device fault reconfiguration.
The normal distribution, also known as the Gauss distribution, is the most common and widely used distribution of the probability distribution of all random phenomena.It can be used to describe many natural phenomena and various physical properties.In practical applications, there are many experimental data that can be fitted with a normal distribution as its approximate distribution [16].The lognormal distribution is a typical model that is derived from a normal distribution.When the logarithm of the expiration time "ln (t)" obeys the normal distribution, t obeys logarithmic normal distribution.Logarithmic transformation can reduce the larger number to a smaller number.Lognormal distribution is widely used in fatigue life research [17].The Weibull distribution is a continuous distribution that is widely used in reliability, and its advantage lies in its strong adaptability to various types of test data.It is widely used in reliability engineering, especially for the distribution of wear and tear failure of electromechanical products.Since it can easily infer its distribution parameters using probability values, it is widely used in data processing of various life tests [15].The log-Weibull distribution is a special correlation distribution of the Weibull distribution which is obtained by taking the logarithm of x and obeying the Weibull distribution.The effect is similar to the lognormal distribution, and the larger number can be reduced by logarithmic transformation [18].
Table 1 is the equation for the distribution function and density function of the four distributions.
Table 1.Distribution functions and density functions of the four distributions.

Name Distribution Function Density Function
Normal distribution Because the design life of IGBTs is generally more than 20 years, aging data is generally obtained by means of accelerated fatigue test (AFT) [19].
In Figure 2, G 1 is auxiliary GTO, I 1 is a fatigue current and I 2 is used to calculate T j .R and C form an absorption branch.If G 1 is turned on, I 1 flows through the IGBT part.If G 1 is turned off, G 2 provides the freewheeling route of I 1 .C 1 and C 2 produce I 1 and I 2 , respectively.During the experiment, aging current I 1 only flows through IGBT.So, it can be called SAFT.

Weibull distribution
Because the design life of IGBTs is generally more than 20 years, aging data is generally obtained by means of accelerated fatigue test (AFT) [19].In Figure 2, G1 is auxiliary GTO, I1 is a fatigue current and I2 is used to calculate Tj.R and C form an absorption branch.If G1 is turned on, I1 flows through the IGBT part.If G1 is turned off, G2 provides the freewheeling route of I1.C1 and C2 produce I1 and I2, respectively.During the experiment, aging current I1 only flows through IGBT.So, it can be called SAFT.
During AFT, the ambient temperature is approximately 25 °C.I2 is 1.5 A, which is no more than 0.1% of the DUT rated current (1500 A) as a rule of thumb.The cycle time is 12 s with a duty cycle of 48%.During the AFT, DUT is placed in the actual TC power module and is then connected to the test platform.In AFT, F is calculated from the observed UCES [7], as shown in Equation (3).
In Equation (3), UCES0 is the initial saturation voltage of a new DUT, and UCES is the current value.When UCES rises to 1.2 times that of UCES0, F is defined to be 1.
With the data from SAFT, the corresponding F is calculated.Four different distribution models are fitted by the LSM method.As for the normal distribution, it gives: As for the lognormal distribution, it gives:  During AFT, the ambient temperature is approximately 25 • C. I 2 is 1.5 A, which is no more than 0.1% of the DUT rated current (1500 A) as a rule of thumb.The cycle time is 12 s with a duty cycle of 48%.During the AFT, DUT is placed in the actual TC power module and is then connected to the test platform.In AFT, F is calculated from the observed U CES [7], as shown in Equation (3).
In Equation ( 3), U CES0 is the initial saturation voltage of a new DUT, and U CES is the current value.When U CES rises to 1.2 times that of U CES0 , F is defined to be 1.
With the data from SAFT, the corresponding F is calculated.Four different distribution models are fitted by the LSM method.As for the normal distribution, it gives: As for the lognormal distribution, it gives: As for the Weibull distribution, it gives: As for the Logarithmic Weibull distribution, it gives: In order to select the most suitable one of the four distributions, we plot the curve of the experimental data and four distributions, and the density functions are also plotted.The results are shown in Figures 3 and 4. It should be noted that in Figures 3 and 4, the curve of the Weibull distribution coincides with the curve of the Logarithmic Weibull distribution.In Figures 3 and 4, the horizontal axis is the number of cycle times, and the vertical axis is cumulative failure rate, which is calculated according to DUT saturation voltage drop [13,17].
In order to select the most suitable one of the four distributions, we plot the curve of the experimental data and four distributions, and the density functions are also plotted.The results are shown in Figures 3 and 4. It should be noted that in Figures 3 and 4, the curve of the Weibull distribution coincides with the curve of the Logarithmic Weibull distribution.In Figures 3 and 4, the horizontal axis is the number of cycle times, and the vertical axis is cumulative failure rate, which is calculated according to DUT saturation voltage drop [13,17].From Figure 4, it seems that the Weibull distribution is closer to the practical fatigue process (hence the fault transition process) compared with the normal distribution, because the latter one shows a larger error.What is more, the lognormal distribution is basically consistent with normal distribution; hence the same fitting accuracy is obtained.However, the transient component of failure density in the case of the lognormal distribution function is too high; therefore, it is not suitable to describe the failure density of high-power IGBTs.The logarithmic Weibull distribution has the same characterization accuracy as the Weibull distribution, however its calculation process is much more complicated.Finally, the Weibull distribution is the most suitable one.From Figure 4, it seems that the Weibull distribution is closer to the practical fatigue process (hence the fault transition process) compared with the normal distribution, because the latter one shows a larger error.What is more, the lognormal distribution is basically consistent with normal distribution; hence the same fitting accuracy is obtained.However, the transient component of failure density in the case of the lognormal distribution function is too high; therefore, it is not suitable to describe the failure density of high-power IGBTs.The logarithmic Weibull distribution has the same Appl.Sci.2019, 9, 444 6 of 19 characterization accuracy as the Weibull distribution, however its calculation process is much more complicated.Finally, the Weibull distribution is the most suitable one.

Effect of FWD Actions on Device Aging
It is not enough to consider IGBT only in the fatigue model.In the device, current can flow in two directions, as shown in Figure 5, where C and E are the external terminals of the device.In Figure 5a, current flows through IGBT.In Figure 5b, current flows through FWD.In AFT, the DUT is heated by the test current flowing through it.However, conventional AFT only offers the test current in one direction, i.e., the direction shown in Figure 5a [13][14][15], considering that IGBT is more fragile.In conventional AFT (hence SAFT, single-directional AFT), the aging of IGBT is equal to the aging of DUT.However, the current through FWD also generates heat, hence FWD actions also cause DUT to age.SAFT should be improved to bidirectional AFT (BAFT) where the test current flows through both the IGBT and FWD.The data that is obtained by BAFT accurately shows the aging effects caused by IGBTs and FWD.Such an effect is shown more clearly in junction temperature variation, as shown in the simulation of the TC operation below.Figure 6 shows TC topology, where Q11~Q32 are IGBTs, D11~D32 are FWDs, QB, RB and DB constitute the brake energy consumption branch.Q11~Q32 and D11~D32 are usually packaged in six modules onto the heat sink.QB and DB is not used frequency, therefore only the fatigue of Q11~Q32 and D11~D32 are considered here.
The internal temperature of the IGBTs and FWDs under the given operating conditions was obtained in an electro-thermal simulation of the TC model.In the simulation, the EMU went through two operating cycles.The simulation results are shown in Figure 7.The horizontal axis in Figure 7 is time, and the vertical axis is junction temperature.The switching device used is 1500 A/3300V, the switching frequency is 57 Hz~1000 Hz and the ambient temperature is 25 °C.Figure 6 shows TC topology, where Q11~Q32 are IGBTs, D11~D32 are FWDs, QB, RB and DB constitute the brake energy consumption branch.Q11~Q32 and D11~D32 are usually packaged in six modules onto the heat sink.QB and DB is not used frequency, therefore only the fatigue of Q11~Q32 and D11~D32 are considered here.
The internal temperature of the IGBTs and FWDs under the given operating conditions was The internal temperature of the IGBTs and FWDs under the given operating conditions was obtained in an electro-thermal simulation of the TC model.In the simulation, the EMU went through two operating cycles.The simulation results are shown in Figure 7.The horizontal axis in Figure 7 is time, and the vertical axis is junction temperature.The switching device used is 1500 A/3300V, the switching frequency is 57 Hz~1000 Hz and the ambient temperature is 25 • C. In Figure 7, the maximum variation of FWD junction temperature (ΔTj2) is 56.3 °C, and that of IGBT (ΔTj1) is 39.7 °C.Junction temperature variation severely affects device lifetime [20,21].According to the experimental data of LESIT [22] and CIPS08 [23], the junction temperature change and the average junction temperature are the key factors affecting the aging process.Table 2 shows the maximum values of ΔTj2 andΔTj1 under different load conditions [24].ΔTj2 and ΔTj1 are affected by various factors, such as power loss of IGBT and FWD, heat dissipation condition and thermal resistance.In Table 2, ΔTj2 and ΔTj1 increase as the vehicle load increases (which appears in the form of load current, i.e., the fatigue current), which means that there is certain one-to-one correspondence between ΔTj1, ΔTj2 and the load current.In Figure 7, it should be noted that ΔTj1 is higher in the traction phase than ΔTj2 and lower in the braking phase.What is more, ΔTj2 grows much faster than ΔTj1 in the high-speed braking phase.This stems from the fact that more TC power output is required in the high-speed braking phase than in the traction phase, and such braking power mainly flows through the FWD instead of the IGBT [6].Therefore, the role of FWD should not be ignored during the fatigue of the switching device.
As stated above, the IGBT ages faster with the FWD actions, i.e., the fatigue process is accelerated.If the aging state of the DUT is still represented by the aging state of the IGBT (as is the case in the existing references [2][3][4]), this acceleration effect of the FWD action must be considered.

BAFT and the Analysis of BAFT Results
The fatigue model reveals the relationship between the main factors affecting fatigue accumulation and the degree of fatigue.The load condition of the vehicle directly affects the fatigue of the switching device.The TC current output reflects the load level and this current flow completely through the switching device.Therefore, there is a one-to-one correspondence between the TC current output and the fatigue level of the switching device, and there is a one-to-one correspondence between the device current and the fatigue level and between the device current and the device lifetime.The correspondence is shown in Equation ( 8).In Figure 7, the maximum variation of FWD junction temperature (∆T j2 ) is 56.3 • C, and that of IGBT (∆T j1 ) is 39.7 • C. Junction temperature variation severely affects device lifetime [20,21].According to the experimental data of LESIT [22] and CIPS08 [23], the junction temperature change and the average junction temperature are the key factors affecting the aging process.Table 2 shows the maximum values of ∆T j2 and∆T j1 under different load conditions [24].∆T j2 and ∆T j1 are affected by various factors, such as power loss of IGBT and FWD, heat dissipation condition and thermal resistance.In Table 2, ∆T j2 and ∆T j1 increase as the vehicle load increases (which appears in the form of load current, i.e., the fatigue current), which means that there is certain one-to-one correspondence between ∆T j1 , ∆T j2 and the load current.In Figure 7, it should be noted that ∆T j1 is higher in the traction phase than ∆T j2 and lower in the braking phase.What is more, ∆T j2 grows much faster than ∆T j1 in the high-speed braking phase.This stems from the fact that more TC power output is required in the high-speed braking phase than in the traction phase, and such braking power mainly flows through the FWD instead of the IGBT [6].Therefore, the role of FWD should not be ignored during the fatigue of the switching device.
As stated above, the IGBT ages faster with the FWD actions, i.e., the fatigue process is accelerated.If the aging state of the DUT is still represented by the aging state of the IGBT (as is the case in the existing references [2][3][4]), this acceleration effect of the FWD action must be considered.

BAFT and the Analysis of BAFT Results
The fatigue model reveals the relationship between the main factors affecting fatigue accumulation and the degree of fatigue.The load condition of the vehicle directly affects the fatigue of the switching device.The TC current output reflects the load level and this current flow completely through the switching device.Therefore, there is a one-to-one correspondence between the TC current output and the fatigue level of the switching device, and there is a one-to-one correspondence between the device current and the fatigue level and between the device current and the device lifetime.The correspondence is shown in Equation (8).
N f is the number of cycles the device can withstand before failure.We denote N f as the lifetime of the device; I eq is the equivalent current amplitude of the device.
The acceleration effect of the FWD action means that the life of the device is shortened, as shown in Equation (9).
In Equation ( 7), N f2 is the practical lifetime with FWD action, while N f1 is the lifetime of only considering IGBT action; n is a coefficient indicating an acceleration effect.
However, n is difficult to obtain through experimental testing, so other ways need to be considered to express the aging acceleration effect of FWD on power modules.For better applicability, we present the Equation ( 8) in its current form, as shown in Equation (10).
In Equation ( 10), i G is the current of the IGBT, i D is the current of FWD and µ is the acceleration factor.The fatigue of semiconductor devices is caused by the accumulation of thermal stress effects, so the current is in the form of RMS (Root-Mean-Square), and there is a square operation.
In order to obtain µ, we propose a new AFT (BAFT) with bi-directional fatigue current.The BAFT test platform is shown in Figure 8. Nf is the number of cycles the device can withstand before failure.We denote Nf as the lifetime of the device; Ieq is the equivalent current amplitude of the device.
The acceleration effect of the FWD action means that the life of the device is shortened, as shown in Equation (9).
In Equation ( 7), Nf2 is the practical lifetime with FWD action, while Nf1 is the lifetime of only considering IGBT action; n is a coefficient indicating an acceleration effect.
However, n is difficult to obtain through experimental testing, so other ways need to be considered to express the aging acceleration effect of FWD on power modules.For better applicability, we present the Equation ( 8) in its current form, as shown in Equation (10).
In Equation (10), iG is the current of the IGBT, iD is the current of FWD and μ is the acceleration factor.The fatigue of semiconductor devices is caused by the accumulation of thermal stress effects, so the current is in the form of RMS (Root-Mean-Square), and there is a square operation.
In order to obtain μ, we propose a new AFT (BAFT) with bi-directional fatigue current.The BAFT test platform is shown in Figure 8.In Figure 8, the device functions are similar to those in the single-directional accelerated fatigue test platform of Figure 2, except that when G2 and G4 are turned on, I1 flows through the FWD portion inside the DUT.In BAFT, the number of cycles of IGBT and FWD is equal, so only half of the total number of actions is considered.
For comparison, a bi-directional experiment was performed at I1 of 1500 A. Based on the data obtained from the bi-directional experiment, the cumulative failure rate curve of BAFT is shown together with the cumulative failure rate curve of SAFT at 1500 A in Figure 9.In Figure 9, the horizontal axis is the number of cycle times and the vertical axis is cumulative failure rate.It can be seen that the bi-directional experiment significantly accelerates the aging of the entire switching device.In Figure 8, the device functions are similar to those in the single-directional accelerated fatigue test platform of Figure 2, except that when G 2 and G 4 are turned on, I 1 flows through the FWD portion inside the DUT.In BAFT, the number of cycles of IGBT and FWD is equal, so only half of the total number of actions is considered.
For comparison, a bi-directional experiment was performed at I 1 of 1500 A. Based on the data obtained from the bi-directional experiment, the cumulative failure rate curve of BAFT is shown together with the cumulative failure rate curve of SAFT at 1500 A in Figure 9.In Figure 9, the horizontal axis is the number of cycle times and the vertical axis is cumulative failure rate.It can be seen that the bi-directional experiment significantly accelerates the aging of the entire switching device.Since the fatigue process of the switching device follows the Weibull distribution, the Weibull distribution is fitted by the least squares method according to the bi-directional aging experimental data under 1500 A, and the mathematical analysis relationship between the BAFT and the number of cycles x can be fitted to: The mean of the Weibull distribution: In Equation ( 13), Γ is a gamma function, and λ and k are parameters of the Weibull distribution.
To derive μ, we define an intermediate variable m which is derived from the quotient between the mean of FB (x) and FS (x), as shown in Equation (14).
From the difference between BAFT and SAFT, it is given that: Since the fatigue process of the switching device follows the Weibull distribution, the Weibull distribution is fitted by the least squares method according to the bi-directional aging experimental data under 1500 A, and the mathematical analysis relationship between the BAFT and the number of cycles x can be fitted to: 15.05 (11) The result curve of SAFT is fitted as: 12.98 (12) The mean of the Weibull distribution: In Equation ( 13), Γ is a gamma function, and λ and k are parameters of the Weibull distribution.
To derive µ, we define an intermediate variable m which is derived from the quotient between the mean of F B (x) and F S (x), as shown in Equation (14).
From the difference between BAFT and SAFT, it is given that: Consider i G = i D = 1500A and m = 1.1125.Equation (10) shows that FWD accelerates the aging speed by 23.77%.

Fault Model Fitted by Weighted LSM
However, it is not enough to analyze the experimental data of bi-directional aging under a single current.In order to obtain a more consistent distribution function with the practical trend of failure rate, this paper will analyze the bi-directional aging under different current values.Take the bi-directional 900A experiment as an example.The data collected from the experiment are plotted in Figure 10.In Figure 10, the horizontal axis is the number of cycle times and the vertical axis is cumulative failure rate.

Fault model fitted by Weighted LSM
However, it is not enough to analyze the experimental data of bi-directional aging under a single current.In order to obtain a more consistent distribution function with the practical trend of failure rate, this paper will analyze the bi-directional aging under different current values.Take the bi-directional 900A experiment as an example.The data collected from the experiment are plotted in Figure 10.In Figure 10, the horizontal axis is the number of cycle times and the vertical axis is cumulative failure rate.
It should be noted that when the test current decreases, the aging speed decreases sharply [25].It is too difficult to obtain complete aging data through experiments.Therefore, only the part with a failure rate of 0.4 is used for analysis.Through the previous analysis, Weibull distribution is selected to fit the failure rate distribution.The least squares method can be used to estimate the parameters of non-linear regression analysis.The traditional methods of least squares are direct search method, lattice search method, Gauss-Newton method, Newton-Raphson method, etc. [26].Direct search method and lattice search method are inefficient and seldom used in practical engineering [27].Gauss-Newton method is also called linearization method.Newton-Raphson method is an improvement of Gauss-Newton method.However, in some cases, these two methods may require much iteration to converge, and in a few cases, they may not converge at all [26].
To a certain extent, the problem of the above method can be avoided by using a genetic algorithm for least squares.GA (genetic algorithm) is an adaptive stochastic search method that is based on natural selection and biogenetic theory [28].Firstly, the solution of the problem to be solved is regarded as a number of the individuals in the population.The individual in the population is coded and expressed as a string.Then, according to the fitness of the individuals, the selection, crossing and variation operations between the individuals are simulated to generate new It should be noted that when the test current decreases, the aging speed decreases sharply [25].It is too difficult to obtain complete aging data through experiments.Therefore, only the part with a failure rate of 0.4 is used for analysis.
Through the previous analysis, Weibull distribution is selected to fit the failure rate distribution.The least squares method can be used to estimate the parameters of non-linear regression analysis.The traditional methods of least squares are direct search method, lattice search method, Gauss-Newton method, Newton-Raphson method, etc. [26].Direct search method and lattice search method are inefficient and seldom used in practical engineering [27].Gauss-Newton method is also called linearization method.Newton-Raphson method is an improvement of Gauss-Newton method.However, in some cases, these two methods may require much iteration to converge, and in a few cases, they may not converge at all [26].
To a certain extent, the problem of the above method can be avoided by using a genetic algorithm for least squares.GA (genetic algorithm) is an adaptive stochastic search method that is based on natural selection and biogenetic theory [28].Firstly, the solution of the problem to be solved is regarded as a number of the individuals in the population.The individual in the population is coded and expressed as a string.Then, according to the fitness of the individuals, the selection, crossing and variation operations between the individuals are simulated to generate new solutions.In the process of population cyclic evolution, individuals gradually evolve to the state of the approximate optimal solution; therefore, the optimal solution of the problem can be obtained.The flow chart is shown in Figure 11.
solutions.In the process of population cyclic evolution, individuals gradually evolve to the state of the approximate optimal solution; therefore, the optimal solution of the problem can be obtained.The flow chart is shown in Figure 11.The most important part of GA is fitness function.When GA simulates the selection of nature, the adaptation of each generation of population to conditions corresponds to a necessary parameter, which determines how much probability the individuals in this population participate in the heredity.The function of evaluating this probability is called fitness function.The suitability of fitness function selection will affect the whole GA.Selecting a different fitness function will result in different convergence and accuracy of the whole algorithm [29].
Aiming at the fitting of Weibull distribution in this paper, the purpose of our fitting is to find the appropriate values of λ and k, so we take λ and k as variables in the initial population.
The fitness function selected is: (1 ) In Equation ( 16), xi is the number of experimental cycles and Yi is the failure rate corresponding to the number i experimental cycle in the actual experimental data.
It should be noted that GA is to solve the fitness function to a minimum value, and the corresponding variables of the minimum value as the output of the optimal solution.The concept of LSM is also used in the construction of fitness function.The purpose of the fitness function is to minimize the absolute value of the difference between the output value of the function to be fitted and the actual data.After summing the squares of the difference between each output value and the practical data, we can calculate a function that is closest to the curve that was formed by the practical data.
Because of the existence of crossing and variation operations, GA has randomness in the process of finding the optimal solution, which can avoid the algorithm falling into local optimum to a certain extent.The search for the optimal solution by GA is general.The fitting degree of λ and k The most important part of GA is fitness function.When GA simulates the selection of nature, the adaptation of each generation of population to conditions corresponds to a necessary parameter, which determines how much probability the individuals in this population participate in the heredity.The function of evaluating this probability is called fitness function.The suitability of fitness function selection will affect the whole GA.Selecting a different fitness function will result in different convergence and accuracy of the whole algorithm [29].
Aiming at the fitting of Weibull distribution in this paper, the purpose of our fitting is to find the appropriate values of λ and k, so we take λ and k as variables in the initial population.
The fitness function selected is: In Equation ( 16), x i is the number of experimental cycles and Y i is the failure rate corresponding to the number i experimental cycle in the actual experimental data.
It should be noted that GA is to solve the fitness function to a minimum value, and the corresponding variables of the minimum value as the output of the optimal solution.The concept of LSM is also used in the construction of fitness function.The purpose of the fitness function is to minimize the absolute value of the difference between the output value of the function to be fitted and the actual data.After summing the squares of the difference between each output value and the practical data, we can calculate a function that is closest to the curve that was formed by the practical data.
Because of the existence of crossing and variation operations, GA has randomness in the process of finding the optimal solution, which can avoid the algorithm falling into local optimum to a certain extent.The search for the optimal solution by GA is general.The fitting degree of λ and k fitted by GA should be the closest to all data points, and the fitted function should be the closest to the actual curve.
Taking λ and k as input variables and Equation ( 16) as fitness function, the algorithm is simulated by the GA toolbox of MATLAB.After 200 generations of the calculation, the optimal value is λ = 925830, k = 5.7629.The relationship curve between the optimal solution of each generation and the number of iterations of the algorithm is shown in Figure 12.In Figure 12, the horizontal axis is the number of generation and the vertical axis is fitness value.It can be seen that the algorithm converges basically after about 10 generations.
Appl.Sci.2019, 9 FOR PEER REVIEW 13 fitted by GA should be the closest to all data points, and the fitted function should be the closest to the actual curve.Taking λ and k as input variables and Equation ( 16) as fitness function, the algorithm is simulated by the GA toolbox of MATLAB.After 200 generations of the calculation, the optimal value is λ = 925830, k = 5.7629.The relationship curve between the optimal solution of each generation and the number of iterations of the algorithm is shown in Figure 12.In Figure 12, the horizontal axis is the number of generation and the vertical axis is fitness value.It can be seen that the algorithm converges basically after about 10 generations.Using GA to calculate this can get better fitting accuracy and convergence, however when facing a large number of data in IGBT acceleration test, GA takes too much time, which is the biggest problem of genetic algorithm.Therefore, we introduce a weighted least squares method.
The weighted algorithm is helpful to select the most important region of a set of data.Because the aging process is too slow and the data that are obtained are less when the current is small, in order to describe the average level of bi-directional aging of IGBT under low current (900 A), the LSM method of weighted residual is introduced in this paper.The LSM method of weighted residuals can conveniently determine which interval in the whole definition domain is the most concerned.
The standard LSM is to form a hyper plane in two dimensions (single independent variable, single dependent variable or SISO) by some functions.The sum of the distances between the hyper plane and the points in the sample space is the smallest.This hyper plane can be defined as { ( ), ( )..., ( )} span x x x ; the equation at any point on the extended hyper plane can be defined as: In fact, ( ),( , ) y f x x y C (18) In Equation (17), C is the set of sample spaces.In weighted LSM，we define residual as the sum of the distances between points and the hyper planes.
In the improved LSM, the definition Using GA to calculate this can get better fitting accuracy and convergence, however when facing a large number of data in IGBT acceleration test, GA takes too much time, which is the biggest problem of genetic algorithm.Therefore, we introduce a weighted least squares method.
The weighted algorithm is helpful to select the most important region of a set of data.Because the aging process is too slow and the data that are obtained are less when the current is small, in order to describe the average level of bi-directional aging of IGBT under low current (900 A), the LSM method of weighted residual is introduced in this paper.The LSM method of weighted residuals can conveniently determine which interval in the whole definition domain is the most concerned.
The standard LSM is to form a hyper plane in two dimensions (single independent variable, single dependent variable or SISO) by some functions.The sum of the distances between the hyper plane and the points in the sample space is the smallest.This hyper plane can be defined as span{ϕ 1 (x), ϕ 2 (x)..., ϕ n (x)}; the equation at any point on the extended hyper plane can be defined as: In fact, In Equation ( 17), C is the set of sample spaces.
In weighted LSM, we define residual as the sum of the distances between points and the hyper planes.
In the improved LSM, the definition where α i is defined as the quotient between the number of abscissa x i ∈ (X i , X i + ∆X) of sample point (x i , y i ) and the total number of sample points.
∆X is usually determined by experiment or experience.For IGBT aging failure rate fitting, it is actually the resolution of junction temperature fluctuation (the relative relationship between failure rate and junction temperature fluctuation) or the time resolution (the relationship between failure rate and time).
This is actually the definition of product in vector space.
Note: Here A can be given according to the normal distribution.According to the weighted LSM theory, we select different weights and obtain three curves, as shown in Figure 13.In Figure 13, the horizontal axis is the number of cycle times and the vertical axis is cumulative failure rate.From the point of view of the relationship between failure rate and time, in fact, what we need to focus on is not only the accuracy of characterization of failure rate in the whole life cycle of products, however also the accuracy of the most product failure processes.Accurately grasping the change of the latter is more conducive to accurately predicting the state of aging damage of components.According to this idea, observing the experimental data curve under 900 A, the failure   From the point of view of the relationship between failure rate and time, in fact, what we need to focus on is not only the accuracy of characterization of failure rate in the whole life cycle of products, however also the accuracy of the most product failure processes.Accurately grasping the change of the latter is more conducive to accurately predicting the state of aging damage of components.According to this idea, observing the experimental data curve under 900 A, the failure rate rises rapidly in the interval where the number of tests is [4.5 × 10 5 , 7.5 × 10 5 ], so we think that curve 2 fits the aging situation under 900 A most accurately.By fitting the data with this weighted least squares method, the function corresponding to the above curve 2 is the closest to the actual distribution function.
In order to validate the distribution function fitted by weighted LSM, we compare the fitting results with those of GA.The function curve obtained is compared with the curve that was fitted by weighted LSM as shown in Figure 14, where the horizontal axis is the number of cycle times and the vertical axis is cumulative failure rate.According to Figure 14, the curve of the function that was obtained by GA and curve 2 almost coincides in the interval of cycle number [4 × 10 5 , 8 × 10 5 ], which proves that the weighted LSM can more accurately fit the distribution function.Compared with the GA, the computational complexity of weighted LSM is greatly reduced and the computation time is shortened by 27% compared with the GA.So, the weighted LSM is more suitable for fitting the distribution.

Analytical Fatigue Model
Different BAFTs are based on different I1, as shown in Figure 15.The horizontal axis in Figure 15 is the number of cycle times, and the vertical axis is cumulative failure rate.Under I1 at 1200 A and 900 A, the mathematical model between cumulative failure rate F and cycle number x is as follows: According to Figure 14, the curve of the function that was obtained by GA and curve 2 almost coincides in the interval of cycle number [4 × 10 5 , 8 × 10 5 ], which proves that the weighted LSM can more accurately fit the distribution function.Compared with the GA, the computational complexity of weighted LSM is greatly reduced and the computation time is shortened by 27% compared with the GA.So, the weighted LSM is more suitable for fitting the distribution.

Analytical Fatigue Model
Different BAFTs are based on different I 1 , as shown in Figure 15.The horizontal axis in Figure 15 is the number of cycle times, and the vertical axis is cumulative failure rate.Under I 1 at 1200 A and 900 A, the mathematical model between cumulative failure rate F and cycle number x is as follows:  In Equations ( 21)-( 23), F B1500 (x), F B1200 (x) and F B900 (x) are the cumulative failure efficiencies of I 1 at 1500A, 1200A and 900A, respectively, and their mean values are 304890, 479170 and 851580.The corresponding equipment service life N f can be expressed in terms of the mean value.

Analytical Fatigue Model
Different BAFTs are based on different I1, as shown in Figure 15.The horizontal axis in Figure 15 is the number of cycle times, and the vertical axis is cumulative failure rate.Under I1 at 1200 A and 900 A, the mathematical model between cumulative failure rate F and cycle number x is as follows:  In addition, the ambient temperature T a should also be considered [30,31].According to Equation ( 9), the N f expression considering T a is obtained.
In Equation (24), g is specific functions.Finally, after the fitting calculation, Equation ( 9) is expressed as: In Equation (25), k is an Arrhenius coefficient.According to Arrhenius's law (a temperature increase of 10 • C means that the service life is shortened by half), k is calculated to be 60.65.The experimental and fitting curves are shown in Figure 16.
Considering the RMS calculation scheme, the amplitude is √ 2 times of the effective value, so: From Equation ( 26): The data obtained in the experiment and the curve that was obtained by mathematical fitting is shown in Figure 16.The vertical axis in Figure 16 is the cycle counts that a DUT could survive before failure.The horizontal axis is the amplitude value of the test current.As shown in Figure 16, the obtained curve conforms to the obtained data very well.
By Equation (27), when I eq and T a are known, the service life or failure rate of the IGBT can be predicted.Equation (27) has been subjected to actual field data.The example we consider here is a metro line application in Shenzhen, a southern-China city, with the annual temperature average of 23 • C. The I eq has been recorded to be 489 A in a single year; therefore, N f is calculated to be 2,197,460.Considering that a train normally operates 204 cycles per day and normally 300 days per year, the train is capable of operating around 36 years under the same working conditions.According to statistics offered by metro operating agency, the initial failure rate of a new module is 157 FIT.Given that they hold that the failure rate rise of 200% means that the switching devices in a batch have failed, the statistical service life of the applied power module in the specific metro line is around 47 years (offered by the agency).This means that the approach that was proposed has a prediction error of around 23%, which is within project tolerance (normally 20%~30%).It should be noted that the lifetime predicted by our model in Equation ( 27) is more likely to be true and more convincing, because a metro train is designed to operate for no more than 30 years, and the 47-year lifetime prediction does not take into consideration the acceleration of the aging process, as is shown by a bathtub curve.(1.17In Equation (25), k is an Arrhenius coefficient.According to Arrhenius's law (a temperature increase of 10 °C means that the service life is shortened by half), k is calculated to be 60.65.The experimental and fitting curves are shown in Figure 16.
Considering the RMS calculation scheme, the amplitude is 2 times of the effective value, so: The data obtained in the experiment and the curve that was obtained by mathematical fitting is shown in Figure 16.The vertical axis in Figure 16 is the cycle counts that a DUT could survive before failure.The horizontal axis is the amplitude value of the test current.As shown in Figure 16, the obtained curve conforms to the obtained data very well.By Equation (27), when Ieq and Ta are known, the service life or failure rate of the IGBT can be predicted.Equation (27) has been subjected to actual field data.The example we consider here is a metro line application in Shenzhen, a southern-China city, with the annual temperature average of

Conclusions
The object of the fault model in this paper is the fault model of the high-power switching device that was applied to rail transit.Based on a new test method using bidirectional current aging, we propose a method for extracting the acceleration factor of the freewheeling diode in the switching device for the entire module.Then, the weighting factor is introduced in the traditional least squares method, which improves the processing efficiency of the test results and the fitting accuracy of the model.Finally, an IGBT fault prediction model was presented by using the aging current and ambient temperature as sensitive factor.

Figure 1 .
Figure 1.Internal structure of the switchgear.

Figure 1 .
Figure 1.Internal structure of the switchgear.
Weibull distribution, it gives:

Figure 3 .
Figure 3.Comparison between data curves of single-direction 1500 A fatigue test and four distribution fitting results.

Figure 3 .
Figure 3.Comparison between data curves of single-direction 1500 A fatigue test and four distribution fitting results.Appl.Sci.2019, 9 FOR PEER REVIEW 6

Figure 4 .
Figure 4. Comparison of the density function of Weibull distribution, Logarithmic Weibull distribution and normal distribution, lognormal distribution under the single-direction 1500 A fatigue test.

Figure 4 .
Figure 4. Comparison of the density function of Weibull distribution, Logarithmic Weibull distribution and normal distribution, lognormal distribution under the single-direction 1500 A fatigue test.

Figure 6 Figure 5 .Figure 6 .
Figure 6 shows TC topology, where Q 11 ~Q32 are IGBTs, D 11 ~D32 are FWDs, QB, RB and DB constitute the brake energy consumption branch.Q 11 ~Q32 and D 11 ~D32 are usually packaged in six modules onto the heat sink.QB and DB is not used frequency, therefore only the fatigue of Q 11 ~Q32 and D 11 ~D32 are considered here.

Figure 7 .
Figure 7. Junction Temperature Simulation of Q 32 and D 32 under AW0.

Figure 8 .
Figure 8. Schematic diagram of a bi-directional accelerated fatigue test.

Figure 8 .
Figure 8. Schematic diagram of a bi-directional accelerated fatigue test.

Figure 11 .
Figure 11.Flow chart of genetic algorithm.

Figure 11 .
Figure 11.Flow chart of genetic algorithm.

Figure 12 .
Figure 12.Optimal Solution Curve of Each Generation.

Figure 13 .
Figure 13.Weibull distribution function curves fitted under different weights.

Figure 13 .
Figure 13.Weibull distribution function curves fitted under different weights.

Figure 14 .
Figure 14.Comparison of Weibull distribution fitted by GA and curves fitted by weighted LSM (least squares method).

Figure 14 .
Figure 14.Comparison of Weibull distribution fitted by GA and curves fitted by weighted LSM (least squares method). 01

5
Fatigue current in amperesThe curve obtained by mathematical fittingThe number of cycle timesThe data obtained in experiment

Figure 16 .
Figure 16.Experimental results of BAFT and fitting curves.

Figure 16 .
Figure 16.Experimental results of BAFT and fitting curves.

Table 2 .
ΔTj of Q32 and D32 under different load conditions.

Table 2 .
∆T j of Q 32 and D 32 under different load conditions.