Arc Fault Detection Algorithm Based on Variational Mode Decomposition and Improved Multi-Scale Fuzzy Entropy

Arc fault diagnosis is necessary for the safety and efficiency of PV stations. This study proposed an arc fault diagnosis algorithm formed by combining variational mode decomposition (VMD), improved multi-scale fuzzy entropy (IMFE), and support vector machine (SVM). This method first uses VMD to decompose the current into intrinsic mode functions (IMFs) in the time-frequency domain, then calculates the IMFE according to the IMFs associated with the arc fault. Finally, it uses SVM to detect arc faults according to IMFEs. Arc fault data gathered from a PV arc generation experiment platform are used to validate the proposed method. The results indicated the proposed method can classify arc fault data and normal data effectively.


Introduction
With the shortage of fossil fuels and increasing concern for the environment, demand for clean energies is increasing. Among clean energies, solar energy has been found to have the advantages of sustainability and cleanliness, attracting more and more attention. According to the reports of the International Energy Agency (IEA) and the International Renewable Energy Agency (IRENA) [1,2], the total installed capacity of the global photovoltaic industry has maintained a stable upward trend from 2011 to 2019.
At present, photovoltaic systems usually operate at a high voltage to save on the investment in cables [3][4][5][6][7][8]. For example, the DC voltage of a photovoltaic system in some places has reached 1000 V [9]. High voltage makes air more easily ionized and greatly increases the possibility of a photovoltaic DC arc [10]. In addition, a PV system is often placed in harsh conditions. Small creatures, high air humidity, and so on can also trigger photovoltaic DC arcs. If an arc is not detected in time, a continuous arc can generate high temperature plasma, which can severely damage the components of the photovoltaic system or even cause a fire. This kind of damage can severely affect the efficiency and profitability of a PV system. Therefore, arc fault diagnosis plays an important role in keeping photovoltaic systems operating with reasonable efficiency, profitability, and reliability.
DC arcs can be divided into parallel arcs and series arcs. Parallel arcs will cause overcurrent and can be easily detected and prevented in time by the over-current protection device [11]. However, series arcs will not cause over-current. It is necessary to conduct deeper research on the diagnosis of DC-side series arcs [12][13][14][15].
At present, series arcs can be detected by using sensors to collect several types of signals or by analyzing electrical data in the time domain or the frequency domain [16][17][18][19][20][21][22]. Sensors can effectively detect arcs and complete the task of circuit protection. However, sensors are expensive and hard to use widely.
Analyzing data in the time domain has critical requirements for the input data. However, the output of the DC side of a photovoltaic power generation system will be affected by the external environment, which results in nonlinearity and instability in the data. It is usually difficult to make arc detections by using this method alone.
Analyzing electrical data in the frequency domain avoids these disadvantages. Frequency domain signal analysis methods such as the fast Fourier transformation (FFT) [23,24], the short-time Fourier transformation (STFF) [25,26], the wavelet transformation (WT) [27,28], and the empirical mode decomposition (EMD) [29,30] have been applied to the detection of DC series arcs with relatively good detection effects. However, the FFT is not suitable for nonstationary signals and is not capable of detecting when the arc occurs. The STFT relies heavily on the choice of window function. Narrow windows lead to low frequency resolution. Wide windows lead to low temporal resolution. The WT depends heavily on the choice of wavelet basis. An improper wavelet basis affects the detection. The EMD suffers from the issue of mode mixing of the IMFs. When mode mixing occurs, the decomposed signals have overlapping parts in the frequency domain. This phenomenon makes decomposition meaningless. However, the variational mode decomposition (VMD) [31][32][33][34][35][36] avoids almost all of the disadvantages mentioned above. It is a completely non-recursive signal decomposition method based on the frequency domain, which has been widely used in bearing fault detection, EEG signal detection, and so on. Due to its advantages, this method is used to decompose the current data for further analysis in this paper.
After decomposition is completed, some nonlinear parameter estimation methods should be used to furtherly analyze the results of decomposition. Approximate entropy [37], sample entropy [38], and fuzzy entropy [39,40] are used to extract the characteristic parameters of the signals obtained by analysis in the frequency domain. However, approximate entropy often produces illegal values. Sample entropy lacks stability. Fuzzy entropy only describes the characteristics of the signal in a single scale. However, the multi-scale entropy proposed by Costa makes fuzzy entropy capable of application in multi-scale situations. An improved version of multi-scale entropy is used to further analyze the decomposed data.
In this paper, an improved diagnosis algorithm consisting of VMD, IMFE, and the support vector machine (SVM) was proposed to diagnose PV DC series arc faults based on the current data. The block diagram of this algorithm is shown in Figure 1    This algorithm consists of VMD, IMFE, and SVM. The advantages of the proposed algorithm are as follows: • VMD avoids mode mixing, which makes sure that decomposition is meaningful; • IMFE is stable and can be used to describe signals from the multi-scale; • SVM has great classification effects on nonlinear problems.

Arc Fault Data Feature Analysis
The proposed method of our research focused on its application in power stations located in Hubei Enshi Badong. One of them was the Songjialiangzi power plant. Photos of it are shown in Figure 2a  This algorithm consists of VMD, IMFE, and SVM. The advantages of the proposed algorithm are as follows: • VMD avoids mode mixing, which makes sure that decomposition is meaningful; • IMFE is stable and can be used to describe signals from the multi-scale; • SVM has great classification effects on nonlinear problems.

Arc Fault Data Feature Analysis
The proposed method of our research focused on its application in power stations located in Hubei Enshi Badong. One of them was the Songjialiangzi power plant. Photos of it are shown in Figure 2a In order to analyze the features of arc fault data, a photovoltaic DC arc generation experiment platform was built. The system chart is shown in Figure 3. This system consisted of a PV array, a breaker, a load resistance, a current and voltage sampling device, and an arc generator. The sampling device consisted of a Hall current sensor and an oscilloscope. The sampling frequency was 500 kHz. The arc generator was established according to the UL1699B standard, whose structure is shown in Figure 4. The photos of sampling devices, the arc generator, and the arc are shown in Figures 5-7. In order to analyze the features of arc fault data, a photovoltaic DC arc generation experiment platform was built. The system chart is shown in Figure 3. This algorithm consists of VMD, IMFE, and SVM. The advantages of the proposed algorithm are as follows: • VMD avoids mode mixing, which makes sure that decomposition is meaningful; • IMFE is stable and can be used to describe signals from the multi-scale; • SVM has great classification effects on nonlinear problems.

Arc Fault Data Feature Analysis
The proposed method of our research focused on its application in power stations located in Hubei Enshi Badong. One of them was the Songjialiangzi power plant. Photos of it are shown in Figure 2a In order to analyze the features of arc fault data, a photovoltaic DC arc generation experiment platform was built. The system chart is shown in Figure 3. This system consisted of a PV array, a breaker, a load resistance, a current and voltage sampling device, and an arc generator. The sampling device consisted of a Hall current sensor and an oscilloscope. The sampling frequency was 500 kHz. The arc generator was established according to the UL1699B standard, whose structure is shown in Figure 4. The photos of sampling devices, the arc generator, and the arc are shown in Figures 5-7. This system consisted of a PV array, a breaker, a load resistance, a current and voltage sampling device, and an arc generator. The sampling device consisted of a Hall current sensor and an oscilloscope. The sampling frequency was 500 kHz. The arc generator was established according to the UL1699B standard, whose structure is shown in Figure 4. The photos of sampling devices, the arc generator, and the arc are shown in Figures 5-7.       In the experiment, the knob of the arc generator was slowly rotated to shorten the distance between the electrodes. As the distance became smaller, an arc was generated between the electrodes. Then the oscilloscope collected the current data. The current data of an arc fault under different load conditions was obtained by changing the value of the load resistance. By removing the arc generator, normal current data were also obtained.
The unprocessed signal showed that the low frequency AC noise existed in both normal and arc fault situations. This type of noise does not help to detect the fault and makes the problem complicated. Therefore, a high-pass filter with a cut-off frequency of 30 kHz was used to remove the low frequency noise. The comparison of processed current data is shown in Figures 8 and 9:  In the experiment, the knob of the arc generator was slowly rotated to shorten the distance between the electrodes. As the distance became smaller, an arc was generated between the electrodes. Then the oscilloscope collected the current data. The current data of an arc fault under different load conditions was obtained by changing the value of the load resistance. By removing the arc generator, normal current data were also obtained.
The unprocessed signal showed that the low frequency AC noise existed in both normal and arc fault situations. This type of noise does not help to detect the fault and makes the problem complicated. Therefore, a high-pass filter with a cut-off frequency of 30 kHz was used to remove the low frequency noise. The comparison of processed current data is shown in Figures 8 and 9: In the experiment, the knob of the arc generator was slowly rotated to shorten the distance between the electrodes. As the distance became smaller, an arc was generated between the electrodes. Then the oscilloscope collected the current data. The current data of an arc fault under different load conditions was obtained by changing the value of the load resistance. By removing the arc generator, normal current data were also obtained.
The unprocessed signal showed that the low frequency AC noise existed in both normal and arc fault situations. This type of noise does not help to detect the fault and makes the problem complicated. Therefore, a high-pass filter with a cut-off frequency of 30 kHz was used to remove the low frequency noise. The comparison of processed current data is shown in Figures 8 and 9:   Figure 8 shows that in the time domain, the current of the arc fault oscillated wildly compared to normal situations. Figure 9 indicates that the fault current was significantly different from the normal one in the 30-100 kHz band [10]. VMD was used to process the signal with the aim of analyzing the frequency components related to the arc accurately.

Variational Model Decomposition
VMD decomposes a time-series signal f(t) into K IMFs. Each IMF has a central frequency. VMD is an optimization problem based on IMF bandwidth minimization. It can be described by Formula (1). In Formula (1), K is the number of IMFs, k-th IMF is presented by u (t), the central frequency of u (t) is presented by ω , (a * b) represents a convolved with b. ‖a‖ represents the L2-norm of a, which can be calculated by Formula (2), and ∂ X represents the partial derivative of X with respect to t.
Quadratic penalty term a and Lagrangian multiplier λ(t) are used to transform this problem into a non-constrained problem as shown in Formula (3). < x, y > represents the inner product of x and y.
This problem can be solved by using alternating the direction multiplier method (ADMM) until an augmented Lagrange saddle point is found. u , ω , and λ can be updated by using Formulas (4) to (6). For the convenience of subsequent analysis, Formulas (4) to (6) are given in the frequency domain. In Formulas (4) to (6), û (ω) is the k-th IMF,  Figure 8 shows that in the time domain, the current of the arc fault oscillated wildly compared to normal situations. Figure 9 indicates that the fault current was significantly different from the normal one in the 30-100 kHz band [10]. VMD was used to process the signal with the aim of analyzing the frequency components related to the arc accurately.

Variational Model Decomposition
VMD decomposes a time-series signal f(t) into K IMFs. Each IMF has a central frequency. VMD is an optimization problem based on IMF bandwidth minimization. It can be described by Formula (1). In Formula (1), K is the number of IMFs, k-th IMF is presented by u k (t), the central frequency of u k (t) is presented by ω k , (a * b) represents a convolved with b. ||a || 2 represents the L2-norm of a, which can be calculated by Formula (2), and ∂ t X represents the partial derivative of X with respect to t.
Quadratic penalty term a and Lagrangian multiplier λ(t) are used to transform this problem into a non-constrained problem as shown in Formula (3). x, y represents the inner product of x and y.
This problem can be solved by using alternating the direction multiplier method (ADMM) until an augmented Lagrange saddle point is found. u k , ω k, and λ can be updated by using Formulas (4)- (6). For the convenience of subsequent analysis, Formulas (4)- (6) Energies 2021, 14, 4137 7 of 16 are given in the frequency domain. In Formulas (4)-(6), û k (ω) is the k-th IMF, andf(ω) is the Fourier transformation result of the k-th IMF. θ is the noise tolerance parameter.
The iteration of u k , ω k, and λ can be stopped when an augmented Lagrange saddle point is found, which can be judged by Formula (7): When Formula (7) is satisfied, the original signal f(t) has been decomposed into K IMFs {u k }. The central frequency of u k is ω k .

Multi-Scale Fuzzy Entropy
After K IMFs are obtained by using VMD, the parameter calculated from the IMFs must be chosen as the feature of the arc. This parameter should basically maintain the same value when an arc fault does not occur. Once an arc fault occurs, this parameter should change noticeably. In this research, IMFE was chosen as the parameter. IMFE is used to measure the complexity of the time-series signal under different scales. It can be calculated through following steps.
Firstly, some sampling points in the original time-series signal were selected by using the sliding window method. Then the selected sampling points sequence {u(i) : 1 ≤ i ≤ N} needed to be coarsened. The coarsened sequence was established by using Formula (8): In Formula (8), τ represents the length of a factor in the coarsened sequence. This process is illustrated in Figure 10. and f (ω) is the Fourier transformation result of the k-th IMF. θ is the noise tolerance parameter.
The iteration of u , ω , and λ can be stopped when an augmented Lagrange saddle point is found, which can be judged by Formula (7): When Formula (7) is satisfied, the original signal f(t) has been decomposed into K IMFs {u }. The central frequency of u is ω .

Multi-Scale Fuzzy Entropy
After K IMFs are obtained by using VMD, the parameter calculated from the IMFs must be chosen as the feature of the arc. This parameter should basically maintain the same value when an arc fault does not occur. Once an arc fault occurs, this parameter should change noticeably. In this research, IMFE was chosen as the parameter. IMFE is used to measure the complexity of the time-series signal under different scales. It can be calculated through following steps.
Firstly, some sampling points in the original time-series signal were selected by using the sliding window method. Then the selected sampling points sequence {u(i): 1 ≤ i ≤ N} needed to be coarsened. The coarsened sequence was established by using Formula (8): In Formula (8), τ represents the length of a factor in the coarsened sequence. This process is illustrated in Figure 10.
In Formula (9), y (ȷ) can be calculated by using Formula (10): X m j = {y τ (j), y τ (j + 1), · · · , y τ (j + m − 1)} − y τ (j), j = 1, 2, · · · , N − m − τ + 2 (9) In Formula (9), y τ (j) can be calculated by using Formula (10): The distance d[X m i , X m j ] between X m i and X m j is defined as the maximum value of the difference between the two corresponding elements: The similarity degree D m ij of the vectors X m i and X m j is defined by a fuzzy function µ d m ij , β, r . In Formula (12), r is calculated according to the original data and β is an adjustable constant. ρ is an adjustment factor used to enhance the ability of noise tolerance.
Once a similarity degree is defined, IMFE can be calculated according to Formula (13): In Formula (13):

The Steps of Arc Fault Diagnosis Algorithm
Aiming at the non-linear and non-stationary characteristics of PV DC current, arc diagnosis is particularly suitable for combing VMD and IMFE together. The specific steps of the arc fault diagnosis algorithm are described below.
For the first step, DC current data were decomposed into several IMFs by VMD.
Firstly, we initialized the number of IMFs, K IMF signal set For the third step, we used IMFE to analyze the effective intrinsic mode function. Firstly, we initialized the length of the sliding window N, the length of a factor in the coarsened sequence τ, the dimension of the converted vector m, the values of the adjustable constants ρ and β, and the formula for calculating r according to the original data. Then N sampling points from an IMF signal u k were selected by using a sliding window. The selected sampling points {u(i) : 1 ≤ i ≤ N} were coarsened by using Formula (8). Then the coarsened sequences {y τ (j) : 1 ≤ j ≤ (N − τ + 1)} were converted into vectors by using Formulas (9) and (10). At last, by defining distance and the similarity degree function of these vectors {X m l : 1 ≤ l ≤ (N − τ − m + 2)}, the IMFE could be calculated by using Formulas (13) and (14). By sliding the window forward, IMFE based on IMF could be calculated eventually.
In the final step, we took the IMFE obtained in step 3 and the corresponding labels (normal data were marked as 1, and fault data were marked as 0) of them as the training objectives. These data were used to train the support vector machine. The remaining untrained data were used to verify the training effect of the support vector machine.

VMD Parameter Selection
Before performing VMD, the number of decomposed IMFs must be selected. When the K value is small, the decomposed IMFs cannot contain all the information of the original data. When the K value is large, mode mixing will occur between the IMFs, which makes these IMFs distorted signals. Therefore, the selection of the K value is very important.
In this research, the K value was chosen by decomposing the signal several times with different K values. After a set number of iterations with N as 0, set û 1 k , ω 1 k , andλ 1 as suitable size zero matrix, we set the noise tolerance θ as 0.5 and set the quadratic penalty term a as 2000. Table 1 shows the results of decomposition when K varies from 1 to 5. As shown in Table 1, IMF1 and IMF2 have relatively close central frequencies when K = 5. Mode mixing may occur in this situation. In order to decide whether 4 or 5 is the appropriate K value, the IMFs when K is 4 and 5 are shown in  Energies 2021, 14, x FOR PEER REVIEW 9 of 17

VMD Parameter Selection
Before performing VMD, the number of decomposed IMFs must be selected. When the K value is small, the decomposed IMFs cannot contain all the information of the original data. When the K value is large, mode mixing will occur between the IMFs, which makes these IMFs distorted signals. Therefore, the selection of the K value is very important.
In this research, the K value was chosen by decomposing the signal several times with different K values. After a set number of iterations with N as 0, set {u }, {ω }, and λ as suitable size zero matrix, we set the noise tolerance θ as 0.5 and set the quadratic penalty term a as 2000. Table 1 shows the results of decomposition when K varies from 1 to 5. As shown in Table 1, IMF1 and IMF2 have relatively close central frequencies when K = 5. Mode mixing may occur in this situation. In order to decide whether 4 or 5 is the appropriate K value, the IMFs when K is 4 and 5 are shown in Figures 11-14. Figure 11. The waveform of IMFs when K = 4. Figure 11. The waveform of IMFs when K = 4.      Figure 14 indicates that IMF1, IMF2, and IMF3 had an obvious overlapping part in the frequency domain. This phenomenon indicates that from the perspective of avoiding mode mixing, K = 4 was a more appropriate choice than K = 5.
Except for mode mixing, the stability of the results can also be the criterion for choosing the K value. Tables 2 and 3 show the central frequencies of the IMFs decomposed from several sets of data when K was set as either 4 or 5.
Tables 2 and 3 indicate that the stability when K = 4 was better than the stability when K = 5. Therefore, from the perspectives of stability and mode mixing, K should be set as 4 in this research.  Figure 14 indicates that IMF1, IMF2, and IMF3 had an obvious overlapping part in the frequency domain. This phenomenon indicates that from the perspective of avoiding mode mixing, K = 4 was a more appropriate choice than K = 5.
Except for mode mixing, the stability of the results can also be the criterion for choosing the K value. Tables 2 and 3 show the central frequencies of the IMFs decomposed from several sets of data when K was set as either 4 or 5.  Tables 2 and 3 indicate that the stability when K = 4 was better than the stability when K = 5. Therefore, from the perspectives of stability and mode mixing, K should be set as 4 in this research.   Figure 9 indicates that the frequency bandwidth related to arc faults ranges from 30 kHz to 100 kHz. Figure 12 illustrate that when K = 4, IMF1 and IMF2 are in this bandwidth. In addition, Figure 15 shows that once the current fluctuates wildly because an arc fault occurs, IMF1 and IMF2 will fluctuate synchronously. Therefore, IMF1 and IMF2 should be chosen as the IMFs related to the arc fault.

Selection of IMF and Calculation of IMFE
As proposed in Section 2.3, the IMFEs of these 2 IMFs were calculated. The length of the sliding window, which is represented by N, was chosen as 20. The length of a factor in the coarsened sequence, which is represented by τ, was less than or equal to 5. The dimension of vectors converted from the coarsened sequence, which is represented by m, was 3. As for parameters in the similarity degree function, ρ = 1, β = 2, r is calculated by Formula (15), and S is the standard deviation of all the original data. The IMFEs calculated from IMF1 and IMF2 are shown in Figures 16 and 17.  Figure 9 indicates that the frequency bandwidth related to arc faults ranges from 30 kHz to 100 kHz. Figure 12 illustrate that when K = 4, IMF1 and IMF2 are in this bandwidth.

Selection of IMF and Calculation of IMFE
In addition, Figure 15 shows that once the current fluctuates wildly because an arc fault occurs, IMF1 and IMF2 will fluctuate synchronously. Therefore, IMF1 and IMF2 should be chosen as the IMFs related to the arc fault. As proposed in Section 2.3, the IMFEs of these 2 IMFs were calculated. The length of the sliding window, which is represented by N, was chosen as 20. The length of a factor in the coarsened sequence, which is represented by τ, was less than or equal to 5. The dimension of vectors converted from the coarsened sequence, which is represented by m, was 3. As for parameters in the similarity degree function, ρ 1, β 2, r is calculated by Formula (15)  By combining Figures 15-17, the IMFE successfully met the requirement of the features of arc diagnosis. The value of IMFE was close to 0 when an arc fault did not happen. Once an arc fault occurred, the value of IMFE increased noticeably, and the magnitude was proportional to the severity of the arc fault.
Before conducting validation, the choices of τ and N need to be discussed. The IMFE calculation above was conducted when τ was less than 5. When the length of the sliding window and the length of the factor are chosen differently, the result could be different. Figure 18 shows the average IMFE-scale curve when the length of the sliding window is chosen as 100, 75, or 50. The IMFE is calculated from IMF1 in Figure 18. Figure 18 indicates that when the N chosen is larger, the average IMFE is larger. This means that the fluctuation in IMFE caused by an arc is stronger when the chosen N is larger. However, a larger length of the sliding window will make the calculation more time-consuming. In Figures 16 and 17, N is set as 20. The fluctuation is still strong   By combining Figures 15-17, the IMFE successfully met the requirement of the features of arc diagnosis. The value of IMFE was close to 0 when an arc fault did not happen. Once an arc fault occurred, the value of IMFE increased noticeably, and the magnitude was proportional to the severity of the arc fault.
Before conducting validation, the choices of τ and N need to be discussed. The IMFE calculation above was conducted when τ was less than 5. When the length of the sliding window and the length of the factor are chosen differently, the result could be different. Figure 18 shows the average IMFE-scale curve when the length of the sliding window is chosen as 100, 75, or 50. The IMFE is calculated from IMF1 in Figure 18.  Figure 18 indicates that when the N chosen is larger, the average IMFE is larger. This means that the fluctuation in IMFE caused by an arc is stronger when the chosen N is larger. However, a larger length of the sliding window will make the calculation more time-consuming. In Figure 16 and 17, N is set as 20. The fluctuation is still strong enough for arc detection. Therefore, N should be chosen based on computing power and speed needs.
Regardless of N, the average IMFE always decays to a small value when τ is equal to 10. A possible explanation can be stated as the following: in the IMF1 which IMFE is calculated from, the arc portion counts for the same number of data points regardless of the value of N. When the chosen τ is larger than the number of arc fault data points, the IMFEs which contain arc data points are reduced by the presence of normal data points, which can cause the arc to go undetected based on IMFE. Therefore, the IMFE of a small τ value will make sure that an arc with short bandwidth can be detected. If fluctuations on the IMFE exists, the larger the τ of IMFE is, and the wider the bandwidth of the arc is. Regardless of N, the average IMFE always decays to a small value when τ is equal to 10. A possible explanation can be stated as the following: in the IMF1 which IMFE is calculated from, the arc portion counts for the same number of data points regardless of the value of N. When the chosen τ is larger than the number of arc fault data points, the IMFEs which contain arc data points are reduced by the presence of normal data points, which can cause the arc to go undetected based on IMFE. Therefore, the IMFE of a small τ value will make sure that an arc with short bandwidth can be detected. If fluctuations on the IMFE exists, the larger the τ of IMFE is, and the wider the bandwidth of the arc is.

Fault Diagnosis and Validation
In order to validate the algorithm, 5100 current sampling points were obtained from the experimental platform proposed in Section 2.1. In this process, 24 experiments, including 12 normal experiments and 12 arc fault experiments with different load resistances, were carried out. SVM was used to detect the arc fault based on the IMFEs of IMF1 and IMF2. Therefore, these data were divided into a training set and a verification set. The training set contains 70% of the data while the verification set contains the rest.
The SVM program used in this paper was the LibSVM program provided by Chang Chih-Chung and Lin Chih-Jen [41]. The radial basis function was selected as the kernel function of the SVM, and the particle swarm optimization (PSO) algorithm was used to optimize the penalty parameter c and the kernel function parameter g of the SVM.
The diagnosis result type can be divided into TP (true positive, an arc fault sample was diagnosed as an arc fault sample), FP (false positive, a normal sample was diagnosed as an arc fault sample), TN (true negative, a normal sample was diagnosed as a normal sample) and FN (false negative, an arc fault sample was diagnosed as a normal sample). From the definitions of these result types, FP and FN were the wrong diagnosis results and the others were the right diagnosis results.
The 1530 samples in the validation set were diagnosed by SVM and the results are shown in Table 4. Table 4 illustrates that the proposed arc fault diagnosis algorithm had a great effect on detecting arc faults. Of the 1530 samples, 1514 were diagnosed correctly. The accuracy of this algorithm was 99.0% on this validation data set.

Conclusions
This research proposed a fault diagnosis method based on VMD, IMFE, and SVM for PV arc diagnosis. An experimental platform was established to obtain the training and validation current data. During the arc diagnosis, the current data were first decomposed into several IMFs by using VMD. Then the IMFs related to an arc were picked for IMFE calculation. After IMFEs of these IMFs were obtained, they were used as the input of SVM to detect the arc fault. According to the results of validation, the proposed arc fault diagnosis method could be used to detect an arc fault based on current data and was relatively accurate.
The advantages of this method can be stated as follows: using VMD makes sure that the decomposed signals have a central frequency and limited frequency bandwidth; by choosing an appropriate K value, mode mixing can be avoided, and IMFs are guaranteed to make sense; compared to other decomposition methods, the result of VMD contains relatively rich information, which is beneficial to the subsequent analysis. Using IMFE successfully converts fluctuation in IMFs into amplitude change in IMFEs, which makes using SVM possible. Compared to developed methods, the advantages of the proposed method can be stated as follows: compared to using sensors to detect the arc [42], the proposed method only requires users to add sampling devices to the scene. High sampling rate devices can be chosen with same investment. Fewer devices needing to be installed also makes the proposed method easy to apply in different situations. Compared to using thermal imaging [43], the proposed method can significantly save on the cost of equipment installation. The proposed method is also more universal than thermal imaging. Compared to other frequency-domain methods [44], this method avoids the choice of functions such as wavelet basis, window functions, etc. By decreasing the length of the sliding window, the calculation speed of the proposed method can be improved, though the significance of the results may be weakened at the same time. Users can find a proper balance between them based on the requirements of the actual application. Furthermore, this algorithm has potential application in bearing fault diagnosis, mechanical fault analysis of circuit breakers, multi-level inverter analysis, and so on.
However, compared to using sensors and thermal imaging, accuracy may be a disadvantage of the proposed method. The rapidity of the proposed algorithm was not validated in this research. It is possible that an integrated online arc fault diagnostic device could be established based on this algorithm.