Series Arc Fault Detection Algorithm Based on Autoregressive Bispectrum Analysis

Arc fault is one of the most critical reasons for electrical fires. Due to the diversity, randomness and concealment of arc faults in low-voltage circuits, it is difficult for general methods to protect all loads from series arc faults. From the analysis of many series arc faults, a large number of high frequency signals generated in circuits are found. These signals are easily affected by Gaussian noise which is difficult to be eliminated as a result of frequency aliasing. Thus, a novel detection algorithm is developed to accurately detect series arc faults in this paper. Initially, an autoregressive model of the mixed high frequency signals is modelled. Then, autoregressive bispectrum analysis is introduced to analyze common series arc fault features. The phase information of arc fault signal is preserved using this method. The influence of Gaussian noise is restrained effectively. Afterwards, several features including characteristic frequency, fluctuation of phase angles, diffused distribution and incremental numbers of bispectrum peaks are extracted for recognizing arc faults. Finally, least squares support vector machine is used to accurately identify series arc faults from the load states based on these frequency features of bispectrum. The validity of the algorithm is experimentally verified obtaining arc fault detection rate above 97%.


Introduction
Statistical data from fire services show that over 90% of electrical fires are caused by arc faults, over currents, short circuits and leakages [1][2][3][4].Leakage protectors and over current breakers are respectively used to protect electrical circuits from leakages and over currents presently, but they cannot be used to protect circuits from arc faults.There are three types of arc faults: ground arc fault, parallel arc fault and series arc fault.Parallel and ground arc faults respectively share similar features to ground fault and over current and are thus easy to be detected [5].However, fault features from series arc faults are usually submerged by load currents and background noise, thus making accurate recognition difficult.
To accurately describe an arc fault in mathematics, some arc fault models were proposed.Gammon and Matthews developed an instantaneous arc model for a resistive-inductive system [6].This model could predict arc fault currents because arc voltage was a current-dependent function.Another similar arc fault model for alternating current (AC) systems was derived by the probability method [7].Parise provided a simplified model which could be used to analyze and determine basic parameters such as arc resistance and ignition voltage [8].To fit with the arc fault scenarios described in UL1699 which was a standard for arc fault circuit interrupter (AFCI), a meta-model based on Mayr and Ayrton models was built [9].Arc fault models usually provide theoretical bases for arc fault detection.
When arc faults occur in circuits, some abnormal behaviors are found on arc light, arc sounds, arc radiation signals, arc voltages and arc currents.Arc light and arc sounds usually change obviously and they can be used for arc fault detection [10,11].The study in [12] estimated the amplitude of arc voltage by least error squares method.The voltage was compared with a given threshold value to identify arc faults.Another similar approach for arc fault detection by the amplitude comparison of arc voltage was proposed in [13].The simulation of the approach was successfully done using a square wave arc model.Charles obtained the radiated electromagnetic energy from arc source using a stick and a loop antenna [14].He then concluded that the radiation signals could be used for arc fault detection.Although arc fault behaviors of arc light, arc sounds, arc radiation signals and arc voltages are obvious, they appear on stationary sites.Hence, there are some limitations in arc fault detection methods based on these behaviors since the sites, time and intensities of series arc faults are random.
The study in [15] briefly summarized the frequency features of currents.Researchers in the study believed that the high frequency components (greater than 20 kHz) existed in arc fault currents.In the frequency domain, the spectrum energy variations of line currents can be used to detect series arc faults [16].In many studies, a large quantity of load currents in arc faults and normal states were first acquired.Arc fault features were then found by advanced signal processing approaches, and arc faults were finally discriminated from the load states based on these features.Those advanced approaches are: using the absolute sum value of coefficients in multiresolution signal decomposition [17], combining discrete wavelet transform and wavelet networks [18], using discrete wavelet transform and an artificial neural network [19,20], using Gabor Transform and artificial neural network [21], reconstructing the information entropy of signals in different frequency bands [22], etc.
For this paper, in order to improve the accuracy of arc fault detection, a large number of arc fault high frequency signals will be collected by a transducer to find any common features among arc faults.High frequency signals are usually mixed with Gaussian noise, such as circuit noise and quantization noise from the acquisition process [23].In order to improve the signal-to-noise ratio (SNR) and the accuracy of arc fault detection, the higher-order spectrum will be introduced in this paper.Higher-order spectrum analysis is emerging as a new useful technique in signal processing; it can describe nonlinear coupling, restrain Gaussian noise and preserve phase information among different frequency components of a random process.The study in [24] introduced the higher-order spectrum in detail.In the third-order case of a random process, the higher-order spectrum reduces to the bispectrum.The two main methods used to estimate the bispectrum are the conventional and the parametric methods [24][25][26][27][28][29].Raghuvee and Nikias presented two parametric approaches consisting of the third-order recursion method and the constrained third-order mean method to estimate the bispectrum [30,31].Bispectrum has been widely used in many fields, such as electrical and mechanical equipment condition monitoring and diagnostics [32][33][34], astronomy [35], communication [36,37] and biomedical engineering [38,39].
Bispectrum will be used to extract common features for identifying series arc faults.In order to further improve the identification rate of arc fault, a good classification algorithm is needed.Compared with traditional classification algorithms, such as a neural network algorithm, support vector machine (SVM) can better solve the difficult problems: dimension disaster, over learning, local minima [40,41], etc.It has several merits including convergence to global optimum, good generalization ability and adaptation for some small samples and high dimension data [42,43].In this research, least squares support vector machine (LSSVM) will be introduced to identify series arc faults.
As mentioned above, many previous studies on arc fault detection have discussed mathematical models, abnormal behaviors in time domain, frequency spectrum, power spectrum, etc., but none of these studies offer autoregressive (AR) model and AR bispectrum analysis for arc fault detection.For the first time, this research will apply AR bispectrum to study arc faults in low-voltage AC circuits.
The remainder of this paper is organized as follows.Section 2 constructs an arc fault experimental platform and collects a large amount of experimental data.Section 3 analyzes different types of load signals in arc faults and normal states based on two main methods: conventional time-frequency domain and AR bispectrum analysis.The conventional time-frequency analysis includes time domain and power spectrum analysis.In bispectrum analysis, an AR model is first modelled, then AR bispectrum is introduced.Common series arc fault features are extracted as input vectors to recognize arc faults.Section 4 introduces LSSVM and applies it to the identification of series arc faults from the load states.Finally, Section 5 summarizes the conclusions of this study.

Experimental Platform
An arc fault experimental platform which is based on the low-voltage electric standards is constructed as shown in Figure 1.Electrical standards include UL1699, IEC 62606: 2013 and GB/T 31143-2014 which are used as reference standards to construct the experimental platform.Typical experimental loads are composed of six 50 W halogen lamps, a 0.75 kW electrical hand drill, two 40 W fluorescent lamps, a 0.35 kW computer, a 1.2 kW electric stove, a 1 kW dimming lamp, a 1.2 kW vacuum cleaner, a 1.1 kW air conditioning unit and a 2.2 kW air compressor.The arc generator used for simulations of arc faults in circuits consists of a stationary electrode and a moving electrode.A PXI (the abbreviated form of the peripheral component interconnection extension for instrumentation) data acquisition system and a high frequency transducer are used to acquire a large number of high frequency signals in circuits.The band width of PXI system is 100 MHz.Different types of load signals in arc faults and normal states are acquired through this experimental platform, and they will be used to analyze the common features of arc faults in the following section.

Conventional Time-Frequency Analysis
High frequency signals were discovered during numerous arc fault experiments.They could reflect the dynamic arc discharge process and be acquired using high frequency transducers [14].At first, air molecules in the wire gap are ionized in the initiation of arc discharge, and the motion of plasma is then intensified further.According to electromagnetic theory, great quantities of high frequency signals are initially released.The signals are then gradually reduced during the remainder of the current cycle [14].As a result, high frequency signals are produced periodically when arc faults exist in circuits.However, these signals are uncertain as a result of the unknown external conditions including the effects of electrode materials, surface states, oxide layers, adsorbed gases, arc currents, arc gaps of dielectric materials, etc.In order to find the features of arc faults, some typical load signals are selected for analysis.The high frequency signals of different loads in arc faults and normal states are shown in Figure 2.There are many high frequency signals and short pulses generated during the arcing process when the loads work in arc fault states as shown in Figure 2b,d and f.The high frequency signals with big amplitudes are densely distributed.However, high frequency signals can be also acquired during the normal states as shown in Figure 2a,c and e.
As a result of the switch transformers alternating between on/off state, there are many high frequency signals generated in current cycles when the halogen lamps work in the normal state as shown in Figure 2a.Similarly, when the vacuum cleaner and the electrical drill work in normal states, some random interference pulses appear in the circuits as shown in Figure 2c,e.Due to electric brush motors in the vacuum cleaner and electrical drill, electrical discharge occurs and produces many high frequency signals when the electric brush changes the current phase.Electric sparks can be seen from the electrical drill windows even when working in the normal states.The electric sparks are caused by load arcs which are not dangerous but are similar to fault arcs.It is usually very difficult to discriminate load arcs directly from measured frequency signals.
The analysis above indicates that it is difficult to find a reasonable numerical threshold to classify arc faults and the normal states in all loads by simply inspecting the constant amplitude of the high frequency signals.This is because that arc fault signals are usually mixed with much interference consisting of circuit noise, quantization noise from the acquisition process, random noise from electric parts, etc.They are both close to the Gaussian distribution.As a result of their existence, the SNR decreases considerably.The frequency aliasing is hard to overcome.
In frequency domain, the power spectra of vacuum cleaner are shown in Figure 3.The frequency distributions of arc faults and normal states in power spectra are similar.As we know, the power spectrum is based on second-order statistics.Signals are only analyzed by the amplitude (e.g.Figures 2c,e, 3a and b) but lack phase coupling information.So the power spectrum may fail in providing a satisfactory description in a non-Gaussian or nonlinear process.In addition, there are many high frequency signals generated in current cycles when the vacuum cleaner works in normal state.The power spectrum applies the Fourier transform to all signals and is sensitive to the Gaussian noise.Hence it is difficult to distinguish the effective components from the mixed signals of arc faults through the power spectrum.That is to say, it is difficult to identify series arc faults from the load states.Generally speaking, arc fault high frequency signals are random, nonlinear and non-Gaussian.Since bispectrum is a powerful technique in processing random signals, it will be introduced to analyze arc fault signals in the following section.

Bispectrum Analysis
Higher-order spectrum and higher-order cumulants are useful tools in describing the dynamic features and processing signals [44].Bispectrum is the two-dimensional Fourier transform of third-order cumulants and is the simplest higher-order spectrum.It is widely used in describing the nonlinear coupling, restraining Gaussian noise and preserving the phase information of random signals.At present, there are many methods to estimate the bispectrum.A parametric method based on the AR model is introduced to study arc faults in this section.

AR Model and Third-Order Cumulants
According to the analysis results in Section 3.1, arc fault is an obvious random process.So it is very difficult for us to build a mathematical function to describe the variation rules with time.However, we may introduce time series analysis theory to reveal the correlation among the acquired data from arc faults.Here, the AR model is introduced.As a result of environmental impacts, circuit noise, quantization noise and load disturbances, there is much interference mixed in arc fault signals.The random signals which exist in arc faults are assumed to be interfered by the non-Gaussian white noise a(t) whose mean value is zero.The zero-mean non-Gaussian colored noise u(t) containing abundant useful dynamic information of arc faults is assumed to be the output of AR process driven by a(t) [30,44].Therefore, the time series AR model of arc fault can be described as where q is the order of model AR(q), ( 1,2, , ) is the autoregressive coefficient and a(t) is the input signal.
According to the relationship between cumulants and moments in statistical theory, the third-order cumulants of u(t) can be expressed as After the preprocessing, the mean value of u(t) is zero, namely, 0 u μ = .Therefore, Equation (3) can be simplified as Here the third-order cumulants of u(t) equal to its third-order moments.For the input signal a(t), it has the following properties: where 2 a σ is the variance of a(t), 1 ( ) δ τ τ is the two-dimensional unit impulse function and the coefficient 3 ≠ .For a stationary physical process, Equation (1) can be expressed by using third-order cumulants of u(t) [30,31]: , Equation ( 6) can be transformed in matrix: = CΦ Γ (7) where 3 3 3 (1 ,1 ) (0,0) 0 k k q ϕ =  is the AR model parameter to be estimated.Now, when solving Equation ( 7), a crucial action should be taken: determine the order of AR model.This problem can be solved by the cumulant matrix e C with the singular value decomposition technique [45].Here e C is the third-order cumulant matrix of u(t).It can be decomposed as H e = C UΣV (8) where U and V are unitary matrices, Σ is an matrix whose elements are zero except possibly along its main diagonal and H V is the complex conjugate transpose of V.The diagonal elements of Σ are ordered as The diagonal element n λ is called as the singular value of e C .The normalized ratio of Frobenious norm can be expressed as The number q can be regarded as the order of the AR model when R(q) ≥ 0.995.At last, Equation ( 7) can be solved by the least squares method to obtain the AR model parameter ( 1,2, , ) k k q ϕ =  .Here the AR model of the mixed high frequency signals can be used to quantitatively describe the nonlinear coupling relationships of arc fault signals.The next task is to extract series arc fault features through bispectrum based on the AR model.

AR Bispectrum
AR Bispectrum is a type of bispectrum which is estimated through an AR model parametric method.It will be briefly introduced in this section.
In Equation (1), if the system impulse response is h(t), the output u(t) can be described as Substituting Equation (11) into Equation ( 4), then we have [ ]

a t h a t h a t h h h E a t a t a t
As a result of Equation ( 12) can be simplified as According to the definition of higher-order spectrum, third-order cumulants spectrum (usually called bispectrum) is defined as the two-dimensional Fourier transform of third-order cumulants: Here the imaginary number 1 j = − .Substituting Equation (13) into Equation ( 14), then we have where the frequency transfer function ( ) H ω ω + is the complex conjugate of ( ) H ω ω + [31,46].The frequency transfer function of the q-th order AR process in Equation (1) can be further expressed as ( ) To reduce the computational complexity of the bispectrum, bispectrum slice will be introduced.The one-dimensional slices of the bispectrum are the reconstructed spectra.For ω = ω1 = ω2 and according to Equation (15), the calculation of 1½-dimensional slices of the bispectrum is given by ( ) ( ) The 1½-dimensional slices usually achieve high resolution ability and have quick calculation speed [47].They can also reflect the nonlinearity, the quadratic phase coupling information and restrain Gaussian noise of a random process.They can be used to analyze arc fault features quickly.

Bispectrum Features of Arc Faults
The contours, amplitudes, phase angles and the 1½-dimensional slices of different load bispectra are plotted as shown in Figures 4-6.According to the differences between the normal state and arc fault bispectra, it is demonstrated that more information about the nonlinear dynamic features of the load signals is given by the bispectra.When series arc faults occur in circuits, the energy distribution which can be discovered from the contours becomes diffused.
Since the power spectrum is phase blind, it is unable to detect the presence of phase coupling.However, the bispectrum gives an indication of the correlation between the phases of different frequency components 1 ω and 2 ω .In addition, the power spectrum also applies the Fourier transform to all signals and is sensitive to Gaussian noise.In contrast, the bispectrum can eliminate the additive Gaussian noise and provide information of the phase coupling.The phases of electrical load arc faults are different from those of normal states, where the former increase suddenly and randomly in some frequency domain while the latter change smoothly in most frequency domain.Since the capacitive and the inductive reactance of fault arc are unstable, they may change randomly.
As shown in the analysis of arc fault contours (Figures 4-6) and the symmetry features of the bispectrum, all the valuable information in the arc fault signal is contained in the bispectra in the inner triangular domain 2 0 ω ≥ , 1 2 ω ω ≥ and 1 2 ω ω π + ≤ .Therefore, the spectrum is sufficient for the description of all bispectrum features and the symmetry features are given by: In the inner triangular domain, the highest peaks can be found in each plot.The peaks of the bispectra indicate the process is non-Gaussian and the nonlinear coupling does exist in the arc fault signals.One possible reason is the randomness of the fault arc resistance.The information of the peaks is similar, and proves they are effective in reflecting the information of the dominant frequency components of the non-Gaussian nonlinear arc fault signals.
When the loads work in normal, the background noise and working processes are treated as stationary random processes, and there are few peaks in the bispectrum graphs.When series arc faults occur in circuits, as a result of the coupling relations of different high frequency components generated by arc faults, bispectrum peaks gradually change from concentrated distribution to dispersed and become mixed and disordered.The number of bispectrum peaks is evidently increased.There are increased numbers of high peaks in the arc fault bispectrum as a result of strong correlation in some frequency.The arc fault becomes more serious and the nonlinearity becomes stronger.
The above dynamic features of load states are also performed in the slices of bispectra.Furthermore, the 1½-dimensional slices of bispectra are easier to calculate than bispectra.The phase coupling frequency components can also be extracted from the Gaussian noise environment.Bispectrum peak numbers of arc faults increase obviously, and the highest bispectrum peak positions of arc faults move towards 1 = 0.5π ω .Therefore, the attended phase coupling components and their generated components in the 1½-dimensional slices of bispectra can be used for arc fault detection.However, it is hard to find constant thresholds to classify normal states and arc faults in all loads.Thus, a good classification algorithm is needed.Compared with traditional classification algorithms, such as a neural network algorithm, SVM can better solve the difficult problems listed as follows: dimension disaster, over learning, local minima [40,41], etc.It has several merits consisting of convergence to global optimum, good generalization ability and adaptation for some small samples and high dimension data [42,43].To better discriminate arc fault states from load states, the frequencies 1 ω and 2 ω are selected as the input vector.Based on the features including distribution regularities of bispectrum peaks, SVM is introduced to classify series arc faults and the normal states in the following section.

Arc Fault Identification
The SVM algorithm which is based on statistical learning theory has been successfully used to solve linear regression, pattern recognition and classification problems [40].The LSSVM algorithm which has been developed by Suykens and Vandewalle [42] is a new deformation algorithm based on the traditional SVM.It can reduce computing complexity and guarantee the accuracy of data classification simultaneously.The frequencies 1 ω and 2 ω are selected as the input vector x.
The feature vector x is mapped from the original to the high dimensional feature space by nonlinear transformation f(x).Then, the optimal classification plane is found in the high dimensional feature space.According to the structural risk minimization principle, the constrained optimization problem corresponding to the original classification problem can be expressed as follows: ( 1 where V is a weight vector, C is the penalty parameter, m is the number of samples, i ξ is the relaxation coefficient and b is a bias term.The output result of classification is [40,42,48].
In order to solve the constrained optimization problem, Lagrange multipliers i α is introduced.
Equation ( 19) is transformed to the unconstrained objective function as (20) According to the Karush-Kuhn-Tucker condition, let the derivative to V, b, i α , i ξ of Equation (20) equal to 0, then the equation can be expressed as Furthermore, Equation ( 21) can be organized into a matrix form (22) where ( ) ( ) , 1, 0 ( , ) 0, others . On the basis of Mercer condition, T ( , ) ( ) ( ) x is defined as a kernel function.According to the features of arc fault signals, the radial basis function (RBF) is selected as the kernel function which is widely used in classification cases.
Hence the optimization problem of Equation ( 19) can be solved by using Equation (22), where α and b can be solved through least square method [42].The identification result of the loads states by LSSVM can be described as * 1 ( )=sgn ( , ) The frequencies 1 ω and 2 ω are selected as the input vector x.The main procedures for arc fault identification are listed as follows: (1) Selection of sample set for LSSVM.The input used the feature vector x including the frequencies 1 x and 2 x .The output was the classification result y.The status of output contained −1 and 1.The "−1" represented the normal state and "1" represented arc fault state.(2) Construction of the training set (x, y).From the experimental data of nine types of typical loads in different working states, three hundred and sixty samples were chosen for further processing.Two hundred and eighty samples of those samples were treated as the training samples and the remainder were treated as testing samples.The training set of recognizer was listed in Table 1.
, exp where σ 2 is the kernel parameter.At first, the initial kernel parameter σ 2 and the penalty parameter C were selected.Then, the ten-fold cross-validation method was applied to optimize the parameters through the training set.According to the method, the training set was equally divided into ten subsets which were mutually disjoint.The entire method would be repeated ten times.Each time, one of the ten subsets was selected as a new test subset and the other nine subsets were placed together to form a new training subset.The nine subsets were used for training and the tenth subset was used for testing.After repeating ten times, the average error of the ten trials was computed.Finally, the optimal parameters were obtained when the error was the minimum [49].
(4) Arc fault identification.The testing samples were input into the arc fault recognizer.Then the identification results were compared with the real results.Finally, the generalization ability of recognizer was evaluated based on error rate which could be calculated as where p was the number of testing samples and * i y was the classification result.
From the above procedures of arc fault identification, the arc fault recognizer was proposed.The kernel parameter σ 2 = 0.54 and the penalty parameter C = 7.1 were obtained by the ten-fold cross-validation method.The identification results are shown in Figure 7 where "*" represents the normal states and "□" represents arc faults of the loads.The testing samples can be identified from the feature vector space.If the testing sample appears on the arc fault feature space in Figure 7, it means an arc fault has occurred in the circuit; otherwise, there is no arc fault.The identification results are the globally optimal solution in that RBF kernel function parameters.The optimal classification plane of arc faults and the normal states can be found in the high dimensional feature space.After completing sample experiments, the actual results and the classified results are listed in Table 2.The results show the average error rate of LSSVM is 2.6%.Hence, the arc fault identification rate is 97.4%.Since some weak arc fault signals are added to the testing samples, the frequency of the arc fault is very close to that of the normal state.Therefore, the LSSVM recognizer fails to classify the arc faults from the normal states.This limitation can be improved in future works.

Conclusions
AR Bispectrum analysis has been applied in many signal processing cases, but not previously in the analysis of arc faults in low-voltage AC circuits.Therefore, application of said analysis in this research would represent a first.Arc fault features are extracted to realize detection of series arc faults by AR bispectrum.The main conclusions in detail are listed as follows: (1) High frequency signals of circuits increase frequently when series arc faults occur in circuits, but they are usually mixed with much interference.(2) An AR model of arc fault is established to describe the coupling relationship of the mixed high frequency signals and reflect the dynamic characteristics of arc faults.(3) According to the AR bispectrum analysis on nine types of typical experimental loads which are mentioned in electrical standards, the signal phase information of arc fault is kept and the influence of noise such as Gaussian noise is restrained effectively.AR bispectrum analysis is more effective than power spectrum and time-domain analysis.When series arc faults occur, the numbers of spectrum peaks increase obviously; the distribution of spectrum peaks tends to diffuse and the bispectrum slices are also dispersed.To better describe series arc faults, bispectrum frequency features including distribution regularities of bispectrum peaks are extracted as support vectors.(4) Based on the above features of bispectrum, LSSVM is successfully used to discriminate arc faults from working states in different loads.The whole algorithm has been well run in the computer and has been verified through the arc fault experimental platform.The arc fault detection rate is over 97%.The result shows that the developed algorithm has good generalization ability in different loads' arc fault detection.For future research, in terms of algorithm improvement, arc fault detection rate will be further advanced.Furthermore, this algorithm may be applied in direct current (DC) arc fault detection, such as arc faults in photovoltaic systems and automotive power supply systems.

Figure 2 .
Figure 2. Detected high frequency signals of different load states.(a) Halogen lamp normal; (b) Halogen lamp arc fault; (c) Vacuum cleaner normal; (d) Vacuum cleaner arc fault; (e) Electrical drill normal; (f) Electrical drill arc fault.
the positive definite matrix Ω is T

Table 1 .
The training set of recognizer.