Comparison of Artificial Intelligence Methods for Fault Classification of the 115-kV Hybrid Transmission System

: This research proposes a comparison study on different artificial intelligence (AI) methods for classifying faults in hybrid transmission line systems. The 115-kV hybrid transmission line in the Provincial Electricity Authority (PEA-Thailand) system, which is a single circuit single conductor transmission line, is studied. Fault signals in the transmission line were generated by the EMTP/ATPDraw software. Various factors such as fault location, type, and angle were considered. Then, fault signals were analyzed by coefficient details on the first scale of the discrete wavelet transform. Daubechies mother wavelet from MATLAB software was used to decompose the fault signal. The coefficient value of the mother wavelet behaved depending on the position, inception of fault angle, and fault type. AI methods including probabilistic neural networks (PNNs), back-propagation neural networks (BPNNs), and support vector machine (SVM) were used to identify faults. AI input used the maximum first peak coefficients of phase ABC and zero sequence. The results obtained from the study were found to be satisfactory with all AI methodologies having an average accuracy of more than 98% in the case study. However, the SVM technique can provide more accurate results than the PNN and BPNN techniques with less computation burden. Thus, it is suitable for being applied to actual protection systems.


Introduction
Many big cities have undergone economic and infrastructure growth, with a rising population density due to migration from rural to urban areas. This trend has resulted in increasing demand for electricity that requires more transmission lines to connect different parts of a substation to the end-user. In order to improve the landscape in large cities and reduce the stack of overhead lines, underground cables have been employed in distributing power in high density or tourist areas. Thus, the topology of the distribution network is shifting from mainly overhead lines to a combination of overhead lines and underground cables. Currently, Thailand is also installing hybrid transmission lines (a combination of overhead lines and underground cables) in various cities throughout the country. There are significantly different parameters between overhead lines and underground cables, which can cause an error in the conventional protection system and risk reliability issues in the electrical network. Thus, an algorithm that can detect and classify fault types under the hybrid transmission line needs to be developed.
A significant proportion of faults in electrical power systems occur in transmission lines. Fault signal analysis in transmission systems is one of the important fields to ensure that the operator has a method to deal with disturbance correctly and limit the damage on the overall system. There has been a lot of research in the past that performed analysis using various methodologies [1,2]. Wavelet transform is a signal processing method that has gained attention from researchers for the application of fault analysis and has been applied for fault analysis in electrical systems due to properties that make it suitable for analyzing the transient sate of the signal .
A wavelet transform was applied to detect and classify faults in transmission lines [3][4][5][6][7][8][9]. The maximum wavelet singular value (MWSV) was applied for the detection and classification of faults. Current signals were input for the wavelet transform. This type of fault classification depends on the Euclidean norm of the MWSV, and faults are identified by coefficients and setup values [3]. Herein, detection and classification of faults in transmission lines were performed by introducing a novel method based on power spectral density (PSD) in time and frequency. The PSD index (in time) was applied for fault detection, and the PSD index (in frequency) was used for classification [4]. Thus, faults could be detected in a short time, and classification was completed using the Hellinger distance [4]. The mother wavelet is important because if the appropriate mother wavelet is chosen, effective results can be obtained. Previous studies have highlighted the importance of the mother wavelet [8,16]. In [8], the mother wavelets affected the classification of faults in hybrid transmission lines. Daubechies (db) mother wavelet exhibits better accuracy than any other mother wavelet. Likewise, in [16], Daubechies (db) mother wavelet was found to completely satisfy fault classification.
By combining the wavelet transform method with artificial intelligence (AI) methods, the accuracy of fault classification in electrical systems has been improved [10][11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27][28]. The discrete wavelet transform (DWT) has been applied previously to decompose fault signals [11,13,14,17]. The DWT results were input into artificial neural networks (ANNs) to detect and classify faults in transmission lines. A technique for classification of internal and external faults of protection zones in transmission lines has been proposed, in which fault signals were decomposed by the wavelet transform, and high frequency components and spectral energy were input into a support vector machine (SVM) to detect faults and classify them, respectively. Various AI methods are applied for a multitude of applications. To obtain the best results, the most suitable AI method needs to be identified. AI methods such as ANN, SVM, k-nearest neighbors (k-NN), extreme learning machine (ELM), and support vector regression (SVR) were compared in some previous studies [25][26][27][28][29]. The different methodologies that have been used and tested in previous research are summarized in Table 1. The literature review showed that most studies used wavelet transform to analyze signals in electrical systems due to its suitability for analyzing the transient state of the power system. However, past research has focused on fault analysis in either overhead transmission lines or underground cables. Fault analysis has not been performed in hybrid transmission lines. Thus, this paper aims to present the performance of different AI methodologies in fault classifying in hybrid transmission lines. In previous studies [7][8][9], wavelet coefficient behavior to hybrid transmission lines was studied [7] and the algorithm for classification of faults using only DWT was designed [8]. In addition, different mother wavelets for classification were compared to identify the optimal mother wavelet using an algorithm [9]. The algorithms developed in previous studies [7][8][9] are applied to AI methodologies to improve their accuracy. Three types of AI methods (probabilistic neural networks (PNN), back-propagation neural networks (BPNN), and SVM) have been used and compared in terms of accuracy of classification results in order to select the suitable AI methodology to implement in protection systems. The overall process of the study is shown in Figure 1. First, the fault signal was simulated by ATPDraw. Next, the fault signals were analyzed using a WT. The maximum coefficient value was input into the AI. Finally, the AI methods were compared in terms of accuracy. The detailed process is explained in the next section.

Fault Signal and Wavelet Transform
The 115-kV transmission line of the Province Electricity Authority (PEA) system in Thailand was used for simulation. PEA is responsible for providing electricity to 74 provinces of Thailand. The single circuit single conductor model was used to model the transmission lines of overhead lines and underground cables. The layout of the structures is shown in Figure 2.

Fault Signal
The performance of the proposed classification algorithm was evaluated using simulation software by considering varying parameters to ensure the accuracy of the algorithm. The following case studies with different parameters were analyzed: The fault in each phase of the transmission line was located at 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, and 90% of the length of the transmission line measured from the sending end to the receiving end.
The model for simulating and analyzing faults in the transmission line was divided into two parts. First, EMTP/ATPDraw software generated fault signals. The parameters in Figure 2 were input into the software. Then, fault signals were decomposed using WT.
The signals from the simulation with EMTP/ATPDraw software were obtained with a sampling rate of 100 kHz, and the switch was responsible for the format and the time that the fault occurred.   Figure 3 shows the current signal of both sides when a fault occurs in the transmission line at 0.04 s. Before the fault occurred, current signals were sinewaves with the same size as that of the load. After the fault occurred, the current on the fault phase significantly increased in both the send and load sides. Then, the current signal from EMTP/ATPDraw was diagnosed using discrete wavelet transform (DWT) by MATLAB program.

Wavelet Transform
Fault signals from EMTP/ATPDraw were decomposed by WT of MATLAB. High frequency components were caused by abnormalities in the electrical system. WT extracted and analyzed the fault signals. Three-phase current signals and zero sequence were decomposed by Daubechies (db2) of DWT for the high frequency component in scale 1. The coefficients from DWT were squared to focus on the changes in the coefficients and find the maximum coefficients, as shown in Figure 4.   Figure 4 shows the characteristics of coefficients from the DWT of phase A-short circuit to ground at 30% of the length of the transmission line and at an angle of fault of 0°. Before the fault occurred, the value of the coefficient of the three phase and zero sequence was almost zero. When the fault occurred, the value of the coefficient of phase A and zero sequence became higher than that of phases B and C. Then, because of the steady state current after the fault occurred, the value of the coefficient decreased. The coefficient of the three-phase current, the phase in which the fault occurred, had a higher coefficient than the phase in which no fault occurred. The coefficient of the case in which short circuit to ground coefficients of zero sequence occurred was higher than that in the case of short circuit underground fault.
The behavior of coefficients from DWT varies depending on various factors. The behavior of the values of the coefficients was previously analyzed to create a fault classification algorithm [7]. Moreover, there are studies of the behaviors of the coefficient value of positive sequence. This is confirmed in paper [9].

Fault Classification
The behavior of coefficients can be used to design an algorithm for fault classification [7]. AI methods are important for data classification because they can increase accuracy and can rapidly analyze data. AI methods were used herein for fault classification and compared. ANNs are used in data classification and data forecasting. Probabilistic neural networks (PNNs) can be applied for data classification. Back-propagation neural networks (BPNNs) are popular ANNs because they can be used to solve numerous problems. In addition, the SVM is a popular choice for data classification. In this study, we used three methods to classify types of faults to compare accuracy. The data were divided into three parts (total 1080 data): 50% of the data were used for training (540 data), 25% of the data were used for validation (270 data), and finally, 25% of the data were used for the case study (270 data). The data detected the maximum first peak coefficients of phase ABC and zero sequence (Z). The normalized values (0 to 1) of coefficients of phase ABC and zero sequence were input into the algorithm for classification.

Probabilistic Neural Network: PNN
PNN is a feedforward neural network and is applied in classification and pattern recognition problems. PNN is derived from the Bayesian network and a statistical algorithm called Kernel Fisher discriminant analysis. It includes three layers (input layer, radial basis layer, and competitive layer). The first layer computes the distances from the input vector to the training input vectors and produces a vector whose element indicates how close the input is to the training input. The second layer sums these contributions for each class of input to produce its net output of the second layer, which picks the maximum of these probabilities and produces a 1 for that class and a 0 for the other classes. The training process begins with a random initial weight and the spread is increased in the radial basis layer, which corresponds to a bias value ( 0.8326 b Spread = ) from 0.0001 to 0.1. An increase step of 0.0001 is used to calculate the number of minimum errors. This loop process is repeated until the maximum value (0.1) is obtained or until the minimum errors are equal to zero, after which training is stopped. The training process can be summarized as a flowchart illustrated in Figure 5. The answers are shown in the form of numbers from 1 to 10, which means 10 types of fault are present. The training time for the process is approximately 97 s.

Back-Propagation Neural Network: BPNN
The back-propagation algorithm process involves calculating the gradient of the loss function with respect to each weight by the chain rule. Then, the gradient is calculated one layer at a time, and backward calculations are performed from the last layer to avoid over-calculations of intermediate terms in the chain rule. BPNN consists of four layers (an input layer, two hidden layers, and an output layer). In all hidden layers, hyperbolic tangent sigmoid functions are used, while linear functions are used in the output layer. The training process starts with a random weight and bias to the input layer. One round of training is divided into three steps. The weight and bias from the input layer are used to calculate the results of the neural networks. Then, value back-propagation of the error is performed between the output of the neural network and the target output. Next, the weight and bias are adjusted, and 20,000 iterations are performed to compute the best value of the mean absolute percentage error (MAPE). The number of neurons in hidden layers 1 and 2 is increased by one step until the MAPE value is below 0.5%. The MAPE index indicates the efficiency of BPNNs. MAPE can be computed using Equation (1). The training process of BPNN is shown in Figure 6

Support Vector Machine: SVM
SVMs are supervised learning models with associated learning algorithms that analyze data used for classification and regression analysis. Categories in SVMs are separated by a clear gap which is as wide as possible. In addition, SVMs are a linear model but can be used to solve complex and nonlinear problems by converting data to another space that has more dimensions. This is called the kernel function. Popular kernel functions include the Gaussian kernel and polynomial kernel. Both kernel functions were applied in this study. SVMs are divided into four models, which are used to classify different types of faults. The same data were input for all four SVM models, and the output showed different types of faults. The training process starts with random parameters of the SVM, and the parameters are adjusted to 1000 rounds. The parameters of the SVM are adjusted to find the maximum accuracy. If the results show more than 99% accuracy, the SVM parameters are recorded. Then, the kernel function changes to select the best parameter. The training time is approximately 60 s. The process is shown in Figure  7. The best kernel is polynomial kernel at kernel option level 19.

Result
Each AI method used 540 datapoints for training and 270 for testing. Then, the best parameters were used for classification. We used 270 datapoints for the case study and to compare the accuracies of the different AI methods. The results are shown in Table 2. Table 2 shows a comparison of the accuracy of fault classification by different AI methods at the sending end and receiving end.  Table 2 shows that PNN and SVM give the most accurate results, with 100% accuracy at both the sending and receiving ends. BPNN is the least accurate with an accuracy of 98.88% at the sending end. The sending end of BPNN has an error in the case of double line to ground fault (DLG) at 30-40% of the length of the transmission line and single line to ground fault at 40% of the length of the transmission line. BPNN shows errors because it detects DLG as a line-to-line (LL) and the single line to ground fault (SLG) is identified as a DLG. We also analyzed an additional case by varying the ground resistivity (1-100 ohm) in six cases of SLG and varying the load (70-120%) in ten cases of SLG and LL, making a total of sixteen cases. These additional cases were analyzed using the three AI models to test accuracy. Table 3 shows the percentage of accurate results in the additional cases. PNN was the least accurate, with an accuracy of 68.75% at the sending and receiving ends. PNN showed errors in the case of high load of LL because it identified the LL as SLG at both ends. BPNN showed an accuracy of 68.75% at the sending side and 100% at the receiving end. BPNN showed errors in the case of the high load of LL, which was identified as DLG. The SVM showed the most accurate results for the additional cases, with an accuracy of 100% at both sides. Therefore, the SVM is suitable for fault identification.

Conclusions
In this study, the coefficients and behavior of DWTs were analyzed for fault classification using various AI methods. Fault signals in a 115-kV hybrid transmission line were simulated using EMTP/ATPDraw software. Then, the signals comprising three-phase current and zero sequence were decomposed by the WT of MATLAB software. The coefficients from the DWT had different values depending on the position and angle at which the fault occurred. When a phase fault occurred, the values of the coefficients were high, and the values at zero sequence were high when short circuit to ground fault occurred.
Using the behavior of DWT coefficients, previous studies designed algorithms for fault classification, but the algorithms had less accuracy. Therefore, this study used AI methods to increase the performance and accuracy of fault classification. ANN and SVM were chosen for fault classification because they are well-known for data classification. Two types of ANN were applied: PNN and BPNN. The 1080 datapoints were divided into three parts. Half of the data were applied for training the AI models. Data validation was performed using 25% of the data, and the case study used 25% of the data. Normalized values (0-1) of the coefficients of phase ABC and zero sequence were input into the algorithms for classification. The results of all AI for fault classification in the case study show that PNN and SVM were the most accurate. For the additional cases with all AI models for fault classification, it can be seen that the SVM was found to be more accurate than PNN and BPNN.
Because of complex information, AI methods are required for fault classification because of the high flexibility and because all fault data do not need to be input for training. PNN and BPNN showed errors in the case of different loads, and controlling the load in real-time cases is difficult. However, the SVM provided 100% accuracy for both types of data (case study and additional cases). In terms of computation time, the result was that the SVM is suitable for fault classification because it is flexible and no training is required.

Conflicts of Interest:
The authors declare no conflict of interest.