Accuracy Improvement of Transformer Faults Diagnostic Based on DGA Data Using SVM-BA Classiﬁer

: The main objective of the current work was to enhance the transformer fault diagnostic accuracy based on dissolved gas analysis (DGA) data with a proposed coupled system of support vector machine (SVM)-bat algorithm (BA) and Gaussian classiﬁers. Six electrical and thermal fault classes were categorized based on the IEC and IEEE standard rules. The concentration of ﬁve main combustible gases (hydrogen, methane, ethane, ethylene, and acetylene) was utilized as an input vector of the two classiﬁers. Two types of input vectors have been tested; the ﬁrst input type considered the ﬁve gases in ppm, and the second input type considered the gases introduced in the percentage of the sum of the ﬁve gases. An extensive database of 481 had been used for training and testing phases (321 data samples for training and 160 data samples for testing). The SVM model conditioning parameter “ λ ” and penalty margin parameter “ C ” were adjusted through the bat algorithm to develop a maximum accuracy rate. The SVM-BA and Gaussian classiﬁers’ accuracy was evaluated and compared with several DGA techniques in the literature.


Introduction
The insulation system state of the power transformers is responsible for determining the transformers' lifetime. It is generally exposed to a couple of defects arising from overheating, paper carbonization, arcing, and discharges of low or high energy [1][2][3]. These faults might accelerate the insulation degradation, affecting the transformer reliability and lifetime [4]. Early detection of these faults can avoid the undesired abnormal operating conditions or transformer outages [5,6].
Several DGA techniques in the literature were proposed to detect the faults in transformers, but in some cases, these DGA techniques' diagnostic accuracy is inadequate. The dissolved gas analysis (DGA) technique considers one of the fastest and economical techniques widely used to diagnose the transformer fault types of the insulation system [7]. The insulating oil decomposes into hydrocarbon products, which are categorized as combustible and incombustible gases. The five main combustible gases are Hydrogen (H 2 ), Methane (CH 4 ), Acetylene (C 2 H 2 ), Ethylene (C 2 H 4 ), and Ethane (C 2 H 6 ), which might be generated within the oil during a faulty mode [1]. The concentrations of these gases were used as an input vector to interpret the DGA results in transformer oil, associated with six basic electrical and thermal faults [4,8]. Different DGA techniques have been developed to diagnose the transformer faults, including graphical DGA methods (e.g., [1,[9][10][11]) and artificial intelligence techniques (e.g., [12,13]). Improved coupled techniques have also been developed to diagnose multiple transformer faults and quantitatively indicate each fault's likelihood (e.g., [14]).
Artificial intelligence techniques such as artificial neural networks (ANN) can combine with the traditional DGA techniques to enhance the diagnostic accuracy of the transformer faults, such as the California State University Sacramento artificial neural network method (CSUS-ANN) [13]. The CSUS-ANN DGA technique used the gas concentration percentage from the five main combustible gases as inputs to the backpropagation neural network to determine the transformer faults based on the training process of DGA samples with knowing transformer fault types. Ghoneim and Taha [15] proposed a new approach (clustering) to enhance the diagnostic transformer faults by developing new gas ratios with the IEC ratios and defining its limits to improve diagnostic accuracy. The traditional IEC code 60599 and Rogers' four ratios gave a poor diagnostic accuracy of the transformer faults. Enhancing the diagnostic accuracy by modifying the two previous DGA methods' ratio limits using the particle swarm optimization with fuzzy logic is presented [6]. The conditional probability in [16] introduced a new concept using the likelihood of the faults' occurrence and the likelihood of un-occurrence of the fault via the mean and standard deviation of the two events' DGA samples. The conditional probability of the fault occurrence is identified using the multivariate normal probability density function. Three scenarios were developed depending on how to separate among the different faults. All these techniques are merged into one software package (DGALab), which is own as in [17] to facilitate the comparison process between them and any new proposed DGA techniques with the advantage of using an extensive database of DGA samples [17,18].
In this paper, SVM-BA and Gaussian classifiers have been used to detect faults within an oil-immersed power transformer. The concentration of gases in the ppm and percentage of the sum of the five main combustible gases have been used as an input vector for Gaussian and SVM classifiers. Kernel parameter λ and penalty margin C of the SVM model have been optimized by a Bat algorithm (SVM-BA) to adjust the model, getting a high diagnostic accuracy. Electrical and thermal transformer faults have characterized the output of each classifier including partial discharge (PD), low energy discharges (D1), high energy discharges (D2), thermal faults < 300 • C (T1), thermal faults of 300 • C to 700 • C (T2), and thermal faults > 700 • C (T3) [1]. The performance of each classifier has been investigated in terms of accuracy rate. A total of 481 sample datasets have been considered, where two-thirds were used for the training process (321 samples) while the rest was used for the testing process (160 samples). A comparative study was accomplished with the other DGA techniques in the literature to identify the proposed DGA technique's diagnostic improvement.
The current work presents a classification technique (SVM-BAT and Gaussian classifiers) to enhance the transformer faults' diagnostic accuracy, which considers one of the new trends in condition monitoring and diagnostics of power system assets.

Problem Formulation
Highly reliable transformers are mainly made of iron core and windings; both are placed in the oil tank filled with insulating oil, as shown in Figure 1.
Mineral insulating oil is the most common type of oil used in outdoor transformers [19]. This insulating oil has significant dielectric strength so that it can withstand a pretty high voltage. It also reduces heat generated by transformer windings employing the cooler (radiators, air fans, . . . ). Therefore, the heat generated in the transformer results in a temperature rise in the internal transformer structures. Under electrical and thermal stresses, different hydrocarbon gases are liberated due to the insulating oil decomposition. Particular gases characterize each type of fault. For instance, hydrogen concentration, produced by ionic bombardment, increases with partial discharges within a transformer oil. In this context, a general review about the gases produced during the deterioration of mineral oil and their interpretation has been detailed in [10]. Mineral insulating oil is the most common type of oil used in outdoor transformers [19]. This insulating oil has significant dielectric strength so that it can withstand a pretty high voltage. It also reduces heat generated by transformer windings employing the cooler (radiators, air fans, …). Therefore, the heat generated in the transformer results in a temperature rise in the internal transformer structures. Under electrical and thermal stresses, different hydrocarbon gases are liberated due to the insulating oil decomposition. Particular gases characterize each type of fault. For instance, hydrogen concentration, produced by ionic bombardment, increases with partial discharges within a transformer oil. In this context, a general review about the gases produced during the deterioration of mineral oil and their interpretation has been detailed in [10].
Early-stage detection of these faults should be carried out to avoid the undesired abnormal operating conditions or transformer outages. For this purpose, periodic monitoring of the oil should be conducted during transformer service, whether in-situ or at the laboratory, using a multi-stage gas-extractor (a device for sampling transformer oil) [10]. In general, the most important gases are Hydrogen (H2), Methane (CH4), Acetylene (C2H2), Ethylene (C2H4), and Ethane (C2H6). The distribution of these gases is related to the type of transformer fault, and the rate of gas generation can indicate the severity of the fault [5,20].
In [6], the authors have collected 481 samples associating with the six different faults as indicated in the Introduction (i.e., PD, D1, D2, T1, T2, and T3). The number of samples associated with each fault is given in Table 1.  Early-stage detection of these faults should be carried out to avoid the undesired abnormal operating conditions or transformer outages. For this purpose, periodic monitoring of the oil should be conducted during transformer service, whether in-situ or at the laboratory, using a multi-stage gas-extractor (a device for sampling transformer oil) [10]. In general, the most important gases are Hydrogen (H 2 ), Methane (CH 4 ), Acetylene (C 2 H 2 ), Ethylene (C 2 H 4 ), and Ethane (C 2 H 6 ). The distribution of these gases is related to the type of transformer fault, and the rate of gas generation can indicate the severity of the fault [5,20].
In [6], the authors have collected 481 samples associating with the six different faults as indicated in the Introduction (i.e., PD, D1, D2, T1, T2, and T3). The number of samples associated with each fault is given in Table 1. The database set has been exploited in the present investigation to detect and identify faults. As shown in this table, only separated faults (no combined faults) have been considered. The fault detection has been examined using the concentration of each dissolved gas. Since the weight percent of the gases as mentioned earlier would result in an inopportunely small number, concentration in parts per million, or ppm, has been considered for each gas. Furthermore, percent concentration of the total sum was also used, where each sample X = [x 1 , x 2 , . . . , x 5 ] is scaled as follows: The faults diagnostic method has been carried out elaborating two different classifiers, namely Gaussian and SVM-BA. The flowchart given in Figure 2 summarizes the various stages of the diagnostic approach.
sidered. The fault detection has been examined using the concentration of each dissolved gas. Since the weight percent of the gases as mentioned earlier would result in an inopportunely small number, concentration in parts per million, or ppm, has been considered for each gas. Furthermore, percent concentration of the total sum was also used, where each sample X = [x1, x2, …, x5] is scaled as follows: The faults diagnostic method has been carried out elaborating two different classifiers, namely Gaussian and SVM-BA. The flowchart given in Figure 2 summarizes the various stages of the diagnostic approach.

Classification Approach
For both, Gaussian and SVM-BA, classifier, the concentrations in percentages and ppm of the five dissolved gases have been used as an input vector, denoted by X = [x1, x2, …, x5], associated with a particular class of fault (denoted by y) representing the classifier decision (classifier output).

Gaussian Classifier
In this part, the Gaussian classification is used as a probabilistic learning method for constructing a classifier by applying Bayes' theorem. It concerns the conditional and marginal probabilities of two random events. The classifier is based on the comparison of the posterior probability P (wi|x): where P (x|wi) is the conditional probability (likelihood) given by:

Classification Approach
For both, Gaussian and SVM-BA, classifier, the concentrations in percentages and ppm of the five dissolved gases have been used as an input vector, denoted by X = [x 1 , x 2 , . . . , x 5 ], associated with a particular class of fault (denoted by y) representing the classifier decision (classifier output).

Gaussian Classifier
In this part, the Gaussian classification is used as a probabilistic learning method for constructing a classifier by applying Bayes' theorem. It concerns the conditional and marginal probabilities of two random events. The classifier is based on the comparison of the posterior probability P (w i |x): where P (x|w i ) is the conditional probability (likelihood) given by: and P(x) is the unconditional density that normalizes the posteriors, computed as follows: in which P(w i ) is the prior probability of each class. Firstly, the training phase has been carried out for constructing the parameters of the Gaussian model. In this phase, 321 samples of the data set have been reserved to determine the Gaussian distributions, consisting of the mean value (µ) and the matrix covariance (σ) of the gas concentration for each defect class. Since the number of samples differs from one fault to another, every distribution is multiplied by a weight corresponding to its samples' number on the database's total size.
In the next step, Gaussian has been employed to compute the conditional probability P (x|w i ) as indicated in Equation (3), where the posterior probability is calculated using the probability density function of a univariate normal distribution as follows: Since it is required to know the likelihood of observing the k-th sample while considering all the different distributions, one can sum the likelihood of observing the given sample from each possible Gaussian, using: in which, |σ| and σ −1 denote the determinant and inverse of the covariance matrix σ. Each Gaussian model's parameters (i.e., variance, mean, and weight) have been addressed to cluster the data and estimate those having the same parameters. Moreover, a maximum likelihood estimate (MLE) was used to find the optimal mean and variance, maximizing the data's likelihood. After training the model, the classifier output ideally ends up with six distributions on the same axis. Depending on the axis's location, each testing sample (a total of 160 testing ones) is placed in one of the defect classes. Figure 3 illustrates the different steps of the Gaussian classifier.

SVM Classifier Coupled with BA
SVMs techniques are used in the problem of classification, regression, and prediction models [21]. For the classification problems, hyperplanes are required in a multidimensional space separating data points of both fault classes. These hyperplanes are used to distinguish between every two classes (yi and yj) of faults associated with two different

SVM Classifier Coupled with BA
SVMs techniques are used in the problem of classification, regression, and prediction models [21]. For the classification problems, hyperplanes are required in a multidimensional space separating data points of both fault classes. These hyperplanes are used to distinguish between every two classes (y i and y j ) of faults associated with two different input vectors (X i and X j ) [22][23][24]. Among these hyperplanes, it is suggested to find the one that has the maximum margin (denoted by M). In this light, the classification becomes an optimization problem where hyperplanes represent the decision boundaries that help classify the data points. Usually, an orthogonal vector (denoted by ω) to the hyperplane defined by: which is used in combination with an input vector (Xi) to define the hyperplane function, h, as follows [22]: The ω 0 is the bias term required to determine the position of separating hyperplane (i.e., h (X) = 0).
A learning strategy of One-to-One is selected. It is assumed that X i is of class "1" if h (X i ) ≥ 0 and, consequently, it is of class "−1" elsewhere. Assuming that X i and X j are the two closest points on each side of the hyperplane (different classes), the equations for the hyperplanes h (X i ) and h (X j ) become: and Differencing these equations and dividing both sides by the magnitude of the ω, we obtain: X i − X j is the distance between the two hyperplanes. From the expression (11), it is clear that the maximization of the margin implies the minimization of the weight vector ω used to define the hyperplane. A soft-margin SVM is utilized for nonlinear classes to provide freedom to the model misclassifying some data points by minimizing the number of such samples [23]. For this purpose, slack variable non-negatives ζ i is introduced in the hyperplane equation. Consequently, the optimization problem becomes: C represents the margin parameter, which can be seen as a regularization parameter. The corresponding Lagrangian dual problem is given by: α i are Lagrange coefficients (multipliers). In such circumstances, the Karush-Kuhn-Kucker conditions are [14]: Setting the derivatives as mentioned earlier of the Lagrangian and ω 0 individually to 0, it follows that the Lagrangian expression should be maximized under the constraint: and it also yields Since the data are assumed as non-separable, the feature space has been enlarged by a characteristic function Φ known as Kernel function. Every data point has been mapped into high-dimensional space through a particular transformation Φ: X 7→ φ (X). Polynomial Kernel function of d-degree has been used in this investigation as follows [23]: which verifies the following condition: In this case, the optimization problem after rearrangement becomes as follows: SVM parameters consisting in: • Kernel parameter λ (conditioning parameter equivalent σ in RBF kernel [24]); • penalty parameter C (margin parameter); and • degree d of the Kernel polynomial.
significantly affect the accuracy of predicting model. To further improve the accuracy rate, the bat algorithm (BA) has been elaborated in this investigation to optimize the SVM parameters. BA is part of meta-heuristic algorithms for global optimization, intended (by Xin-She Yang in 2010) to simulate prey's sensing distance and avoid obstacles using microbats echolocation behavior [23]. In BA, the aim is reached by determining the optimum parameters C and λ that give the best accuracy rate of the SVM classifier. The degree of a polynomial kernel has been fixed at d = 1, 2, and 3. Figure 4 illustrates the different steps of the coupled SVM-BA classifier.
In the beginning, BA parameters have been initialized. Table 2 lists the detailed settings for the BA values used to optimize the SVM model.  The parameter fi denotes the frequency, which is computed as follows: in which β is a random number ranging between 0 and 1. It is worth noting that the searching space has been bounded by Cmin = 10 −6 and Cmax = 0.1 for the parameter C against λmin = 10 −7 and λmax = 0.7 for the second parameter λ. Moreover, the pulse rate increased according to the iteration number as follows: while the loudness decreased by:

Experimental Work
During transformers' operation, the insulation of transformer coils is subjected to high electrical and thermal stresses causing corrosion of some insulating material particles and decomposition of some insulating oil particles producing different types of gases. These gases dissolve in the transformer oil. At the beginning of any slight fault, the gases are not released largely enough to operate the gas protection device that does not cause instantaneous breakdown, but the transformer efficiency is reduced.
The gases that were used to diagnose the transformers' state include Hydrogen (H2), Methane (CH4), Acetylene (C2H2), Ethylene (C2H4), Ethane (C2H6), carbon monoxide (CO), and carbon dioxide (CO2). Hence, chromatographic analysis (CA) of dissolved gases in  Therefore, BA generates a population of the SVM parameters (C, λ). For each couple, the initial position p and velocity v have been randomly selected. Each couple's fitness has been evaluated to extract the best global position (denoted by p * ). This means that the training dataset is used to train the SVM classifier for each position, while the testing dataset is used to calculate the accuracy rate. This latter represents the ratio between the number of correctly classified samples (N c ) to the total number of test samples (N). After this step, the position and velocity of each individual are updated using the following expressions: where v i t and v i t+1 are current, and the next velocities correspond to the existing p i t and the next p i t+1 positions, respectively. The parameter f i denotes the frequency, which is computed as follows: (21) in which β is a random number ranging between 0 and 1. It is worth noting that the searching space has been bounded by C min = 10 −6 and C max = 0.1 for the parameter C against λ min = 10 −7 and λ max = 0.7 for the second parameter λ. Moreover, the pulse rate increased according to the iteration number as follows: while the loudness decreased by:

Experimental Work
During transformers' operation, the insulation of transformer coils is subjected to high electrical and thermal stresses causing corrosion of some insulating material particles and decomposition of some insulating oil particles producing different types of gases. These gases dissolve in the transformer oil. At the beginning of any slight fault, the gases are not released largely enough to operate the gas protection device that does not cause instantaneous breakdown, but the transformer efficiency is reduced.
The gases that were used to diagnose the transformers' state include Hydrogen (H 2 ), Methane (CH 4 ), Acetylene (C 2 H 2 ), Ethylene (C 2 H 4 ), Ethane (C 2 H 6 ), carbon monoxide (CO), and carbon dioxide (CO 2 ). Hence, chromatographic analysis (CA) of dissolved gases in transformer oils is considered as an analysis method that reveals small percentages of dissolved gases in the oil. The CA of gases indicates the transformers' condition in the early stage of the fault occurrence. Thus, the transformers can be preserved and decrease the transformer failure before a transformers' complete breakdown occurs.
The CA results' accuracy depends on drawing the transformer's oil sample, extracting the dissolved gases from the oils' samples, and adjusting the analyzer device. The CA must be carried out at the start of the transformers' operation, and its results are considered a reference when analyzing this transformer later.
American Society for Testing and Materials (ASTM) D3612-2 [24] indicates the dissolved gases' extracting procedures from the transformer oils' samples using gas chromatography (GC). The GC consists of the mobile phase (including three types of gases the carrier gas, the fuel gas, and zero air), the sample injector, the column, the columns' oven, the detector, and the data system.
Oil samples were prepared and filled with glass vials by a sampling device. Then, they were placed into the Autosampler unit. Hence, one by one, the samples were analyzed, and inserted into the oven at 80 • C. The dissolved gases are extracted by increasing the temperature by moving the oil sample. Hence, the extracted gases are injected into the GC to accomplish the gases' analysis [25]. Figure 5 illustrates the oil samples' drawing process from the transformer and the GC device (8890 Gas Chromatograph (GC) System and 7697A Headspace Sampler, Agilent, USA). The GCs' analysis results are shown in Figure 6, illustrating the time required to extract each gas and its concentration in ppm. The chromatograph provides a signal with time, which produces the familiar chromatogram. The chromatogram signal can be converted into a list of peak times and sizes by either manual or electronic means [26].
GC to accomplish the gases' analysis [25]. Figure 5 illustrates the oil samples' drawing process from the transformer and the GC device (8890 Gas Chromatograph (GC) System and 7697A Headspace Sampler, Agilent, USA). The GCs' analysis results are shown in Figure 6, illustrating the time required to extract each gas and its concentration in ppm. The chromatograph provides a signal with time, which produces the familiar chromatogram. The chromatogram signal can be converted into a list of peak times and sizes by either manual or electronic means [26].

Results and Discussions
A database set of 481 samples has been exploited to evaluate each classifiers' accuracy rate. As stated in Section 2, 321 samples of the data set were used in the training phase, while the rest was used for testing (160 samples). The data distribution was based on the holdout method; more than 60% of the database must be reserved for the training phase (2/3 for the training set and the remaining 1/3 as the test set) [27]. Both data parts were randomly selected, and they were used in all simulations. DGA results in percentages (i.e., percentages of the total sum) and ppm have been considered an input vector for both classifiers. As mentioned previously for SVM, the Kernel polynomial and a one-to-one learning strategy were selected. The classifiers' results were compared regarding inspection (the real fault in the transformer) as in Table 3. In Table 3, some cases were illustrated to explain the comparison between the SVM-BA and Gaussian classifier for gases in ppm and gases percentages.

Results and Discussions
A database set of 481 samples has been exploited to evaluate each classifiers' accuracy rate. As stated in Section 2, 321 samples of the data set were used in the training phase, while the rest was used for testing (160 samples). The data distribution was based on the holdout method; more than 60% of the database must be reserved for the training phase (2/3 for the training set and the remaining 1/3 as the test set) [27]. Both data parts were randomly selected, and they were used in all simulations. DGA results in percentages (i.e., percentages of the total sum) and ppm have been considered an input vector for both classifiers. As mentioned previously for SVM, the Kernel polynomial and a one-to-one learning strategy were selected. The classifiers' results were compared regarding inspection (the real fault in the transformer) as in Table 3. In Table 3, some cases were illustrated to explain the comparison between the SVM-BA and Gaussian classifier for gases in ppm and gases percentages.  T1  T1  T3 *  T1  T1   960  4000  6  1560  1290  T2  T3*  T3 *  T2  T3 *   1374  2648  298  5376  628  T3  T3  T3  T3  T3 (*) denotes that diagnosis is wrong based on the inspection.
Not only the type of input vector influences the accuracy rate in the SVM algorithm, but in the experience of previous investigations, the degree of the polynomial kernel can also affect the diagnostic accuracy [28]. Figure 7 shows the impact of vector input type on the classification performance for the SVM-BA classifier's evolution during the optimization process. This latter has been illustrated in Figure 7a,b when the input vector calculates gases in percentages and ppm, respectively, with different polynomial kernel degrees. For a given degree of Kernel polynomial, it is clear that 300 iterations are mainly sufficient for the convergence of the SVM-BA algorithm. The results showed that the SVM-BA classifier's accuracy rate is quite sensitive to the degree of Kernel polynomial. For an input vector taken in percentages as shown in Figure 7a, the maximal accuracy rate is 93.13% with d = 2 and 3 against 91.88% with d = 1. On the other hand, notably lower results have been found in Figure 7b for an input vector in ppm where the highest accuracy is 87.5% obtained for d = 1 against 82.5% and 78.13% for d = 2 and 3, respectively. However, For a given degree of Kernel polynomial, it is clear that 300 iterations are mainly sufficient for the convergence of the SVM-BA algorithm. The results showed that the SVM-BA classifier's accuracy rate is quite sensitive to the degree of Kernel polynomial. For an input vector taken in percentages as shown in Figure 7a, the maximal accuracy rate is 93.13% with d = 2 and 3 against 91.88% with d = 1. On the other hand, notably lower results have been found in Figure 7b for an input vector in ppm where the highest accuracy is 87.5% obtained for d = 1 against 82.5% and 78.13% for d = 2 and 3, respectively. However, the convergence for d = 3 is very fast compared to those found for d = 2 and d = 1.
The previous simulation, related to the SVM-BA classifier and shown in Figure 7, is repeated 50 times to find the best accuracy rate to provide more credibility of the obtained results. Figure 8 illustrates an example of the accuracy rate versus the number of runs (i.e., executions) when using DG in percentages as an input vector for the SVM-BA classifier. These results have been computed for the Kernel polynomial of d = 1, 2, and 3 degrees. In Figure 8, the best accuracy rate obtained for different executions is located between 91% and 94%. The global best results related to the accuracy rate obtained for several runs are presented in Table 4. Additionally, DGA has been elaborated in ppm and percentages for the input vector. For SVM-BA, the results are given for three degrees of Kernel polynomial. After 50 executions, it was found that the maximal accuracy rate was 93.75% with d = 2 and 3 against 93.13% with d = 1 obtained when employing an input vector in percentages. When using the dissolved gases in ppm as input vector for the SVMBA, the computed results decreased to 87.50% for d = 1 against 89.75% for d = 2 and 3. This implies that the SVM-BA classifier gives a better accuracy rate for an input vector given in percentages. On the other hand, Gaussian classifier gives the lowest accuracy rate of 32.75% when the input vector employed in ppm compared to an accuracy rate of 66.25% when the input vector is in percentages. This ascertainment demonstrates the concentration of gases in percentages to differentiate between a particular defect from the other ones.
For the Gaussian classifier, it should be noted that the results have been dramatically improved when the real part of the posterior probability given by the expression (6) is employed. In this case, the accuracy rate has been increased to 70% for percentages input in while it remains the same (i.e., 32.75%) for an input vector in ppm. Such findings suggest using the real part of posterior Probability in Gaussian classifier with an input vector In Figure 8, the best accuracy rate obtained for different executions is located between 91% and 94%. The global best results related to the accuracy rate obtained for several runs are presented in Table 4. Additionally, DGA has been elaborated in ppm and percentages for the input vector. For SVM-BA, the results are given for three degrees of Kernel polynomial. After 50 executions, it was found that the maximal accuracy rate was 93.75% with d = 2 and 3 against 93.13% with d = 1 obtained when employing an input vector in percentages. When using the dissolved gases in ppm as input vector for the SVMBA, the computed results decreased to 87.50% for d = 1 against 89.75% for d = 2 and 3. This implies that the SVM-BA classifier gives a better accuracy rate for an input vector given in percentages. On the other hand, Gaussian classifier gives the lowest accuracy rate of 32.75% when the input vector employed in ppm compared to an accuracy rate of 66.25% when the input vector is in percentages. This ascertainment demonstrates the concentration of gases in percentages to differentiate between a particular defect from the other ones.
For the Gaussian classifier, it should be noted that the results have been dramatically improved when the real part of the posterior probability given by the expression (6) is employed. In this case, the accuracy rate has been increased to 70% for percentages input in while it remains the same (i.e., 32.75%) for an input vector in ppm. Such findings suggest using the real part of posterior Probability in Gaussian classifier with an input vector in percentages. Compared to other literature results, the SVM-BA classifier has good accuracy and has high abilities to diagnose the transformer fault classes with simple codes. The overall accuracy obtained in [6] for the same database in ppm is a good example of this.

Validation and Overall Accuracy of the Proposed SVM-BA Classifier
The SVM-BA and Gaussian classifiers are compared with various classification algorithms used in the DGALab interface to evaluate the proposed method's accuracy [17]. The free DGALab software package is available in [18]. DGALab involves the Duval triangle method, IEC code 60599, Roger's four ratios, modified IEC code and Modified Rogers' 4 ratios, clustering method, conditional probability, and California State University Sacramento artificial neural network method (CSUS-ANN). The details of the whole algorithms are cited in [15][16][17]. A comparison among all these mentioned methods was carried out based on the individual fault accuracy and overall accuracy rate ( Table 5).
The last row in Table 5 illustrates the total number of samples used for testing purposes and the overall accuracy of each DGA method for comparison purposes. From the evaluation exposed in Table 5, the SVM-BA provides the best overall accuracy rate (93.75%). This superiority came from the ability of SVM to classify the complex and extensive data set. Moreover, the coupling of SVM with the BAT algorithm enabled the right choice of parameters which gave the highest possible accuracy rate. The nearest overall accuracy of the proposed method is the modified IEC code, which showed an 88.75% overall accuracy rate. The worst overall accuracy rate is the Rogers' four ratio DGA method, for which the overall accuracy is 53.75%. The results in Table 5 are recapitulated in Figure 9 in a histogram form. sive data set. Moreover, the coupling of SVM with the BAT algorithm enabled the right choice of parameters which gave the highest possible accuracy rate. The nearest overall accuracy of the proposed method is the modified IEC code, which showed an 88.75% overall accuracy rate. The worst overall accuracy rate is the Rogers' four ratio DGA method, for which the overall accuracy is 53.75%. The results in Table 5 are recapitulated in Figure 9 in a histogram form.

Conclusions
This paper proposed a new DGA technique using an SVM-BA classifier to enhance the transformer faults' diagnostic accuracy. Five main combustible dissolved gas concentrations (H2, CH4, C2H6, C2H4, and C2H2) were used as an input vector top the SVM-BA

Conclusions
This paper proposed a new DGA technique using an SVM-BA classifier to enhance the transformer faults' diagnostic accuracy. Five main combustible dissolved gas concentrations (H 2 , CH 4 , C 2 H 6 , C 2 H 4 , and C 2 H 2 ) were used as an input vector top the SVM-BA classifier to identify the transformer fault type. The concentration of five dissolved gases was used in ppm and in percentages. A total of 481 samples was collected from the chemical laboratory and literature, categorized into 321 data samples for training and 160 data samples for testing processes. The SVM-BA classifier results indicated the following:

•
An accuracy rate of 93.75% has been achieved when the input vector in percentage with d = 2 and 3 degrees.

•
The coupled SVM-BA classifier's test results revealed the classifier's ability to enhance the transformer faults' diagnostic accuracy rather than the other DGA techniques in the literature.

•
The overall accuracy of SVM-BA was 93.75%, which is higher than that of the modified IEC code (88.75%).