The Degree of Polymerization in a Prediction Model of Insulating Paper and the Remaining Life of Power Transformers

: The aging of power transformers causes several defects and damages in the insulating sys-tem, especially in the insulating paper. The degradation of the insulating paper generates dissolved gases in the insulating oil, which are measured by gas chromatography and used as an indicator of the insulation status. The state of the insulating paper can be identiﬁed based on the degree of polymerization (DP) measurement. In some cases, when the measurement of DP is difﬁcult, estimating DP can be accomplished through gathering information about some of the testing parameters, such as the dissolved gases (DGA), breakdown voltage (BDV), oil interfacial tension (IF), oil acidity (ACI), moisture content (MC), oil color (OC), dielectric loss (Tan δ ), and furans concentration speciﬁcally (2-furfuraldhyde (FA)). The statistical tools (correlation and multiple linear regression), based on 131 transformer samples, can be used to build a relation linking DP and one or more of the previous parameters to identify the insulating paper status and the percentage of remaining life of the transformer. The results indicated that it is difﬁcult to build a mathematical model to relate between the DP and the testing variables, except with FA, where the trend of DP with FA is more obvious than with other variables. The empirical formula to compute DP based on the FA concentration was developed and gave promising results to compute DP and the remaining life of the power transformers.


Introduction
A power transformer is very crucial equipment for the continuous operation of power systems. Maintaining the continuous operation of a power transformer is a great issue for the electrical utilities because any malfunction operation of a transformer results in a revenue loss. Therefore, monitoring, inspection, and periodic maintenance of a transformer help to avoid the undesired outage of a transformer from the service. Most of a transformer faults can be attributed to the failure or damage to the insulation system, which consists of insulating oil and paper [1][2][3][4].
The degradation process of the transformer insulation (oil and paper) enhances at higher temperature and the presence of oxygen and moisture. The assessment of a transformer's "insulation age" can be identified through some diagnostic techniques such as dissolved gas analysis (DGA) [5], the degree of polymerization (DP) through furan compounds concentration assigning the insulating paper condition, or using gel permeation chromatography to provide the change in molecular weight distribution of the insulating paper [6]. The insulating paper is a sheet of cellulose, which is a linear polymer. The DP is defined as the monomer units' number in the polymer and can be measured using a viscosity method and expresses as DP V . The DP V of the new kraft paper ranges between 1000 and 1500, but the aging of the kraft paper under high temperature and water contents becomes brittle and DP V reduces to 200 or 250. The mechanical strength of the kraft paper decreases by 20% of its initial strength, which refers to the end of life of the insulating paper. The works of literature indicated that the end life criterion of the insulating paper is considered as 200-250 DP V . The remaining paper insulation life can be predicted through the thermal endurance curve [7,8].
Measuring the furan compounds that dissolve in insulating oil is a prominent indicator to interpret the insulating paper degradation through the degree of polymerization (DP) measurement [9]. The cellulose that constitutes the insulating paper degrades under electrical and thermal stresses into five furan features, i.e., 5-hydroxymethyl-2-furfuraldhyde (HOM), furfuryl alcohol (FOL), 2-furfuraldhyde (FA), 2-acytelfuran (AF), and 5-methyl-2-furfural (MF). The FA is the most insulation aging indicator of the five furan products, where the importance of FA results from its stability at high temperature and its relative affinity (partition coefficient), which refers to the ratio between furan product in oil to that in insulating paper. The practical experimental results refer that the relative affinity of FA is temperature-dependent [7][8][9][10]. The results of the experimental works that were carried out on the insulating paper indicated that the thermal aging of the paper increased the concentration of FA in the insulating oil [11].
In [12], statistical analysis was presented for furan analysis of an above-500 kV transformer. It presented the potential causes of the individual furans. Studying the correlation between acid number, moisture, and carbon monoxide indicated the significant effect of these parameters on the furan product. Several equations were developed to illustrate the relation between the furan content and DP and its role to compute the remaining life of the transformer [13]. The statistical analysis indicated that the appearance of FA in the oil is the result of paper degradation by moisture and acids. The 5-hydroxymethyl-2-furfuraldhyde (HOM) may be a product of thermally overheating of the insulating paper, because it has good correlation with carbon monoxide (CO) and carbon dioxide (CO 2 ). The HOM is related to the oxidation, but the correlation coefficient between the data of HOM and Oxygen (O 2 ) is very weak in this study (−0.08), which illustrates no linear relation between the two parameters. The 2-acytelfuran (AF), which is reasonably rated in the oil samples, has good correlations with the acid number, CO, and CO 2 , and this shows that it presented due to overheating of the paper or by acids. The statistical analysis also explained that some variables, such as oil acidity (ACI), moisture content (MC), CO, and CO 2 , have the highest correlation coefficients with the total furans. Therefore, they were utilized as the predictor parameters in multilinear regression model to predict the total furans. The t stat and p-value results of the multiple linear regression model to predict the total furan using the predictor parameters illustrated that CO 2 is not one of the best predictors. The multiple correlation coefficient of the linear regression model of the three predictor parameters is 0.51, while the adjusted correlation coefficient is only 0.26, which shows that the three predictor parameters are not the only parameters that have an influence on the total furans. According to the relation between CO and CO 2 with the total furans, the temperature of the transformer may be a variable that affects the total furans and must be considered in the linear regression model.
In the current work, 131 transformer test cases were utilized to interpret the variables that influence the DP computation and the lifespan of the transformer. The 131 test samples of the oil transformer include: (1) dissolved gases including the concentration of hydrogen (H 2 ), methane (CH 4 ), ethane (C 2 H 6 ), ethylene (C 2 H 4 ), acetylene (C 2 H 2 ), carbon monoxide (CO), and carbon dioxide (CO 2 ), (2) the oil characteristics variables involving the breakdown voltage of the oil (BDV), acidity (ACI), interfacial tension (IF), moisture contents (MC), oil color (OC), and dielectric loss (tan delta, Tan δ), and (3) the 2-furfuraldehyde (FA) concentration, which is the main product of paper degradation. The oils of these transformers are tested in the period between 1 March 2019 and 17 June 2019. The correlation and multilinear regression, which are the statistical tools, are used to establish the empirical formula linking the previous variables with the DP and the percentage of the remaining life of the transformer.

Experimental Work
It is difficult to present the measurements of all investigated parameters because most of them were illustrated in the literature; however, according to the importance of 2-furfuraldhyde (FA), which is the most significant parameter in this study, the procedures to measure FA according to ASTM D5837 can be explained.
Collecting the insulating paper samples from the transformer in service is impractical, so the furan compounds analysis based on ASTM D5837 [14] is an alternative way to explore the paper condition and then assess the remaining life of the transformer. High performance liquid chromatography (HPLC) with a suitable analytical column and UV detector can be used to measure the furan concentration that dissolved in the insulating oil due to the degradation of the insulating paper. The experiments that were used to measure the furan compounds were conducted in the central Laboratories section, field Operation & Support service, Asset Maintenance Department, Jedda, Saudi Arabia according to ASTM D 5837 by experts in the chemical central laboratory. The furan compounds developed from the test are 5-hydroxymethyl-2-furfuraldhyde (HOM), furfuryl alcohol (FOL), 2furfuraldhyde (FA), 2-acytelfuran (AF), and 5-methyl-2-furfural (MF). The samples were collected from the transformer for the periodic inspection, or when faults were occurring or expecting, and the aging process of transformer oil was accelerated. The procedures to oil extraction were performed according the ASTM D5837 as follows, (i) extraction of the solvent (1-2 mL) into 10 mL of the test oil in a test tube, which were accurately capped, (ii) the test sample was mixed for 3 min using a vortex mixer, and (iii) the test sample was left in a closed cabinet for 1 h to separate between the two phases, where the top phase was the extract and the bottom phase was the nonpolar portion of the sample; then, the extracted sample was injected into the HPLC.

Correlation and Multilinear Regression
The explanation of correlation and multiple linear regression in this section is very helpful to clarify the purpose of the work.

Correlation
A lot of practical applications may include two variables or more, and it is required to know if there is a relationship between these variables, what the form of this relationship is, and how to predict one of these variables when another variable is known [12,15,16]. The correlation indicates the nature and strength of the presence or lack of the relationship between the variables. The correlation coefficient is an indicator of this relationship, and the first step in determining this relationship is drawing the spread shape between the variables. One of the variables considers the dependent variable, and the other variable or variables are the independent variables. The correlation coefficient has a positive, negative, or zero value, where a positive correlation coefficient means a directly proportional between the dependent and the independent variable, a negative correlation shows that the relation between the two variables is inversely proportional, and the zero-correlation coefficient shows that there is no relation between the two variables. A good (strong) correlation coefficient is +1 or −1. Figures 1 and 2 illustrate the correlation between the parameter x on the x-axis and parameter y on the y-axis. These figures show the positive and negative correlation and their forms. Figure 1a indicates the directly proportional correlation (positive correlation) between x and y, while Figure 1a indicates the inversely proportional correlation between x and y (negative correlation). Figure 1b refers to the strong directly proportional correlation (strong positive correlation) and Figure 2b refers to a strong negative correlation. In addition, Figures 1c and 2c illustrate the completely positive and negative correlation, respectively. Table 1 depicts the correlation types, the direction of this relationship, and its form [12].  Table 1 depicts the correlation types, the direction of this relationship, and its form [12].

Multilinear Regression
Regression is a method by which one can estimate the value of one of the two variables by the value of the other variable [17,18]. The regression's types are categorized into simple linear regression, multiple linear regression, and nonlinear regression. Therefore, the simple regression shows that the dependent variable y relies on one independent variable x and the relationship between them is linear. If the dependent variable y depends on more than one variable, then this is called a multiple linear regression. In addition, the relationship between the dependent variable y and the other independent variables may be, in some cases, nonlinear.
Multiple linear regression is an advanced statistical method that guarantees the accuracy of inference to improve the results by finding the causal relationships between the phenomena. It is a mathematical equation that expresses the relationship between two variables and is used to estimate past values and predict future values. In addition, it is used to predict the changes of a dependent variable that is influenced by several independent variables. Thus, multiple linear regression is used to explain the relationship between a continuous dependent variable and two or more independent variables [19]. The independent variables can be continuous or discontinuous. After obtaining the results of the multiple regression equation, the equation coefficients must be statistically accepted, i.e., statistically significant, and the significance is for each variable. In order to judge the significance of the regression coefficients, t-test, the corresponding likelihood level, and p-value are used. The other statistical parameters are used to know the overall significance of the model, including the correlation coefficient (R) and the determination coefficient (R 2 ). The first, R, is the simple correlation coefficient, which measures the strength of the  Table 1 depicts the correlation types, the direction of this relationship, and its form [12].

Multilinear Regression
Regression is a method by which one can estimate the value of one of the two variables by the value of the other variable [17,18]. The regression's types are categorized into simple linear regression, multiple linear regression, and nonlinear regression. Therefore, the simple regression shows that the dependent variable y relies on one independent variable x and the relationship between them is linear. If the dependent variable y depends on more than one variable, then this is called a multiple linear regression. In addition, the relationship between the dependent variable y and the other independent variables may be, in some cases, nonlinear.
Multiple linear regression is an advanced statistical method that guarantees the accuracy of inference to improve the results by finding the causal relationships between the phenomena. It is a mathematical equation that expresses the relationship between two variables and is used to estimate past values and predict future values. In addition, it is used to predict the changes of a dependent variable that is influenced by several independent variables. Thus, multiple linear regression is used to explain the relationship between a continuous dependent variable and two or more independent variables [19]. The independent variables can be continuous or discontinuous. After obtaining the results of the multiple regression equation, the equation coefficients must be statistically accepted, i.e., statistically significant, and the significance is for each variable. In order to judge the significance of the regression coefficients, t-test, the corresponding likelihood level, and p-value are used. The other statistical parameters are used to know the overall significance of the model, including the correlation coefficient (R) and the determination coefficient (R 2 ). The first, R, is the simple correlation coefficient, which measures the strength of the

Multilinear Regression
Regression is a method by which one can estimate the value of one of the two variables by the value of the other variable [17,18]. The regression's types are categorized into simple linear regression, multiple linear regression, and nonlinear regression. Therefore, the simple regression shows that the dependent variable y relies on one independent variable x and the relationship between them is linear. If the dependent variable y depends on more than one variable, then this is called a multiple linear regression. In addition, the relationship between the dependent variable y and the other independent variables may be, in some cases, nonlinear.
Multiple linear regression is an advanced statistical method that guarantees the accuracy of inference to improve the results by finding the causal relationships between the phenomena. It is a mathematical equation that expresses the relationship between two variables and is used to estimate past values and predict future values. In addition, it is used to predict the changes of a dependent variable that is influenced by several independent variables. Thus, multiple linear regression is used to explain the relationship between a continuous dependent variable and two or more independent variables [19]. The independent variables can be continuous or discontinuous. After obtaining the results of the multiple regression equation, the equation coefficients must be statistically accepted, i.e., statistically significant, and the significance is for each variable. In order to judge the significance of the regression coefficients, t-test, the corresponding likelihood level, and p-value are used. The other statistical parameters are used to know the overall significance of the model, including the correlation coefficient (R) and the determination coefficient (R 2 ). The first, R, is the simple correlation coefficient, which measures the strength of the relationship between two or more variables, while the second, R 2 , is called the determination coefficient, which is used to find the explanatory power of the estimated model (the estimated equation) in the case of simple linear regression (one independent variable with one dependent variable). In this study, multiple linear regression is used to investigate the relation between the DP and some other physical and chemical characteristic of the insulating oil and paper, such as the dissolved gases, BDV, IF, ACI, MC, OC, FA, and Tan δ.

Statistical Parameters
R-squared (R 2 ) is the coefficient of determination, and it is a statistical measurement examining how the change in one variable can influence another [12]. It is an indicator about the strength between the relationship of two or more variables. It can be computed as follows [20]: where X refers to the mean value of X. The correlation coefficient R is a statistical parameter that is used to measure the strength of the relationship between two variables and the common R is the Pearson's coefficient. The computation of R can be carried out as follows [15]: where Cov(X, estimated X) refers to the covariance between X and estimated X, Var(X) is the variance of X, and Var(estimated X) is the variance of estimated X. A p-value is a statistical measure that is used to determine if the null hypothesis is rejected or not based on the value of a significant level (α). The significant level is always assumed as 0.05, when the p-value of the term is less than 0.05, then, the null hypothesis is rejected, and the alternative hypothesis is confirmed either that the null hypothesis is confirmed [21].
The t stat statistical parameter is used to confirm the significance of the equation terms and compare with t-critical value. The t stat can be computed based on the following Equation (3): where α i refers to the coefficient of the equation terms and SE i can be computed as follows: where N refers to the number of observations, P is the number of equation terms, (N−P) refers to the degrees of freedom, and ε i refers to the difference between the actual X and estimated X. Tcritical can be computed based on the number of freedom and the risk interval limit (β) as: Tcritical = (β, N − P).

Statistical Results and Discussions
The correlation and multiple linear regression are applied to the test results of 131 transformers, which are collected from the chemical laboratory of the Saudi Electricity Company in Jedda in the period between 1/3/2019 and 17/6/2019 to investigate the relation between H 2 , CH 4 , C 2 H 6 , C 2 H 4 , C 2 H 2 , CO, CO 2 , CO 2 /CO, BDV, ACI, IF, MC, OC, FA, Tan δ, and DP. The remaining life of the transformer is also determined based on the value of DP. Table 2 illustrates the correlation between the previous variables and the results indicated that good correlation between the IF, ACI, MC, OC, FA, tan δ, and DP, where the correlation coefficients between these variables and DP are 0.789, −0.637, −0.544, −0.76, −0.679, and −0.459, respectively. The IF has a positive correlation coefficient with DP and on the other hand, the ACI, MC, OC, FA, and Tan δ have negative correlation coefficients. Therefore, the multiple linear regression is then applied to investigate the significant variables that influence the DP magnitude. Table 3 shows the statistical parameters to investigate the significant variables of H 2 , CH 4 , C 2 H 6 , C 2 H 4 , C 2 H 2 , CO, CO 2 , CO 2 /CO, BDV, ACI, IF, MC, OC, FA, and Tan δ on DP based on the p-value. The p-value is used to test the significance of each variable under test, where if the p-value is less than 0.05, the null hypothesis is rejected and the alternative hypothesis is confirmed, i.e., there is a true relationship between the variable with a p-value less than 0.05 and the dependent variable (DP in these cases). Based on the results of the statistical p-value in Table 3, there are several factors that have a significant effect on DP, such as IF, ACI, OC, and FA, where their p-values are less than 0.05. The highlighted cell indicated that the p-values of If, ACI, OC, and FA are 0.001, 0.029, 0.009, and 0.001, respectively, which p-value of each of them is less than the significant limit (0.05). Therefore, these four variables are considered in the second round of correlation and multilinear regression to get the exact variables on the DP after removing the other variables. In the second round of correlation and multiple linear regression, IF, ACI, OC, and FA are considered. Table 4 shows the correlation coefficient between these variables and DP. As in Table 3, the correlation coefficients are 0.789, −0.637, −0.76, and −0.679, respectively.
The statistical p-value is now tested again for these four variables to investigate their significant effect on DP. Table 5 illustrates the p-value of each variable with DP, where all p-values of IF, ACI, OC, and FA are less than the significant limit (0.05); the p-values of IF, ACI, OC, and FA decreases significantly to 1.05 × 10 −6 , 0.034, 0.009, and 8.84 × 10 −7 , respectively. Therefore, the results confirmed that these variables influence the DP magnitude and then the lifespan of the power transformers.
Based on the coefficient values of each term in Table 5 of the recording data may be wrong, which are considered to construct the DP prediction model. In addition, there is no obvious trend of PD with some variables, which caused the errors of Equation (5) in some cases to be large. Therefore, the next part discusses the relation between each of the terms in Equation (5) and DP to decide the appropriate term that influences DP.  The relation between each variable and DP is investigated based on the results of the test of each sample to identify the variable that is convenient to construct the empirical formula with DP. The trend lines fitting the data of each variable with PD are determined by the correlation coefficient to study the variation of DP with each variable (IF, ACI, OC, and FA). The relation between the IF results and the corresponding DP value is illustrated in Figure 3. DP and IF and Equation (6) cannot be used to develop a good simulation model between the two variables.
The worst correlation coefficient is for the acidity (ACI), where the R 2 is 0.5695 with power fitting. The R 2 is a negative correlation coefficient where there is inverse proportion between DP and ACI. The equation of the fitting line is as follows: The fitting DP line and the actual DP have a large error, as shown in Figure 4. Hence, the variation of the DP based on the ACI data is not convenient to build a specific relation between them. Furthermore, the nature of the variation in DP with the oil color (OC) cannot be used to establish the relation between them. The correlation coefficient R 2 of DP with OC is 0.6341 and the exponential fitting line expressing this relation and the relation between them is as follows: ( ) (8) Figure 5 illustrates the exponential fitting line describing the relation between DP and OC. It seems that the relation between DP and OC is not a good relation, because there is no obvious trend of DP with the change of OC, i.e., the measured degree of oil  The polynomial fitting gave the best correlation coefficient (R 2 ) of 0.6817 and the relation between DP on the y axis and IF on the x axis is as follows: The positive correlation coefficient indicates that an increase in IF leads to an increase in the DP value. Although the R 2 is moderate, the fitting line shows that the error between the fitting line and the DP is large; moreover, there is no obvious relationship between the DP and IF and Equation (6) cannot be used to develop a good simulation model between the two variables.
The worst correlation coefficient is for the acidity (ACI), where the R 2 is 0.5695 with power fitting. The R 2 is a negative correlation coefficient where there is inverse proportion between DP and ACI. The equation of the fitting line is as follows: DP = 148.79 × (ACI) 0.412 (7) The fitting DP line and the actual DP have a large error, as shown in Figure 4. Hence, the variation of the DP based on the ACI data is not convenient to build a specific relation between them. Furthermore, the nature of the variation in DP with the oil color (OC) cannot be used to establish the relation between them. The correlation coefficient R 2 of DP with OC is 0.6341 and the exponential fitting line expressing this relation and the relation between them is as follows: Figure 5 illustrates the exponential fitting line describing the relation between DP and OC. It seems that the relation between DP and OC is not a good relation, because there is no obvious trend of DP with the change of OC, i.e., the measured degree of oil color is 4.9, giving two DP values (127 and 624). Therefore, the empirical formula between DP and OC cannot express the correct value of DP.
The furfuraldehyde (FA) is the fourth variable that shows a considerable correlation based on the statistical analysis as in Table 2. The relation between DP and FA is shown in Figure 6. The correlation coefficient (R 2 ) is as high as 0.9959. Therefore, the empirical Energies 2021, 14, 670 9 of 14 formula to compute PD can be constructed based on the variation of FA. The formula is developed as a logarithmic fitting as follows: DP = −122.6 ln(FA) + 1294.4 (9) It is obvious that the FA is a convenient variable because the relation between DP and the FA has a specific trend where an increase of the FA results in a decrease in DP.
The percent of life remaining of the insulating paper is expressed based on a McNutt formula in [22]. It computes the remaining life as a function of DP value as follows: % remaining life = 166.1 × log 10 ×(DP) − 382.2 (10) Energies 2021, 14, x FOR PEER REVIEW 9 of 14 color is 4.9, giving two DP values (127 and 624). Therefore, the empirical formula between DP and OC cannot express the correct value of DP. The furfuraldehyde (FA) is the fourth variable that shows a considerable correlation based on the statistical analysis as in Table 2. The relation between DP and FA is shown in Figure 6. The correlation coefficient (R 2 ) is as high as 0.9959. Therefore, the empirical formula to compute PD can be constructed based on the variation of FA. The formula is developed as a logarithmic fitting as follows: = −122.6 ln( ) + 1294.4 It is obvious that the FA is a convenient variable because the relation between DP and the FA has a specific trend where an increase of the FA results in a decrease in DP. Degree of polymerization Oil color color is 4.9, giving two DP values (127 and 624). Therefore, the empirical formula between DP and OC cannot express the correct value of DP. The furfuraldehyde (FA) is the fourth variable that shows a considerable correlation based on the statistical analysis as in Table 2. The relation between DP and FA is shown in Figure 6. The correlation coefficient (R 2 ) is as high as 0.9959. Therefore, the empirical formula to compute PD can be constructed based on the variation of FA. The formula is developed as a logarithmic fitting as follows: = −122.6 ln( ) + 1294.4 (9) It is obvious that the FA is a convenient variable because the relation between DP and the FA has a specific trend where an increase of the FA results in a decrease in DP. The percent of life remaining of the insulating paper is expressed based on a McNutt formula in [22]. It computes the remaining life as a function of DP value as follows: The formula in Equation (9) to identify the PD value can be compared with the other expressions in the literature. Table 6 indicates the DP formulas that are used in the comparison process to validify the proposed empirical formula of DP in Equation (9). This comparison is based on the percentage error between the actual measured DP of the sample and the estimated DP through the different empirical DP expressions. Table 7 shows the error percentage of some cases using Equation (9) and the other DP expressions in literature. The 21 cases in Table 7 are used to investigate the accuracy of the Equation (9) compared to other DP equation in literatures. One case only (case 3) of the estimated DP based on Equation (9) gave a higher percentage error, 15.41%, but all other cases in ref. [23] were below 10% and the cases in ref. [24] were completely below 2% error. The errors of the Chendong empirical formula for DP computation [3,25] are less than 10%, but only four cases had errors higher than 10% (cases 3, 4, 5, 6). The other empirical DP formula provided higher error percentages for the most cases under test.
After checking the accuracy of Equation (9) to compute the DP, the remaining life of the insulating paper was determined, and then, the power transformer was identified based on Equation (10). Table 8 illustrates the percentage of remaining life of the 21 cases in Table 7. The results of Table 8 indicate that the percentage remaining life of the transformer is inversely proportional to the concentration of FA and directly proportional to DP. An increase of FA concentration leads to a reduction of DP and then to the transformer's remaining life. The remaining life of the power transformer is very high with the new insulating paper, where the DP of this paper is higher than 900 and the aged paper the remaining life decreases dramatically where the DP is 200 and this provides an indication for a great risk in the transformer condition. The formula in Equation (9) to identify the PD value can be compared with the other expressions in the literature. Table 6 indicates the DP formulas that are used in the comparison process to validify the proposed empirical formula of DP in Equation (9). This comparison is based on the percentage error between the actual measured DP of the sample and the estimated DP through the different empirical DP expressions. Table 7 shows the error percentage of some cases using Equation (9) and the other DP expressions in literature. The 21 cases in Table 7 are used to investigate the accuracy of the Equation (9) compared to other DP equation in literatures. One case only (case 3) of the estimated DP based on Equation (9) gave a higher percentage error, 15.41%, but all other cases in ref. [23] were below 10% and the cases in ref. [24] were completely below 2% error. The errors of the Chendong empirical formula for DP computation [3,25] are less than 10%, but only four cases had errors higher than 10% (cases 3, 4, 5, 6). The other empirical DP formula provided higher error percentages for the most cases under test. After checking the accuracy of Equation (9) to compute the DP, the remaining life of the insulating paper was determined, and then, the power transformer was identified based on Equation (10). Table 8 illustrates the percentage of remaining life of the 21 cases in Table 7. The results of Table 8 indicate that the percentage remaining life of the transformer is inversely proportional to the concentration of FA and directly proportional to DP. An increase of FA concentration leads to a reduction of DP and then to the transformer's remaining life. The remaining life of the power transformer is very high with the new insulating paper, where the DP of this paper is higher than 900 and the aged paper the remaining life decreases dramatically where the DP is 200 and this provides an indication for a great risk in the transformer condition.

Conclusions
Inspection of the insulating paper in a power transformer is very difficult when the transformer is in service; therefore, some electrical and chemical tests were carried out to check the insulating paper condition. The DP was used as a key indicator for the insulating paper condition via the measurement of several variables, such as the dissolved gas concentrations, BD, IF, ACI, MC, OC, Tan δ, and FA concentration. Statistical analysis via correlation and multiple linear regression was accomplished to construct the prediction equation relating PD to the previous variables. The statistical analysis results indicated that there is a high correlation between DP and IF, ACI, OC, and FA. The relations between IF, ACI, OC, and FA were investigated individually to study the effect of the variation of each variable with DP. The results confirmed that the convenience variable, which provided a good indicator to DP was FA; then, the empirical formula based on the measurement data was developed. The comparison between the constructed empirical formula and other DP formula in literatures revealed that the ability of the Equation (9) to be a good indicator of DP magnitude and then to identify the percentage remaining life of power transformers. The error percentage of the Equation (9) for computing DP was 15% in only one investigated test case (case 3), as in Table 7, three cases, i.e., cases 4, 5, and 6, showed error percentages of 8.11, 9.26, and 8.81, respectively, while in all other cases, the errors percentage were less than 2%. The percentage errors of Equation (9) were less than all the other empirical formulas that were presented in the literature, as shown in Table 7. The results of Equation (9), which computes DP, are close to the results of Chendong equation, as shown in Table 7.
The proposed future work will be how to use the unsupervised learning and clustering techniques for classifying the insulating paper condition and then determine the remaining life of a power transformer.