1. Introduction
Power and distribution transformers are one of the most significant and expensive assets in any power system grid. Internal faults in the transformer such as partial discharge (PD) or overloading may lead to insulation deterioration and eventually to complete failure of the transformer. This causes catastrophic transformer outages, which lead to both direct and indirect costs. Hence, assessing the transformer’s health condition and continuous monitoring of the insulation system ensures its satisfactory performance, maintains efficiency, and prolongs its lifetime.
Together, the oil and insulation paper constitute the transformer’s insulation system and have two important functionalities [
1]: to act as an insulation to insulate high voltage from the ground and as a coolant to dissipate the generated heat efficiently. The overall health condition of a transformer depends largely on the state of its oil and paper insulation system [
2]. Ageing of the transformer oil, which is a natural process in any insulation system, results in the formation of sludge particles, which in turn damages the properties of other insulation components like cellulose paper in the transformer winding. Therefore, it becomes very critical to monitor the transformer oil quality by regularly inspecting samples using different electrical, physical and chemical methods.
There are several elements that can be measured to quantify the transformer oil ageing condition. They can be classified into three categories: dissolved gas analysis (DGA), furan content and oil tests. DGA analysis is conducted mainly to detect the emergence of different faults inside the transformer winding, like arcing or PD activities. Furan, on the other hand, is measured to estimate the health condition of the transformer paper insulation. Finally, oil tests reveal information about several aspects of the electrical, physical and chemical condition of the transformer oil. For example, oil tests include water content, breakdown voltage (BDV), interfacial tension (IFT), dissipation factor (DF), color and acidity [
3]. Conducting such tests routinely adds to the overall maintenance cost of the transformer. The cost of the oil sample varies from one country to another, for example, testing one oil sample (BDV, acidity, water content and IFT) in Dubai would cost around USD 1500 [
4]. Thus, instead of testing these samples, it is more economical to predict their values. This is particularly so, given the recent advancement in machine learning (ML) algorithms as they have proven efficacy in many applications. Among all oil tests, the IFT conducted as per the ASTM D971 standard has the highest cost, requires specific expertise and specialized instruments [
2].
The IFT of mineral oil is related to the aging of the oil sample. Mineral oil is essentially a non-polar saturated hydrocarbon fluid and when it undergoes oxidative degradation, oxygenated species are formed such as carboxylic acids, which are hydrophilic in nature. The presence of these hydrophilic components in the transformer oil can influence the chemical (acidity), electrical (BDV), and physical (IFT) properties of the oil sample. Measuring the IFT is basically conducted by measuring the surface tension of an oil sample against that of water, which is highly polar. The more the two liquids (oil and water) are similar in their polarity, the lower the value of the surface tension between them. Thus, the higher the concentration of hydrophilic materials in the oil sample, the lower will be the interfacial tension of the oil measured against water. So, the magnitude of the IFT is inversely related to the concentration of the hydrophilic degradation products that result from the aging of the oil. Since hydrophilic materials are usually highly polar and thus not very soluble in non-polar oil, the presence of these species can result in sludge formation that in turns contributes to the further degradation of the transformer insulation system [
5].
Recently, the application of machine learning in transformer assessment has become more widespread. Most of the reported studies have concentrated on predicting the transformer health index (HI). The transformer HI is a calculated number that estimates the health condition of oil-filled transformers [
6]. In [
7], a fuzzy logic-based approach was used to predict the HI value using the oil quality, dissolved gas and furan content parameters as inputs. The reported classification success rate was 97% based on a three-class classification system. Moreover, in [
8], an artificial neural network (ANN) approach was proposed to classify the condition of the transformer based on the predicted HI value. The input features used in this model are oil test parameters, DGA and furan content. Based on the testing outcomes, 97% of the testing samples were correctly classified into a three-class condition problem. To further enhance the HI calculation, a reduced model was implemented [
9]. It has been found that a HI with relatively high accuracy can be achieved with few tests.
Few studies have been conducted to estimate transformer oil characteristics such as water content and breakdown voltages [
10,
11,
12]. A cascaded ANN was used to predict transformer oil parameters using the Megger test [
10]. Also, ANN with stepwise regression was implemented to predict the transformer furan content [
11]. These studies were only conducted on a moderate number of transformers, which makes it hard to generalize the conclusions. A polynomial regression model has been developed to predict the breakdown voltage as a function of the transformer service period and other oil testing parameters like total acidity and water content. Except for a few cases, the percentage error between the actual and predicted values of transformer breakdown voltage was less than 10% [
12]. However, the model needs the water content and total acidity as an input to predict the breakdown voltage. Hence, while this model saves the cost of conducting the breakdown voltage test, there is still a need to conduct two other oil tests. Moreover, the values of the water content and total acidity need to be collected at different time intervals to formulate the mathematical model and predict the value of the transformer oil breakdown voltage, which adds to the overall transformer oil maintenance cost.
In this paper, the authors investigated the ability of ensemble methods to predict the class of IFT. An ensemble method is a learning technique that uses several base models in order to produce one optimal predictive model [
13]. The key idea behind any learning-based problem is to find a single model that best predicts the output. Instead of depending on only one model and hoping that it might be the most accurate we can come up with, ensemble methods take a myriad of models into account and leverage these to produce one final model. In our problem, we use two layers of these ensemble models using soft voting. The concept behind a voting classifier is to combine different machine learning classifiers and use a voting criterion of some sort to predict the class label [
13]. A classifier of this sort can balance out the individual weakness of the classifiers involved. There are two types of voting classifiers: (i) majority/hard voting and (ii) soft voting. The former uses the mode of the class labels predicted by the individual classifiers while the later returns the class label as argmax of the sum of predicted probabilities. In other words, each classifier is assigned a weight and the class label that has the maximum weighted average is selected as the output class label.