Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components

Cruz-Ramírez, Daniel; Zamudio-Ramírez, Israel; Dunai, Larisa; Antonino-Daviu, Jose Alfonso

doi:10.3390/app16073505

Open AccessArticle

Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components

by

Daniel Cruz-Ramírez

¹

,

Israel Zamudio-Ramírez

¹,

Larisa Dunai

²

and

Jose Alfonso Antonino-Daviu

^3,*

¹

Engineering Faculty, San Juan del Río Campus, Universidad Autónoma de Querétaro, Av. Río Moctezuma 249, San Juan del Río 76807, Querétaro, Mexico

²

Department of Graphic Engineering, Universitat Politècnica de València (UPV), Camino de Vera s/n, 46022 Valencia, Spain

³

Instituto Tecnológico de la Energía, Universitat Politècnica de València (UPV), Camino de Vera s/n, 46022 Valencia, Spain

^*

Author to whom correspondence should be addressed.

Appl. Sci. 2026, 16(7), 3505; https://doi.org/10.3390/app16073505

Submission received: 9 March 2026 / Revised: 28 March 2026 / Accepted: 31 March 2026 / Published: 3 April 2026

(This article belongs to the Special Issue Reliability and Fault Tolerant Control of Electric Machines)

Download

Browse Figures

Versions Notes

Abstract

Dry-type electrical transformers are essential components in commercial, industrial, and residential power distribution systems, as they adapt voltage levels required by a broad range of load types. Although they are robustly constructed, they are exposed to adverse operational and environmental conditions such as dust, humidity, and electrical disturbances that may cause premature winding damage, such as inter-turn short circuits. This study focuses on the detection of inter-turn short-circuit faults in a 15 kVA commercial dry-type transformer, where a fault equivalent to 11.54% of short-circuited turns was induced in the tap changers. Axial, radial, and rotational leakage magnetic flux signals were captured using a low-cost, non-invasive triaxial Hall-effect magnetic flux sensor. During data processing, Fisher Score feature selection was applied to identify the most relevant indicators. Subsequently, feature extraction techniques, including Linear Discriminant Analysis, Principal Component Analysis (PCA), Uniform Manifold Approximation and Projection, and Isometric Mapping, were evaluated. The technique that best preserved global and local data structures was selected using Trustworthiness, Spearman’s correlation, and Kruskal’s stress metrics. PCA was selected as the optimal technique based on these quality metrics, achieving the highest classification performance. The resulting subspace data were classified using support vector machines and applying K-fold cross-validation. The proposed system achieved classification accuracies above 95%, with high recall and F1-score values, for inter-turn fault detection in each winding, confirming its effectiveness for reliable inter-turn fault detection in each transformer winding.

Keywords:

leakage magnetic flux; inter-turn short-circuit; feature selection; feature extraction; support vector machine

1. Introduction

Electrical transformers are vital equipment for the supply and distribution of electrical energy, allowing the voltage levels required for numerous applications to be coupled [1]. They are robust, reliable, and cost-effective machines, but unexpected failures occur due to operating and environmental conditions. Studies show that 48% of these recorded failures correspond to damage to the windings [2], where it has been identified that between 60 and 70% of cases originate from incipient faults such as turn-to-turn short circuits, which cause gradual damage to the insulation due to the increase in local current of the short-circuited turns, giving rise to faults such as short circuits between phases or ground faults [3].

There are approved methods for detecting faults in transformers, some of which are mentioned in the IEEE Std C57-152 standards. However, they cannot detect incipient faults such as a short circuit between windings [4]. For this reason, several studies have focused on detecting this type of failure based on the measurement of different physical variables, including current analysis [5], oil analysis [6], infrared thermography [7], and vibration analysis [8,9,10], in order to detect these types of faults at an early stage. However, the analysis of stray magnetic flux has generated great interest, since the presence of short circuits between windings is directly linked to variations in the magnetic flux of a transformer, making this technique a promising alternative for the detection of incipient faults.

Among the studies focused on the diagnosis of inter-turn short circuits in transformers through leakage magnetic flux analysis, the pioneering studies of [11,12] analyzed this fault using leakage magnetic flux signals in a three-phase transformer. Coil-type sensors were arranged along the axial axis of the core, demonstrating that the leakage magnetic flux maintains symmetry along the mid-height of the core under normal operating conditions. However, this symmetry is immediately lost when an inter-turn short circuit occurs.

Similarly, refs. [3,13,14] used flat coil-type sensors placed between the ferromagnetic core and the windings to measure mutual flux and leakage flux between the windings and the core. The asymmetry caused by an inter-turn short circuit in a three-phase transformer enabled fault localization in each winding, even when the number of shorted turns was less than 1% of the total. In a more recent work, ref. [15] simulated the effect of electromagnetic forces on the windings and their mechanical response using a finite element model of a real transformer to reproduce the evolution of an inter-turn short-circuit fault. Their results demonstrated that insulation aging progressively increases due to the fault current, leading to core saturation and distortion of the leakage magnetic flux field.

On the other hand, ref. [16] proposed an active protection system based on leakage magnetic flux measurement using both invasive and non-invasive fiber optic sensors. Symmetry criteria of the leakage magnetic flux were established, enabling the use of amplitude- and phase-differential protection schemes to detect incipient winding faults. The system was experimentally validated and implemented on-site. In the same year, ref. [17] simulated a single-phase transformer as an experimental method to reproduce inter-turn short circuits, demonstrating electrical and magnetic equivalence under different fault positions and numbers of shorted turns, with an average error below 5%. Finally, ref. [18] employed an optimized array of 20 Hall-effect sensors for non-invasive inter-turn short-circuit detection, reinforcing the previously established criterion of symmetry loss in the leakage magnetic flux as a fault indicator. Automatic classification using support vector machines was performed, achieving an accuracy of 92.3% under no-load conditions and 97.4% under rated load.

These studies demonstrate that leakage magnetic flux is a reliable tool for diagnosing inter-turn short circuits in transformers. Studies have evidenced significant changes in the leakage magnetic flux signals measured in the time domain, especially in parameters such as amplitude, symmetry, and waveform shape. To quantify these variations, in the diagnosis of faults in electric motors [19] and transformers [9], statistical indicators were employed to characterize these changes and establish trends in the behavior of the signals such as current and vibrations, associated with the presence of faults, enabling the computation of relevant performance metrics for fault classification. Nevertheless, further research is still required to consolidate a more robust and standardized methodology, capable of systematizing the analysis of these signals and facilitating the implementation of diagnostic techniques based on leakage magnetic flux.

In this work, a methodology for inter-turn short-circuits is proposed. It is based on the analysis of axial, radial, and rotational components of the magnetic leakage flux using time-domain statistical indicators and feature reduction techniques that preserve the original data structure while remaining consistent with the physical behavior of the system under healthy and inter-turn short-circuit fault conditions. Finally, a support vector machine classification system with parameters optimized through genetic algorithms is implemented.

2. Theoretical Background

2.1. Leakage Magnetic Flux in Transformers

In power transformers, not all the magnetic flux produced by the primary winding links the secondary winding and vice versa. A portion of these magnetic flux lines leaves the core through the air without contributing to energy transfer; these losses are referred to as leakage magnetic flux [12]. This phenomenon depends on factors such as the reluctance of the magnetic circuit and the constructive configuration of the transformer. An inadequate design may increase this type of loss and affect the accuracy of the transformation ratio [20].

The leakage magnetic flux in a power transformer is a direct consequence of imperfect magnetic coupling between the primary and secondary windings. This leakage magnetic flux

“ \emptyset_{L} ”

is proportional to both the current flowing through the winding and the leakage inductance associated with the physical design of the transformer.

From electromagnetic theory, the inductance relates the current to the flux linkage

λ_{L}

, which can be expressed as follows:

λ_{L} = L_{L} * i (t)

(1)

where

“ λ_{L} ”

represents the leakage flux linkage,

“ L_{L} ”

is the leakage inductance, and

“ i (t) ”

is the instantaneous current flowing in the winding. The flux linkage is related to the magnetic flux through the number of turns of the winding according to

“ λ_{L} = N Φ_{L} ”

. Therefore, the leakage magnetic flux can be expressed in Equation (2):

Φ_{L} = \frac{L_{L} * i (t)}{N}

(2)

The leakage inductance, which represents the system’s ability to store magnetic energy along these leakage paths, directly depends on the geometric and constructive characteristics of the transformer. It can be calculated using Equation (3):

L_{L} = μ_{0} * \frac{N^{2} * A_{L}}{l}

(3)

where

“ μ_{0} ”

represents the magnetic permeability of free space,

“ N ”

is the number of turns in the winding,

“ A_{L} ”

is the effective cross-sectional area through which the leakage flux circulates, and “

l ”

is the effective length of the leakage flux path [21].

Figure 1 shows the equivalent circuit of a single-phase transformer under no-load conditions with an inter-turn short circuit. In this model,

“ R_{w} ”

and

“ X_{w} ”

represent the resistance and leakage reactance of the main winding, while

“ R_{c} ”

and

“ X_{m} ”

represent the core losses and magnetizing reactance.

“ R_{f} ”

and

“ X_{f} ”

represent the resistance and inductance associated with the short-circuited turns, and

V_{i n}

is the line voltage, allowing modeling of the fault current circulating within the winding itself.

The inter-turn short-circuit fault in a power transformer is a type of internal fault that occurs when the insulation between two or more turns of the same winding is compromised, allowing direct electrical contact between them. This condition leads to the circulation of a localized short-circuit current (fault current) through the shorted turns, resulting in excessive heating, a local increase in leakage magnetic flux, and consequently a significant alteration in the magnetic field distribution inside the transformer [22].

This localized current does not necessarily immediately affect externally measured electrical parameters, such as voltage or line current, which makes early detection difficult using conventional techniques or standard protection devices. However, the cumulative effect of this fault may lead to progressive damage, such as deterioration of nearby insulation, winding deformation due to thermal and mechanical stresses, and even evolution toward more severe faults such as phase-to-ground or phase-to-phase short circuits, with the risk of total transformer failure.

The electrical behavior of a system with an inter-turn short circuit can be modeled using Equations (4) and (5), incorporating the variables associated with the fault. The following equations represent a coupled model of the primary winding and the shorted turns:

V = R * i (t) + \frac{d}{d t} (L_{L} * i (t)) + \frac{d}{d t} (M * i_{f}),

(4)

0 = R_{f} * i_{f} + \frac{d}{d t} (L_{f} * i_{f}) + \frac{d}{d t} (M * i (t)),

(5)

where

“ i (t) ”

represents the main current in the winding,

“ i_{f} ”

is the inter-turn short-circuit fault current,

“ L_{L} ”

is the leakage inductance of the winding,

“ L_{f} ”

is the inductance of the short-circuited turns,

“ M ”

is the mutual inductance between the winding and the short-circuited turns,

“ R ”

is the winding resistance, and

“ R_{f} ”

is the resistance of the short-circuited turns.

Equation (5) shows that the inter-turn short circuit does not necessarily introduce an immediate change in the output voltage or total system current, which explains why these faults often go unnoticed in their early stages. However, the fault current

“ i_{f} ”

generates a localized magnetic flux that may oppose the main magnetic flux, distorting its symmetry and introducing harmonic components or anomalous variations in the leakage flux, which can be detected through appropriate sensors or advanced diagnostic techniques [3].

According to standards such as IEEE C57.12.91-2020 on the field testing of transformers, this type of fault represents one of the main challenges in predictive monitoring and the preventive diagnostics of power equipment due to its progressive evolution and difficult early-stage detection [23].

2.2. Statistical Time Features Processing

The signals captured by the magnetic sensor in the time domain are stored to evaluate changes in the power transformer operating under healthy and faulty conditions. For this purpose, the statistical indicators presented in Table 1 are used, as they allow the extraction of relevant quantitative information from the signals and the generation of trends. For the implementation of the statistical analysis of the captured signal, a windowing process is applied, enabling the segmentation of the signals into defined intervals that allow the extraction of features and information regarding the dynamic behavior of the signals, where

“ X_{i} ”

is the signal in the time domain, for

“ i = 1,2, 3, \dots, N ”

, with

“ N ”

being the number of data samples in the signal.

The processing of large amounts of data for machine learning may generate a complex model, which can lead to overfitting or poor performance in evaluation metrics. Therefore, feature reduction algorithms are implemented to improve data quality and reduce their complexity, which are described below.

2.2.1. Feature Selection

The objective of this stage is to select a subset of input variables that contain more information and better describe the input data, to maximize relevance and minimize data redundancy. For this case study, Fisher Score is used, which is an automatic method based on the idea of finding a subset of features such that the data space of the selected features is evaluated within each class, with the purpose of ensuring that the distances between data from different classes are as large as possible, while the distances between data from the same class are as small as possible [24], according to Equation (20).

{F S}^{j} = \frac{\sum_{i = 1}^{c} η_{i} {(μ_{i}^{j} - μ^{j})}^{2}}{\sum_{i = 1}^{c} η_{i} {(σ_{i}^{j})}^{2}},

(20)

where

η_{i}

is the number of data samples,

{“ μ}_{i}^{j} ”

is the mean, and

“ σ_{i}^{j} ”

is the standard deviation of the i-th class of the j-th feature, respectively, and

“ μ^{j} ”

is the mean of the entire dataset for the j-th feature. After calculating the Fisher Score of each feature, the features with the highest score are selected, indicating that the feature has a good capability to discriminate between classes [25].

2.2.2. Feature Extraction

Feature extraction aims to project a high-dimensional dataset into a lower-dimensional subspace while preserving the most relevant information, facilitating the analysis process and improving the efficiency of learning models. There are different techniques, including linear, nonlinear, supervised, and unsupervised methods [26]. In this study, both supervised and unsupervised techniques are employed. The supervised method considered is Linear Discriminant Analysis (LDA), and unsupervised techniques include Principal Component Analysis (PCA), Uniform Manifold Approximation and Projection (UMAP), and Isometric Mapping (ISOMAP).

LDA is a supervised dimensionality reduction technique whose objective is to project the data into a lower-dimensional subspace where the classes are as well separated as possible. This method identifies a new feature space by maximizing the between-class dispersion while minimizing the within-class dispersion. However, since this technique uses class label information to optimize separability, it may modify the original dynamic structure of the data by increasing the global distances between classes and reducing the local dispersion within each class [27]. For this reason, in the present work, LDA is mainly used for representational and comparative purposes with respect to unsupervised dimensionality reduction techniques, which better preserve the intrinsic structure of the data.

PCA is a standard technique in modern data analysis, widely used in various scientific fields. Its objective is to identify a new basis that captures as much variance as possible from the original dataset, revealing hidden structures and reducing noise. This unsupervised technique is useful for tasks such as dimensionality reduction, compression, feature extraction, and data visualization.

PCA performs a linear transformation of the data by generating new variables, called principal components (PCs), which represent the directions of maximum variance in the data. These new dimensions allow the data to be projected into a more compact subspace while preserving the most relevant information. The procedure for applying PCA includes the following steps:

Calculate the covariance matrix: $“ S = (X * X^{T}) ”$ .
Perform the eigenvalue and eigenvector decomposition of $“ S ”$ .
Sort the eigenvalues and their corresponding eigenvectors in descending order.
Select a reduced number of dimensions $“ m ”$ , smaller than the original.
Construct the transformation matrix $“ W ”$ with the selected $“ m ”$ eigenvectors.
Project each vector $“ x ”$ from the original space of dimension $“ d ”$ into the new space of dimension $m$ using $“ Y = W^{T} * x ”$ [28].

UMAP is an unsupervised, nonlinear, manifold-based dimensionality reduction technique. Unlike methods such as PCA that prioritize the preservation of global distances, UMAP seeks to preserve both the local and global structure of the dataset. It is useful for both visualization and preprocessing in machine learning tasks. UMAP is based on three key principles:

Local approximation of the space using neighbor graphs.
Modeling connectivity through fuzzy simplicial sets.
Optimization of the reduced space to preserve the graph structure.
The local weighted distance is calculated using Equation (21):

μ_{i j} = \exp (- \frac{‖x_{i} - x_{j}‖ - ρ_{i}}{σ_{i}}),

(21)

where

“ ρ_{i} ”

is the distance to the nearest neighbor of

“ x_{i} ”

, and

“ σ_{i} ”

is the adaptive parameter that controls the local scale.

On the other hand, symmetric connectivity is obtained from Equation (22):

ω_{i j} = μ_{i j} + μ_{j i} - μ_{i j} * μ_{j i},

(22)

where

“ ω_{i j} ”

represents the symmetric edge weight between points

“ i ”

and

“ j ”

in the fuzzy simplicial graph, combining the mutual membership strengths

“ μ_{i j} ”

and

“ μ_{j i} ”

.

Finally, the loss function (based on the cross-entropy divergence between high- and low-dimensional graphs) is represented by Equation (23).

L = \sum_{(i, j) ϵ E} ω_{i j} * \log (\frac{ω_{i j}}{q_{i j}}) + (1 + ω_{i j}) * \log (\frac{1 - ω_{i j}}{1 - q_{i j}}),

(23)

where

“ y_{i} ”

and

“ y_{j} ”

are the low-dimensional representations of the original data points

“ x_{i} ”

and

“ x_{j} ”

, respectively;

“ ∥ y_{i} - y_{j} ∥ ”

is the Euclidean distance in the embedded space; and

a

and

b

are empirical parameters that define the shape of the distribution used to model similarities in the low-dimensional space. The term

“ q_{i j} ”

represents the similarity between points

“ i ”

and

“ j ”

in the reduced space [29].

ISOMAP is an unsupervised, nonlinear, manifold-based method for dimensionality reduction. Unlike linear techniques such as PCA, which preserve only Euclidean distances, ISOMAP maintains geodesic distances between points, allowing the proper representation of nonlinear structures in the data.

The method constructs a neighborhood graph by connecting each point

“ x_{a} ”

with its nearest neighbors. Then, it computes the approximate geodesic distances between all pairs of data points using algorithms such as Floyd or Dijkstra, generating the geodesic distance matrix

“ d_{G} ”

[30]. This matrix is transformed using Equation (24):

K = H * d_{G} * H,

(24)

where

“ H ”

is the centering matrix. Finally, Multidimensional Scaling (MDS) is applied to

“ K ”

, obtaining the eigenvectors and eigenvalues to project the data into a lower-dimensional space. The algorithm is obtained from the following steps:

Construct the neighborhood graph of $“ X ”$ .
Compute geodesic distances and construct $“ d_{G} ”$ .
Apply MDS to $“ d_{G} ”$ to obtain the new space $“ Y ”$ [28].

2.2.3. Quality Metrics for Dimensionality Reduction Techniques

Dimensionality reduction techniques, when projecting data into a lower-dimensional subspace, inevitably involve a loss of information. Therefore, it is necessary to evaluate how faithfully the original relationships of the dataset are preserved [31]. For this purpose, local and global dimensionality reduction quality metrics are used, as described below.

Trustworthiness is a metric that quantifies how well a dimensionality reduction technique preserves the local structure of the original space; that is, whether points that are close neighbors in the reduced space were also close in the high-dimensional space. This metric focuses on penalizing spurious neighbors, namely points that appear as neighbors in the projection but were not close in the original structure. A visualization is considered trustworthy if it introduces the smallest possible number of such false neighbors.

Formally, let

“ X ”

be the dataset in the original space and

“ Y ”

its projection into the reduced space. For point

“ i ”

, its

“ k ”

nearest neighbors in both spaces are defined as

“ N_{k}^{X} (i) ”

, which is set of the

“ k ”

nearest neighbors of

“ i ”

in the original space.

“ N_{k}^{Y} (i) ”

is the set of the

“ k ”

nearest neighbors of

“ i ”

in the projected space.

These two sets define the neighbors that appear in the projection but not in the original space, obtained as

“ U_{k} (i) = N_{k}^{Y} (i) / N_{k}^{X} (i) ”

. The global formula to measure the local distortion using Trustworthiness is given by Equation (25):

T r u s t w o r t h i n e s s (k) = 1 - \frac{2}{n * k (2 * n - 3 * k - 1)} \sum_{i = 1}^{n} \sum_{j \in U_{k} (i)} (r_{i j} - k),

(25)

where

“ n ”

is the total number of points in the dataset,

“ r_{i j} ”

is the rank of point

“ j ”

as a neighbor of

“ i ”

in the original space, and

“ k ”

is the number of neighbors considered, typically between 5 and 15.

The key parameters to determine the quality of dimensionality reduction using the global formula are interpreted as follows:

Trustworthiness > 0.9: excellent preservation, adequately maintains local relationships.
0.8 < Trustworthiness ≤ 0.9: good representation.
Trustworthiness ≤ 0.8: possible significant distortion [32].

Spearman’s correlation is based on the ranks of these distances (rather than their absolute values), making it less sensitive to scale and more focused on preserving the relative order among distances. It is obtained from Equation (26):

L_{s p e a r m a n} = - \frac{C o v (r^{x}, r^{y})}{\sqrt{V a r (r^{x}) * V a r (r^{y})}},

(26)

This negative correlation value is used as a loss function: values closer to one indicate greater preservation of the global structure, since the distance ranks between pairs remain consistent in both spaces.

Since ordinary ranks are not differentiable (which prevents their direct use in gradient-optimized methods), a technique called soft ranking is employed. This allows a smooth approximation of traditional ranks and makes it possible to compute a differentiable version of

“ L_{s p e a r m a n} ”

. The parameters to evaluate dimensionality reduction quality are defined as follows:

Values of $“ L_{s p e a r m a n} ”$ close to one indicate that dimensionality reduction has well-preserved in the global structure of the data.

This metric is particularly useful when preserving the relative order between points is more important than maintaining exact distances [33].

Kruskal Stress, also known as Stress-1, is one of the most widely used metrics for this purpose. Its function is to quantify the discrepancy between the original dissimilarities and the distances represented in the reduced space, serving as a direct indicator of the quality of fit by evaluating global distances. Kruskal’s stress formula is defined by Equation (27):

σ_{1} (X) = \sqrt{\frac{\sum_{i < j} w_{i j} {(D_{i j} - d_{i j} (X))}^{2}}{\sum_{i < j} w_{i j} d_{i j}^{2} (X)}},

(27)

where

“ D_{i j} ”

represents the transformed dissimilarities or disparities,

“ d_{i j} (X) ”

are the distances between points

“ i ”

and

“ j ”

in the reduced space of dimension

“ p ”

, and

“ w_{i j} ”

are optional weights, and are equal to one if the pair is present and zero if data are missing [34].

To evaluate data quality, Kruskal (1964) [35] proposed a qualitative scale to interpret the following:

Stress-1 values as follows:
Stress-1 > 0.20: Very poor
Stress-1 ≈ 0.10: Acceptable
Stress-1 ≈ 0.05: Good
Stress-1 ≈ 0.025: Excellent
Stress-1 = 0: Perfect

However, these rules should be used with caution. The stress value may be influenced by several factors, such as the following:

Number of objects $“ n ”$ : As the number of data points increases, stress is expected to increase.
Number of dimensions $“ p ”$ : As the number of dimensions increases, stress tends to decrease.
Level of noise or error in the original dissimilarities.
Ties in the data (if ordinal transformations are used).
Presence of missing values, which often artificially reduce stress [36].

2.3. Machine Learning Processing

Support vector machines (SVMs) have been established as an effective tool in classification problems, especially in contexts involving small, high-dimensional, and nonlinear datasets. Their popularity in fault diagnosis is due to their ability to find optimal decision boundaries, even in scenarios where the data are not linearly separable [9].

In essence, an SVM seeks the optimal hyperplane that maximizes the margin between two classes. This hyperplane is defined by Equation (28):

g (x) = w^{T} * x + w_{0} = 0,

(28)

where

“ w ”

is the weight vector,

“ x ”

is the input vector, and

“ w_{0} ”

is the bias. The objective is that the support vectors satisfy the following conditions of Equations (29) and (30):

w^{T} * x + w_{0} \geq 1, \forall x \in w_{1},

(29)

w^{T} * x + w_{0} \leq - 1, \forall x \in w_{2},

(30)

where

“ w_{1} ”

and

“ w_{2} ”

represent the two linearly separable classes in the feature space.

The training of an SVM consists of solving the following optimization problem:

m i n J (w) = \frac{1}{2} {‖w‖}^{2},

(31)

It is subject to the following:

y_{i} (w^{T} * x + w_{0}) \geq 1, i = 1,2, \dots N

(32)

where

“ ∥ w ∥ ”

denotes the Euclidean norm of the weight vector,

“ x_{i} \in R^{n} ”

is the

i

-th training sample,

“ y_{i} \in {- 1,1} ”

is the corresponding class label of each sample

“ x_{i} ”

, and

“ N ”

is the total number of training samples.

The solution to the problem leads to a linear combination of the samples closest to the hyperplane, known as support vectors, defined by Equation (33):

w = \sum_{i = 1}^{N} y_{i} * d_{i} * x_{i},

(33)

where

“ d_{i} \geq 0 ”

are the Lagrange multipliers obtained from the dual optimization problem. Only samples with

“ d_{i} > 0 ”

correspond to support vectors and contribute to the final classifier.

For nonlinear problems, a kernel function is used to transform the data into a higher-dimensional space where they become separable. The resulting model is expressed according to Equation (34):

g (x) = \sum_{i = 1}^{N} y_{i} * d_{i} * K (x_{i}, x) + w_{0},

(34)

One of the most common kernel functions is the radial basis function (RBF) is defined by Equation (35):

K (x_{i}, x) = \exp (- \frac{{‖x - x_{i}‖}^{2}}{σ^{2}}),

(35)

In this expression,

“ σ ”

is referred to as the kernel scale and determines how far an individual sample can influence the classification. Small values of

“ σ ”

generate more complex models that may overfit the data; large values may lead to underfitting. Therefore, the choice of

“ σ ”

has a direct impact on the model’s ability to generalize. In addition, a penalty parameter

“ C ”

is introduced, which controls the trade-off between maximizing the margin and allowing classification errors through slack variables. A high value of

“ C ”

strongly penalizes errors, which may lead to overfitting; a low value allows more training errors but improves generalization [37].

The proper selection of parameters

“ C ”

and

“ σ ”

is essential to achieve high classification performance. Both parameters must be carefully tuned, typically through cross-validation or optimization algorithms such as genetic algorithms (GAs), which are used in this case study [38]. Figure 2 presents a schematic representation of the general architecture of an SVM, where the transformation of input data into a higher-dimensional feature space through the kernel, the construction of the optimal hyperplane, and the resulting classification based on the decision function can be observed.

3. Methodology

The methodology is based on the non-invasive acquisition of the axial, radial, and rotational components of leakage magnetic flux in a three-phase power transformer, considering healthy and inter-turn short-circuit fault conditions performed between tap changers 4 and 5 in the three windings, under no-load, as well as 15% and 40% load conditions. The signals are acquired and stored on a computer for processing through the stages described in Figure 3.

In the first stage, a signal normalization process is performed, followed by rectangular windowing without overlap, to calculate the statistical indicators listed in Table 1. Subsequently, the statistical features of each case study proceed to the feature reduction stage, which consists of feature selection using Fisher Score and feature extraction using LDA, PCA, UMAP, and ISOMAP techniques. The selected technique is the one that best preserves the original data distribution, evaluated through dimensionality reduction quality metrics.

The new two-dimensional subspace is used in the classification stage, where K-fold cross-validation is applied to split the data into training and validation sets. For the selection of the optimal parameters of the support vector machine-based classifier, a genetic algorithm is employed. Finally, system performance is evaluated using accuracy, recall and F1-score metrics.

4. Experiments and Results

This section presents the experimental setup of the test bench and the results of each signal processing stage up to the detection of the inter-turn short-circuit fault.

4.1. Experimental Test Bench

The experimental setup is shown in Figure 4, which consists of a three-phase dry-type transformer manufactured by Square D Mexico, with a Delta–Star configuration, rated at 15 kVA, 440 V–220/127 V, 4.7% impedance, and five 5% tap changers. To produce the inter-turn short circuit, a 1 Ω, 300 W resistor is connected in parallel with the tap changers to reduce the current incident fault. Additionally, a 16 A, single-pole, 230 V thermal-magnetic circuit breaker is connected in series to interrupt the current flow at the end of each test or in the event of a fault. A connection diagram is shown in Figure 5. On the other hand, to simulate operating conditions with 15% and 40% load, resistances that connected between phases of 1.3 kW (220 V, 5.9 A) and 3.5 kW (220 V, 15.9 A) are used, respectively, allowing the dissipation of energy in the form of heat and reproducing controlled load conditions during the experimental tests.

The signals are acquired using a triaxial Hall-effect magnetic sensor (ROHM Semiconductor, Kyoto, Japan), model BM1422AGMV, with a supply voltage of 3.6 V, operating current of 0.15 mA, sensitivity of 0.042 µT/LSB, and I2C serial communication protocol. The sensor is installed on the external casing of the transformer at the mid-height of the windings to measure the axial (X), radial (Y), and rotational (Z) magnetic flux signals.

The sensor was connected to a data acquisition board based on the STMicroelectronics STM32F401xC microcontroller, shown in Figure 4, with 32-bit Arm Cortex-M4 core operating at frequencies up to 84 MHz. It features standard and advanced communication interfaces, including the I2C protocol used for communication with the BM1422AGMV development board, and UART for communication with the USB-UART interface board for data transmission to the computer. The data are stored using an application developed in the open-source software Qt Creator version 15.0.1 (based on Qt 6.8.1) at a sampling frequency of 1 kHz.

Fifteen tests were conducted over a period of 50 s on each winding, considering the healthy state and the inter-turn short-circuit fault condition operating under no-load, 15% and 40% load conditions. This was carried out in accordance with IEEE Std C57.12.91-2020: Test Code for Dry-Type Distribution and Power Transformers and Section 12: Short-Circuit Tests for Transformers, which indicates that at least six tests must be performed on each winding for a minimum duration of 0.25 s, according to the equipment category [23].

The defined number of inter-turn short circuits is determined by the tap changers. The primary windings have a total of 52 turns per phase and six turns per tap. Since the short circuit is performed between taps 4 and 5, this corresponds to a total of 11.54% of turns in short circuit. According to IEEE Std C37.91-2021: Guide for Protecting Power Transformers, the minimum number of short-circuited turns detectable by a current transformer is greater than 10% of the indicated value [39].

Although the fault is induced through tap changers under controlled conditions, this approach enables a safe and repeatable emulation of inter-turn short-circuit faults. While it does not fully replicate the stochastic nature and progressive insulation degradation of real faults, it reproduces the main electromagnetic effects, such as localized current increase and the resulting modification of the leakage magnetic flux distribution. Therefore, the experimental setup provides a representative approximation of actual fault behavior for diagnostic purposes.

4.2. Statistical Time-Domain Processing

The obtained time-domain signals are segmented into intervals of 0.15 s, corresponding to winding 1. Figure 6 shows the time-domain signals and the changes in amplitude according to the fault and load level. The axial (X), radial (Y), and rotational (Z) components of the leakage magnetic flux are presented for the four operating conditions. By comparing Figure 6a and Figure 6b, it is observed that the load increase causes a slight increase in the amplitude of the axial (X) component, a slight decrease in the radial (Y) component, and a greater increase in the rotational (Z) component. In contrast, when comparing the fault conditions shown in Figure 6c,d, a decrease in amplitude is observed in the axial and rotational components. Additionally, waveform distortion is present in all three components.

The analysis of the axial, radial, and rotational leakage magnetic flux components of the three windings indicates that the presence of an inter-turn short circuit generates a local increase in leakage magnetic flux in the affected region, as a consequence of the increased electric current in the closed loop formed by the short-circuited turns.

As a result, the leakage magnetic flux signal associated with the fault produces alterations in the global leakage magnetic flux components, opposing this signal and causing changes in amplitude and waveform shape. These changes reflect a decrease in voltage and an increase in the local fault current, intensifying Joule losses, raising the winding temperature, and accelerating insulation deterioration.

It should be noted that the healthy no-load condition was not used for the diagnostic stage, since the proposed methodology aims to detect faults while the transformer is operating under load conditions. In practical applications, power transformers normally operate supplying load; therefore, the diagnostic strategy is focused on distinguishing between healthy and faulty conditions during normal operating conditions.

4.2.1. Feature Normalization Processing

The leakage magnetic flux signals corresponding to the axial (X), radial (Y), and rotational (Z) components were obtained over a period of 50 s under four operating conditions: healthy and fault inter-turn short-circuit without load, with 15% and 40% load. As part of the signal preprocessing stage, the healthy no-load condition was taken as the reference signal to perform the normalization process using Equation (36):

X_{N} = \frac{X_{1} - c}{m a x (∣ X_{1} - c ∣)},

(36)

The normalization was performed within the interval

[- 1, 1]

, where

“ X_{1} ”

is the signal to be normalized,

“ c = (m a x (X_{2}) + m i n (X_{2})) / 2 ”

is the center or mid-range value,

“ X_{2} ”

is the reference signal from which the maximum and minimum values are obtained—in this case, the healthy no-load condition for the three components—and

“ X_{N} ”

is the normalized signal [40].

Figure 7 shows the normalized signals of the three components under the four operating conditions. The scaling preserves the relative relationship of the signals in order to improve the feature reduction process and enhance the sensitivity of the machine learning stage of the classification technique. In Figure 7, the amplitude changes in the signals under load variations and inter-turn short-circuit fault conditions can be more clearly observed, mainly in the axial (X) and rotational (Z) axes.

4.2.2. Signal Windowing Processing

As part of the signal preprocessing stage, the 15 tests performed for each case study were processed using a rectangular windowing technique without overlap, as illustrated in Figure 8 (as red squares), considering a time interval of 1 s. Each test had a duration of 50 s; therefore, the windowing process generated 50 segments per test, illustrated in Figure 8 as a red dotted line. This procedure assumes a quasi-stationary behavior of the signal within each window and allows the signals to be segmented into intervals of equal length, increasing the number of samples available for statistical analysis.

Considering the 15 tests conducted for each operating condition, the segmentation process produced a total of 750 samples per class (15 tests × 50 windows). In this study, three operating conditions were considered for the dataset: healthy without load, healthy with load, and inter-turn short-circuit fault with load. Consequently, the dataset used for analysis consists of three classes with 750 samples each.

The statistical features described in Table 1 were calculated for each window to quantify the dynamic changes in the leakage magnetic flux signals under the different operating conditions and across the three measured components: axial (X), radial (Y), and rotational (Z). The increased number of samples improves the representativeness of the dataset, facilitates the dimensionality reduction stage, and contributes to improving the performance and robustness of the classification system.

4.3. Feature Reduction

This section presents the feature reduction stage based on feature selection techniques using the Fisher Score method, as well as feature extraction by employing the technique that best preserves the original dataset after the dimensionality reduction process. These procedures are described in the following subsections.

4.3.1. Feature Selection Results

After obtaining the 14 statistical indicators for each axis, they were grouped into a statistical data matrix comprising the three components of stray magnetic flux, namely axial (X), radial (Y), and rotational (Z), under operating conditions: healthy without load, healthy with 15% and 40% load, and inter-turn short-circuit fault without load, fault with 15% and 40% load. As a result, a matrix of 42 features per winding (14 indicators × 3 axes) was obtained. Under this framework, the objective is to perform transformer diagnosis in a non-invasive manner while the equipment remains in operation, thereby reducing inspection and testing times during scheduled maintenance.

For the feature selection stage, the matrices were separated by winding, since the measurement point was acquired independently for each winding. This approach allows the diagnosis to be carried out regardless of the winding on which the magnetic sensor is installed. Using the Fisher Score technique, 12 indicators were selected for each winding, representing dynamic changes among the proposed operating states for fault diagnosis. This process reduces the dataset dimensionality while minimizing information redundancy.

The statistical indicators selected by Fisher Score are presented in Table 2 for the axial (X) component, Table 3 for the radial (Y) component, and Table 4 for the rotational (Z) component, which are subsequently used in the following stages of the proposed methodology. According to the theoretical behavior of leakage magnetic flux in transformers and the results obtained from the Fisher Score ranking, the axial component (X) exhibits the highest contribution to fault discrimination, presenting the largest Fisher Score values. This observation is consistent with previous studies reported in the literature, where the axial leakage flux is identified as the component most affected by internal electromagnetic disturbances such as inter-turn short circuits. Nevertheless, the radial (Y) and rotational (Z) components also provide complementary and relevant information regarding the dynamic changes in the magnetic field distribution, contributing to a more robust representation of the fault condition.

4.3.2. Feature Extraction Results

In this stage, the dimensionality of the dataset selected by Fisher Score (12 indicators) is reduced to a two-dimensional set for each winding using the LDA, PCA, ISOMAP, and UMAP techniques. To determine the most appropriate technique, the quality of each projection provided by these methods was evaluated in terms of how well they preserve the structure of the original data. In particular, global dispersion indicators (Spearman’s correlation and stress) and local dispersion (Spearman) were assessed. These metrics allow the measuring of how faithfully the reduced subspace preserves the global and local distances of the original feature space.

Figure 9, Figure 10 and Figure 11 present the two-dimensional subspaces obtained by applying the dimensionality reduction techniques to windings 1, 2, and 3, respectively. The LDA technique shows better class separation, since it projects the data into subspace by maximizing global separation and reducing local dispersion within each class. However, according to Table 5, Table 6 and Table 7, when evaluating the quality of dimensionality reduction in terms of preserving the original dataset, it is observed that PCA provides the best projection of the new feature subspace, ensuring that the physical interpretation of the phenomena is not altered for transformer fault diagnosis.

4.4. Classification Results

In the classification stage, the K-fold validation technique was used, with

K = 5

, to evaluate the performance and generalization capability of the SVM classifier. A total of 750 samples from the two-dimensional subspace generated by PCA was considered for Windings 1, 2, and 3, including the three operating conditions: healthy and fault without load, with 15% and 40% load. Accordingly, 600 samples (80%) were used for training and 150 samples (20%) for validation.

For the automatic selection and optimization of the support vector machine hyperparameters, specifically the penalty parameter

“ C ”

and the parameter associated with the radial basis function (RBF) kernel, a genetic algorithm (GA) was employed. The GA configuration consisted of 30 evolutionary generations, 10 solutions per generation, and two genes corresponding to

C

and the kernel parameter. A crossover probability of 30% and a mutation probability of 10% were defined. The fitness function was established as the average accuracy obtained through five-fold cross-validation, as illustrated in Figure 12a, Figure 13a and Figure 14a, showing the convergence evolution of the optimization algorithm.

To verify the performance of SVM and the computational efficiency of the optimization process, Table 8, Table 9 and Table 10 present the optimal hyperparameters, accuracy, and execution time for Windings 1, 2, and 3. Using genetic algorithms (GAs), Grid Search with parameters included 20 logarithmically spaced values for both

“ C ”

and RBF. A five-fold cross-validation strategy was employed to evaluate each parameter combination, and the accuracy metric was used as the optimization criterion; Particle Swarm Optimization (PSO) consisted of 20 particles and had evolved for 30 iterations. The inertia weight was set to

“ w = 0.7 ”

, while the cognitive and social coefficients were set to

“ c_{1} = 1.5 ”

and

“ c_{2} = 1.5 ”

, respectively. The results show that GA provides a better response in terms of execution time and accuracy for the considered case study and the configuration of the implemented optimizers.

Once the optimal hyperparameters were obtained through GA, the SVM was trained using the corresponding training sets of each partition, generating the decision regions for each operating condition, as shown in Figure 12b, Figure 13b and Figure 14b. Subsequently, the validation data were used to evaluate the model performance in each of the five iterations, ensuring the statistical robustness of the results. Finally, the classification results are presented through the confusion matrices in Figure 12c, Figure 13c and Figure 14c.

Additionally, Table 11 presents the performance metrics of the classifier, including accuracy, recall and F1-score with values ranging from 97.84% to 100%, for each winding. These results indicate that the subspace obtained through PCA effectively preserves the relevant information associated with the dynamic variations in the operating states, enabling a reliable diagnosis of inter-turn short-circuit faults according to the proposed case study. Furthermore, the low variability observed among the data partitions confirms the stability and generalization capability of the classifier, reducing the likelihood of overfitting despite the high accuracy values achieved.

The changes observed in the global leakage magnetic flux signal comparing the healthy without load, healthy with load, and inter-turn short-circuit fault conditions demonstrate that the axial (X) component exhibits the highest sensitivity to dynamic variations associated with internal faults. This behavior is confirmed through the Fisher Score feature selection technique, which shows a greater discriminative capability in the statistical indicators corresponding to this component. Nevertheless, the radial (Y) and rotational (Z) components also contain relevant information related to the physical changes in the phenomenon, contributing to the characterization of the leakage magnetic flux behavior.

The appropriate selection of statistical indicators using the Fisher Score enables a significant separation among operating conditions while employing a reduced number of features. This improves model efficiency without the need to apply additional dimensionality reduction techniques that could alter global distances and local dispersion among classes. Preserving these relationships is essential to maintain the physical interpretation of the changes associated with the inter-turn short-circuit phenomenon.

Furthermore, by working with an optimized and compact feature set, the computational cost during the training and classifier optimization stages is reduced, particularly when evolutionary techniques such as genetic algorithms are employed for hyperparameter selection. This strategy not only enhances process efficiency but also contributes to preventing overfitting, ensuring a robust and generalizable model for non-invasive fault diagnosis in electrical transformers.

5. Conclusions

In the present work, a methodology based on the use of statistical indicators was implemented for the automatic classification of inter-turn short-circuit faults in a three-phase dry-type power transformer constructed under the IEEE-STD-C57-12-01 standard. Signal acquisition was carried out in accordance with IEEE Std C57.12.91-2020, through non-invasive measurement of leakage magnetic flux in the three windings using a triaxial Hall-effect magnetic sensor. This strategy allows the internal behavior of the transformer to be evaluated without physical intervention in the equipment, representing an advantage in industrial applications.

Time-domain analysis showed that, when operating the transformer under healthy no-load and subsequently under load conditions, variations occur in the amplitude of the global leakage magnetic flux. When an inter-turn short circuit is induced, changes associated with the increase in local fault current are generated, producing a local leakage magnetic flux that modifies behavior of the global magnetic flux. This results in variations in amplitude, symmetry, and waveform shape in the three analyzed components: axial (X), radial (Y), and rotational (Z).

The use of statistical indicators made it possible to quantify these changes in terms of signal amplitude, shape, and energy, providing discriminative features to differentiate between healthy and faulty conditions. The application of the Fisher Score feature selection technique enabled the identification of the most relevant indicators, while dimensionality reduction using PCA adequately preserved the structure of the original dataset when projecting it into two-dimensional subspace. This preservation was verified through dimensionality reduction quality metrics, ensuring that the physical interpretation of the associated phenomena was not altered.

The combination of feature selection, dimensionality reduction, and classification using SVM allowed outstanding performance metrics to be obtained, reaching an accuracy of 98.60% for Winding 1, 97.84% for Winding 2 and 100% for Winding 3. These results demonstrate that it is possible to diagnose inter-turn short-circuit faults from any winding where the measurement is performed, according to the severity level presented in this case study. Due to the electromagnetic proximity between the windings, the changes in the leakage magnetic flux induced by an internal fault are reflected in the other phases. This confirms the feasibility of the proposed methodology for the non-invasive diagnosis of internal faults in three-phase transformers, with potential application in online monitoring systems that enable early fault detection before catastrophic failures that could lead to significant economic losses.

For future studies, we propose developing a more robust methodology focused on the diagnosis of internal faults in transformers constructed under international standards with different power ratings, considering a smaller number of short-circuited turns (two, four, six, eight, ten turns), different loading conditions (20%, 30%, 50%, 75%, 100%, and current imbalance), and frequency-domain analysis techniques that allow the identification of spectral components characteristic of inter-turn short-circuit faults. Finally, the development of a continuous monitoring device based on FPGA is proposed, enabling the implementation of an intelligent relay aligned with the IEEE Std C37.91-2021: Guide for the Protection of Power Transformers.

Author Contributions

Conceptualization, D.C.-R., I.Z.-R. and J.A.A.-D.; methodology, D.C.-R. and I.Z.-R.; software, D.C.-R. and I.Z.-R.; validation, I.Z.-R., L.D. and J.A.A.-D.; formal analysis, D.C.-R. and I.Z.-R.; investigation, D.C.-R., L.D. and J.A.A.-D.; resources, L.D. and J.A.A.-D.; data curation, D.C.-R.; writing—original draft preparation, D.C.-R. and I.Z.-R.; writing—review and editing, I.Z.-R., L.D. and J.A.A.-D.; visualization, D.C.-R. and L.D.; supervision, I.Z.-R. and J.A.A.-D.; project administration, J.A.A.-D.; funding acquisition, J.A.A.-D. and L.D. All authors have read and agreed to the published version of the manuscript.

Funding

This work was partially supported by the grant PID2024-155729OB-I00 funded by MICIU/AEI/10.13039/501100011033 and the European Union ERDF.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

The data presented in this study are available on request from the corresponding author. The data are not publicly available due to privacy issues.

Acknowledgments

The author D.C.-R. gratefully acknowledges the support provided by the Secretaría de Ciencia, Humanidades, Tecnología e Innovación (SECIHTI), Mexico, by the scholarship (1316038).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

H–NL	Healthy–No Load
H–WL15%	Healthy–With Load 15%
H–WL40%	Healthy–With Load 40%
F–NL	Fault–No Load
F–WL15%	Fault–With Load 15%
F–WL40%	Fault–With Load 40%

References

Granados-Lieberman, D.; Huerta-Rosales, J.R.; Gonzalez-Cordoba, J.L.; Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M.; Camarena-Martinez, D. Time-Frequency Analysis and Neural Networks for Detecting Short-Circuited Turns in Transformers in Both Transient and Steady-State Regimes Using Vibration Signals. Appl. Sci. 2023, 13, 12218. [Google Scholar] [CrossRef]
Tenbohlen, S.; Jagers, J.; Vahidi, F. Standardized survey of transformer reliability: On behalf of CIGRE WG A2.37. Proc. Int. Symp. Electr. Insul. Mater. 2017, 2, 593–596. [Google Scholar] [CrossRef]
Haghjoo, F.; Mostafaei, M.; Mohammadi, H. A New Leakage Flux-Based Technique for Turn-to-Turn Fault Protection and Faulty Region Identification in Transformers. IEEE Trans. Power Deliv. 2018, 33, 671–679. [Google Scholar] [CrossRef]
C57.152-2013; IEEE Guide for Diagnostic Field Testing of Fluid-Filled Power Transformers, Regulators, and Reactors. IEEE: New York, NY, USA, 2013; pp. 1–121. [CrossRef]
Medeiros, R.P.; Costa, F.B. A Wavelet-Based Transformer Differential Protection: Internal Fault Detection during Inrush Conditions. IEEE Trans. Power Deliv. 2018, 33, 2965–2977. [Google Scholar] [CrossRef]
Liu, J.; Fan, X.; Zhang, C.; Lai, C.S.; Zhang, Y.; Zheng, H.; Lai, L.L.; Zhang, E. Moisture Diagnosis of Transformer Oil-Immersed Insulation with Intelligent Technique and Frequency-Domain Spectroscopy. IEEE Trans. Ind. Inform. 2021, 17, 4624–4634. [Google Scholar] [CrossRef]
Fanchiang, K.H.; Huang, Y.C.; Kuo, C.C. Power electric transformer fault diagnosis based on infrared thermal images using wasserstein generative adversarial networks and deep learning classifier. Electronics 2021, 10, 1161. [Google Scholar] [CrossRef]
Huerta-Rosales, J.R.; Granados-Lieberman, D.; Amezquita-Sanchez, J.P.; Camarena-Martinez, D.; Valtierra-Rodriguez, M. Vibration signal processing-based detection of short-circuited turns in transformers: A nonlinear mode decomposition approach. Mathematics 2020, 8, 575. [Google Scholar] [CrossRef]
Huerta-Rosales, J.R.; Granados-Lieberman, D.; Garcia-Perez, A.; Camarena-Martinez, D.; Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M. Short-circuited turn fault diagnosis in transformers by using vibration signals, statistical time features, and support vector machines on fpga. Sensors 2021, 21, 3598. [Google Scholar] [CrossRef] [PubMed]
Huerta-Rosales, J.R.; Granados-Lieberman, D.; Amezquita-Sanchez, J.P.; Garcia-Perez, A.; Bueno-Lopez, M.; Valtierra-Rodriguez, M. Contrast Estimation in Vibroacoustic Signals for Diagnosing Early Faults of Short-Circuited Turns in Transformers under Different Load Conditions. Energies 2022, 15, 8508. [Google Scholar] [CrossRef]
Cabanas, M.; Melero, M.; Rojas, C.; Orcajo, G.; Cano, J.; González, F.; Norniella, J.G.; Rozada, S. Detection of Insulation Faults on Disk-Type Winding Transformers by means of Leakage Flux Analysis. In Proceedings of the 2009 IEEE International Symposium on Diagnostics for Electric Machines, Power Electronics and Drives, Cargese, France, 31 August–3 September 2009. [Google Scholar]
Cabanas, M.F.; González, F.P.; Melero, M.G.; García, C.H.R.; Orcajo, G.A.; Rodríguez, J.M.C.; Norniella, J.G. Insulation Fault Diagnosis in High Voltage Power Transformers by Means of Leakage Flux analysis. Prog. Electromagn. Res. 2011, 114, 211–234. [Google Scholar] [CrossRef]
Haghjoo, F.; Mostafaei, M. Flux-based method to diagnose and identify the location of turn-to-turn faults in transformers. IET Gener. Transm. Distrib. 2016, 10, 1083–1091. [Google Scholar] [CrossRef]
Haghjoo, F.; Mohammadi, H. Planar Sensors for Online Detection and Region Identification of Turn-to-Turn Faults in Transformers. IEEE Sens. J. 2017, 17, 5450–5459. [Google Scholar] [CrossRef]
Zhu, N.; Li, J.; Shao, L.; Liu, H.; Ren, L.; Zhu, L. Analysis of Interturn Faults on Transformer Based on on Electromagnetic-Mechanical Coupling. Energies 2023, 16, 512. [Google Scholar] [CrossRef]
Wang, J.; Liu, Y.; Mao, J.; Liu, S.; Tong, Z.; Deng, X.; Tan, W. Research on Active Defense System for Transformer Early Fault Based on Fiber Leakage Magnetic Field Measurement. Energies 2025, 18, 4497. [Google Scholar] [CrossRef]
Li, X.; Yang, C.; Shuai, Y.; Wu, D.; Zhang, Z.; Yang, L. Experimental Methods and Equivalence Research on Inter-Turn Short Circuits in Power Transformers. Energies 2025, 18, 5453. [Google Scholar] [CrossRef]
Wang, B.; Wang, L. OPEN A fault diagnosis method for inter-turn short circuit based on magnetic field distribution. Sci. Rep. 2025, 15, 17409. [Google Scholar] [CrossRef]
Saucedo-Dorantes, J.J.; Jaen-Cuellar, A.Y.; Delgado-Prieto, M.; Romero-Troncoso, R.d.J.; Osornio-Rios, R.A. Condition monitoring strategy based on an optimized selection of high-dimensional set of hybrid features to diagnose and detect multiple and combined faults in an induction motor. Meas. J. Int. Meas. Confed. 2021, 178, 109404. [Google Scholar] [CrossRef]
Fitzgerald, A.E.; Kingsley, C.; Kusko, A. Electric Machinery; McGraw-Hill Book Company: New York, NY, USA, 2003. [Google Scholar]
Chapman, S.J. Transformers. In Electric Machinery Fundamentals, 3rd ed.; McGraw-Hill: New York, NY, USA, 1999; pp. 61–117. [Google Scholar]
Athikessavan, S.C.; Jeyasankar, E.; Manohar, S.S.; Panda, S.K. Inter-Turn Fault Detection of Dry-Type Transformers Using Core-Leakage Fluxes. IEEE Trans. Power Deliv. 2019, 34, 1230–1241. [Google Scholar] [CrossRef]
C57.12.91-2020; IEEE Standard Test Code for Dry-Type Distribution and Power Transformers; Revision of IEEE Std C57.12.91-2011. IEEE: New York, NY, USA, 2021; pp. 1–102. [CrossRef]
Sun, L.; Wang, T.; Ding, W.; Xu, J.; Lin, Y. Feature selection using Fisher score and multilabel neighborhood rough sets for multilabel classification. Inf. Sci. 2021, 578, 887–912. [Google Scholar] [CrossRef]
Chiang, L.H.; Kotanchek, M.E.; Kordon, A.K. Fault diagnosis based on Fisher discriminant analysis and support vector machines. Comput. Chem. Eng. 2004, 28, 1389–1401. [Google Scholar] [CrossRef]
Ghosh, A.; Nashaat, M.; Miller, J. Context-Based Evaluation of Dimensionality Reduction Algorithms—Experiments and Statistical Significance Analysis. ACM Trans. Knowl. Discov. Data 2021, 15, 24. [Google Scholar] [CrossRef]
Kaya, S.; Cicioǧlu Aridoǧan, B.; Demirci, M. Hepatit B ve C virus enfeksiyonu olan hastalarda hepatit G virus prevalansι. Mikrobiyol. Bul. 2004, 38, 421–427. [Google Scholar] [PubMed]
Anowar, F.; Sadaoui, S.; Selim, B. Conceptual and empirical comparison of dimensionality reduction algorithms (PCA, KPCA, LDA, MDS, SVD, LLE, ISOMAP, LE, ICA, t-SNE). Comput. Sci. Rev. 2021, 40, 100378. [Google Scholar] [CrossRef]
Mcinnes, L.; Healy, J.; Melville, J. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. arXiv 2020, arXiv:1802.03426. [Google Scholar] [CrossRef]
Choi, H.; Choi, S. Robust kernel Isomap. Pattern Recognit. 2007, 40, 853–862. [Google Scholar] [CrossRef]
Thrun, M.C.; Märte, J. Analyzing Quality Measurements for Dimensionality Reduction. Mach. Learn. Knowl. Extr. 2023, 5, 1076–1118. [Google Scholar] [CrossRef]
Najim, S.A.; Soo, I. Trustworthy dimension reduction for visualization different data sets. Inf. Sci. 2014, 278, 206–220. [Google Scholar] [CrossRef]
Ali, K.; Al-hameed, A. Spearman’ s correlation coefficient in statistical analysis. Int. J. Nonlinear Anal. Appl. 2022, 13, 3249–3255. [Google Scholar]
Mair, P.; Borg, I.; Rusch, T. Goodness-of-Fit Assessment in Multidimensional Scaling and Unfolding Goodness-of-Fit Assessment in Multidimensional Scaling and Unfolding. Multivar. Behav. Res. 2017, 51, 772–789. [Google Scholar] [CrossRef]
Kruskal, J.B. Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis. Psychometrika 1964, 29, 1–27. [Google Scholar] [CrossRef]
Dexter, E.; Rollwagen-bollens, G.; Bollens, S.M. The trouble with stress: A flexible method for the evaluation of nonmetric multidimensional scaling. Limnol. Oceanogr. Methods 2018, 16, 401–473. [Google Scholar] [CrossRef]
Bacha, K.; Souahlia, S.; Gossa, M. Power transformer fault diagnosis based on dissolved gas analysis by support vector machine. Electr. Power Syst. Res. 2012, 83, 73–79. [Google Scholar] [CrossRef]
Romero-Ramirez, L.A.; Elvira-Ortiz, D.A.; Jaen-Cuellar, A.Y.; Morinigo-Sotelo, D.; Osornio-Rios, R.A.; Romero-Troncoso, R.D.J. Methodology based on higher-order statistics and genetic algorithms for the classification of power quality disturbances. IET Gener. Transm. Distrib. 2020, 14, 4580–4592. [Google Scholar] [CrossRef]
C37.91-2021; IEEE Guide for Protecting Power Transformers; Revision of IEEE Std C37.91-2008. IEEE: New York, NY, USA, 2021; pp. 1–160. [CrossRef]
Xu, G.; Zhang, M.; Chen, W. Transformer Fault Diagnosis Utilizing Feature Extraction and Ensemble Learning Model. Information 2024, 15, 561. [Google Scholar] [CrossRef]

Figure 1. Equivalent circuit of a single-phase transformer under no-load conditions with an inter-turn short-circuit.

Figure 2. Support vector machine architecture.

Figure 3. Flowchart for proposed methodology.

Figure 4. Experimental setup.

Figure 5. Connection diagram for the inter-turn short-circuit test.

Figure 6. Three-dimensional leakage magnetic flux signals under healthy and inter-turn short-circuit fault conditions: (a) healthy without load; (b) healthy with load; (c) inter-turn short-circuit fault without load; (d) inter-turn short-circuit fault with load.

Figure 7. Normalized leakage magnetic flux signals in axial (X), radial (Y), and rotational (Z) components: (a) healthy without load; (b) healthy with load; (c) inter-turn short-circuit fault with load.

Figure 8. Signal segmentation process using non-overlapping rectangular windows (1 s interval).

Figure 9. Two-dimensional feature space projection for Winding 1 using (a) LDA; (b) PCA; (c) ISOMAP; (d) UMAP.

Figure 10. Two-dimensional feature space projection for Winding 2 using (a) LDA; (b) PCA; (c) ISOMAP; (d) UMAP.

Figure 11. Two-dimensional feature space projection for Winding 3 using (a) LDA; (b) PCA; (c) ISOMAP; (d) UMAP.

Figure 12. Classification results for Winding 1: (a) Genetic algorithm convergence for SVM; (b) decision regions of the GA-optimized SVM; (c) confusion matrix.

Figure 13. Classification results for Winding 2: (a) genetic algorithm convergence for SVM; (b) decision regions of the GA-optimized SVM; (c) confusion matrix.

Figure 14. Classification results for Winding 3: (a) genetic algorithm convergence for SVM; (b) decision regions of the GA-optimized SVM; (c) confusion matrix.

Table 1. Statistical features.

Feature	Equation
Mean	$F 1 = \frac{1}{N} * \sum_{i = 1}^{N} X_{i}$	(6)
Maximum Value	$F 2 = m a x (X_{i})$	(7)
Root Mean Square (RMS)	$F 3 = {(\frac{1}{N} * \sum_{i = 1}^{N} X_{i}^{2})}^{1 / 2}$	(8)
Square Root Mean (SRM)	$F 4 = {(\frac{1}{N} * \sum_{i = 1}^{N} {\|X_{i}\|}^{1 / 2})}^{2}$	(9)
Standard Deviation	$F 5 = \sqrt{\frac{1}{N} * \sum_{i = 1}^{N} {(X_{i} - F_{1})}^{2}}$	(10)
Variance	$F 6 = \frac{1}{N} * \sum_{i = 1}^{N} {(X_{i} - F 1)}^{2}$	(11)
RMS Shape Factor	$F 7 = \frac{F 3}{\frac{1}{N} * \sum_{i = 1}^{N} \|X_{i}\|}$	(12)
SRM Shape Factor	$F 8 = \frac{F 4}{\frac{1}{N} * \sum_{i = 1}^{N} \|X_{i}\|}$	(13)
Crest Factor	$F 9 = \frac{F 2}{F 4}$	(14)
Latitude Factor	$F 10 = \frac{F 2}{F 3}$	(15)
Impulse Factor	$F 11 = \frac{F 2}{\frac{1}{N} * \sum_{i = 1}^{N} \|X_{i}\|}$	(16)
Skewness	$F 12 = \frac{\sum_{i = 1}^{N} ({X_{i} - F 1)}^{3}}{{F 5}^{3}}$	(17)
Kurtosis	$F 13 = \frac{\sum_{i = 1}^{N} ({X_{i} - F 1)}^{4}}{{F 5}^{4}}$	(18)
Shannon Entropy	$F 14 = \sum_{i = 1}^{N} X_{i}^{2} l o g (X_{i}^{2})$	(19)

Table 2. Optimal feature subset selected using Fisher Score for axial (X) component.

Winding 1 Feature	F-Score	Winding 2 Feature	F-Score	Winding 3 Feature	F-Score
Mean	508.72	Latitude Factor	898.63	RMS Factor	662.39
Crest Factor	125.29	Impulse Factor	891.37	Standard Deviation	294.91
Maximum	58.55	Standard Deviation	267.98	SRM	258.60
RMS	58.15	Variance	176.91	Variance	181.48
-	-	RMS Factor	77.86	SRM Factor	87.66
-	-	Mean	51.35	-	-

Table 3. Optimal feature subset selected using Fisher Score for radial (Y) component.

Winding 1 Feature	F-Score	Winding 2 Feature	F-Score	Winding 3 Feature	F-Score
Kurtosis	63.47	Shannon Entropy	172.68	Mean	253.78
-	-	Standard Deviation	156.00	SRM	180.36
-	-	Variance	120.56	SRM Factor	165.22
-	-	-	-	Shannon Entropy	113.23
-	-	-	-	Standard Deviation	108.24
-	-	-	-	Variance	95.09

Table 4. Optimal feature subset selected using Fisher Score for rotational (Z) component.

Winding 1 Feature	F-Score	Winding 2 Feature	F-Score	Winding 3 Feature	F-Score
Impulse Factor	205.48	Standard Deviation	49.33	Standard Deviation	137.41
SRM	205.13	RMS Factor	48.08	-	-
Latitude Factor	205.11	Variance	47.77	-	-
Standard Deviation	128.56	-	-	-	-
SRM Factor	97.49	-	-	-	-
Variance	96.82	-	-	-	-
RMS Factor	90.83	-	-	-	-

Table 5. Performance Evaluation Metrics of Dimensionality Reduction Methods for Winding 1.

Method	Trustworthiness	Spearman	Stress
LDA	0.9435	0.8415	71.5422
PCA	0.9978	0.9995	0.0158
ISOMAP	0.9960	0.9954	0.1034
UMAP	0.9991	0.5493	19.0411

Table 6. Performance Evaluation Metrics of Dimensionality Reduction Methods for Winding 2.

Method	Trustworthiness	Spearman	Stress
LDA	0.8488	0.2046	4.3820
PCA	0.9807	0.9987	0.0022
ISOMAP	0.9789	0.9861	0.0082
UMAP	0.9981	0.2195	0.8894

Table 7. Performance Evaluation Metrics of Dimensionality Reduction Methods for Winding 3.

Method	Trustworthiness	Spearman	Stress
LDA	0.8619	0.4427	1.2315
PCA	0.9992	1.000	0.0006
ISOMAP	0.9953	0.9987	0.0158
UMAP	0.9994	0.2162	0.7899

Table 8. Comparative performance of SVM optimization algorithms for Winding 1.

Optimization Method	Penalty Parameter (C)	Kernel (RBF)	Accuracy (%)	Time (s)
Genetic Algorithm	99.24	7.95	98.60	57.45
Grid Search	100.00	10.00	98.42	89.23
Particle Swarm	91.81	8.12	98.60	63.32

Table 9. Comparative performance of SVM optimization algorithms for Winding 2.

Optimization Method	Penalty Parameter (C)	Kernel (RBF)	Accuracy (%)	Time (s)
Genetic Algorithm	74.00	1.91	97.84	68.28
Grid Search	100.00	1.62	97.20	96.57
Particle Swarm	85.89	1.88	97.84	77.32

Table 10. Comparative performance of SVM optimization algorithms for Winding 3.

Optimization Method	Penalty Parameter (C)	Kernel (RBF)	Accuracy (%)	Time (s)
Genetic Algorithm	33.98	0.36	100.00	50.52
Grid Search	100.00	0.0037	100.00	59.32
Particle Swarm	53.61	0.34	100.00	54.89

Table 11. Performance metrics obtained with SVM classifier for each winding.

Transformer Winding	Accuracy (%)	Recall (%)	F1-Score (%)
Winding 1	98.60%	98.60%	98.60%
Winding 2	97.84%	97.84%	97.84%
Winding 3	100%	100%	100%

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Cruz-Ramírez, D.; Zamudio-Ramírez, I.; Dunai, L.; Antonino-Daviu, J.A. Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components. Appl. Sci. 2026, 16, 3505. https://doi.org/10.3390/app16073505

AMA Style

Cruz-Ramírez D, Zamudio-Ramírez I, Dunai L, Antonino-Daviu JA. Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components. Applied Sciences. 2026; 16(7):3505. https://doi.org/10.3390/app16073505

Chicago/Turabian Style

Cruz-Ramírez, Daniel, Israel Zamudio-Ramírez, Larisa Dunai, and Jose Alfonso Antonino-Daviu. 2026. "Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components" Applied Sciences 16, no. 7: 3505. https://doi.org/10.3390/app16073505

APA Style

Cruz-Ramírez, D., Zamudio-Ramírez, I., Dunai, L., & Antonino-Daviu, J. A. (2026). Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components. Applied Sciences, 16(7), 3505. https://doi.org/10.3390/app16073505

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Automatic Detection of Inter-Turn Short-Circuit in Dry-Type Transformers Through the Analysis of Leakage Flux Components

Abstract

1. Introduction

2. Theoretical Background

2.1. Leakage Magnetic Flux in Transformers

2.2. Statistical Time Features Processing

2.2.1. Feature Selection

2.2.2. Feature Extraction

2.2.3. Quality Metrics for Dimensionality Reduction Techniques

2.3. Machine Learning Processing

3. Methodology

4. Experiments and Results

4.1. Experimental Test Bench

4.2. Statistical Time-Domain Processing

4.2.1. Feature Normalization Processing

4.2.2. Signal Windowing Processing

4.3. Feature Reduction

4.3.1. Feature Selection Results

4.3.2. Feature Extraction Results

4.4. Classification Results

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI