1. Introduction
Induction motors are widely used in industrial applications due to their robustness, efficiency, and cost-effectiveness. However, like all rotating machinery, they are subject to various electrical and mechanical faults that can compromise reliability and shorten service life. Among these, stator winding faults—particularly inter-turn short circuits (ITSCs)—are considered especially critical because they can quickly evolve into severe failures if not detected at an early stage. Therefore, accurate and timely fault diagnosis is of great importance for predictive maintenance strategies and Industry 4.0-oriented smart monitoring systems. The adoption of condition-based monitoring and predictive maintenance strategies has become a cornerstone of Industry 4.0-oriented manufacturing systems [
1,
2]. Modern industrial facilities increasingly rely on intelligent monitoring systems that integrate Internet of Things (IoT) sensors and advanced analytics to enable real-time fault detection and proactive maintenance scheduling [
3,
4]. These data-driven approaches not only reduce unplanned downtime but also optimize maintenance costs and extend equipment service life, making reliable fault detection algorithms essential for competitive industrial operations. Early and reliable ITSC detection has long been a priority, with foundational studies analyzing turn-to-turn short circuits in operating motors and during switch-off transients directly in the frequency domain [
5,
6].
Traditionally, motor current signature analysis (MCSA) has served as the cornerstone of fault diagnosis in induction machines, relying on the identification of characteristic frequency components in the stator current spectrum [
7,
8]. Over the years, various signal processing enhancements have been introduced to improve the sensitivity of MCSA, including empirical mode decomposition (EMD), wavelet transform, and principal component analysis (PCA) [
9]. These methods have proven effective for rotor bar faults, eccentricities, and bearing anomalies. For stator faults, spectral strategies are particularly effective: parametric spectral estimation sharpens fault-related lines under short records and varying loads [
10]. Advanced parametric spectral estimation methods have demonstrated superior frequency resolution compared to non-parametric approaches when signal duration is restricted [
10]. These high-resolution techniques address the fundamental trade-off between time and frequency resolution inherent in short-duration signal analysis. Rotor slot harmonics (RSH) have emerged as particularly sensitive fault indicators for stator winding defects, with the matrix pencil method specifically designed to extract RSH components with high precision [
11]. Further contributions underline frequency-domain diagnostics under short records and inverter-fed conditions, including harmonic/sideband modeling and high-resolution estimation [
12,
13].
Short-Time Fourier Transform (STFT) has been widely used to generate spectrograms that reveal transient fault signatures requiring time–frequency analyses [
14,
15]. When combined with deep learning, STFT-based approaches have achieved high accuracy in fault classification, with some studies reporting accuracies above
in rotor bar fault detection [
16]. In [
17], authors show that time–frequency trajectory tracking methods can support interpretation under variable speeds. More advanced spectral estimation methods, such as minimum-norm time–frequency analysis, have demonstrated improved resolution for multiple-fault scenarios under variable operating conditions [
18]. Unlike STFT, these subspace-based approaches can achieve frequency resolution that is not strictly constrained by signal duration, making them particularly suitable for the analysis of brief measurement windows encountered in rapid industrial inspection scenarios. In parallel, statistics such as spectral kurtosis have proven effective in detecting impulsive or transient fault patterns [
16]. Related cyclostationary analysis can assist in the demodulation and reinforcement of sidebands in the spectrum [
19,
20]. Cyclostationary signal analysis offers a powerful framework for extracting periodic modulation patterns characteristic of rotating machinery faults. Vibration and current signals from faulty motors exhibit cyclostationary behavior due to the periodic interaction between stationary and rotating components [
21]. Spectral correlation and cyclic coherence maps can reveal fault-related modulation sidebands even when they are masked by noise or other cyclic components in conventional power spectra. Improved cyclostationary methods combining Teager–Kaiser energy operator (TKEO) demodulation with fast spectral correlation have successfully diagnosed broken rotor bar and bearing faults under various operating conditions [
22,
23]. Recent surveys synthesize early-detection strategies and data-driven pipelines for induction machines, emphasizing the practical role of spectral indicators and sidebands [
16,
24].
Recent years have witnessed remarkable progress in the integration of artificial intelligence (AI) with motor fault diagnosis. Deep learning models leveraging time–frequency images or raw signal data have outperformed traditional approaches in terms of accuracy and generalization. For instance, convolutional neural networks (CNNs) trained on STFT-based spectral representations, as well as lightweight models such as ShuffleNetV2, have achieved nearly
accuracy while remaining computationally efficient [
16]. Ensemble methods, such as the weighted probability ensemble deep learning (WPEDL) framework, combine features from both current and vibration signals and achieve classification rates exceeding
across multiple fault types [
16]. Additionally, graph neural networks (GNNs) have been introduced for the direct analysis of raw current and vibration signals, showing promising results in capturing complex dependencies between sensor modalities [
16]. Unlike traditional CNNs that process signals as independent samples, GNNs construct graph structures that explicitly model dependencies between data points, potentially improving generalization to unseen operating conditions. In parallel, recent works report AR/Prony-based spectral MCSA and ITSC-oriented indicators under inverter-fed or variable-load conditions [
25]. However, deep learning approaches often require extensive labeled datasets and lack the interpretability necessary for safety-critical industrial applications. The black-box nature of neural networks makes it difficult to validate their decisions based on a physical understanding of fault mechanisms.
With respect to stator winding faults, several recent contributions have specifically addressed ITSC detection [
26]. Other works have developed indicators, such as the complex current unbalance coefficient (CCUC), to discriminate between ITSC and voltage unbalance, achieving high robustness under realistic operating conditions (see related negative-sequence compensation in [
27]). Inter-turn short circuits create an imbalance in the three-phase stator windings, generating negative-sequence components that are nominally absent in healthy motors under balanced supply conditions [
28,
29]. Negative-sequence current analysis can distinguish ITSC-induced asymmetry from supply voltage unbalance, achieving high robustness under realistic operating conditions [
29]. Space-vector trajectory analysis during start-up transients can reveal ITSC signatures that are obscured during steady-state operation [
30]. Envelope energy methods applied to start-up current signals have achieved high accuracy in early ITSC fault detection, with reported detection rates of 96.9% using machine learning classification [
31]. The amplification of fault signatures during transient operation, when currents reach several times their rated values, makes start-up analysis particularly attractive for detecting incipient faults. The fusion of multi-sensor data and advanced deep learning architectures has further improved early-stage fault detection capabilities [
32]. While these methods represent important advances, they often rely on advanced feature engineering, extensive training datasets, or access to multiple sensing modalities, which may restrict their adoption in industrial environments. By contrast, spectral techniques that target physics-informed sidebands—such as stator-fault components and rotor-slot harmonics—offer explainability and low sensor overhead [
5,
6,
11].
Despite these technological advances, important limitations persist: application to short-duration signals is often hindered by the fundamental trade-off between time and frequency resolution, and most of the aforementioned methods lack a rigorous statistical framework for decision-making. Specifically, the significance of characteristic fault frequencies is rarely tested using statistical inference, such as confidence intervals, hypothesis testing, or test power analysis. This gap becomes particularly evident when analyzing short-duration signals (e.g., 0.2 s, 10,000 samples), where poor spectral resolution introduces high uncertainty. Although some research has proposed statistic-based spectral indicators for bearing faults [
33,
34], systematic statistical methodologies for short-time current signals remain scarce. Recent contributions focused on start-up features, space-vector indicators, and sequence compensation highlight the domain need but typically do not attach formal confidence measures [
27,
30,
31]. Furthermore, deep learning approaches typically do not provide probabilistic outputs or uncertainty quantification, limiting their trustworthiness in safety-critical applications [
35]. This fundamental limitation undermines the development of objective detection thresholds that balance sensitivity against false alarm rates, which is essential in industrial contexts where maintenance decisions carry significant economic consequences and regulatory compliance may require statistically justified diagnostic criteria.
The present work introduces a novel statistical approach for the spectral analysis of short-duration signals and applies it to stator fault diagnosis. The proposed approach employs confidence intervals and hypothesis testing [
36,
37,
38] to statistically evaluate informative spectral components, enabling the construction of a health indicator (HI) that captures significant amplitude variations associated with inter-turn short circuits. Unlike purely data-driven techniques, the method provides interpretable results with quantified uncertainty, thus enhancing its trustworthiness. The contribution is further validated through experimental studies on a dedicated test rig, encompassing five levels of fault severity and seven load conditions. It is tested on previously unseen data to demonstrate robustness. The novelty of the study is a statistically grounded framework for stator winding fault detection based on a spectral health indicator (HI) constructed from very short signal records (e.g.,
s). The selection of a 0.2 s time window is motivated by the requirement to monitor machines operating under dynamic load conditions. In such applications, a longer acquisition time (e.g., 1 s) would violate the stationarity assumption due to supply frequency fluctuations, leading to spectral smearing. Furthermore, the short window aligns with the capabilities of low-cost embedded sensors designed for rapid condition assessment during short duty cycles. The proposed approach operates robustly across multiple fault severities (including a single shorted turn) and a wide range of load conditions. By explicitly modeling the sampling distribution of spectral amplitude estimates, the method forms confidence bounds and hypothesis tests at physics-informed sidebands, enabling reliable decisions even when frequency resolution is coarse and spectral estimates are biased. Experimental validation on previously unseen data indicates a high probability of detection at controlled false-alarm rates. In summary, the contribution advances spectral and statistical diagnostics of ITSC and strengthens decision-making by providing a statistically validated HI tailored to very short records.
The remainder of this paper is organized as follows.
Section 2 presents the proposed methodology, including the diagnostic framework and data processing strategy. Key implementation details and statistical assumptions for the spectral analysis are provided to facilitate reproducibility and industrial transfer.
Section 3 describes the experimental setup and the measurement procedure used to collect current signals at different fault stages under various operating conditions.
Section 4 summarizes and discusses the aggregated diagnostic results obtained for all fault types and load levels. Finally,
Section 5 provides the main conclusions and outlines directions for future research.
2. Methodology
The proposed methodology aims to diagnose inter-turn short circuits in stator windings based on the statistical analysis of amplitudes at characteristic fault frequencies (FFs) extracted from short-duration current signals. The approach combines classical spectral analysis with statistical hypothesis testing to obtain a scalar health indicator, i.e., HI value, which is subsequently used for decision-making.
The proposed diagnostic approach is organized into three main stages: dataset construction, feature-based health modeling, and validation.
- 1.
Dataset construction. Raw three-phase stator current signals—denoted as IsA, IsB, and IsC—were acquired under both healthy and faulty operating conditions, comprising five fault severity levels and seven load conditions. Each signal was segmented into short, fixed-length intervals to standardize the input size and augment the number of samples for analysis. The resulting segments were labeled and divided into separate datasets for subsequent processing, ensuring that identical signals did not appear in both sets. The overall dataset construction procedure is described in detail in
Section 2.3.
- 2.
Feature-based health modeling. To construct the health indicator, it is first necessary to identify the fault-related informative frequencies. Therefore, statistical testing is employed to determine which frequencies are statistically significant. Subsequently, the health indicator and its confidence interval are computed, both of which are essential for decision-making regarding fault diagnosis. This step consists of two main stages:
- (a)
Feature selection based on statistical testing. Spectral features are extracted from the current signals to characterize both healthy and faulty operating conditions. Statistical analysis is then applied to identify the most informative features for distinguishing between these states (see
Section 2.4.1).
- (b)
Health indicator and confidence interval construction. Using the healthy reference data and the selected informative features, the health indicator and its confidence interval are established, providing a quantitative reference for subsequent fault detection (see
Section 2.4.2).
The step-by-step procedure of the feature-based health modeling is described in detail in
Section 2.4.
- 3.
Validation procedure. During validation, new unseen three-phase stator current signals—representing both healthy and faulty conditions with five fault-severity levels at various load settings—are employed. For each tested signal, the computed health indicator is compared against the reference confidence range. If the indicator falls within this range, the signal is classified as healthy; otherwise, it is identified as faulty. Finally, the overall diagnostic performance of the proposed method is evaluated. The step-by-step statistical testing procedure is described in detail in
Section 2.5.
2.1. Spectral Representation of the Signal
Let , , be a current signal of length T. The first stage of the method involves the computation of the signal spectrum. A one-sided periodogram (non-parametric power spectral density, PSD) is used with a Hanning window , and the number of FFT points is set to . Zero padding is applied to improve frequency resolution.
The one-sided periodogram of
is defined as [
39]:
with the frequency
for
, and the
and
. The one-sided amplitude spectral density is defined as
where
is the Euclidean norm of vector
w.
2.2. Theoretical Foundations of Stator Winding Fault Detection Based on Current Spectrum Analysis
The primary cause of characteristic harmonics in the stator current during an inter-turn short circuit (ITSC) is the disturbance of the magnetic circuit’s symmetry. In a healthy state, the stator windings produce a balanced magnetomotive force (MMF), resulting in an approximately sinusoidal flux distribution in the air gap. The occurrence of an ITSC reduces the effective number of turns in the faulty phase, leading to a significant short-circuit current that generates a locally opposing magnetic field. This negative MMF acts against the main MMF of the stator winding, causing its asymmetry, which weakens and distorts the total MMF. This disturbance creates an uneven flux distribution in the air gap, and this disturbed magnetic field subsequently interacts with the rotor. This interaction modulates the stator current with frequencies intrinsically linked to the rotor’s construction, particularly the rotor slot harmonics (RSH). The severity of the fault is directly correlated with the amplitude of these induced harmonic components; a larger fault, characterized by a higher short-circuit current and greater MMF distortion, results in a more pronounced increase in their magnitudes in the current spectrum.
As a result of the FFT transformation of the stator current signal, an ITSC fault manifests as an increase in the amplitudes of several characteristic frequency groups of components. The first group comprises the basic fault harmonics in the low-frequency range, which are a direct consequence of the MMF disturbance and depend on the rotor speed or slip. These components are calculated using the following formula:
where
—supply voltage frequency,
—number of pairs of poles,
s—slip,
,
, where
m belongs to the odd integer values.
The second, and particularly sensitive, group consists of the rotor slot harmonics (RSH) in the medium-frequency range, which are related to the physical construction of the rotor. The amplification of RSH during ITSC occurs because the stator asymmetry created by the shorted turns distorts the air-gap magnetomotive force (MMF), causing uneven flux distribution. This disturbed field interacts with the rotor’s physical structure (rotor slot spacing), thereby modulating the stator current with frequencies intrinsically linked to rotor geometry. Unlike low-frequency fault harmonics, RSH are less affected by slip variations at varying loads, making them more robust indicators. While these harmonics exist under normal conditions with small amplitudes, a stator fault significantly amplifies them. The general formula for these components is
where
—number of rotor bars.
The most diagnostically valuable is typically the Lower Rotor Slot Harmonic (LRSH) given by
The third group includes supply harmonics, calculated as
which are multiples of the fundamental frequency; however, these are less specific indicators compared to the first two groups.
It is essential to consider practical aspects, such as load dependence, as the frequency position of
components varies with slip. Under very light loads, they may approach supply-related harmonics, complicating detection. Furthermore, low-amplitude fault harmonics in the early stages can be masked by noise in conventional FFT analysis. For a measured rotor speed of
n, the slip is determined as
The synchronous speed
can be calculated using the following formula:
where
denotes the number of pole pairs.
In the presented approach, seven fault-related frequencies, denoted as
, are investigated. They consist of the third supply harmonic, i.e.,
, and the rotor slot harmonics for
and
. The resulting vector of potential fault frequencies
considered in this study is defined as follows:
Each empirical frequency estimate
is obtained as
where
denotes the estimated spectral density of the signal.
The selection of these components was determined experimentally by comparing healthy and faulty operating conditions. The chosen spectral components exhibit the largest differences in spectral amplitudes, making them the most informative for fault characterization.
The graphical illustration of
detection for different fault stages, i.e., healthy signal (upper panel), fault 1 (middle panel), and fault 5 (bottom panel), is presented in
Figure 1. The
is marked with a black dashed line, with the region of local maximum exploration, i.e.,
marked with a shaded gray area.
The exemplary theoretical values of
used in this study (for load 6) for the analyzed motor, with the number of rotor slots equal to
, a measured rotor speed of
, and a current supply of
Hz are listed in
Table 1.
Using the formula for RSH, see Equation (
4), and for the formula for the slip and synchronous speed, see Equations (
7) and (
8); one can note that
Substituting into the fault frequency model from Equation (
4) gives
Using
(hence
), the expression simplifies to a linear form
This linear dependence yields a direct sensitivity of the predicted fault frequency to rotor-speed estimation errors. For the parameters
,
,
, and
, the expression becomes
A rotor-speed estimation error
produces a frequency prediction error
To ensure that the true spectral peak remains inside the search window of
, the condition
must hold, which yields the admissible speed error bound
Summarizing, for the
, small speed/load variations within
will not affect the effectiveness of the proposed method, as the spectral peak of empirical frequency estimate
in Equation (
10) will be properly assigned.
2.3. Dataset Construction
Both healthy and faulty three-phase stator current signals were used to build the database for analysis, including different load levels. Each raw signal was divided into fixed-length windows (in the presented analysis,
s) to ensure that uniform data frames were generated from the input data. Based on this segmentation, the obtained signals were organized into datasets. Three datasets were prepared from healthy signals, i.e.,
,
, and
, and two from faulty ones, i.e.,
and
, with all datasets containing an equal number of segments. The desired feature of the datasets is their independence (the datasets come from the same machine but from different experiments run at different times). Spectra are computed using fixed parameters and reveal stable, i.e., spectral estimates and their confidence intervals do not vary systematically across adjacent windows. The resulting datasets are summarized in
Table 2.
This division ensures a strict separation between datasets of equal size used for different purposes, thereby facilitating a reliable evaluation of diagnostic performance. In the present analysis, each subset contains
segments of
s. The algorithm of the dataset construction (for each current phase, fault type, and load) is presented in Algorithm 1.
| Algorithm 1: Dataset construction (for each current phase, fault type, and load) |
Data: Current-signal datasets: healthy and faulty Result: Datasets and of healthy and faulty data Dataset construction: - 1.
Collect raw current signals under healthy () and faulty () conditions. - 2.
Divide each raw signal into fixed-length windows to ensure uniform input size. - 3.
Label the obtained segments and organize them into disjoint datasets. - 4.
Select M segments for each subset (, , , , ).
Dataset for and construction: (reference healthy), Datasets for feature selection: (healthy), (faulty), Datasets used for validation procedure: (healthy), (faulty).
- 5.
Save the prepared datasets with corresponding labels and segment indices for further diagnostic analysis.
|
In the proposed methodology, the construction of the health indicator involves an internal statistical testing procedure, namely a feature selection approach, which is independent of the final validation stage of the entire diagnostic framework. This separation is required since the algorithm for selecting informative fault frequencies () relies on an internally defined statistical test, which is explicitly described in the next section.
2.4. Feature-Based Health Modeling
The feature-based health modeling stage employs three datasets: two healthy datasets, and , and one faulty dataset, . The signals are used to establish the confidence intervals for the spectral amplitudes at each characteristic frequency , as well as the confidence interval for the health indicator . The signals and are then used to estimate the empirical size (Type I error) and empirical power of the statistical test defined below, enabling the selection of informative fault frequencies. The two main stages include:
the selection of informative
features, described in detail in
Section 2.4.1;
the construction of the confidence interval for the decision-maker health indicator discussed in
Section 2.4.2.
2.4.1. Feature Selection Based on Statistical Testing
For each fault frequency
from healthy signals
, where
, and
, a confidence interval
is estimated as follows:
where
is the empirical quantile of order
with
. The confidence intervals
are used to assess which of the
are informative. For this purpose, a statistical test was defined that involved the verification of two criteria. For a signal
z, the statistic
and the test
are defined as follows:
with the following testing hypotheses:
For healthy signals
and faulty signals
, where
the empirical size and power of the test [
37,
38] denoted
at
respectively are defined as follows:
The selected informative set of fault frequency indexes is defined as follows:
This means that the informative frequencies are those for which, under healthy conditions, the corresponding amplitudes fall within the confidence interval
with a Type I error probability not exceeding
, and under faulty conditions, they fall outside this interval with a confidence level of
.
2.4.2. Health Indicator and Confidence Interval Construction
For a given
, the spectrum-based health indicator, HI, for signal
z is defined as the sum of spectrum amplitudes at the informative
, i.e.:
For healthy signals
,
, a confidence interval of the indicator
is estimated as follows:
The
serves as a quantitative reference for subsequent fault detection and decision-making. It reflects the deviation of the current signal from the healthy reference state, enabling the assessment of the machine’s operational health and facilitating the identification of emerging faults with statistical confidence.
The algorithm for feature-based health modeling is summarized in Algorithm 2.
| Algorithm 2: Construction of the health indicator confidence interval (for each current phase, fault type, and load) |
Data: Current-signal datasets: healthy and faulty Result: The confidence interval of healthy dataset Feature-based health modeling: - 1.
Import signals from , , and . Feature selection - 2.
For each , , and , compute the spectrum and estimate using Equation ( 10). - 3.
Determine confidence intervals from dataset , using Equation ( 18). - 4.
Compute empirical size and test power: and , using Equation ( 21). - 5.
Select informative , i.e., according to Equation ( 22). Confidence interval construction - 6.
Compute the health indicator from signals using the selected Equation ( 23). - 7.
Establish the indicator confidence interval from healthy dataset using Equation ( 24).
|
2.5. Validation Procedure
For a signal
z with the HI-based statistic
, the test
is defined as follows:
with the following testing hypotheses:
For the signals
(healthy) and
(faulty), where
, the indicator values
and
are computed using Equation (
23), and each signal is classified according to the HI-based decision rule defined in Equation (
25). The count of correct decisions (CCD) is then defined as
where
is the decision function defined in Equation (
25), returning 1 for faulty signals and 0 for healthy ones.
The diagnostic performance is evaluated using the fault detection rates (FDRs):
A higher
value (closer to 1) indicates superior fault detection performance;
means that 100% of faulty signals are correctly identified as faulty. Conversely, a lower
value (closer to 0) reflects better acceptance of healthy signals;
implies that none of the healthy signals were incorrectly classified as faulty, i.e., 100% of healthy signals were correctly identified as healthy. The desired diagnostic property is to achieve
close to 0 and
close to 1. The algorithm of the validation procedure is summarized in Algorithm 3.
| Algorithm 3: Validation procedure (for each current phase, fault type and load) |
Data: Current-signal datasets: healthy and faulty ; The confidence interval health indicator of healthy dataset: Result: Classification of the tested signal as healthy or faulty Validation procedure: - 1.
Import signals from and . - 2.
For each , , compute the spectrum and estimate informative - 3.
For each signal z, compute using Equation ( 23). - 4.
Apply the HI-based decision rule Equation ( 25):
- 5.
Calculate and using Equation ( 28) and report the overall diagnostic performance.
|
Figure 2 depicts the schematic block flowchart of the proposed methodology, covering dataset construction (Stage 1), feature extraction and selection (Stage 2.1), health-indicator and confidence intervals calculation (Stage 2.2), and the final decision and validation procedure (Stage 3). The figure provides a compact overview that is referenced throughout the section.
4. Results
In this section, the analyzed signals are presented. The signals correspond to different fault stages, including the healthy condition and the most severe fault (fault 5). The datasets
and
are independent (the datasets come from the same machine but from different experiments run at different times). Spectra are computed using a Hann-windowed periodogram with fixed parameters, and stability is verified by confirming that spectral estimates and their confidence intervals do not vary systematically across adjacent windows. The measurements were conducted under different levels of load conditions (from level 0 to 6), with the load gradually increasing over time. The recorded signals (with each of the three phase currents overlapping) are shown in
Figure 6 with the zoomed in view presented in
Figure 7. There is a visible increase in the signal amplitude for fault 2–fault 5, corresponding to the increasing load over time for load levels greater than 2. The load profiles estimated from the signals are illustrated in
Figure 8, with the zoomed in view presented in
Figure 9. Estimated load profiles exhibit slight variations between fault categories, resulting from fault-induced changes in current amplitude and waveform distortion. For higher fault severities (faults 2–5), phase divergence becomes more pronounced due to increased stator current asymmetry. The load level was estimated from the RMS value of the stator current.
4.1. Dataset Construction for Analyzed Current Data
For each load level, 5 s current signals were extracted for further analysis (highlighted in the green shaded areas in
Figure 6,
Figure 7 and
Figure 8). The sampling frequency of the measurements was 50 kHz. Considering the requirements of the intended industrial application, the maximum signal length was limited to 0.2 s (10,000 samples). In the spectral analysis, the FFT length was set to 8192 points, which resulted in a frequency resolution of approximately 6 Hz. Each 5 s signal (for every load and current phase) was further divided into 0.2 s segments to create equal-sized datasets:
,
, and
for healthy signals, and
and
for faulty signals. Each dataset contained 100 segments that were used for statistical testing. The indices of the selected signals in each dataset were determined using a pseudo-random number generator that produces a deterministic sequence of numbers that appear random. For example, set
contains indices starting with
, whereas set
starts with
. This segmentation procedure ensures a balanced and statistically representative dataset for evaluating the proposed method.
4.2. Partial Results—Feature Selection Based on Statistical Testing
In this section, the results of the proposed methodology are presented for the data corresponding to the maximum load applied during the experiment (load 6). As discussed in
Section 2.2, fault detection under very light loads can be challenging. Therefore, load 6 was selected as the most representative operating condition for demonstrating the analysis results. The following subsections present the obtained health indicators, confidence intervals, and diagnostic performance measures.
Exemplary current signals for healthy and faulty conditions are presented in
Figure 10. Each subplot contains six different signals, although some of them overlap and are therefore not fully visible. A slight phase shift between the signals can be observed when comparing healthy and faulty cases. There is also a small difference in the maximum amplitudes among the current types (IsA, IsB, and IsC), but it is difficult to distinguish visually.
In
Figure 11, the signal spectra are presented. The 0.2 s signal duration yields a frequency resolution of approximately 6 Hz (with 8192-point FFT), which constrains the separation of closely spaced harmonics. Despite this limitation, the proposed statistical framework successfully identifies informative fault frequencies.
The fault frequencies, denoted as
, are identified as local maxima within the theoretical fault frequency range and a window of
samples (according to Equation (
10)). This range is indicated by the gray shaded area.
As shown in
Figure 11, certain fault frequencies, such as
(559.7 Hz) and
(659.7 Hz), exhibit noticeable variations in spectral amplitude as the damage level increases. To verify this observation, the amplitudes of the spectrum at
are analyzed. The corresponding
values are summarized in
Table 6.
Figure 12,
Figure 13,
Figure 14,
Figure 15,
Figure 16,
Figure 17,
Figure 18,
Figure 19,
Figure 20,
Figure 21,
Figure 22,
Figure 23,
Figure 24 and
Figure 25 present the results of testing the
amplitudes for each fault stage and healthy data.
Spectrum amplitudes at the fault frequencies
, obtained from the healthy dataset
(shown as blue dots), are used to construct the
confidence intervals
(blue shaded areas) and to perform the corresponding hypothesis testing.
Figure 12,
Figure 13,
Figure 14,
Figure 15,
Figure 16,
Figure 17 and
Figure 18 present the testing results for the healthy data
(red dots). Each figure includes results for the three current phases—IsA, IsB, and IsC—displayed in separate subplots, with the titles indicating the empirical size
of the test. The desired outcome is
. As observed, four frequencies
do not meet this criterion, with the following results:
(IsB, IsC),
(IsA),
(IsA), and
(IsB). Consequently,
for IsB and IsC,
for IsA,
for IsA, and
for IsB are excluded from the set of informative fault frequencies, as they do not satisfy the expected result of Criterion 1.
Figure 19,
Figure 20,
Figure 21,
Figure 22,
Figure 23,
Figure 24 and
Figure 25 present the results of testing the faulty data
(red dots). Each figure shows the results for individual current phases—IsA, IsB, and IsC—in separate subplots, with titles indicating the test power
, for which the desired outcome is
.
For , the value of is achieved for all fault stages and current phases. In contrast, for , none of the analyzed cases reaches the expected value. Consequently, is excluded from the set of informative fault frequencies, as it does not satisfy the expected result of Criterion 2.
Among all analyzed frequencies, and most consistently meet the testing criteria, indicating a strong sensitivity of their spectral components to fault-related variations in the current signal.
The summarized testing results are presented in
Figure 26 and
Figure 27.
Figure 26 presents the aggregate results for the healthy-case test (test on healthy data) across loads and fault frequency indexes for all currents.
Figure 27 presents the aggregate results for the faulty-case test (test on faulty data) across loads and fault frequency indexes for all currents.
In
Table 7, the summarized results of feature selection based on statistical testing are presented for all load levels and current phases.
Since, for some analyzed cases, it was not possible to fully satisfy the statistical test assumptions and extract an informative set of fault frequency indexes (as even a minor deviation in the test power or the empirical size exceeding, e.g., 0.005 leads to rejection), to ensure consistency and robustness, the informative set of fault frequency indexes identified at maximum load (load 6) for each current separately is adopted across all load levels. This choice is justified by the superior statistical separability between healthy and faulty conditions observed at higher loads.
Therefore, without a significant loss of generality, the set of fault frequency indexes identified as informative for load 6 is adopted for all load levels, namely, for IsA—, for IsB—, and for IsC—.
4.3. Partial Results—Health Indicator and Confidence Interval Construction
The informative fault frequencies (IsA—, IsB—, , and IsC—) are used to calculate the values. The 99% confidence intervals , computed based on the healthy dataset for each current phase, are represented by the blue shaded areas and are used in the decision-making process.
Figure 28 presents the results of testing the healthy dataset
(red dots), with the corresponding fault detection rate
values indicated in the subplot titles. The expected outcome is
, meaning that all healthy data samples are correctly classified as healthy. Each figure shows the results for the individual current phases—IsA, IsB, and IsC—in separate subplots.
As observed, the values for all current phases remain close to zero, confirming the high reliability of the proposed approach in correctly identifying healthy operating conditions.
Figure 29 presents the results of testing the faulty dataset
(red dots), with the corresponding fault detection rate
values indicated in the subplot titles. The expected outcome is
, which means that all
values corresponding to faulty data are correctly classified as faulty. Each figure shows the results for the individual current phases—IsA, IsB, and IsC—in separate subplots.
As observed, the values for all current phases are equal to 1, confirming the full detection of faulty conditions by the proposed method.
4.4. Validation Procedure for Analyzed Current Data
Figure 30 presents the summarized results of the validation procedure for all load conditions, current phases, and fault categories, using the testing datasets
and
. As observed, the fault detection effectiveness for the faulty datasets reaches 100% across all analyzed cases. For the healthy datasets, the effectiveness remains nearly 100%, with
values close to zero, confirming the high reliability of the proposed diagnostic methodology.
The data from the same test rig, acquired during a separate experimental session at load 0 (the most challenging case for diagnosis), are used for testing (10 signals of 0.2 s). The exemplary signals of currents A, B, and C for healthy and faulty data are presented in
Figure 31. The corresponding spectra are presented in
Figure 32. These separate experiments ensure that the thermal conditions, background noise, and mechanical transients are uncorrelated with the training data. The proposed statistical method was applied to this new dataset without re-training the baseline parameters. The results of health indicator testing are presented in
Figure 33,
Figure 34 and
Figure 35.
The summarized results obtained for this independent validation scenario are presented in
Figure 36 (tested healthy data) and
Figure 37 (tested faulty data). This additional validation confirms that the high accuracy reported is not a result of data leakage but demonstrates the method’s capability to generalize across different operating cycles.
4.5. Influence of the Significance Level on the HI Outcome and Detection Performance
In the proposed methodology, the significance level governs the strictness of the decision rule through the width of the two-sided acceptance band defined by the confidence interval (CI). Decreasing widens the CI (i.e., produces a more conservative acceptance band), which reduces the likelihood of spurious detections on healthy signals. Conversely, increasing leads to narrower CIs and therefore a tighter acceptance band, enhancing sensitivity to deviations while potentially increasing the false-alarm rate in healthy data.
This behavior is consistent with the nominal size of a two-sided test: for a confidence band, approximately of healthy observations are expected to fall outside the band, with about in each tail. As a result, a larger naturally tends to increase the false discovery rate computed on healthy data, denoted as .
Figure 38 empirically confirms this relationship by showing an almost linear trend:
increases with
over the considered range. To evaluate whether this effect is consistent across operating conditions,
Figure 39 summarizes
for all tested loads and currents. The results indicate that the increase in
with
is systematic across the full set of conditions, which further supports using a small
when false-alarm control on healthy signals is prioritized.
At the same time, selecting a too small
may reduce statistical power (increase Type II errors), potentially delaying fault detection. Importantly, this trade-off does not manifest as a loss of fault detection effectiveness in our experiments.
Figure 40 aggregates
(computed on faulty data) across all loads and currents and shows no degradation of performance for the tested values of
.
Therefore, while primarily affects through CI tightening/loosening, the fault-related metric remains stable, indicating no loss in efficiency.
Based on these observations, we adopt as a deliberate compromise that emphasizes robustness against false alarms while preserving fault detection performance across the full range of operating conditions.
The average processing time per 0.2 s window for the entire online pipeline (three phases, and all fault/healthy states) was 13.43 ms, i.e., about of the 200 ms window duration, leaving a timing margin. This confirms real-time feasibility of the online stage.
5. Conclusions
A feature-based methodology for fault detection in induction motor systems has been presented, utilizing three-phase stator current signals. In the proposed approach, frequency-domain analysis was combined with statistical testing to identify informative fault-related spectral components, construct confidence intervals, and develop a quantitative health indicator HI along with its confidence range HICI. Most existing MCSA/envelope-based approaches rely on very similar frequency-domain indicators (e.g., amplitudes at fault-related sidebands). The difference lies in how the final decision is made from these features. In standard MCSA/envelope practice, fault detection typically depends on selecting a fixed threshold, which is often chosen heuristically, tuned separately for different operating conditions, or requires expert interpretation. As a result, the decision boundary may be difficult to reproduce, and the achieved false-alarm rate is not explicitly controlled, especially when load or current conditions vary.
The proposed methodology replaces heuristic thresholding with an automated, data-driven statistical test. Empirical confidence bands are estimated from healthy data offline, and online detection reduces to verifying whether the indicator falls outside the acceptance band. This yields explicit and interpretable control of the nominal false-alarm rate via and provides a consistent, automatically constructed decision boundary without manual threshold selection.
It was demonstrated that the informative fault frequencies— for IsA, and for IsB, and for IsC—exhibit the highest sensitivity to fault-induced variations in the current spectrum. Through validation under multiple load conditions, it was confirmed that using data from load 6 as a reference provides robust generalization and high diagnostic accuracy across all operating conditions. For the faulty datasets, the fault detection rate was found to reach in all analyzed cases, whereas for the healthy datasets, values remained close to zero, indicating the absence of false alarms. These results demonstrate that the proposed health indicator and statistical testing framework offer high reliability, well-defined decision boundaries, and robustness to load variations.
Although the experimental validation was conducted on an Sh90L-4 motor, the proposed method is adaptable to induction motors with different structural parameters. Since the fault-related frequencies (such as RSH) are determined analytically based on the number of rotor bars () and pole pairs (), the algorithm can be reconfigured for any motor by updating these parameters. The statistical main of the method operates independently of the specific frequency location, making it applicable to a wide range of AC machinery, provided that the characteristic frequencies do not overlap significantly with the fundamental supply frequency or its low-order harmonics.
Obtaining a guaranteed “perfectly healthy” baseline is a common challenge in industrial condition monitoring. The standard procedure is to perform the healthy state acquisition (dataset ) immediately after motor installation or maintenance. This ensures that the baseline reflects the optimal machine state. In a bad scenario, when the system is installed on a motor that is already in operation (with unknown health status), the current state is adopted as the baseline . The proposed method functions as an anomaly detector. It detects statistically significant deviations from the reference state, rather than measuring absolute health against a theoretical ideal. A key advantage of our statistical approach is its adaptability. If the “baseline” motor already exhibits slight wear or noise, the variance in the spectral components in the dataset will be naturally higher. Consequently, the calculated confidence interval automatically widens. This mechanism naturally desensitizes the health indicator to the pre-existing condition, ensuring that the system detects only significant future degradation (trend changes) rather than triggering false positives based on the initial state.
The proposed methodology is particularly tailored for two specific industrial scenarios where traditional MCSA is often inapplicable due to signal duration requirements. The first scenario involves automated manufacturing and robotics (e.g., pick-and-place operations), where motors operate under short duty cycles with rapid speed changes. In such regimes, acquiring a continuous 1 s steady-state signal is unfeasible, whereas the proposed method effectively utilizes brief 0.2 s stable windows for reliable diagnosis. The second scenario concerns Edge AI and IoT monitoring systems implemented on low-cost microcontrollers. The capability to achieve high diagnostic accuracy using short data buffers minimizes memory usage and computational latency, enabling cost-effective, real-time condition monitoring on embedded platforms.
In conclusion, the developed methodology can be regarded as an interpretable and statistically grounded framework for fault detection in electric machines. Future work will focus on extending the method to enable fault classification and validating the approach in real industrial environments with time-varying load profiles.