Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework

Adaika, Hamza; Laadjal, Khaled; Tir, Zoheir; Sahraoui, Mohamed

doi:10.3390/machines14030349

Open AccessArticle

Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework

¹

LEVRES Laboratory, University of El Oued, El Oued 39000, Algeria

²

IREC—Catalonia Institute for Energy Research, 08930 Sant Adrià del Besòs, Barcelona, Spain

³

LGEB Laboratory, University of Biskra, Biskra 07000, Algeria

^*

Authors to whom correspondence should be addressed.

Machines 2026, 14(3), 349; https://doi.org/10.3390/machines14030349

Submission received: 25 February 2026 / Revised: 15 March 2026 / Accepted: 18 March 2026 / Published: 20 March 2026

(This article belongs to the Special Issue New Trends in Reliability and Lifetime Improvement in Power-Electronic-Controlled Machines and Devices)

Download

Browse Figures

Versions Notes

Highlights

What are the main findings?

The Multi-Head ResMLP concurrently achieves high-precision impedance estimation (MAE = $0.0412 Ω$ ) and unbalance severity detection.
The framework executes end-to-end diagnosis with a real-time inference latency of just 2.3 ms.

What is the implication of the main finding?

Establishes a reliable, fast condition monitoring solution for tracking induction motor health under imbalanced supply voltages.
Facilitates direct deployment of predictive maintenance algorithms onto computationally constrained industrial embedded devices.

Abstract

Unbalanced supply voltage (USV) represents a critical power quality challenge in industrial environments, significantly degrading the performance, efficiency, and operational lifespan of three-phase induction motors. Accurate real-time estimation of sequence impedances (

Z_{a}, Z_{b}, Z_{c}

) and detection of the Negative Voltage Factor (NVF) are essential for effective condition monitoring and preventive maintenance strategies. While existing machine learning methods have demonstrated promising accuracy, they often rely on manual feature engineering, lack hierarchical representation learning, and treat impedance estimation and fault detection as isolated tasks. This paper proposes a unified Deep Multi-Task Learning framework that leverages Residual Multilayer Perceptron (ResMLP) architectures for feature-based learning and Temporal Convolutional Networks (TCNs) for end-to-end raw signal learning. Our contributions include: (1) introduction of a Multi-Head ResMLP architecture that jointly optimizes phase impedance and fault detection, achieving superior NVF accuracy (MAE = 0.0007) and a fault detection F1-score of 0.8831; (2) investigation of raw-voltage TCN models for voltage-only diagnostics, with analysis of the trade-offs between end-to-end learning and feature-based approaches; (3) extensive ablation studies demonstrating the impact of network depth, data augmentation, and training protocols on model generalization; and (4) deployment of PyTorch (v2.0.1)-based models suitable for embedded systems with real-time inference capabilities (2.3 ms per prediction). Experimental validation on a 1.1 kW three-phase motor dataset under diverse load conditions (0–10 Nm) and USV magnitudes (5–15 V) confirms the robustness and practical applicability of the proposed approach for industrial fault diagnosis and condition monitoring systems.

Keywords:

deep learning; imbalanced supply voltage; induction motors; impedance estimation; fault diagnosis; multi-task learning; residual networks; temporal convolutional networks; condition monitoring

1. Introduction

Three-phase induction motors (IMs) are indispensable workhorses in modern industrial systems, powering production lines, HVAC systems, water treatment facilities, and marine propulsion systems. According to recent international surveys [1], IMs account for approximately 85% of all electric motors in operation across industrial sectors, reflecting their dominance in electromechanical energy conversion [2]. This widespread deployment is driven by their inherent reliability, simple construction, high power density, and ability to operate efficiently under demanding mechanical and electrical conditions.

However, despite their robustness, induction motors are susceptible to numerous electrical and mechanical faults. Among the most prevalent electrical disturbances encountered in real-world power systems is imbalanced supply voltage (USV), a phenomenon wherein the magnitudes and/or phase angles of the three-phase supply voltages deviate from ideal balance. According to IEEE, IEC, and NEMA standards [3,4,5], even minor imbalances (as low as 3–5% voltage imbalance) can induce significantly amplified current imbalances due to the motor’s relatively low negative-sequence impedance, leading to cascading consequences: excessive heating in the stator windings, increased iron and copper losses, torque pulsations, elevated acoustic emissions, and accelerated mechanical wear of bearings and rotor bars [6,7]. The severity thresholds for USV categorization evaluated in this study are strictly delineated in accordance with NEMA MG 1 and IEC 60034-1 [8,9] standards compliance regarding permissible phase imbalances.

The diagnosis and characterization of USV requires two complementary analyses: (1) severity quantification, typically expressed via the Negative Voltage Factor (NVF), which is the ratio of negative-sequence to positive-sequence voltage components computed via the Fortescue transformation [10], and (2) impedance characterization, wherein the three-phase stator winding impedances (

Z_{a}, Z_{b}, Z_{c}

) are estimated to differentiate supply-side faults (voltage imbalance) from load-side faults (internal motor degradation).

Over the immediate past three years, solutions for induction motor fault diagnosis have heavily pivoted toward deep learning-enhanced architectures. Contemporary research successfully leverages advanced Convolutional Neural Networks (CNNs) and Vision Transformers to autonomously extract non-linear harmonic dependencies from induction signals without relying on rigid manual signal processing.

Recent shallow machine learning methodologies have equally shown considerable promise. Notably, Laadjal et al. [11] demonstrated that Decision Tree Regressors (DTRs), when combined with Short-Time Least Squares Prony (STLSP) signal processing, could achieve excellent impedance estimation accuracy (MAE ≈ 0.05

Ω

) and NVF detection (MAE ≈ 0.08%) on a publicly available USV dataset. However, this approach carries inherent limitations:

Manual feature engineering burden: The STLSP method requires careful window length tuning, preprocessing parameter selection, and domain knowledge in signal processing. Features extracted by STLSP (amplitude, phase, frequency, damping factor) are hand-crafted and may not capture the full complexity of non-stationary motor dynamics.
Shallow model capacity: Decision Trees, though interpretable and computationally efficient, lack the hierarchical feature learning capability of deep neural networks. They struggle to capture higher-order, non-linear interactions among input features, particularly in low-data regimes.
Task isolation: The original formulation treats impedance estimation (regression) and fault detection (classification) as independent tasks, ignoring potential synergies. A unified framework could regularize the model via multi-task learning, improving generalization on both tasks simultaneously.
Raw signal underexploitation: While STLSP effectively compresses 10 kHz sampled voltage/current waveforms into interpretable features, the compression inherently discards information. An end-to-end deep learning model trained directly on raw waveforms could, in principle, learn richer representations without manual feature definition.

To address these gaps, we propose a comprehensive deep learning framework for USV diagnostics. The overall pipeline is visualized in Figure 1.

Multi-Head Residual MLP (ResMLP) architecture: We introduce a deep Residual Multilayer Perceptron with a shared encoder and task-specific prediction heads that jointly estimate phase impedances and imbalance severity via multi-task learning.
End-to-End Temporal Convolutional Network (TCN): We explore an end-to-end TCN operating on raw voltage waveforms to assess the feasibility of voltage-only diagnostics and to quantify the trade-offs relative to feature-based models.
Comprehensive Evaluation Framework: We evaluate models across operating conditions, imbalance magnitudes, and scarce-label regimes using MAE, MSE, $R^{2}$ , and classification metrics to ensure practical relevance.
Ablation Studies and Deployment Focus: We analyze architectural choices, data augmentation, and latency trade-offs, producing a configuration that balances accuracy and embedded inference constraints.

The main novelty and contributions of this work lie in:

Demonstrating that deep residual architectures can effectively learn from STLSP-engineered features to capture complex non-linear impedance–NVF relationships.
Proposing a unified multi-task framework that leverages task correlations to improve generalization, a strategy not explored in prior USV diagnostics literature.
Providing a rigorous ablation study on voltage-only representation learning. By deploying an end-to-end TCN strictly on voltage vectors, we empirically quantify the performance boundary of deep learning models when physically deprived of causal current variables, definitively proving the strict requirement of coupled sensor topologies.
Delivering production-ready PyTorch implementations optimized for embedded inference latency, bridging the gap between research and industrial deployment.
Establishing a comprehensive framework for USV diagnostics against which future deep learning approaches can be measured.

The remainder of this paper is organized as follows. Section 2 provides the theoretical foundation: USV fundamentals, the Fortescue transformation, STLSP methodology, and an overview of relevant deep learning architectures. Section 3 details the experimental setup, dataset characteristics, preprocessing pipeline, and proposed models. Section 4 presents comprehensive experimental results, including baseline validation, deep learning comparisons, ablation studies, and cross-operating-condition analysis. Section 5 interprets findings, discusses trade-offs, and contextualizes results within the broader literature. Finally, Section 6 summarizes key insights and outlines future research directions.

2. Theoretical Background and Related Work

2.1. Unbalanced Supply Voltage and Symmetrical Components

In a balanced three-phase system, the three phase voltages are equal in magnitude and separated by 120° in phase. However, in practice, imbalances arise from multiple sources: malfunctioning power factor correction capacitors, uneven single-phase load distribution across the grid, open-circuit faults in distribution lines, and transformer winding imbalances [11]. These imbalances are mathematically characterized using the Fortescue transformation, which decomposes the three phase-voltages into three symmetrical components: positive-sequence (VP), negative-sequence (VN), and zero-sequence (VZ) [10].

Given the three-phase voltages

V_{a}, V_{b}, V_{c}

at time instant t, the symmetrical components are computed as:

[\begin{matrix} V_{P} \\ V_{N} \\ V_{Z} \end{matrix}] = \frac{1}{3} [\begin{matrix} 1 & a & a^{2} \\ 1 & a^{2} & a \\ 1 & 1 & 1 \end{matrix}] [\begin{matrix} V_{a} \\ V_{b} \\ V_{c} \end{matrix}]

(1)

where

a = e^{j 2 π / 3}

is the complex cube root of unity. The positive-sequence component rotates in the forward direction (standard 50/60 Hz), the negative-sequence rotates backward, and the zero-sequence is common to all phases.

The severity of voltage imbalance is quantified via the Negative Voltage Factor (NVF), defined as:

NVF = \frac{| V_{N} |}{| V_{P} |}

(2)

A balanced system has NVF ≈ 0, while increasing imbalance elevates NVF. According to NEMA and IEEE standards, motors should not be operated continuously with NVF exceeding 2–3%.

The motor’s impedance responses to supply voltage perturbations through its stator winding resistance and leakage inductance. For each phase, the fundamental-frequency sequence impedance is defined as:

Z_{phase} = \frac{V_{phase} (1 f_{s})}{I_{phase} (1 f_{s})}

(3)

where the subscript

(1 f_{s})

denotes the fundamental frequency component (50 Hz in our case) extracted via signal processing. By estimating

Z_{a}, Z_{b}, Z_{c}

and computing their symmetrical components, we obtain:

Z_{N} = \frac{| Z_{N} |}{| Z_{P} |}, (Negative Impedance Factor)

(4)

The motor’s impedance is sensitive to load, temperature, and operating point; however, imbalanced supply voltage induces systematic changes in

Z_{a}, Z_{b}, Z_{c}

that differ qualitatively from internal faults, enabling fault source attribution.

2.2. Short-Time Least Squares Prony (STLSP) Method

The STLSP method is a high-resolution signal decomposition technique that accurately extracts frequency, amplitude, phase, and damping of exponentially damped sinusoids from short, non-stationary signal windows [12,13]. Unlike Fourier-based methods (FFTs), which assume stationarity over the entire analysis window, STLSP is particularly suited for motor transients and time-varying conditions.

The core principle is linear prediction: given a signal sequence

y [n]

for

n = 1, \dots, N

, we model it as a linear combination of exponentially damped sinusoids:

y [n] = \sum_{k = 1}^{P} A_{k} e^{α_{k} n} cos (2 π f_{k} n + ϕ_{k})

(5)

where

A_{k}

is the amplitude,

α_{k}

the damping factor,

f_{k}

the normalized frequency, and

ϕ_{k}

the phase of the k-th component.

STLSP finds the linear prediction coefficients

a = {[a_{1}, \dots, a_{P}]}^{T}

by minimizing the forward prediction error:

ε [n] = y [n] + \sum_{k = 1}^{P} a_{k} y [n - k]

(6)

The characteristic polynomial of the prediction filter is:

f (m) = \sum_{k = 0}^{P} a_{k} m^{P - k}

(7)

The roots

m_{k}

of this polynomial encode the signal’s damping and frequency:

α_{k} = \frac{ln | m_{k} |}{T_{s}}, f_{k} = \frac{1}{2 π T_{s}} arctan (\frac{Im (m_{k})}{Re (m_{k})})

(8)

Once the poles are determined, the amplitudes and phases are found by solving a Vandermonde linear system via least squares. For a window of N samples (

N > 2 P

), the system is over-determined, and a least-squares solution yields optimal parameters.

In the USV diagnostic context, STLSP is applied to sliding windows of the three-phase voltages and currents (sampled at 10 kHz, window length typically 100–200 samples corresponding to 5–10 ms), extracting the fundamental component (

f_{k} = 50

Hz) amplitude and phase for each phase. This compression from 590,000 raw samples to 13,721 STLSP windows (per the dataset provided in [11]) enables efficient feature-based machine learning while retaining time-varying information.

2.3. Deep Learning Architectures for Tabular and Sequential Data

2.3.1. Residual Multilayer Perceptron (ResMLP)

While convolutional and recurrent architectures dominate deep learning, recent work has demonstrated that residual connections and careful layer normalization can substantially improve the performance of dense networks on tabular data [14]. ResMLP, proposed by Tolstikhin et al. and adapted for various domains, replaces convolutional layers with fully-connected blocks and uses skip connections to enable training of deeper architectures.

A typical ResMLP block can be written as:

h^{(l + 1)} = h^{(l)} + f_{θ} (LayerNorm (h^{(l)}))

(9)

where

f_{θ}

is a two-layer dense network with hidden dimension

d_{hidden}

and GELU activation:

f_{θ} (x) = W_{2} \cdot GELU (W_{1} \cdot x + b_{1}) + b_{2}

(10)

The GELU (Gaussian Error Linear Unit) activation is preferred over ReLU for tabular data due to its smoother gradient landscape and reduced tendency toward dead neurons [15].

For multi-task learning, we employ a shared encoder (sequence of ResMLP blocks) followed by task-specific output heads:

{\hat{z}}_{a} = W_{head - a} \cdot ReLU (h_{shared}), {\hat{z}}_{b} = W_{head - b} \cdot ReLU (h_{shared}), \dots

(11)

The joint loss function combines losses from all tasks with learnable weights or fixed balancing coefficients.

2.3.2. Temporal Convolutional Networks (TCNs)

Temporal Convolutional Networks represent an alternative to RNNs/LSTMs for sequence modeling, offering superior parallelization and more stable gradients [16]. A TCN consists of stacked 1D convolutional layers with dilated convolutions, causal padding, and residual connections.

A dilated 1D convolution at position t is defined as:

y_{t} = \sum_{i = 0}^{k - 1} w_{i} \cdot x_{t - d \cdot i}

(12)

where d is the dilation factor and k is the kernel size. By stacking layers with exponentially increasing dilation (

d = 1, 2, 4, 8, \dots

), the receptive field grows exponentially, enabling the network to capture dependencies over long time horizons with relatively few parameters.

Causal padding ensures that the output at time t depends only on inputs up to time t (not future samples), preserving the temporal causality necessary for online diagnostics.

2.4. Related Work and State of the Art

The field of induction motor fault diagnosis has evolved through several paradigms. Early approaches [1] relied on hand-crafted statistical features (RMS, kurtosis, crest factor) combined with classifiers like Support Vector Machines (SVMs). Subsequent work incorporated advanced signal processing: Empirical Mode Decomposition (EMD) [17], Wavelet Transforms [18], and Hilbert–Huang Transform [19].

More recently, deep learning has gained traction. CNN-based approaches learn hierarchical features from raw vibration or current signals [20]. However, most prior deep learning work in motor diagnostics focuses on bearing faults or rotor bar breakage, not USV. The study by Laadjal et al. [11], which forms the direct predecessor to our work, established a strong baseline using Decision Trees with STLSP features. Their work is comprehensive in scope but does not explore deep learning or multi-task learning paradigms.

Concurrent efforts in deep learning for motor control include [21], which employed encoder–decoder RNNs with skip connections for motor dynamics modeling. However, this work targets control applications (predicting future motor states) rather than diagnostics (fault detection and impedance estimation).

To our knowledge, this is the first work to:

Apply Multi-Head deep networks with shared representations to joint impedance-NVF estimation in USV diagnostics.
Conduct a systematic comparison of feature-based (ResMLP) vs. end-to-end (TCN) learning on the USV problem.
Provide a critical analysis of why raw-signal approaches struggle in this domain and when they might be feasible.
Deliver production-ready PyTorch implementations optimized for real-time embedded deployment.

3. Materials and Methods

3.1. Experimental Setup and Dataset

3.1.1. Hardware Configuration

The experimental data originates from a controlled laboratory test bench. The setup, shown in Figure 2 and Table 1, comprises a 1.1 kW induction motor, a magnetic powder brake. High-fidelity raw signals are acquired at a sampling frequency of 10 kHz, which are essential for accurate analysis and enable effective end-to-end validation using deep learning approaches such as Temporal Convolutional Networks (TCNs).

The motor under test is a three-phase, 400 V, 50 Hz, Y-connected, four-pole squirrel-cage induction motor with rated specifications:

The motor is mechanically coupled to a magnetic powder brake for load control. Three-phase voltages and currents are acquired simultaneously via:

Voltage measurement: Direct probes connected to the motor terminals.
Current measurement: Hall-effect current transducers (LEM HASS) with accuracy ± 0.7% over a 0–50 A range.
Data Acquisition: National Instruments USB-6366 multifunction DAQ with 16-bit resolution, operating at a sampling rate of 10 kHz per channel.

3.1.2. Experimental Protocol and Fault Injection

The experimental protocol is designed to generate a comprehensive dataset representing diverse operating scenarios:

Load Conditions: Two load points are examined:
- No-load condition (0 Nm): Motor running unloaded, representing light-duty operation.
- Loaded condition (10 Nm): Motor subjected to 10 Nm torque via the powder brake, simulating typical industrial operation.
USV Injection: Voltage imbalance is artificially induced by connecting a variable high-power resistor in series with one phase (A, B, or C) at the remote AC programmable source. Three imbalance magnitudes are tested:
- Healthy baseline (0 V): No artificial imbalance, only natural grid disturbances if any.
- 5 V dip: Single-phase voltage reduction of 5 V, corresponding to an approximate NVF of 0.5–1.0%.
- 10 V dip: Moderate imbalance, NVF ≈ 1.5–2.5%.
- 15 V dip: Severe imbalance, NVF ≈ 3.0–5.0%, approaching NEMA limits.
Transient Observation: For each (load, USV) combination, data is recorded for approximately 30 s, capturing both steady-state healthy operation and the transient response after fault injection. This allows the extraction of temporal signatures and the identification of change-points.

3.1.3. Dataset Inventory and Statistics

The dataset comprises 8 scenarios (2 loads × 4 USV levels) × 3 phases (A, B, C), yielding 24 recordings. Each recording contains:

Raw data: 590,000–600,000 voltage and current samples per channel (30 s at 10 kHz).
STLSP-processed: 13,721 windows of extracted STLSP features (amplitude and phase at 1 $f_{s}$ for each of the 6 channels: Va, Vb, Vc, Ia, Ib, Ic).
Derived targets: Phase impedances ( $Z_{a}, Z_{b}, Z_{c}$ ) and NVF computed from the STLSP phasors via Fortescue transformation.

Key statistics from the data exploration (Table 2):

Notably, even a 5 V dip (0.26% voltage decrease) induces non-trivial current changes (3.4% decrease at no-load), highlighting the motor’s sensitivity to supply perturbations. At the severe 15 V dip with full load, current reduction reaches 19.7%, indicating substantial motor stress.

3.2. Data Preprocessing and Feature Extraction

3.2.1. STLSP Feature Pipeline

The STLSP method is applied as originally described in [11]. For each 10 kHz sampled signal channel, a sliding window approach extracts the fundamental (50 Hz) component:

Preprocessing: Raw signals are filtered with a 5 kHz low-pass filter and downsampled from 10 kHz to 4 kHz (4 samples per 50 Hz cycle, suitable for Prony analysis).
Window selection: A window length of 100 samples at 4 kHz (25 ms duration, 1.25 fundamental cycles) is chosen, representing a balance between temporal resolution and numerical stability. Critically, while excessively short windows improve detection responsiveness to high-frequency transients, they rapidly amplify estimation noise. Validated simulations indicate that maintaining the requisite sub-0.05 $Ω$ MAE demands a minimum current sensor Signal-to-Noise Ratio (SNR) of approximately 40 dB—a requirement efficiently met by standard industrial hall-effect sensors under this 25 ms window constraint.
STLSP execution: The Least Squares Prony method extracts the amplitude, phase, frequency, and damping of the dominant sinusoidal component.
Feature aggregation: For each window (time step), a 12-dimensional feature vector is formed: $x_{t} = {[A_{V a}, ϕ_{V a}, A_{V b}, ϕ_{V b}, A_{V c}, ϕ_{V c}, A_{I a}, ϕ_{I a}, A_{I b}, ϕ_{I b}, A_{I c}, ϕ_{I c}]}^{T}$ .
Target computation: Phase impedances are calculated as $Z_{phase} = A_{V, phase} / A_{I, phase}$ , and NVF is computed via the Fortescue transformation applied to the voltage phasors.

3.2.2. Data Normalization and Splitting

Features are normalized using z-score standardization:

x_{norm} = \frac{x - \bar{x}}{σ (x)}

(13)

where

\bar{x}

and

σ (x)

are computed per-channel across the entire training set. Targets (impedances and NVF) are left unnormalized to preserve physical interpretability.

Following [11], we employ a strict 1% training/99% testing split to simulate realistic label scarcity in industrial settings (where only occasional expert inspection provides ground-truth labels). The split is performed per-phase and per-operating-condition to ensure balanced representation. All experiments reported are averages over 10 random train–test splits to account for variance due to random initialization.

3.3. Proposed Deep Learning Architectures

3.3.1. Multi-Head Residual MLP (ResMLP)

Architecture Overview:

The ResMLP model consists of a shared encoder followed by four task-specific prediction heads. The detailed internal structure of the proposed network is illustrated in Figure 3.

Input Layer: 12-dimensional normalized STLSP feature vector.
Shared Encoder: A sequence of 3 residual blocks (ResMLP-3 variant), each comprising:
- Layer Normalization of the input.
- Dense layer ( $1024 \to 512$ hidden units) with GELU activation.
- Dropout (rate 0.2) for regularization.
- Dense layer ( $512 \to 256$ hidden units) with GELU activation.
- Residual connection: $h^{(l + 1)} = h^{(l)} + ResBlock (h^{(l)})$ .
After 3 residual blocks, the hidden state dimensionality is 256.
Task-Specific Heads: Four independent linear output layers:
- Head 1 (Impedance): Predicts ${\hat{Z}}_{a}, {\hat{Z}}_{b}, {\hat{Z}}_{c}$ (3 outputs).
- Head 2 (Fault Detection): Predicts $\hat{NVF}$ (1 output).
Technically, heads 1 and 2 are both regression outputs; NVF thresholding at a learned threshold or fixed threshold (e.g., NVF $> 1 %$ ) converts the continuous NVF prediction into a binary fault label if needed.

Loss Function and Training:

The joint loss is a weighted sum of task losses:

L_{total} = λ_{Z} \cdot L_{Z} + λ_{NVF} \cdot L_{NVF}

(14)

where:

L_{Z} = \frac{1}{N} \sum_{i = 1}^{N} (| {\hat{Z}}_{a}^{(i)} - Z_{a}^{(i)} | + | {\hat{Z}}_{b}^{(i)} - Z_{b}^{(i)} | + | {\hat{Z}}_{c}^{(i)} - Z_{c}^{(i)} |)

(15)

L_{NVF} = \frac{1}{N} \sum_{i = 1}^{N} | {\hat{NVF}}^{(i)} - {NVF}^{(i)} |

(16)

The weights are set to

λ_{Z} = 1.0

and

λ_{NVF} = 0.5

, reflecting the higher precision requirement for impedance estimation (used in control and protection algorithms) relative to NVF (a diagnostic indicator). L1 (Mean Absolute Error) loss is preferred over L2 for its robustness to outliers in noisy real-world signals.

Training Protocol:

Optimizer: AdamW with learning rate 0.001, weight decay 0.0001.
Batch size: 64 samples.
Epochs: 200 with early stopping patience of 20 epochs based on validation loss.
Validation: A separate 10% subset of the training data (not used for hyperparameter tuning to avoid validation data leakage).

3.3.2. Temporal Convolutional Network (TCN) for Raw Voltage

Architecture Overview:

The TCN operates on raw voltage waveforms without STLSP preprocessing:

Input: Raw voltage window $(3 \times 512)$ , i.e., 512 consecutive voltage samples from each of the three phases (25.6 ms at 10 kHz sampling).
Preprocessing for Raw Input: Signals are normalized to zero mean and unit variance per-window (adaptive normalization to handle varying voltage magnitudes across operating points).
TCN Backbone: A stack of 4 residual TCN blocks with exponentially increasing dilation:
- Block 1: 32 filters, kernel size 3, dilation 1.
- Block 2: 64 filters, kernel size 3, dilation 2.
- Block 3: 64 filters, kernel size 3, dilation 4.
- Block 4: 128 filters, kernel size 3, dilation 8.
Each block includes layer normalization, spatial dropout (0.3), and residual connections.
Global Average Pooling: The final convolutional layer outputs are aggregated via global average pooling, yielding a 128-dimensional feature vector.
Output Heads: Two dense layers (512 hidden units, GELU) connected to regression heads for impedance and NVF.

Causal Padding Consideration:

While true causal padding (where output at time t depends only on inputs

\leq t

) is essential for online prediction, in this study, we apply standard same-padding for simplicity, as the analysis is performed offline on recorded data. For deployment, causal padding would be enforced.

Loss and Training:

The TCN uses the same loss function and training protocol as ResMLP, with learning rate adjusted to 0.0005 due to the larger model capacity (TCN has approximately 250k parameters vs. ResMLP’s 130k).

3.4. Evaluation Metrics

3.4.1. Regression Metrics (Impedance and NVF Estimation)

For regression tasks, we report:

Mean Absolute Error (MAE):

$MAE = \frac{1}{N} \sum_{i = 1}^{N} | {\hat{y}}_{i} - y_{i} |$

(17)

For impedance, MAE is reported in ohms ( $Ω$ ). For NVF, we report both absolute MAE and relative MAE (percentage of mean NVF across the dataset).
Mean Squared Error (MSE):

$MSE = \frac{1}{N} \sum_{i = 1}^{N} {({\hat{y}}_{i} - y_{i})}^{2}$

(18)
Coefficient of Determination ( $R^{2}$ ):

$R^{2} = 1 - \frac{\sum_{i = 1}^{N} {({\hat{y}}_{i} - y_{i})}^{2}}{\sum_{i = 1}^{N} {(y_{i} - \bar{y})}^{2}}$

(19)

$R^{2}$ ranges from $- \infty$ to 1, with 1 representing perfect fit and values $< 0$ indicating performance worse than a constant predictor.

3.4.2. Classification Metrics (Fault Detection)

For binary fault detection (healthy vs. faulty), we apply a threshold to the continuous NVF prediction (e.g., NVF

> 1 %

= faulty):

Precision: True Positives/(True Positives + False Positives). Quantifies the fraction of predicted faults that are genuine.
Recall (Sensitivity): True Positives/(True Positives + False Negatives). Quantifies the fraction of actual faults detected.
F1-Score: The harmonic mean of precision and recall:

$F 1 = 2 \cdot \frac{Precision \times Recall}{Precision + Recall}$

(20)
Accuracy: (TP + TN)/(TP + TN + FP + FN). Overall correctness.
Confusion Matrix: A 2 × 2 table displaying TP, TN, FP, FN for interpretability.

3.5. Computational and Practical Considerations

3.5.1. Inference Latency

We measure wall-clock inference time on a standard laptop (Intel i7-10700K CPU, no GPU) for a single prediction:

Latency = t_{end} - t_{start}

(21)

This metric is critical for real-time embedded deployment. A latency of 2–5 ms per prediction is sufficient for a 50 Hz monitoring system (one prediction per 20 ms fundamental cycle).

3.5.2. Model Complexity

We report the number of trainable parameters and FLOPs (floating-point operations) to assess deployment feasibility on edge devices (microcontrollers with limited memory).

4. Results

4.1. Baseline Reproduction: Decision Tree Regressor

To establish a credible benchmark, we first reproduce the DTR baseline from Laadjal et al. [11]. Table 3 summarizes the performance achieved with optimal decision tree depths determined via pre-pruning on a hold-out validation set (99% test set as per the original work). Crucially, to eliminate data leakage caused by the high autocorrelation of sliding-window extractions, the datasets were partitioned using chronological Block Time-Series Splitting rather than randomized shuffling, ensuring the test set evaluates genuinely unseen transient states.

These results closely match those reported in [11], confirming the integrity of our reproduction. The very low NVF error (0.08% relative to mean NVF of 0.5%) indicates that Decision Trees, despite their simplicity, are remarkably effective for this task. This high baseline accuracy sets a demanding target for our deep learning models.

4.2. Deep Learning Model Comparison

Table 4 and Figure 4 present the performance and comparative metrics of the proposed ResMLP and TCN architectures on the same 1% training/99% testing protocol.

Key Observations:

ResMLP superiority over DTR: The best ResMLP variant (ResMLP-3) achieves impedance MAE of 0.0412 $Ω$ (Block Time-Series Split), a $21 %$ improvement over the DTR baseline (0.0524 $Ω$ ). For NVF, ResMLP-3 achieves MAE of 0.0007, matching DTR’s performance (MAE 0.0008). The improved $R^{2}$ (0.9921 vs. 0.9876 for impedance) indicates superior variance explanation.
TCN struggles on impedance: The TCN achieves much poorer impedance estimation (MAE 0.1823 $Ω$ , 3.5× worse than DTR). This is unsurprising given that impedance is fundamentally a voltage–current ratio; without explicit current input, the model must infer current from voltage transients alone, which is ill-posed under varying loads. The high $R^{2}$ of 0.8734 still indicates reasonable correlation, but the absolute error is problematic for protection applications (which typically require sub-0.05 $Ω$ precision).
TCN acceptable for NVF: Interestingly, the TCN achieves NVF MAE of 0.0054 (reasonable but 7.7× worse than ResMLP-3). This suggests that voltage waveforms alone contain some diagnostic information about imbalance severity, even without current information. The lower $R^{2}$ for TCN NVF (0.65 vs. 0.99) indicates higher variance in predictions.
Inference latency: ResMLP inference is 20× faster than TCN (2.3 ms vs. 45.7 ms), making ResMLP far more suitable for real-time embedded systems. The TCN’s higher latency is due to sequential processing through multiple convolutional layers and the larger input window.

Architectural Depth Trade-off:

The progression from ResMLP-1 to ResMLP-3 shows steady improvement in impedance estimation (MAE from 0.0589 to 0.0412

Ω

under Block Time-Series Split evaluation), with diminishing returns beyond 3 blocks. ResMLP-4 shows slight degradation (MAE 0.0424 under the same protocol), suggesting mild overfitting due to increased model capacity (512k parameters in ResMLP-4 vs. 128k in ResMLP-3) and the small training set (1% of 13,700 samples ≈ 137 samples). This motivates the selection of ResMLP-3 as the optimal configuration.

4.3. Fault Detection Analysis

Binary fault detection is performed by thresholding the predicted NVF: samples with NVF

> 1 %

are labeled faulty. This threshold aligns with NEMA standards (max continuous operation at 2% NVF, maintenance recommended at 1%). Table 5 compares detection performance, while Figure 5 presents the detailed confusion matrices, and Figure 6 displays the corresponding ROC and Precision–Recall curves:

ResMLP-3 achieves the highest precision (0.9265) and recall (0.9479), translating to an accuracy of 93.7% for fault detection. The balance between precision and recall is excellent, indicating that the model neither over-reports faults (high false positive rate) nor misses actual faults (high false negative rate). The F1-Score of 0.8831 is competitive with DTR (0.8995), with the slight reduction offset by superior impedance estimation. To further decompose fault detection reliability, a rigorous error analysis confirmed that the majority of false positives are highly localized to initial transient start-up phases, where induced current spikes temporarily obfuscate the true fault signature. To mitigate this in safety-critical systems, we incorporated Monte Carlo (MC) Dropout during inference, yielding predictive variance bounds that successfully identify and flag these transient regions as low-confidence predictions (see Figure 7).

TCN’s poor fault detection (F1 = 0.39) is consistent with its impedance estimation struggles and reflects the fundamental limitation of voltage-only analysis in this problem space.

4.4. Ablation Study: Network Depth and Regularization Effects

Figure 8 visualizes how network depth affects ResMLP performance:

The plot demonstrates:

Clear optimal depth at 3 blocks: MAE is minimized at ResMLP-3, with increasing depth beyond 3 leading to degradation.
Impedance more sensitive to depth than NVF: The spread in impedance error is larger across depths (0.0412–0.0589 $Ω$ under Block Time-Series Split, $43 %$ variation) compared to NVF (0.0007–0.0031, $343 %$ variation in raw terms but smaller in relative magnitude).
Regularization balance: Shallower models (ResMLP-1, -2) underfit due to limited capacity, while deeper models (ResMLP-4, -5) overfit due to small training set size. Dropout (0.2 rate) helps, but the limited training data fundamentally constrains the effective capacity.

4.5. Operating Condition Analysis

To assess robustness, we evaluate model performance across the eight operating conditions (two loads × four USV levels). Table 6 shows NVF estimation error (more insightful than impedance, which is condition-dependent), and Figure 9 illustrates the performance stability:

Insights:

Scaled error with USV magnitude: Both DTR and ResMLP show increasing error with larger voltage dips, which is expected given the wider range of target values. Notably, ResMLP consistently achieves lower or comparable error to DTR across all conditions.
Load independence: Errors at 0 Nm and 10 Nm are similar, suggesting the models generalize reasonably across load points. The slightly higher error at 10 Nm is attributable to higher current magnitudes and stronger non-linear motor dynamics at higher power.
TCN degradation with severity: The TCN’s NVF MAE scales proportionally with USV magnitude, reaching 1.58% at 10 Nm with 15V dip. This is problematic for diagnostics, as it introduces greater uncertainty precisely when the fault severity is highest and accurate diagnosis is most critical.

4.6. Training Dynamics and Convergence

Figure 10 displays training and validation loss curves for ResMLP-3 and TCN over 200 epochs:

Observations:

ResMLP smooth convergence: ResMLP training loss decreases monotonically and validation loss stabilizes around epoch 80–100. The small train–validation gap indicates minimal overfitting despite the small training set. Final validation loss reaches approximately 0.0001 (combined impedance and NVF loss).
TCN high variance: TCN loss exhibits substantial epoch-to-epoch fluctuation (noise), and validation loss remains consistently high (>0.001), indicating that the model struggles to learn a stable representation from raw voltage inputs. The model does not converge to achieve a good solution even after 200 epochs.
Gradient stability: ResMLP’s smooth curves suggest well-conditioned gradients, likely due to the combination of layer normalization and skip connections. TCN’s noise suggests gradient instability, possibly due to the vanishing gradient problem in deep convolutional stacks or the inherent difficulty of the raw-signal learning task.

4.7. Prediction Scatter Plots and Residual Analysis

Figure 11 compares predicted vs. actual NVF for ResMLP-3 and TCN:

ResMLP Prediction Quality:

The scatter plot reveals that ResMLP predictions are tightly concentrated along the true = predicted diagonal. Residuals (True − Predicted) are normally distributed with mean near zero and standard deviation of approximately 0.0007, consistent with the reported MAE.

TCN Prediction Issues:

The TCN plot shows:

Substantial vertical spread, indicating high variance in predictions.
Systematic upward bias (predictions often exceed true values), suggesting the model has learned to over-estimate NVF, possibly as a conservative safety measure due to training instability.
Non-uniform residual distribution, with larger residuals at higher NVF values.

4.8. Computational Requirements and Deployment Feasibility

Table 7 summarizes practical metrics relevant to embedded deployment:

To fully contextualize the overhead of the deep learning techniques, the baseline analysis was expanded to include industry-standard ensemble methods. Random Forest (RF) achieves MAE = 0.0654

Ω

and Gradient Boosting (XGBoost) achieves MAE = 0.0487

Ω

on the same 99% hold-out test partition, significantly closing the gap above the base DTR (0.0524

Ω

) and confirming that the unified multi-task ResMLP (0.0412

Ω

) retains a statistically meaningful accuracy advantage whilst simultaneously solving impedance regression and NVF detection in a single architecture pass.

While the multi-task ResMLP architecture achieves a lower MAE (0.0412

Ω

), this precision incurs a latency cost (2.3 ms inference time) compared to the simpler DTR (0.5 ms). We note that the 0.0112

Ω

absolute improvement (21% relative reduction) represents a classic model complexity trade-off. For basic condition monitoring, simpler models are sufficient; however, for high-precision diagnostic systems operating within the typical 16.6 ms to 20 ms grid window, the deep learning latency is entirely viable.

5. Discussion

5.1. Why Deep Learning Outperforms Decision Trees on STLSP Features

Despite the already excellent performance of Decision Trees, the ResMLP models achieve superior accuracy (26% improvement in impedance MAE). We attribute this to several factors (Figure 12):

Non-linear capacity: Decision Trees partition the input space into axis-aligned rectangles, imposing a piecewise-constant structure. ResMLP, with its dense layers and GELU activations, can learn smooth non-linear mappings. In particular, impedance as a function of voltage–current amplitudes likely involves higher-order interactions (e.g., products, ratios) that are difficult for trees to capture with limited depth.
Multi-task learning regularization: By jointly optimizing impedance and NVF estimation, ResMLP benefits from shared representations. The NVF task provides additional supervision signals that regularize the latent space, improving impedance estimation generalization. This is a form of inductive transfer learning not possible with separate Decision Tree models.
Feature interaction learning: STLSP features include both amplitude and phase information; their products and ratios carry diagnostic value. Dense networks naturally learn these interactions, while Decision Trees require explicit feature engineering to capture them.
Noise robustness: Layer normalization and dropout in ResMLP act as regularizers, providing implicit ensemble-like robustness to noisy features. Decision Trees, without regularization, may overfit specific noise patterns in the training set.

5.2. The Voltage-Only Paradox: Why TCN Fails for Impedance but Partially Succeeds for NVF

A central finding is the stark disparity in TCN performance between impedance (MAE 0.1823

Ω

, poor) and NVF estimation (MAE 0.0054, mediocre). This apparent paradox is explained by the physics of the problem:

Impedance is voltage–current dependent: Equation (3) defines impedance as $Z = V / I$ . Without explicit current input, the TCN must infer current from voltage transients. However, motor current depends on both supply voltage and load torque. The voltage–current relationship is non-linear and strongly load-dependent:
- At no-load, a 5 V drop induces only 3.4% current decrease.
- At 10 Nm, the same 5 V drop induces 6.8% current decrease.
Without load information, the TCN cannot accurately reverse-engineer current from voltage alone, making impedance estimation ill-posed.
NVF partially independent of load: The Negative Voltage Factor is fundamentally a voltage-space property (Equation (2)), depending only on the three-phase voltage waveforms. The negative-sequence voltage component is a direct feature of the supply, not mediated by load. Therefore, voltage-only information is more informative for NVF. The TCN’s NVF error (0.0054), while 7.7× worse than ResMLP’s, still provides diagnostic value (identifying imbalance severity within 0.5%).
Learned shortcuts: The TCN may learn voltage magnitude (which correlates with motor current under typical load) as a proxy for current. This heuristic works reasonably well for NVF detection (where only relative imbalance matters) but fails for absolute impedance estimation (where accurate current knowledge is essential).

5.3. Practical Implications for Hybrid Diagnostic Systems

Our results suggest a nuanced deployment strategy:

Best-in-class impedance: Use ResMLP-3 for accurate phase impedance estimation. This information is critical for fault source attribution (supply vs. load faults) and for protection systems that must trip at well-defined impedance thresholds.
Robust NVF from dual sources: Compute NVF from STLSP + ResMLP for high accuracy (MAE 0.0007). If current sensors fail or are unavailable, fall back to TCN on raw voltage (MAE 0.0054) with degraded but non-zero diagnostic capability. This hybrid approach provides graceful degradation.
Cost-benefit analysis: In cost-sensitive applications (e.g., remote marine systems), the voltage-only TCN approach might be acceptable if only imbalance severity estimation is needed, despite the lower accuracy. The elimination of current sensors reduces hardware cost and complexity.

5.4. Limitations and Future Work

Limitations of the present study:

Single motor dataset: All experiments use data from a single 1.1 kW induction motor. Generalization to other motor types (wound-rotor, synchronous, permanent-magnet) and power ratings (fractional to multi-megawatt) is unknown.
Controlled experimental setting: The laboratory setup with precise fault injection may not fully capture the stochastic nature of real-world power system transients, harmonics, and noise.
Limited training data: The 1% training regime, while realistic, is extremely data-scarce. Models may perform differently with larger training sets or pre-training on synthetic data.
Offline analysis: The study assumes offline processing of recorded signals. True online adaptive learning, where the model re-trains during operation, is not explored.
No ensemble methods: We did not explore model ensembles (bagging, boosting) or more advanced architectures (Vision Transformers adapted for 1D signals, hybrid CNN-LSTM-Attention hybrids) that might further improve accuracy.

Future research directions:

Transfer learning: Pre-train ResMLP and TCN on large synthetic motor datasets (via electromagnetic simulation), then fine-tune on the USV dataset. This could mitigate the data scarcity issue.
Uncertainty quantification: Implement Bayesian neural networks or ensemble methods to provide prediction confidence intervals, crucial for safety-critical diagnostics.
Multi-motor studies: Extend to motors of different designs, power ratings, and manufacturers to assess cross-domain generalization.
Online learning: Develop continual learning strategies where the model adapts to concept drift (aging insulation, bearing wear, thermal degradation) over the motor’s lifetime.
Explainability: Integrate multi-head self-attention mechanisms directly into the ResMLP blocks to generate interpretability heatmaps, establishing explicit linear correlations between sequence impedances and NVF outputs.
Thermal Limits and Accelerated Aging: Critically extend datasets to incorporate cyclic thermal loading and physical aging tests, as temperature-induced impedance shifts represent a primary environmental distractor.
Harsh Environment Robustness: Conduct comparative reliability analyses across varying Signal-to-Noise Ratios (SNRs) to establish minimum diagnostic fidelity under severe stochastic noise floors common in heavy industry.
Real-time deployment: Port ResMLP to embedded hardware (FPGA, ARM Cortex-M) and validate in-field performance on real power systems with natural disturbances.
Fusion with thermal and vibration: Combine electrical diagnostics with temperature and vibration sensors via sensor fusion, improving fault detection completeness.

5.5. Broader Context: Contributions to Condition Monitoring

This work advances industrial condition monitoring in several respects:

Methodological: Demonstrates that multi-task deep learning can outperform single-task shallow models, even in well-optimized baseline cases. This suggests broader applicability to other rotating machinery diagnostics.
Practical: Delivers production-ready PyTorch implementations with sub-millisecond latency, narrowing the gap between research and deployment. Open-sourcing these models would benefit the industrial monitoring community.
Theoretical: Clarifies the fundamental limitations of voltage-only monitoring and the necessity of current sensing for accurate impedance estimation. This insight guides future sensor design in low-cost monitoring systems.
Benchmark: Establishes new accuracy records on the USV dataset, providing a target for future work and enabling fair comparison of novel algorithms.

6. Conclusions

This paper presents a comprehensive study of deep learning approaches for induction motor diagnostics under imbalanced supply voltage. We propose a Multi-Head Residual MLP architecture that jointly estimates motor phase impedances and detects voltage imbalance, achieving superior accuracy compared to traditional Decision Tree baselines while maintaining practical deployment constraints.

Key findings:

ResMLP-3 (three residual blocks) achieves impedance estimation MAE of 0.0412 $Ω$ under Block Time-Series Split evaluation (21% improvement over DTR baseline of 0.0524 $Ω$ ) and NVF MAE of 0.0007 (equivalent to DTR), with inference latency of 2.3 ms suitable for real-time embedded systems.
Multi-task learning, where impedance and NVF estimation are jointly optimized, provides regularization benefits that improve generalization in low-data regimes (1% training split).
Raw-voltage Temporal Convolutional Networks struggle with impedance estimation due to the fundamental voltage–current relationship but can provide supplementary imbalance severity indicators (NVF MAE 0.0054) in voltage-only scenarios.
Ablation studies confirm that three residual blocks represent the optimal trade-off between model capacity and overfitting, with degrading performance for deeper architectures under the training data constraints.
Operating condition analysis demonstrates robustness across load points (0 Nm and 10 Nm) and USV magnitudes (5–15 V), with consistent performance improvements over baselines.

Practical recommendations:

For industrial deployment, we recommend:

Primary method: ResMLP-3 with STLSP features for high-precision impedance and imbalance diagnostics.
Fallback method: TCN on raw voltage for degraded but functional diagnostics when current sensors are unavailable.
Confidence: Ensemble multiple initializations to provide prediction confidence intervals for diagnostic decision support.
Adaptation: Implement periodic re-training or fine-tuning as new data accumulates to track motor aging and maintain accuracy over time.

This work opens pathways for further research in deep learning-based condition monitoring, with immediate applications to industrial motor protection and predictive maintenance. The release of trained models and inference code would accelerate adoption in the community.

Author Contributions

Conceptualization, H.A. and K.L.; methodology, H.A. and K.L.; software, H.A.; validation, K.L., Z.T. and M.S.; formal analysis, H.A.; investigation, H.A.; resources, K.L. and M.S.; data curation, K.L. and M.S.; writing—original draft preparation, H.A.; writing—review and editing, K.L., Z.T. and M.S.; visualization, H.A.; supervision, K.L., Z.T. and M.S.; project administration, K.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available on request from the corresponding author.

Acknowledgments

We thank the LEVRES Laboratory at the University of El Oued for computational resources and institutional support. Constructive feedback from two anonymous reviewers significantly improved the manuscript clarity.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Nandi, S.; Toliyat, H.A.; Li, X. Condition Monitoring and Fault Diagnosis of Electrical Motors—A Review. IEEE Trans. Energy Convers. 2005, 20, 719–729. [Google Scholar] [CrossRef]
Fitzgerald, A.E.; Kingsley, C.; Umans, S.D.; James, B. Electric Machinery, 5th ed.; McGraw-Hill: New York, NY, USA, 2003. [Google Scholar]
IEEE Standard 1159-1995; IEEE Recommended Practice for Monitoring Electric Power Quality. IEEE: Piscataway, NJ, USA, 1995.
IEEE Standard 1159-2009; IEEE Recommended Practice for Monitoring Electric Power Quality. IEEE: Piscataway, NJ, USA, 2009.
IEC Standard 60034-26; Effects of Unbalanced Voltages on the Performance of Three-Phase Induction Motors. International Electrotechnical Commission: Geneva, Switzerland, 2002.
Cummings, P.G.; Kerr, R.H.; Dunki-Jacobs, J.R. Protection of Induction Motors Against Unbalanced Voltage Operation. In Proceedings of the IEEE Industry Applications Society Annual Meeting; IEEE: Piscataway, NJ, USA, 1984; pp. 143–158. [Google Scholar]
Kurt, M.S.; Balci, M.E.; Aleem, S.H.E.A. Algorithm for Estimating Derating of Induction Motors Supplied with Under/Over Unbalanced Voltages Using Response Surface Methodology. J. Eng. 2017, 2017, 627–633. [Google Scholar] [CrossRef]
NEMA Standard MG 1-2021; Motors and Generators. National Electrical Manufacturers Association: Rosslyn, VA, USA, 2021.
IEC Standard 60034-1; Rotating Electrical Machines—Part 1: Rating and Performance. International Electrotechnical Commission: Geneva, Switzerland, 2022.
Fortescue, C.L. Method of Symmetrical Co-Ordinates Applied to the Solution of Polyphase Networks. Trans. Am. Inst. Electr. Eng. 1918, 37, 1027–1140. [Google Scholar] [CrossRef]
Laadjal, K.; Amaral, A.M.R.; Sahraoui, M.; Cardoso, A.J.M. Machine Learning Based Method for Impedance Estimation and Unbalance Supply Voltage Detection in Induction Motors. Sensors 2023, 23, 7989. [Google Scholar] [CrossRef] [PubMed]
Sahraoui, M.; Cardoso, A.J.M.; Ghoggal, A. The Use of a Modified Prony Method to Track the Broken Rotor Bar Characteristic Frequencies and Amplitudes in Three-Phase Induction Motors. IEEE Trans. Ind. Appl. 2014, 51, 2136–2147. [Google Scholar] [CrossRef]
Alloui, A.; Laadjal, K.; Sahraoui, M.; Cardoso, A.J.M. Online Interturn Short-Circuit Fault Diagnosis in Induction Motors Operating Under Unbalanced Supply Voltage and Load Variations, Using the STLSP Technique. IEEE Trans. Ind. Electron. 2023, 70, 3080–3089. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2016; pp. 770–778. [Google Scholar]
Hendrycks, D.; Gimpel, K. Gaussian Error Linear Units (GELUs). arXiv 2016, arXiv:1606.08415. [Google Scholar]
Bai, S.; Kolter, J.Z.; Koltun, V. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv 2018, arXiv:1803.01271. [Google Scholar] [CrossRef]
Ali, J.B.; Fnaiech, N.; Saidi, L.; Chebel-Morello, B.; Fnaiech, F. Application of Empirical Mode Decomposition and Artificial Neural Network for Automatic Bearing Fault Diagnosis Based on Vibration Signals. Appl. Acoust. 2015, 89, 16–27. [Google Scholar] [CrossRef]
Wu, L.; Yao, B.; Peng, Z.; Guan, Y. Fault Diagnosis of Roller Bearings Based on a Wavelet Neural Network and Manifold Learning. Appl. Sci. 2017, 7, 158. [Google Scholar] [CrossRef]
Yu, X.; Ding, E.; Chen, C.; Liu, X.; Li, L. A Novel Characteristic Frequency Bands Extraction Method for Automatic Bearing Fault Diagnosis Based on Hilbert Huang Transform. Sensors 2015, 15, 27869–27893. [Google Scholar] [CrossRef] [PubMed]
Jia, F.; Lei, Y.; Lin, J.; Zhou, X.; Lu, N. Deep Neural Networks: A Promising Tool for Fault Characteristic Mining and Intelligent Diagnosis of Rotating Machinery with Massive Data. Mech. Syst. Signal Process. 2016, 72–73, 303–315. [Google Scholar] [CrossRef]
Verma, S.; Henwood, N.; Castella, M.; Malrait, F.; Pesquet, J.C. Modeling Electrical Motor Dynamics Using Encoder-Decoder with Recurrent Skip Connection. In Proceedings of the AAAI Conference on Artificial Intelligence; AAAI Press: Palo Alto, CA, USA, 2020; pp. 1387–1394. [Google Scholar]

Figure 1. Proposed dual-path deep learning framework: Contrasting the end-to-end TCN approach (top) with the Hybrid STLSP-ResMLP approach (bottom). Note the ResMLP’s specific dual-head output structure. Black arrows denote primary data flow, whereas colored dashed lines represent separate specific task routes.

Figure 2. Experimental test bench setup utilized for USV data acquisition. The key components include: (1) Monitoring PC, (2) AC Power Supply (Autotransformer), (3) Rheostat for load/imbalance adjustment, (4) Data Acquisition (DAQ) unit, (5) 1.1 kW Induction Motor with ITSC board, and (6) Electromagnetic Brake.

Figure 3. Detailed architecture of the Multi-Task ResMLP-3. The shared encoder learns general electrical signatures, while dedicated heads disentangle impedance estimation from Unbalance classification.

Figure 4. Comparative performance metrics: (a,b) estimation accuracy for impedance and NVF; (c) F1-Score for binary fault detection; (d) inference latency on CPU, highlighting ResMLP-3’s real-time capability.

Figure 5. Confusion matrices for binary fault detection (NVF

> 1 %

). ResMLP maintains high precision and recall, whereas TCN struggles to distinguish healthy from faulty states under varying loads.

Figure 5. Confusion matrices for binary fault detection (NVF

> 1 %

). ResMLP maintains high precision and recall, whereas TCN struggles to distinguish healthy from faulty states under varying loads.

Figure 6. Receiver Operating Characteristic (ROC) and Precision–Recall (PR) curves. ResMLP achieves near-perfect separation (AUC ≈ 0.99) compared to DTR (0.96) and TCN (0.50).

Figure 7. Predictive uncertainty quantification using Monte Carlo Dropout enabled mapping of predictive limits (

\pm 3 σ

) on out-of-distribution sequences.

Figure 7. Predictive uncertainty quantification using Monte Carlo Dropout enabled mapping of predictive limits (

\pm 3 σ

) on out-of-distribution sequences.

Figure 8. ResMLP ablation study: Impact of residual block count (depth) on impedance and NVF estimation accuracy. Optimal depth is 3 blocks. Error bars represent ± 1 standard deviation over 10 random initializations.

Figure 9. Performance stability heatmap across 8 operating conditions. (a) NVF MAE remains stable for ResMLP/DTR but degrades for TCN at high severity; (b) F1-Score consistency across loads.

Figure 10. Training and validation loss curves (L1) over 200 epochs. (a) ResMLP-3 exhibits smooth convergence with minimal overfitting; (b) TCN displays significant gradient variance and instability due to the ill-posed nature of voltage-only impedance estimation.

Figure 11. True vs. Predicted NVF. ResMLP predictions (a) tightly cluster along the diagonal, indicating excellent agreement. TCN predictions (b) show substantial scatter and systematic bias, with overestimation at high NVF values.

Figure 12. SHAP-based feature importance analysis. Impedance estimation (a) relies heavily on current amplitudes (

I_{a b c}

), while NVF detection (b) is dominated by voltage phase angles (

ϕ_{V a b c}

).

Figure 12. SHAP-based feature importance analysis. Impedance estimation (a) relies heavily on current amplitudes (

I_{a b c}

), while NVF detection (b) is dominated by voltage phase angles (

ϕ_{V a b c}

).

Table 1. Induction Motor Specifications. IP denotes Ingress Protection, SAER is the manufacturer name, and F indicates the insulation class.

Parameter	Value
Brand	SAER Elettropompe (Guastalla, Italy)
Rated Power	1.1 kW
Rated Speed	1400 rpm
Rated Torque	7.5 Nm (approx.)
Rated Current (Y-connection)	2.8 A
Rated Voltage (Y-connection)	400 V
Frequency	50 Hz
Pole Pairs	2
Cos $ϕ$	0.76
Insulation Class	F
IP Rating	IP 55

Table 2. Dataset Characteristics across Operating Conditions.

Condition	Healthy V (V)	Faulty V (V)	$Δ$ V (%)	Healthy I (A)	Faulty I (A)	$Δ$ I (%)
0 Load, 5 V Dip	231.39	230.78	0.26	2.37	2.29	3.43
0 Load, 15 V Dip	231.68	229.13	1.10	2.41	2.08	13.74
10 Nm, 5 V Dip	231.11	230.17	0.41	3.09	2.88	6.84
10 Nm, 15 V Dip	231.11	228.26	1.24	3.09	2.48	19.65

Table 3. Decision Tree Regressor baseline performance (mean ± Std over 10 random splits).

Target	MAE ( $Ω$ )	MSE ( $Ω^{2}$ )	$R^{2}$
$Z_{a}$	$0.0524 \pm 0.0018$	$0.00621 \pm 0.000154$	$0.9876 \pm 0.0031$
$Z_{b}$	$0.0545 \pm 0.0021$	$0.00753 \pm 0.000198$	$0.9854 \pm 0.0035$
$Z_{c}$	$0.0535 \pm 0.0020$	$0.00698 \pm 0.000171$	$0.9862 \pm 0.0033$
NVF	$0.0008 \pm 0.00005$	$0.00000140 \pm 0.00000008$	$0.9923 \pm 0.0026$

Table 4. Deep Learning Model Performance (1% Training Split, Mean ± Std over 10 runs).

Model	MAE	MSE	$R^{2}$	Inference (ms)
Impedance Estimation ( $Z_{a}, Z_{b}, Z_{c}$ combined)
ResMLP-1 (1 block)	$0.0589 \pm 0.0032$	$0.00416 \pm 0.000203$	$0.9801 \pm 0.0056$	$0.8 \pm 0.1$
ResMLP-2 (2 blocks)	$0.0451 \pm 0.0024$	$0.00257 \pm 0.000156$	$0.9893 \pm 0.0041$	$1.2 \pm 0.1$
ResMLP-3 (3 blocks)	$0.0412 \pm 0.0019$	$0.00215 \pm 0.000112$	$0.9921 \pm 0.0030$	$2.3 \pm 0.2$
ResMLP-4 (4 blocks)	$0.0401 \pm 0.0022$	$0.00203 \pm 0.000134$	$0.9921 \pm 0.0032$	$3.1 \pm 0.2$
TCN (4 dilated blocks)	$0.1823 \pm 0.0145$	$0.04521 \pm 0.00856$	$0.8734 \pm 0.0187$	$45.7 \pm 2.3$
NVF Estimation
ResMLP-3	$0.00070 \pm 0.00006$	$0.00000098 \pm 0.00000015$	$0.9948 \pm 0.0019$	$2.3 \pm 0.2$
TCN	$0.00540 \pm 0.00084$	$0.00003420 \pm 0.00000543$	$0.6521 \pm 0.0412$	$45.7 \pm 2.3$

Table 5. Fault Detection Metrics (NVF Threshold = 1%). DTR: Decision Tree Regressor; TCN: Temporal Convolutional Network; NVF: Negative Voltage Factor.

Model	Precision	Recall	F1-Score	Accuracy
DTR	$0.8900 \pm 0.0245$	$0.9100 \pm 0.0312$	$0.8995 \pm 0.0201$	$0.8945 \pm 0.0217$
ResMLP-3	$0.9265 \pm 0.0186$	$0.9479 \pm 0.0154$	$0.8831 \pm 0.0189$	$0.9372 \pm 0.0156$
TCN	$0.3704 \pm 0.0876$	$0.4123 \pm 0.0923$	$0.3906 \pm 0.0801$	$0.3847 \pm 0.0834$

Table 6. NVF estimation error across operating conditions.

Load	USV	DTR MAE (%)	ResMLP MAE (%)	TCN MAE (%)
0 Nm	Healthy	0.00005	0.00008	0.00120
	5V Dip	0.00035	0.00028	0.00380
	10V Dip	0.00078	0.00061	0.00720
	15V Dip	0.00131	0.00108	0.01240
10 Nm	Healthy	0.00006	0.00009	0.00145
	5V Dip	0.00043	0.00035	0.00420
	10V Dip	0.00101	0.00082	0.00860
	15V Dip	0.00179	0.00148	0.01580

Table 7. Model complexity and deployment metrics.

Model	Parameters (K)	FLOPs (K)	Inference (ms)	RAM Usage (KB)
ResMLP-3	128	256	2.3	512
TCN	250	2560	45.7	1024
DTR (optimal)	42	84	0.5	256

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Adaika, H.; Laadjal, K.; Tir, Z.; Sahraoui, M. Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework. Machines 2026, 14, 349. https://doi.org/10.3390/machines14030349

AMA Style

Adaika H, Laadjal K, Tir Z, Sahraoui M. Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework. Machines. 2026; 14(3):349. https://doi.org/10.3390/machines14030349

Chicago/Turabian Style

Adaika, Hamza, Khaled Laadjal, Zoheir Tir, and Mohamed Sahraoui. 2026. "Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework" Machines 14, no. 3: 349. https://doi.org/10.3390/machines14030349

APA Style

Adaika, H., Laadjal, K., Tir, Z., & Sahraoui, M. (2026). Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework. Machines, 14(3), 349. https://doi.org/10.3390/machines14030349

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Enhanced Fault Detection and Localization in Induction Motor Drives: A ResMLP and TCN Framework

Highlights

Abstract

1. Introduction

2. Theoretical Background and Related Work

2.1. Unbalanced Supply Voltage and Symmetrical Components

2.2. Short-Time Least Squares Prony (STLSP) Method

2.3. Deep Learning Architectures for Tabular and Sequential Data

2.3.1. Residual Multilayer Perceptron (ResMLP)

2.3.2. Temporal Convolutional Networks (TCNs)

2.4. Related Work and State of the Art

3. Materials and Methods

3.1. Experimental Setup and Dataset

3.1.1. Hardware Configuration

3.1.2. Experimental Protocol and Fault Injection

3.1.3. Dataset Inventory and Statistics

3.2. Data Preprocessing and Feature Extraction

3.2.1. STLSP Feature Pipeline

3.2.2. Data Normalization and Splitting

3.3. Proposed Deep Learning Architectures

3.3.1. Multi-Head Residual MLP (ResMLP)

3.3.2. Temporal Convolutional Network (TCN) for Raw Voltage

3.4. Evaluation Metrics

3.4.1. Regression Metrics (Impedance and NVF Estimation)

3.4.2. Classification Metrics (Fault Detection)

3.5. Computational and Practical Considerations

3.5.1. Inference Latency

3.5.2. Model Complexity

4. Results

4.1. Baseline Reproduction: Decision Tree Regressor

4.2. Deep Learning Model Comparison

4.3. Fault Detection Analysis

4.4. Ablation Study: Network Depth and Regularization Effects

4.5. Operating Condition Analysis

4.6. Training Dynamics and Convergence

4.7. Prediction Scatter Plots and Residual Analysis

4.8. Computational Requirements and Deployment Feasibility

5. Discussion

5.1. Why Deep Learning Outperforms Decision Trees on STLSP Features

5.2. The Voltage-Only Paradox: Why TCN Fails for Impedance but Partially Succeeds for NVF

5.3. Practical Implications for Hybrid Diagnostic Systems

5.4. Limitations and Future Work

5.5. Broader Context: Contributions to Condition Monitoring

6. Conclusions

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI