Next Article in Journal
Towards Digital Transformation in SMEs: A Custom Software Solution for Shopfloor–ERP Integration
Previous Article in Journal
Design and Performance Analysis of a High-Temperature Forging Deformation Simulation Device for Dual Manipulators
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Investigation on the Use of 2D-DOST on Time–Frequency Representations of Stray Flux Signals for Induction Motor Fault Classification Using a Lightweight CNN Model

by
Geovanni Díaz-Saldaña
1,2,
Luis Morales-Velazquez
1,
Vicente Biot-Monterde
2 and
José Alfonso Antonino-Daviu
2,*
1
Cuerpo Académico Mecatrónica, Facultad de Ingeniería, Universidad Autónoma de Querétaro, Campus San Juan del Río, Av. Río Moctezuma 249, San Juan del Río 76807, Querétaro, Mexico
2
Instituto Tecnológico de la Energía (ITE), Universitat Politècnica de València (UPV), Camino de Vera S/N, 46022 Valencia, Spain
*
Author to whom correspondence should be addressed.
Machines 2025, 13(11), 1001; https://doi.org/10.3390/machines13111001
Submission received: 24 September 2025 / Revised: 29 October 2025 / Accepted: 29 October 2025 / Published: 31 October 2025
(This article belongs to the Section Electrical Machines and Drives)

Abstract

Condition monitoring and fault detection in induction motors (IMs) are priorities in the industrial environment to secure safe conditions for the processes and production. Convolutional Neural Networks (CNNs) are gaining interest in these tasks as they allow automatic extraction of features from the inputs, sometimes Time–Frequency Distributions (TFDs) obtained with various transforms, directly into large models for data classification. This work presents a proposal for the application of a widely used texture analysis tool in the medical field, the 2D Discrete Orthonormal Stockwell Transform (2D-DOST), to improve the accuracy of a lightweight CNN when using different TFDs and comparing the results to the use of the TFDs in RGB and grayscale. The results show that the use of the 2D-DOST improves the classification accuracy in a two to five percent range for all motor conditions under study, while having minimal variations to the training times when compared to RGB or grayscale images, opening the possibility for the use of image processing tools on TFDs to improve automatic feature extraction while using small CNN models.

1. Introduction

The extensive use of induction motors (IMs) in the industrial environment due to their robustness, flexibility of application for a vast number of tasks and mechanisms makes them a priority for maintenance and condition monitoring, as the presence of faults could lead to performance reduction, increments in production costs, and catastrophic failure of the machines in the worst-case scenario [1,2]. Condition monitoring and fault detection are important tasks in the industry due to their connection to predictive maintenance, minimizing production downtime, improving maintenance and production efficiency, protecting the equipment and increasing safety in the workplace. For many years, these areas have been of great interest for researchers and industry, developing a plethora of techniques and methodologies to detect and identify the apparition of irregularities in the physical magnitudes available for the monitoring of the IMs (e.g., stator current, vibration signals, temperature, acoustic emissions, voltages, electrical power and, in more recent times, magnetic flux) and relate these variations to the emergence of mechanical or electrical faults by means of temporal, spectral or statistical analysis, as well as automated processing techniques like Machine Learning (ML) and Deep Learning (DL) [1,2,3,4,5]. Convolutional Neural Networks (CNNs), one of the most popular DL tools, allow for the automatic extraction of features from the 1-D or 2-D inputs and achieve great classification results without the need for expert knowledge, and when paired with Time–Frequency Distribution (TFD) images, have become a growing trend for fault diagnosis.
One of the components of an IM that attracts more attention is the bearings, as they represent up to 40% of the reported faults [1], with vibration signals as the preferred source of information for diagnosis. Many approaches have been presented, like the one in [6], where the authors analyzed the HUST dataset applying Multitask Learning (MTL) to diagnose six faulty conditions with only one severity on different types of bearings, using the Continuous Wavelet Coefficient Maps (CWCMs) as inputs to a Long Short-Time Memory (LSTM) + CNN model. In [7], a comparison between different TFDs on the Case Reserve Western University (CRWU) bearing dataset and experimental results showed that the use of the Fourier Synchrosqueezed Transform (FSST) on vibration signals achieved great performance when feeding 3-channel images of 28 × 28 pixels to a CNN, outperforming other popular TFDs, such as the Continuous Wavelet Transform (CWT) or the Short-Time Fourier Transform (STFT). However, in [8], a similar approach is presented for bearing fault diagnosis, using Transfer Learning (TL) and RGB images of 224 × 224 pixels, with the authors concluding that the CWT and FSST performed better than other TFDs. The superiority of the FSST for bearing diagnosis is confirmed in [9], where the authors compared its performance against the Wavelet Synchrosqueezed Transform (WSST) on the CRWU dataset, and in [10], where current signals went through the FSST and Hilbert Envelope Spectrum for diagnosing an outer race (OR) defect on bearings, identifying the related frequency component with ease. With a different approach, the authors of [11] rearranged 1-D signals from different sensors and different natures to 2-D matrices, then assembled them to generate 120 × 120 inputs for a CNN model, obtaining good classification results for five faulty bearing conditions, highlighting that the reduction in the model parameters leads to a reduction in classification accuracy. In a similar manner, the work reported in [12] presents a comparison of sound and vibration signals from the CWRU and National Aeronautics and Space Administration (NASA) datasets, analyzed by the Wavelet Scattering Transform (WST) and the Adaptive Superlet Transform (ASLT) to generate RGB and grayscale images of five sizes ranging from 16 × 16 to 224 × 224 pixels and classified by a hybrid CNN–Support Vector Machine (SVM) model, achieving better performance with the biggest images in grayscale for the diagnosis of three bearing conditions. The authors of [13] developed a system based on the Internet of Things (IoT) technology for bearing fault detection using the STFT to process vibration signals, diagnosing the conditions by means of a CNN model with three convolutional and pooling layers with different kernel sizes, achieving a methodology that could be implemented online and offline. In [14], the authors created a Graph Convolutional Network (GCN) to tackle bearing fault detection and Remaining Useful Life (RUL) estimation using vibration signals from a motor under different operating conditions, obtaining good results in both tasks with a novel DL method. A different approach is found in [15], where the authors filtered the fundamental component from current signals to identify the harmonics related to different bearing conditions using both the CWT and STFT and compared the diagnostic performance of five CNN models extensively used in the literature, finding an improvement over non-filtered signals.
Other conditions studied with this kind of approach include broken rotor bars (BRBs), as reported in [16], where the authors analyzed current signals using the STFT on the transient state, achieving good results for the diagnosis of four conditions with 25 × 25 grayscale images and a compact CNN model. Similarly, the author of [17] developed a method to diagnose five BRB conditions from STFT images from vibration signals, using TL with four well-known CNN models and showing an improvement in detection accuracy over a CNN model that was not previously trained. Faults related to the kinematic chain connected to the IM have also been diagnosed, as reported by the authors of [18], where gearbox wear was identified by the analysis of stray flux signals using a variant of the Multiple Signal Classification (MUSIC) algorithm, the Short-Time MUSIC (ST-MUSIC). Also, other investigated faults, such as eccentricities and misalignment, were studied by the authors of [19] using the Persistence Spectrum to stray flux signals, generating 224 × 224 RGB images that fed a CNN model, allowing the detection of 5 different conditions related to these faults.
Similarly, TFDs and CNNs have been used to identify faults of different origins in IMs. One example of this is the research presented in [20], where the authors applied the CWT on vibration and current signals to diagnose six different conditions in an IM: healthy, shorted turns, unbalanced, bearing’s inner race (IR), BRB and bent rotor; the results suggested that the use of 128 × 128 grayscale images, the fusion of the signals and features improved the diagnosis. The authors of [21] presented a comparison of four TFDs for the diagnosis of a BRB, imbalance, and bearing defects using vibration signals, offering an analysis of the way these TFDs display the information and concluding that the ST-MUSIC gives more detailed information about the components related to the faults under study. In [22], the authors used current signals from multiple sensors to diagnose BRB and bearing faults in a motor, transforming the 1-D signals to 224 × 224 RGB images fed to a TL-deep CNN (dCNN) model, obtaining a good performance using a larger CNN model when compared to other popular tools.
It has been seen that the use of TFD as inputs for CNN has become a trend for fault diagnosis; however, most of the time, the images are used directly as obtained, in RGB or grayscale, without a processing step that could improve the identification of the conditions under study, as in other fields. One technique that has shown improvements in the medical diagnosis area is the use of the 2-D Discrete Orthonormal Stockwell Transform (2D-DOST); this transform provides local spatial frequency information from images that can be used to characterize the horizontal and vertical frequency patterns and obtain sets of texture features [23]. This technique has been used to diagnose multiple medical conditions, as in [24], where the authors applied it to Magnetic Resonance Images (MRIs) to diagnose eight conditions using CNNs, outperforming other established methodologies in the area. The authors of [25] used the DOST to analyze multiple sclerosis MRIs and compare the results to the Polar Stockwell Transform (PST), finding a slight advantage when using the PST; on the other hand, in [26], the authors proposed a fusion methodology considering the 2D-DOST, the Orthogonal Ripplet II Transform (ORTII) and the original MRIs for the classification of Alzheimer’s disease using a CNN model, concluding the improvement of the results with the fusion of the extracted features. The 2D-DOST has been used in cancer prognosis [27], glioblastoma prediction [28], and seizure identification from MRIs during pregnancy [29].
Considering the aforementioned, it is evident that the use of CNNs and TFDs for the diagnosis of a variety of fault conditions in IMs is a growing trend; however, most works focus on a particular component or type of faults, relying primarily on well-known databases with no empirical experimentation, employing vibration and current signals, whose sensors can be invasive, as preferred information sources; and using TFD images of considerable size with no extra processing and large CNN models for classification. In a recent work [30], the authors explored the use of the 2D-DOST on STFT and CWT images from stray flux signals for the diagnosis of bearing faults in an IM with a lightweight CNN model, identifying an improvement in accuracy when applying the 2D-DOST to the grayscale CWT images, while the opposite was observed on the STFT dataset. In this paper, a continuation of the previous work is carried out, proposing a methodology where stray flux signals are analyzed using the STFT, CWT, FSST and ST-MUSIC to generate TFDs, which are then saved in RGB, grayscale and a third set processed with the 2D-DOST for the diagnosis of one healthy and eleven faulty conditions including bearing faults, various severities of BRBs, unbalances, misalignments and a loosened bolt. The results compare the performance on several configurations of TFDs and image types, including training times, accuracy, and the effect of processing tools on the inputs of a lightweight CNN model. With these results, the proposed method presents an alternative to improve the classification accuracy of a small CNN model for multiple fault diagnosis considering the use of various TFDs by employing a texture analysis tool as a preprocessing step in the analysis of stray flux signals using non-invasive sensors.

2. Materials and Methods

In this section, the theoretical background of the tools and techniques used in this research, as well as the methodology employed for the analysis of the stray flux signals and the diagnosis of motor fault conditions are described.

2.1. Theoretical Background

In this subsection, the theoretical fundaments and mathematical description of the tools employed are presented. First, fault-related frequency components are summarized, as traditional methodologies focus on the detection of these components for the diagnosis of faults in IM using frequency- or time–frequency domains. Secondly, TFDs previously discussed as tools for fault detection using CNNs are described, being the basis of the proposed methodology, as well as the 2D-DOST, which was commented to be used as a texture analysis tool in medical image processing and is intended to serve as a preprocessing step to identify its capabilities to improve fault diagnosis using a CNN model.

2.1.1. Fault Detection in Induction Motors

The presence of a faulty condition in an IM changes the way it behaves, and it can be reflected in the physical magnitudes that can be measured from the motor. Throughout the years, researchers have identified the frequency components related to the most common faults in vibration and current signals, which are the preferred sources for diagnosing IMs, and more recently, stray-flux analysis has been applied to this end. Table 1 summarizes the frequency components in the vibration, current, and magnetic flux signals related to the faults considered in this research, which are extensively used in the literature for the diagnosis of fault conditions in IMs through spectral analysis and frequency decomposition methods. Equations (1)–(10) consider fov as the frequency related to OR defects in vibration signals; foc as the frequency related to OR in electrical signals; fbrb as the components related to BRB; fecc as the component related to eccentricities; fem as a component related to both eccentricity and misalignment; fs as the supply frequency; fr as the rotational frequency of the rotor; s as the slip of the machine; k as an integer k = 1, 2, 3, ; Nb as the number of balls in the bearing; Db as the diameter of the balls; Dc as the pitch or cage diameter; β as the contact angle between a ball and the raceway; Nr as the number of rotor slots; nd = 0 for static eccentricity; and nd = 1, 2, 3, for dynamic eccentricity.

2.1.2. Short-Time Fourier Transform (STFT)

The STFT is a widely used TFD, which performs the Fourier Transform of a signal using a fixed size sliding window, granting the capability to obtain both time and frequency information, allowing the detection of transient phenomena. Equation (11) shows the mathematical definition of the STFT, where x(t) is the time-domain signal, h(τ − t) is the window function centered at time t, and ω represents the angular frequency [32,33].
S T F T τ , ω = + x t h t τ e j ω t d t  

2.1.3. Continuous Wavelet Transform (CWT)

The CWT was developed to tackle problems related to the fixed resolution of STFT, presenting an alternative capable of multi-resolution analysis thanks to the use of the mother wavelet functions that adapt to the changes in frequency, making it adequate for the analysis of non-stationary signals. Equation (12) presents the definition of the CWT, where x(t) is the signal in the time domain, Cw(a, b) is the function of the scale (a) and translation (b), and ψ is the mother wavelet function [8,34].
C w a , b = 1 a + x t ψ t b a d t  

2.1.4. Fourier Synchrosqueezed Transform (FSST)

The Synchrosqueeze Transform is a post-processing method that aims to provide a sharper, more concentrated representation of multicomponent signals in the TFDs from linear transforms, granting major separation between the frequency components and demodulation of the different modes in the signal while being a reversible process [29], producing a higher frequency and time resolution compared to other techniques [30]. The FSST applies the Synchrosqueezing process to the STFT, producing a sharper spectrogram with narrower components, with the mathematical definition in Equation (13), where Vgf(t,η) is the STFT of a function (or signal) f filtered at a frequency η using the spectral window g; ω and t correspond to the observed time and frequency, respectively; and Ωgf(t,η) represents the instantaneous frequency obtained using phase transform [7,35,36].
F S S T = + V g f t , η δ ω Ω g f t , η d η  

2.1.5. Short-Time Multiple Signal Classification

MUSIC algorithm is a high-resolution method capable of detecting frequency components in signals with important levels of noise [21,37]. This algorithm considers that the discrete time signal x[n] can be represented as the sum of m complex sinusoids in noise e[n], as shown in Equations (14) and (15), where N is the number of samples in the signal; n = 0, 1, 2, , N – 1; Bi is the complex amplitude of the i-th complex sinusoid with frequency fi; and e[n] is a sequence of white noise with zero mean and unit variance σ2.
x n = i = 1 m B i ¯ e j 2 π f i n + e [ n ]  
B i ¯ = B i e φ i
Equation (16) presents the autocorrelation matrix, R, of the noisy signal x[n] as the sum of the autocorrelation matrices of the pure signal, Rs, and the noise, Rn, with P as the number of frequencies; the exponent H as the Hermitian transpose; I corresponding to the identity matrix; and eH(fi) as the signal vector as given by Equation (17).
R = R s + R n = i = 1 P B i 2 e H f i + σ n 2 I
e H f i = 1       e j 2 π f i           e j 2 π f i ( N 1 )
From the orthogonality condition of both subspaces, the MUSIC pseudospectrum Q is given by Equation (18), where Vm+1 is the noise eigenvector, and exhibits peaks that correspond to frequencies of principal sinusoidal components, where e(f)HVm+1 = 0.
Q M U S I C f = 1 e f H V m + 1 2  
To capture the time information, the signal is analyzed using a sliding window, obtaining the MUSIC pseudo-spectrum of each window, and condensing the information to form a TFD that preserves the information and reduces the noise present in the signals while evidencing large components [37].

2.1.6. Two-Dimensional Discrete Orthonormal Stockwell Transform

Stockwell Transform (ST) was introduced in 1996 in [38], originating from the STFT and the CWT, providing an alternative with high time and frequency resolutions at low frequencies and high time resolution at high frequencies by using a window of variable length [20]. The original definition of the ST presented a high redundancy, which affected computational cost; this problem led to the development of mathematical formulations to reduce the computational cost, with the DOST being one of the most used, having the computational cost of the Fast Fourier Transform (FFT) [39]. Equation (19) presents the mathematical definition of the 2D-DOST, where considering a N × N image f(x,y), F(m,n) is the 2D Fourier Transform of the image f, u and v are the shift parameters that control the center of the window on the x- and y-axis, fu and fv are the frequencies related to scale parameters, vx and vy are the horizontal and vertical frequencies, respectively, px and py = 0, 1, , log(N − 1).
S u , v , f u , f v = 1 2 p x + p y 2 m = 2 p x 2 2 p x 2 1 n = 2 p y 2 2 p y 2 1 F m + v x ,   n + v y e ( 2 π j ( m u 2 p x 1 + n v 2 p y 1 ) )
The resulting image, S, has the same size as f, and contains information about the frequencies (fu, fv) in the bandwidth of 2px − 1 × 2py − 1 [24].

2.2. Proposed Methodology

Figure 1 shows a block diagram illustrating the stages included in the methodology. The first step is normalizing the stray flux signals to a [−1, 1] amplitude range; then, the signals are windowed, considering 5 s segments and a 1 s window displacement. Each window is processed in the second stage to generate time–frequency representations for the analysis using the STFT, the CWT, the FSST, and the ST-MUSIC; the representations are then saved as RGB images of 64 × 64 pixels and converted to grayscale. Additionally, the grayscale images undergo a 2D-DOST postprocessing to generate new images. All three sets are then fed to a lightweight CNN model to classify the data into twelve different motor conditions; this process is repeated for each TFD and image set.

Experimental Setup

The experiments were conducted on a 1 hp, six-pole, three-phase induction motor with a rated speed of 1140 rpm, line-fed with 460 V at 50 Hz. The motor was coupled to a generator, acting as a load, and all tests were performed at the nominal load level. For the acquisition of the stray flux signals, a three-axis magnetometer was employed, model DM1422AGMV with a range of ±1200 µT and a 0.042 µT/LSB sensitivity at a sampling rate of 1 kHz, allowing for the measurement of the axial and radial flux components, as well as the combination of both in an orthogonal axis. This sensor was placed centered on the side of the IM, as is common practice, since it can measure in the three orthogonal axes from a single spot and that position allowed for better measurements.
In this study, one “healthy” (HLT) condition and eleven faulty conditions were considered. The faulty states could be grouped according to the source of the fault: bearings, rotor, misalignment, unbalance of the load, and attachment to the test bench. For the bearing-related faults, bearings model NSK-6203 DU, with a 17 mm internal diameter, a 40 mm external diameter, and eight balls were used; the conditions included a 1 mm OR hole, no grease (NG), and induced corrosion by submerging the bearing in brine for one (COR0) and two days (COR1), respectively, followed by three days of drying. The rotor-related conditions considered were one and two broken rotor bars (1BRB and 2BRB, respectively), induced by drilling small holes in the rotor bars. For the misalignment conditions, three angular deviations were caused by placing shims under the motor’s feet: 0.8° (Mal0.8), 1.24° (Mal1.24), and 1.6° (Mal1.6). The unbalanced conditions were created by using an asymmetric coupling between the motor and the load (UNB0), applying a mass of 5.25 g with a center of mass at a 7.25 mm radius from the center of the shaft, and attaching weight to the asymmetric coupling (UNB1), applying a mass of 99.35 g with a center of mass at a 21.45 mm radius from the center of the shaft. Finally, for the last condition, a bolt fixing the motor to the test bench was loosened to create minor disturbances to the system (LB). Figure 2 shows the test bench and some conditions and elements used for the experiments. For each condition, six tests were performed, capturing three stray flux signals for 45 s of the steady state per test.
The data was processed in a personal computer CPU i9-13980HX 2.2 GHz, 32 GB RAM, a NVIDIA GeForce RTX 4060 Laptop GPU in the platform Matlab 23.2.0.2859533 (R2023b) Update 10. The CNN model consisted of an 11-layer structure defined in Table 2, which was trained using a Stochastic Gradient Descent with Momentum method, considering a learning rate of 0.001, a mini batch size of 30, a maximum of 5 epochs, and a validation frequency of 30. Because in the experiments RGB and grayscale images were considered, the size of the Input Layer changed according to the number of layers of the image, and the datasets were divided randomly with a distribution of 60% for training, 20% for validation, and 20% for testing. The model has 99.1 k learnable parameters, significantly less than well-known models like GoogLeNet with 6.9 M, AlexNet with 60.9 M, ResNet18 with 11.6 M, ResNet50 with 25.5 M, or VGG-16 with 138.3 M; this difference situates the proposed model as a lightweight model due to its reduced size and number of parameters [40].

3. Results

In this section, the results obtained with the previously established methodology are described, starting from the image generation to the classification process and comparing the performance in the fault identification process for each dataset.

3.1. Image Generation

The stray flux signals obtained from the experiments conducted in a laboratory were processed following the methodology defined in Section 2.2. Each of the transforms considered similar parameters; for the CWT, a Morlet mother wavelet was considered; for the STFT, FSST and ST-MUSIC, windows of 256 samples were considered. Additionally, in all cases, the representations considered a frequency range from 0 to 500 Hz, although the CWT representation considers a logarithmic distribution in the frequency axis, differing from the other transforms in data distribution but conserving its distinctive characteristic, as appreciated in Figure 3, where TFDs for a whole signal are displayed.
The time–frequency diagrams were then adapted and saved for further processing as mentioned in the methodology, obtaining just the area of importance and resizing the pictures to a 64 × 64 resolution. Considering the three stray flux signals and the window’s length and overlap, a total of 861 pictures were generated for each condition. Each image corresponds to a five-second window, equivalent to 95 rotations of the axis of the machine. Figure 4 provides a comparison of the time–frequency representations generated for all conditions in RGB format; it is possible to observe the differences between the representations, apart from the CWT frequency distribution, such as the similarities between the STFT and the FSST, the definition in the components being more compact and continuous in the FSST images than in the CWT and ST-MUSIC sets, where “spots” can be observed all over the images.
Once the images were generated and saved as RGB files, the pictures were converted to grayscale and saved as a separate dataset, with samples displayed in Figure 5. In this dataset, less distinction between the conditions is noticeable as the color palette is less diverse, especially for the FSST. The other sets conserve some particular characteristics in the time–frequency spectrum for each condition.
The grayscale figures were then processed using the 2D-DOST to obtain new representations, as this version of the DOST is used as a texture analysis tool in the image processing field, creating a different representation of the data in the pictures. In Figure 6, a comparison of the datasets is provided, having some elements in common and particularities regarding the information for each condition, making it harder to differentiate them.
From Figure 4, Figure 5 and Figure 6, it is possible to identify some variations in the data for the fault conditions regardless of the set of pictures; however, the subtlety of the changes in the spectrums makes it hard for the human eye to easily identify the conditions by looking for specific frequency components as performed in traditional approaches.

3.2. CNN Classification

As mentioned in the methodology, a lightweight CNN was used to classify the twelve conditions of the IM. For each set of images, ten training-testing-validation processes were carried out, obtaining parameters such as accuracy and training time to evaluate the impact of the 2D-DOST over the fault detection task. In Table 3, the accuracy and training time for the STFT sets are presented, and the best accuracy with the respective training time is highlighted for each set, as well as the best global performance. Additionally, the average time for the classification of one RGB picture was 0.001366 s, for grayscale images it was 0.001029 s, and for 2D-DOST-treated images it was 0.000980 s, meaning the proposed method allows for faster diagnostic decisions.
Considering that the 2D-DOST subset performed better on average, Table 4 shows the average confusion matrix for the set in percentage, presenting some difficulties in differentiating between related classes but achieving good results in most cases.
In Table 5, the accuracy and training time for the CWT sets are presented, and the best accuracy with the respective training time is highlighted for each set, as well as the best global performance. Additionally, the average time for the classification of one RGB picture was 0.000971 s, for grayscale images it was 0.000911 s, and for 2D-DOST-treated images it was 0.001025 s, meaning the proposed method presented a small increase in time relative to the other options.
Considering that the 2D-DOST subset performed the best, Table 6 shows the average confusion matrix for the set in percentage, demonstrating some difficulties in differentiating between classes but achieving good results in most cases, with the greater decrease in accuracy for the UNB0 condition.
Similarly, Table 7 shows the results obtained when using FSST images as a basis for the diagnosis, with the best results highlighted. In this case, slightly lower accuracy is observed for all cases. Likewise, the average time for the classification of one RGB picture was 0.001067 s, for grayscale images it was 0.000973 s, and for 2D-DOST-treated images it was 0.001009 s, showing not much variation in the times captured for the previous datasets.
Table 8 presents the average confusion matrix obtained for the 2D-DOST group, as it performed better than the other two image representations for the FSST, but presented greater problems for detecting unbalance-related classes when compared to the previous cases.
Continuing with the results from the CNN, Table 9 displays the accuracies and times achieved using the ST-MUSIC TFDs, with the best results highlighted. Contrary to the FSST case, this set performs better throughout the different data presentations. In a similar manner to the previous datasets, the average time for the classification of one RGB picture was 0.001128 s, for grayscale images it was 0.001006 s, and for 2D-DOST-treated images it was 0.001169 s.
Likewise, in Table 10, the average confusion matrix for the 2D-DOST case is displayed, showing fewer mistakes in the classification of the conditions, as all conditions present over 92% of correct detections for all conditions compared to the previous cases.

3.3. Comparison to Current Signals

To validate the proposal, the methodology was implemented to current signals that were captured simultaneously using a Fluke i300 current clamp (Universitat Politècnica de València, Valencia, Spain) on a Yokogawa DL350 scopecorder (Universitat Politècnica de València, Valencia, Spain) at 10 ksps. The signals went under the same normalization process, with a digital subsampling to obtain the same amount of datapoints in the signals and be able to compare on a more direct level. As mentioned in the Introduction, current signals and STFT are one of the main approaches to fault diagnosis on IMs, so only this combination was chosen to validate the capabilities of the proposal to improve the accuracy of various fault diagnosis. Figure 7 presents a comparison of the STFT images generated from the current signals in RGB, grayscale and the 2D-DOST; due to the resolution of the images, it is difficult to identify differences between the conditions in the same dataset. Another clarification is that only one current signal was captured per test, having 1/3 of the total amount of stray flux data, meaning that for each condition, there are 287 images, divided as 60% for CNN training, 20% for validation and 20% for testing.
In Table 11, the accuracy and training time for the STFT current sets are presented, and the best accuracy with the respective training time is highlighted for each set, as well as the best global performance. Additionally, the average time for the classification of one RGB picture was 0.000925 s, for grayscale images it was 0.000986 s, and for 2D-DOST-treated images it was 0.000915 s, which is consistent to the results in the previous subsection.
In this case, the 2D-DOST set allowed for a slight increment in accuracy when compared to the RGB set. Table 12 shows the average confusion matrix for the 2D-DOST set in percentage, demonstrating a similar behavior to the stray flux tests, where a good general performance with some conditions presented a higher level of misclassifications is appreciated.

3.4. Comparison to Other Methods in the Literature

To further provide a comparison of the methodology proposed in this paper, in Table 13, a comparison of the results achieved to the some in the literature for the detection of faults in IMs is provided, with the proposed method average accuracies around the expected level. As mentioned before, most works focus on faults located in one part or component of the machine or on one type of fault, with a limited number of conditions. Additionally, the focus of this article is to analyze the effect of using image processing tools as preprocessing steps to improve multiple fault classification while using non-invasive stray flux sensors.

4. Discussion

The graphs displayed in Figure 3 illustrate the differences between the TFDs chosen for this paper, as the content in each one represents the same information in a different way, like having a logarithmic distribution in the case of the CWT, more definition in the frequency components with the FSST in comparison to the STFT, or a richer distribution in the ST-MUSIC. The comparison in Figure 4 showed that the faults presented not enough information in the spectrograms to distinguish all conditions easily from one another in each dataset; the same could be said about the grayscale images in Figure 5, which presented less detail due to the color palette, making it more difficult to discriminate the conditions by eye. Similarly, the comparison in Figure 6 presented the same tendency, as most conditions share similar patterns with variations that are hard to identify.
The automatic feature extraction performed by the CNN was able to tackle this problem, achieving good results in most cases. As seen in Table 3, Table 5, Table 7 and Table 9, RGB images reached high results in condition classification in most cases while the use of grayscale pictures represented a reduction in the performance of the model, possibly related to the lower detailed and information, as those images have information of the three color layers (RGB) condensed in one; contrarily, the results when using the 2D-DOST processed datasets showed an improvement over the RGB sets, this could be caused by the distribution of the information performed by the DOST, which could facilitate the extraction of features performed by the CNN model.
Regarding the TFDs used, STFT, CWT and FSST have been used to detect faults in IMs, mostly in bearings, and have proved to be effective when discriminating against a couple of conditions affecting different components of the motor. On the other hand, the MUSIC algorithm is considered more resource-consuming, and the short-time variation is less common for fault detection; however, it showed a better performance on average and a similar average training time reduction as the other TFDs. All transforms showed adequate results for identifying the twelve conditions and presented similar difficulties when diagnosing the severities of misalignment and unbalance, as the literature says that both conditions present similar frequency components. The implementation of the 2D-DOST over the TFDs helped the classification task with minimal computational load; this could be of great benefit when the use of more established methods, like the ones compared in this work, do not reach optimal accuracy levels, improving the fault diagnosis, even when other techniques struggle to identify as many and diverse conditions as reported in this research.
The comparison to the current signal analysis employing the proposed methodology shows that the proposal is suitable for application on different signals as the performance achieved when using current signals with STFT images is similar to that of the stray flux and different TFDs, even with reduced data. The CNN model, significantly smaller than most of the extensively reported models in the literature, performed adequately in all cases, allowing for a good performance during the tests. The information displayed in Table 13 shows that the results obtained fall within the range presented in the literature, taking into account the amount of conditions under detection and the difference in size and parameters for all CNN models, as well as the validity of the approach, as a way to improve the classification performance by using a tool that facilitates the automatic feature extraction performed by a CNN model considerably smaller than some of the preferred nets, as mentioned in Section Experimental Setup.

5. Conclusions

This work focused on analyzing the effect of the 2D-DOST, an image processing tool used in the medical field as a texture analysis technique capable of improving condition diagnosis, on the fault detection in IMs. To contrast the influence of this technique, RGB and grayscale TFDs of the STFT, CWT, FSST, and ST-MUSIC were used as baselines, and their classification performance was compared to that of the TFDs after a 2D-DOST treatment. The results showed using RGB images achieved good accuracy, while grayscale pictures reduced the performance of the CNN model; in contrast, the results from the 2D-DOST batch improved the RGB results with minimal changes in computational load. The use of image processing tools on the inputs to CNN models could greatly benefit the condition monitoring and fault detection area, as they could facilitate the extraction of features, improving the overall accuracy and reducing the need for larger models that require more resources. This methodology showed good performance for identifying eleven faulty conditions of five different natures and various severities, proving that it could be applied to different scenarios. Additionally, the application of the method to TFDs of current signals generated using STFT showed similar results to the stray flux analysis, demonstrating the applicability of the proposal to other signals, even with a reduced number of samples.
Further research is necessary to understand the mechanisms that influence the performance of the TFDs when the 2D-DOST is used, as well as to compare whether the results extend to other transforms or 2D representations, or if similar image processing tools have similar effects. Prospective work includes the automation of the process to generate an online system that could identify the presence of faults on working machines to allow for early detection and corrective maintenance measures, reducing the impact of the faults, as well as expanding the results to consider early fault detection and progressive severity increments.

Author Contributions

Conceptualization, G.D.-S., L.M.-V. and J.A.A.-D.; methodology, G.D.-S.; software, G.D.-S.; validation, G.D.-S., L.M.-V. and J.A.A.-D.; formal analysis, G.D.-S.; investigation, V.B.-M.; resources, L.M.-V. and J.A.A.-D.; data curation, G.D.-S. and V.B.-M.; writing—original draft preparation, G.D.-S.; writing—review and editing, L.M.-V. and J.A.A.-D.; visualization, G.D.-S.; supervision, L.M.-V. and J.A.A.-D.; project administration, L.M.-V. and J.A.A.-D.; funding acquisition, and J.A.A.-D. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Spanish “Ministerio de Ciencia e Innovación”, Agencia Estatal de Investigación and FEDER program in the framework of the “Proyectos de Generación de Conocimiento 2021” of the “Programa Estatal para Impulsar la Investigación Científico-Técnica y su Transferencia”, belonging to the “Plan Estatal de Investigación Científica, Técnica y de Innovación 2021–2023” (ref: PID2021-122343OB-I00). The authors also thank Generalitat Valenciana for supporting part of this research in the context of the “Subvenciones del programa Santiago Grisolía” (Grant CIGRIS/2023/149).

Data Availability Statement

The data is not publicly available due to privacy.

Acknowledgments

The authors would like to thank the scholarship from the Secretaría de Ciencia, Humanidades, Tecnología e Innovación, SECIHTI, registered under CVU number 1083233.

Conflicts of Interest

The authors declare no conflicts of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript; or in the decision to publish the results.

Abbreviations

The following abbreviations are used in this manuscript:
2D-DOST2-D Discrete Orthonormal Stockwell Transform
ASLTAdaptive Superlet Transform
BRBBroken rotor bar
CNNConvolutional Neural Network
CORCorrosion
CRWUCase Reserve Western University
CWCMsContinuous Wavelet Coefficient Maps
CWTContinuous Wavelet Transform
dCNNDeep CNN
DLDeep Learning
FFTFast Fourier Transform
FSSTFourier Synchrosqueezed Transform
GCNGraph Convolutional Network
HLTHealthy
IMInduction motor
IRInner race
LBLoosened bolt
LSTMLong Short-Time Memory
MalMisalignment
MLMachine Learning
MRIMagnetic Resonance Image
MTLMultitask Learning
MUSICMultiple Signal Classification
NASANational Aeronautics and Space Administration
NGNo grease
OROuter race
ORTIIOrthogonal Ripplet II Transform
PST Polar Stockwell Transform
RULRemaining Useful Life
STStockwell Transform
ST-MUSICShort-Time MUSIC
STFTShort Time Fourier Transform
SVMSupport Vector Machine
TFDTime–Frequency Distribution
TLTransfer Learning
UNBUnbalance
WSSTWavelet Synchrosqueezed Transform
WSTWavelet Scattering Transform

References

  1. Gundewar, S.K.; Kane, P.V. Condition monitoring and fault diagnosis of induction motor. J. Vib. Eng. Technol. 2021, 9, 643–674. [Google Scholar] [CrossRef]
  2. Kumar, R.R.; Andriollo, M.; Cirrincione, G.; Cirrincione, M.; Tortella, A. A Comprehensive Review of Conventional and Intelligence-Based Approaches for the Fault Diagnosis and Condition Monitoring of Induction Motors. Energies 2022, 15, 8938. [Google Scholar] [CrossRef]
  3. Almounajjed, A.; Sahoo, A.K.; Kumar, M.K.; Assaf, T. Fault diagnosis and investigation techniques for induction motor. Int. J. Ambient Energy 2021, 43, 6341–6361. [Google Scholar] [CrossRef]
  4. De Jesus Rangel-Magdaleno, J. Induction Machines Fault Detection: An Overview. IEEE Instrum. Meas. Mag. 2021, 24, 63–71. [Google Scholar] [CrossRef]
  5. Zamudio-Ramirez, I.; Osornio-Rios, R.A.A.; Antonino-Daviu, J.A.; Razik, H.; De Jesus Romero-Troncoso, R. Magnetic Flux Analysis for the Condition Monitoring of Electric Machines: A Review. IEEE Trans. Ind. Inform. 2021, 18, 2895–2908. [Google Scholar] [CrossRef]
  6. Abbasi, M.A.; Huang, S.; Khan, A.S. Fault detection and classification of motor bearings under multiple operating conditions. ISA Trans. 2024, 156, 61–69. [Google Scholar] [CrossRef]
  7. Gundewar, S.K.; Kane, P.V. Bearing fault diagnosis using time segmented Fourier synchrosqueezed transform images and convolution neural network. Measurement 2022, 203, 111855. [Google Scholar] [CrossRef]
  8. Wang, J.; Mo, Z.; Zhang, H.; Miao, Q. A Deep Learning Method for Bearing Fault Diagnosis Based on Time-Frequency Image. IEEE Access 2019, 7, 42373–42383. [Google Scholar] [CrossRef]
  9. Łuczak, D. Data-Driven Machine Fault Diagnosis of Multisensor Vibration Data Using Synchrosqueezed Transform and Time-Frequency Image Recognition with Convolutional Neural Network. Electronics 2024, 13, 2411. [Google Scholar] [CrossRef]
  10. Fu, M.; Du, D.; Leng, F.; Tian, J.; Fang, Y.; Zhang, J. Motor Bearing Fault Diagnosis Based on Fourier-based Synchrosqueezing Transform and Hilbert Envelope Spectrum. In Proceedings of the 2023 IEEE 6th Student Conference on Electric Machines and Systems (SCEMS), Huzhou, China, 7–9 December 2023; pp. 1–6. [Google Scholar] [CrossRef]
  11. Rajabioun, R.; Afshar, M.; Mete, M.; Atan, Ö.; Akin, B. Distributed Bearing Fault Classification of Induction Motors Using 2-D Deep Learning Model. IEEE J. Emerg. Sel. Top. Ind. Electron. 2023, 5, 115–125. [Google Scholar] [CrossRef]
  12. Mitra, S.; Koley, C. Real-time robust bearing fault detection using scattergram-driven hybrid CNN-SVM. Electr. Eng. 2023, 106, 3615–3625. [Google Scholar] [CrossRef]
  13. Irgat, E.; Çinar, E.; Ünsal, A.; Yazıcı, A. An IoT-Based Monitoring System for Induction Motor Faults Utilizing Deep Learning Models. J. Vib. Eng. Technol. 2022, 11, 3579–3589. [Google Scholar] [CrossRef]
  14. Bouharrouti, N.E.; Saberi, A.N.; Khan, M.D.H.; Kudelina, K.; Naseer, M.U.; Belahcen, A. Deep Transfer Learning Approach Using Filtered Time-Frequency Representations of Current Signals for Bearing Fault Detection in Induction Machines. IET Electr. Power Appl. 2025, 19. [Google Scholar] [CrossRef]
  15. Qi, J.; Chen, Z.; Kong, Y.; Qin, W.; Qin, Y. Attention-guided graph isomorphism learning: A multi-task framework for fault diagnosis and remaining useful life prediction. Reliab. Eng. Syst. Saf. 2025, 263, 111209. [Google Scholar] [CrossRef]
  16. Valtierra-Rodriguez, M.; Rivera-Guillen, J.R.; Basurto-Hurtado, J.A.; De-Santiago-Perez, J.J.; Granados-Lieberman, D.; Amezquita-Sanchez, J.P. Convolutional Neural Network and Motor Current Signature Analysis during the Transient State for Detection of Broken Rotor Bars in Induction Motors. Sensors 2020, 20, 3721. [Google Scholar] [CrossRef] [PubMed]
  17. Misra, S.; Kumar, S.; Sayyad, S.; Bongale, A.; Jadhav, P.; Kotecha, K.; Abraham, A.; Gabralla, L.A. Fault Detection in Induction Motor Using Time Domain and Spectral Imaging-Based Transfer Learning Approach on Vibration Data. Sensors 2022, 22, 8210. [Google Scholar] [CrossRef] [PubMed]
  18. Zamudio-Ramirez, I.; Osornio-Rios, R.A.; Antonino-Daviu, J. Transient stray flux analysis via music methods for the detection of uniform gearbox teeth wear faults. In Proceedings of the 2021 IEEE Energy Conversion Congress and Exposition (ECCE), Vancouver, BC, Canada, 10–14 October 2021; pp. 4431–4435. [Google Scholar] [CrossRef]
  19. Biot-Monterde, V.; Navarro-Navarro, A.; Zamudio-Ramirez, I.; Antonino-Daviu, J.A.; Osornio-Rios, R.A. Automatic classification of eccentricities and misalignments in SCIM applying persistence spectrum and CNN to stray-flux signals. In Proceedings of the 2023 IEEE Energy Conversion Congress and Exposition (ECCE), Nashville, TN, USA, 29 October–2 November 2023; pp. 4057–4061. [Google Scholar] [CrossRef]
  20. Shao, S.; Yan, R.; Lu, Y.; Wang, P.; Gao, R.X. DCNN-Based Multi-Signal Induction Motor Fault Diagnosis. IEEE Trans. Instrum. Meas. 2019, 69, 2658–2669. [Google Scholar] [CrossRef]
  21. Delgado-Arredondo, P.A.; Garcia-Perez, A.; Morinigo-Sotelo, D.; Osornio-Rios, R.A.; Avina-Cervantes, J.G.; Rostro-Gonzalez, H.; Romero-Troncoso, R.D. Comparative Study of Time-Frequency Decomposition Techniques for Fault Detection in Induction Motors Using Vibration Analysis during Startup Transient. Shock. Vib. 2015, 2015, 708034. [Google Scholar] [CrossRef]
  22. Kumar, P.; Hati, A.S. Transfer learning-based deep CNN model for multiple faults detection in SCIM. Neural Comput. Appl. 2021, 33, 15851–15862. [Google Scholar] [CrossRef]
  23. Drabycz, S.; Stockwell, R.G.; Mitchell, J.R. Image texture characterization using the discrete orthonormal S-transform. J. Digit. Imaging 2009, 22, 696–708. [Google Scholar] [CrossRef]
  24. Soleimani, M.; Vahidi, A.; Vaseghi, B. Two-Dimensional Stockwell Transform and Deep Convolutional Neural Network for Multi-Class Diagnosis of Pathological Brain. IEEE Trans. Neural Syst. Rehabil. Eng. 2020, 29, 163–172. [Google Scholar] [CrossRef]
  25. Pridham, G.; Oladosu, O.; Zhang, Y. Evaluation of discrete orthogonal versus polar Stockwell Transform for local multi-resolution texture analysis using brain MRI of multiple sclerosis patients. Magn. Reson. Imaging 2020, 72, 150–158. [Google Scholar] [CrossRef] [PubMed]
  26. Asgharzadeh-Bonab, A.; Kalbkhani, H.; Azarfardian, S. An Alzheimer’s disease classification method using fusion of features from brain Magnetic Resonance Image transforms and deep convolutional networks. Healthc. Anal. 2023, 4, 100223. [Google Scholar] [CrossRef]
  27. Saghazadeh, A.; Rezaei, N. Artificial Intelligence: A Tool to Help Cancer Diagnosis, Prognosis, and Treatment. In Handbook of Cancer and Immunology; Springer International Publishing: Cham, Switzerland, 2023; pp. 1–29. [Google Scholar] [CrossRef]
  28. Alhasan, A.S. Clinical Applications of Artificial Intelligence, Machine Learning, and Deep Learning in the Imaging of Gliomas: A Systematic Review. Cureus 2021, 14, 13. [Google Scholar] [CrossRef] [PubMed]
  29. Nayak, G.; Padhy, N.; Mishra, T.K. 2D-DOST for seizure identification from brain MRI during pregnancy using KRVFL. Health Technol. 2022, 12, 757–764. [Google Scholar] [CrossRef]
  30. Díaz-Saldaña, G.; Morales-Velásquez, L.; Antonio-Daviu, J.A.; Biot-Monterde, V. Comparison of Bearing Fault Classification Using STFT, CWT and 2D-DOST with a Lightweight CNN Model. In Proceedings of the 2025 IEEE 34th International Symposium on Industrial Electronics (ISIE), Toronto, ON, Canada, 20–23 June 2025; pp. 1–6. [Google Scholar] [CrossRef]
  31. Guerrero, J.P.P.; Saucedo-Dorantes, J.J.; Osornio-Rios, R.A.; Antonino-Daviu, J.A.; Biot-Monterde, V. Unbalance and misalignment detection in induction motors under oscillating loads using current and Fast Fourier Transform (FFT). In Proceedings of the 2023 IEEE Energy Conversion Congress and Exposition (ECCE), Nashville, TN, USA, 29 October–2 November 2023; pp. 4202–4207. [Google Scholar] [CrossRef]
  32. Yu, F.T.S.; Lu, G. Short-time Fourier transform and wavelet transform with Fourier-domain processing. Appl. Opt. 1994, 33, 5262. [Google Scholar] [CrossRef]
  33. Özhan, O. Short-Time-Fourier Transform. In Basic Transforms for Electrical Engineering; Springer: Cham, Switzerland, 2022. [Google Scholar] [CrossRef]
  34. Sadowsky, J. The continuous wavelet transform: A tool for signal investigation and understanding. Johns Hopkins APL Tech. Dig. 1994, 15, 306. [Google Scholar]
  35. Oberlin, T.; Meignen, S.; Perrier, V. The Fourier-based synchrosqueezing transform. In Proceedings of the 2014 IEEE international conference on acoustics, speech and signal processing (ICASSP), Florence, Italy, 4–9 May 2014; pp. 315–319. [Google Scholar] [CrossRef]
  36. Chen, Z.; Ji, Y.; Zhang, Y.; Tang, F.; Dong, Z. Ionosphere Phase Decontamination for OTHR Based on Fourier Synchrosqueezed Transform. In Proceedings of the 2024 IEEE International Conference on Signal, Information and Data Processing (ICSIDP), Zhuhai, China, 22–24 November 2024; pp. 1–5. [Google Scholar] [CrossRef]
  37. Chavez, O.; Amezquita-Sanchez, J.P.; Valtierra-Rodriguez, M.; Cruz-Abeyro, J.A.; Kotsarenko, A.; Millan-Almaraz, J.R.; Dominguez-Gonzalez, A.; Rojas, E. Novel ST-MUSIC-based spectral analysis for detection of ULF geomagnetic signals anomalies associated with seismic events in Mexico. Geomat. Nat. Hazards Risk 2015, 7, 1162–1174. [Google Scholar] [CrossRef][Green Version]
  38. Stockwell, R.G.; Mansinha, L.; Lowe, R.P. Localization of the complex spectrum: The S transform. IEEE Trans. Signal Process. 2002, 44, 998–1001. [Google Scholar] [CrossRef]
  39. Battisti, U.; Riba, L. Window-dependent bases for efficient representations of the Stockwell transform. Appl. Comput. Harmon. Anal. 2015, 40, 292–320. [Google Scholar] [CrossRef]
  40. Li, B. Lightweight Neural Networks. In Embedded Artificial Intelligence; Springer: Singapore, 2024. [Google Scholar] [CrossRef]
Figure 1. Block diagram illustrating the methodology followed in this research.
Figure 1. Block diagram illustrating the methodology followed in this research.
Machines 13 01001 g001
Figure 2. Experimental setup: (a) IM, load and location of the stray flux sensor; (b) commercial tool for alignment measuring the misalignment; (c) rotor with one BRB; (d) bearing with corrosion condition, COR1; (e) coupling and weight used for the UNB1 condition.
Figure 2. Experimental setup: (a) IM, load and location of the stray flux sensor; (b) commercial tool for alignment measuring the misalignment; (c) rotor with one BRB; (d) bearing with corrosion condition, COR1; (e) coupling and weight used for the UNB1 condition.
Machines 13 01001 g002
Figure 3. Samples of the TFDs used for a HLT sample: (a) STFT; (b) CWT; (c) FSST; (d) ST-MUSIC.
Figure 3. Samples of the TFDs used for a HLT sample: (a) STFT; (b) CWT; (c) FSST; (d) ST-MUSIC.
Machines 13 01001 g003aMachines 13 01001 g003b
Figure 4. Comparison of the RGB images for all conditions and TFDs.
Figure 4. Comparison of the RGB images for all conditions and TFDs.
Machines 13 01001 g004
Figure 5. Comparison of the grayscale datasets for all TFDs.
Figure 5. Comparison of the grayscale datasets for all TFDs.
Machines 13 01001 g005
Figure 6. Comparison of the TFDs for all conditions after the 2D-DOST processing.
Figure 6. Comparison of the TFDs for all conditions after the 2D-DOST processing.
Machines 13 01001 g006
Figure 7. Comparison of the images generated from the current signals for all conditions and TFDs.
Figure 7. Comparison of the images generated from the current signals for all conditions and TFDs.
Machines 13 01001 g007
Table 1. Fault frequency component equations for different signals.
Table 1. Fault frequency component equations for different signals.
FaultEquationSignalSource
Outer Race Bearing Fault f o v = N b 2 f r 1 D b D c c o s β (1)Vibration[1,3,4]
f o c = | f s ± k f o v | (2)Current and flux[1,5]
Broken Rotor Bar f b r b = f s 1 ± 2 k s (3)Vibration and current[1,3,4]
f b r b = f s ± k f r (4)Airgap flux[5]
f b r b = f s 1 ± 2 k s f s + f r ± 2 k s f s (5)Radial stray flux[5]
f b r b = s f s 3 s f s (6)Axial stray flux[5]
Eccentricity and Misalignment f e c c = f s ± f r ( k N r ± n d ) (7)Magnetic flux[5]
f e c c = 2 f s ± f r (8)Vibration[1]
f e m = f s ± k f r (9)Current and flux[1,19]
f e m = f s ( 1 ± k s ) (10)Current[31]
Table 2. Structure of the implemented CNN model.
Table 2. Structure of the implemented CNN model.
LayerStructure
Layer TypeSizeNumber of Filters/Strides
1Image Input64 × 64-
2Convolution3 × 38
3Batch Normalization--
4ReLU--
5Max Pooling2 × 22
6Convolution3 × 38
7Batch Normalization--
8ReLU--
9Fully Connected -
10SoftMax--
11Classification12-
Table 3. CNN results for the classification of the STFT datasets.
Table 3. CNN results for the classification of the STFT datasets.
TrialRGBGrayscale2D-DOST
Accuracy (%)Time (s)Accuracy (%)Time (s)Accuracy (%)Time (s)
193.738877.710192.307681.537397.227180.8875
295.304177.778594.722778.903598.121681.7360
396.109177.644086.314879.294997.898081.4753
491.189683.524294.096680.409597.987479.4998
597.719187.555779.472279.888798.389979.1721
694.409679.978994.856881.567698.121680.6396
794.543876.362390.697681.827897.763881.5238
893.559976.325991.234379.671698.389983.0410
993.738878.960793.425779.453997.987480.9707
1091.413283.646688.774580.164797.093079.2202
Average94.172679.948790.590380.271997.898080.8166
Std. Deviation1.97383.74644.76301.04000.43661.2371
Table 4. Average confusion matrix for the STFT + 2D-DOST dataset results.
Table 4. Average confusion matrix for the STFT + 2D-DOST dataset results.
Predicted Class
1BRB2BRBNGORCOR0COR1HLTLBMal0.8Mal1.6Mal1.24UNB0UNB1
True class1BRB100000000000000
2BRB099.94000.0500000000
NG0097.5500.5800.050.0500000
OR00099.760.2300000.05000
COR000.051.270.2398.130.230.0500.110.0500.110.98
COR100000.1798.600.170.170.1100.050.930.11
HLT000.58000.2397.264.36000.050.170.05
LB000.11000.171.6294.70000.170.230.05
Mal0.8000.2300.170.290097.031.680.050.290
Mal1.6000.0500.110.05002.3897.790.460.050
Mal1.240000000.2300.170.3499.010.340.05
UNB0000.1100.110.400.460.630.1700.1195.981.86
UNB1000.0500.4000.110.0500.050.051.8696.86
Table 5. CNN results for the classification of the CWT datasets.
Table 5. CNN results for the classification of the CWT datasets.
TrialRGBGrayscale2D-DOST
Accuracy (%)Time (s)Accuracy (%)Time (s)Accuracy (%)Time (s)
196.019679.453276.654772.346494.722779.1546
291.726278.866987.298772.627297.227183.1524
391.189678.668582.155672.920696.153879.6102
494.499192.317288.282681.148896.422272.4330
587.880192.895091.010782.937596.914172.4931
694.409684.263488.237984.150295.885573.7472
787.254079.132384.257674.551696.019684.9196
895.080578.504883.273772.496097.137786.5207
992.262986.103578.398972.884797.495585.8286
1095.796085.834289.042972.779095.393586.4226
Average92.611883.603984.861375.884296.337280.42826
Std. Deviation3.13875.61664.75704.82570.87875.8082
Table 6. Average confusion matrix for the CWT + 2D-DOST dataset results.
Table 6. Average confusion matrix for the CWT + 2D-DOST dataset results.
Predicted Class
1BRB2BRBNGORCOR0COR1HLTLBMal0.8Mal1.6Mal1.24UNB0UNB1
True class1BRB100000000000000
2BRB099.8200000000000
NG0099.1201.4500.050000.230.050
OR00099.590.170000000.110.05
COR0000.520.2397.670000.1100.110.110
COR10000098.020.980.230.230.0500.050
HLT000.0500.051.2298.600.980.3400.340.170.05
LB0000.1100.400.2997.200.4000.291.450.46
Mal0.800.05000.29000.1192.611.040.930.230
Mal1.600.1100.050.290.34005.6398.8300.050
Mal1.2400000.0500.050.110.63096.620.630.63
UNB000000000.7500.050.4080.635.17
UNB1000.2900000.58001.0416.4593.60
Table 7. Classification results for the FSST datasets using the proposed CNN model.
Table 7. Classification results for the FSST datasets using the proposed CNN model.
TrialRGBGrayscale2D-DOST
Accuracy (%)Time (s)Accuracy (%)Time (s)Accuracy (%)Time (s)
190.026877.202285.822879.085293.157481.5256
295.304177.056587.432979.124393.202180.8298
394.007176.833877.549179.600695.572478.9573
492.889083.946185.465180.824994.275479.9391
588.774584.022186.091281.204594.767479.9763
691.234382.765883.050080.717293.917781.1804
790.652986.135687.611879.311295.035781.7306
890.295188.753483.363179.488394.812179.9261
991.502688.062379.427579.682795.348879.2309
1091.413289.737186.672679.462395.393579.6656
Average91.610083.451584.248679.850194.548380.2962
Std. Deviation1.95534.95383.41600.76810.88050.9619
Table 8. Average confusion matrix for the FSST + 2D-DOST dataset results.
Table 8. Average confusion matrix for the FSST + 2D-DOST dataset results.
Predicted Class
1BRB2BRBNGORCOR0COR1HLTLBMal0.8Mal1.6Mal1.24UNB0UNB1
True class1BRB99.700000.0500000000
2BRB0.0510000000000000
NG0098.3100.4600.050.1700000
OR000.0599.760.05000.1100.29000
COR0000.520.0597.670.2900.2900.05000
COR10.2300.2901.3391.622.151.510.810.1100.050.05
HLT000.29004.4787.206.160.5200.230.580.81
LB000.3400.111.627.3890.63000.050.340.81
Mal0.800000.171.040.29095.932.900.0500
Mal1.6000.0500.110.46002.4496.62000
Mal1.24000.11000.110.520.110.17097.791.391.56
UNB0000000.171.860.400.1100.0584.016.91
UNB10000.1700.170.520.58001.8013.6089.82
Table 9. Results obtained for the classification of the ST-MUSIC datasets.
Table 9. Results obtained for the classification of the ST-MUSIC datasets.
TrialRGBGrayscale2D-DOST
Accuracy (%)Time (s)Accuracy (%)Time (s)Accuracy (%)Time (s)
197.182476.355496.064479.590696.869478.6683
294.364976.556587.611879.731997.227179.2518
396.466977.601996.288081.806398.255879.2189
496.735282.750892.307683.475098.255878.7418
596.914184.369894.409679.174098.389981.1456
694.901686.290892.978579.236897.495581.5223
793.694088.131995.304179.853797.450878.7249
897.629690.525395.796081.493997.137778.9051
993.649388.559693.336381.645197.495579.9405
1096.109187.372793.783581.446397.182479.2119
Average95.764783.851493.788080.745497.576079.5331
Std. Deviation1.48435.30532.56781.42740.53571.0233
Table 10. Average confusion matrix for the ST-MUSIC + 2D-DOST dataset.
Table 10. Average confusion matrix for the ST-MUSIC + 2D-DOST dataset.
Predicted Class
1BRB2BRBNGORCOR0COR1HLTLBMal0.8Mal1.6Mal1.24UNB0UNB1
True class1BRB99.8800.1100.050000000.050
2BRB010000000000000
NG0099.0100.0500000000
OR0.110099.94000000000
COR0000.870.0599.8800000000
COR10000099.530.2900000.050
HLT000000.0593.134.180.0500.051.040
LB000000.115.8195.750000.630
Mal0.80000000.05096.273.8300.110.17
Mal1.6000000.23003.4396.040.0500.05
Mal1.240000000.1100.170.0599.820.520.05
UNB00000000.520.050.050.050.0592.503.0
UNB1000000.050.0500005.0596.68
Table 11. CNN results for the classification of the STFT current datasets.
Table 11. CNN results for the classification of the STFT current datasets.
TrialRGBGrayscale2D-DOST
Accuracy (%)Time (s)Accuracy (%)Time (s)Accuracy (%)Time (s)
189.338715.579148.582914.752886.234816.0464
284.885215.622360.323814.765987.854216.2337
389.608615.719749.662614.709584.210515.8785
487.314415.682045.614014.760990.958115.7902
575.573515.773348.987816.332384.345415.7465
687.989215.664469.500616.123187.449315.7199
785.425115.725146.153815.582589.743515.8214
886.774615.773473.954115.898386.504715.8224
988.933815.867832.928415.648287.179415.7987
1087.314415.876427.530315.965387.989215.7368
Average86.315715.728450.323815.453987.246915.8594
Std. Deviation4.08350.097314.50320.64392.11300.1612
Table 12. Average confusion matrix for the STFT + 2D-DOST current dataset results.
Table 12. Average confusion matrix for the STFT + 2D-DOST current dataset results.
Predicted Class
1BRB2BRBNGORCOR0COR1HLTLBMal0.8Mal1.6Mal1.24UNB0UNB1
True class1BRB92.980.520.35001.220000000
2BRB099.4700000000000
NG0.17049.647.017.190.177.892.8000000.35
OR0.5209.2978.245.431.053.684.560002.980.70
COR00.17017.016.6671.750.3510.702.8000000.87
COR15.6100.520.70097.0100.350.170000
HLT0.17020.704.9113.68077.012.9800000
LB0.1701.570.701.5700.1777.710.170002.10
Mal0.800000000.1798.240.70000
Mal1.6000000001.4099.29000
Mal1.24000000000010000
UNB00001.05000000096.840
UNB10.1700.870.700.350.170.528.590000.1795.96
Table 13. Comparison of different fault detection proposals.
Table 13. Comparison of different fault detection proposals.
ProposalTechniqueClassification ModelNumber of ConditionsFault LocationSignalsAccuracy
[9]FSSTCNN13BearingVibration100%
[13]STFTLightweight CNN4BearingVibration95.16%
[20]CWTdCNN6Bearing, rotor, statorVibration and current99.83%
[22]1D to 2D signal conversionTL-dCNN5Bearing, rotorCurrent99.4%
Proposed methodSTFT + 2D-DOSTLightweight CNN13Bearing, rotor, coupling, carcassStray flux97.89%
CWT + 2D-DOST96.33%
FSST + 2D-DOST94.54%
ST-MUSIC + 2D-DOST97.57%
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Díaz-Saldaña, G.; Morales-Velazquez, L.; Biot-Monterde, V.; Antonino-Daviu, J.A. Investigation on the Use of 2D-DOST on Time–Frequency Representations of Stray Flux Signals for Induction Motor Fault Classification Using a Lightweight CNN Model. Machines 2025, 13, 1001. https://doi.org/10.3390/machines13111001

AMA Style

Díaz-Saldaña G, Morales-Velazquez L, Biot-Monterde V, Antonino-Daviu JA. Investigation on the Use of 2D-DOST on Time–Frequency Representations of Stray Flux Signals for Induction Motor Fault Classification Using a Lightweight CNN Model. Machines. 2025; 13(11):1001. https://doi.org/10.3390/machines13111001

Chicago/Turabian Style

Díaz-Saldaña, Geovanni, Luis Morales-Velazquez, Vicente Biot-Monterde, and José Alfonso Antonino-Daviu. 2025. "Investigation on the Use of 2D-DOST on Time–Frequency Representations of Stray Flux Signals for Induction Motor Fault Classification Using a Lightweight CNN Model" Machines 13, no. 11: 1001. https://doi.org/10.3390/machines13111001

APA Style

Díaz-Saldaña, G., Morales-Velazquez, L., Biot-Monterde, V., & Antonino-Daviu, J. A. (2025). Investigation on the Use of 2D-DOST on Time–Frequency Representations of Stray Flux Signals for Induction Motor Fault Classification Using a Lightweight CNN Model. Machines, 13(11), 1001. https://doi.org/10.3390/machines13111001

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop