Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Article Types

Countries / Regions

remove_circle_outline
remove_circle_outline
remove_circle_outline
remove_circle_outline

Search Results (294)

Search Parameters:
Keywords = Short-Time Fourier Transform (STFT)

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
19 pages, 1517 KiB  
Article
Continuous Estimation of sEMG-Based Upper-Limb Joint Angles in the Time–Frequency Domain Using a Scale Temporal–Channel Cross-Encoder
by Xu Han, Haodong Chen, Xinyu Cheng and Ping Zhao
Actuators 2025, 14(8), 378; https://doi.org/10.3390/act14080378 (registering DOI) - 31 Jul 2025
Abstract
Surface electromyographic (sEMG) signal-driven joint-angle estimation plays a critical role in intelligent rehabilitation systems, as its accuracy directly affects both control performance and rehabilitation efficacy. This study proposes a continuous elbow joint angle estimation method based on time–frequency domain analysis. Raw sEMG signals [...] Read more.
Surface electromyographic (sEMG) signal-driven joint-angle estimation plays a critical role in intelligent rehabilitation systems, as its accuracy directly affects both control performance and rehabilitation efficacy. This study proposes a continuous elbow joint angle estimation method based on time–frequency domain analysis. Raw sEMG signals were processed using the Short-Time Fourier Transform (STFT) to extract time–frequency features. A Scale Temporal–Channel Cross-Encoder (STCCE) network was developed, integrating temporal and channel attention mechanisms to enhance feature representation and establish the mapping from sEMG signals to elbow joint angles. The model was trained and evaluated on a dataset comprising approximately 103,000 samples collected from seven subjects. In the single-subject test set, the proposed STCCE model achieved an average Mean Absolute Error (MAE) of 2.96±0.24, Root Mean Square Error (RMSE) of 4.41±0.45, Coefficient of Determination (R2) of 0.9924±0.0020, and Correlation Coefficient (CC) of 0.9963±0.0010. It achieved a MAE of 3.30, RMSE of 4.75, R2 of 0.9915, and CC of 0.9962 on the multi-subject test set, and an average MAE of 15.53±1.80, RMSE of 21.72±2.85, R2 of 0.8141±0.0540, and CC of 0.9100±0.0306 on the inter-subject test set. These results demonstrated that the STCCE model enabled accurate joint-angle estimation in the time–frequency domain, contributing to a better motion intent perception for upper-limb rehabilitation. Full article
Show Figures

Figure 1

20 pages, 4093 KiB  
Article
CNN Input Data Configuration Method for Fault Diagnosis of Three-Phase Induction Motors Based on D-Axis Current in D-Q Synchronous Reference Frame
by Yeong-Jin Goh
Appl. Sci. 2025, 15(15), 8380; https://doi.org/10.3390/app15158380 - 28 Jul 2025
Viewed by 119
Abstract
This study proposes a novel approach to input data configuration for the fault diagnosis of three-phase induction motors. Conventional neural network (CNN)-based diagnostic methods often employ three-phase current signals and apply various image transformation techniques, such as RGB mapping, wavelet transforms, and short-time [...] Read more.
This study proposes a novel approach to input data configuration for the fault diagnosis of three-phase induction motors. Conventional neural network (CNN)-based diagnostic methods often employ three-phase current signals and apply various image transformation techniques, such as RGB mapping, wavelet transforms, and short-time Fourier transform (STFT), to construct multi-channel input data. While such approaches outperform 1D-CNNs or grayscale-based 2D-CNNs due to their rich informational content, they require multi-channel data and involve an increased computational complexity. Accordingly, this study transforms the three-phase currents into the D-Q synchronous reference frame and utilizes the D-axis current (Id) for image transformation. The Id is used to generate input data using the same image processing techniques, allowing for a direct performance comparison under identical CNN architectures. Experiments were conducted under consistent conditions using both three-phase-based and Id-based methods, each applied to RGB mapping, DWT, and STFT. The classification accuracy was evaluated using a ResNet50-based CNN. Results showed that the Id-STFT achieved the highest performance, with a validation accuracy of 99.6% and a test accuracy of 99.0%. While the RGB representation of three-phase signals has traditionally been favored for its information richness and diagnostic performance, this study demonstrates that a high-performance CNN-based fault diagnosis is achievable even with grayscale representations of a single current. Full article
Show Figures

Figure 1

23 pages, 19710 KiB  
Article
Hybrid EEG Feature Learning Method for Cross-Session Human Mental Attention State Classification
by Xu Chen, Xingtong Bao, Kailun Jitian, Ruihan Li, Li Zhu and Wanzeng Kong
Brain Sci. 2025, 15(8), 805; https://doi.org/10.3390/brainsci15080805 - 28 Jul 2025
Viewed by 167
Abstract
Background: Decoding mental attention states from electroencephalogram (EEG) signals is crucial for numerous applications such as cognitive monitoring, adaptive human–computer interaction, and brain–computer interfaces (BCIs). However, conventional EEG-based approaches often focus on channel-wise processing and are limited to intra-session or subject-specific scenarios, lacking [...] Read more.
Background: Decoding mental attention states from electroencephalogram (EEG) signals is crucial for numerous applications such as cognitive monitoring, adaptive human–computer interaction, and brain–computer interfaces (BCIs). However, conventional EEG-based approaches often focus on channel-wise processing and are limited to intra-session or subject-specific scenarios, lacking robustness in cross-session or inter-subject conditions. Methods: In this study, we propose a hybrid feature learning framework for robust classification of mental attention states, including focused, unfocused, and drowsy conditions, across both sessions and individuals. Our method integrates preprocessing, feature extraction, feature selection, and classification in a unified pipeline. We extract channel-wise spectral features using short-time Fourier transform (STFT) and further incorporate both functional and structural connectivity features to capture inter-regional interactions in the brain. A two-stage feature selection strategy, combining correlation-based filtering and random forest ranking, is adopted to enhance feature relevance and reduce dimensionality. Support vector machine (SVM) is employed for final classification due to its efficiency and generalization capability. Results: Experimental results on two cross-session and inter-subject EEG datasets demonstrate that our approach achieves classification accuracy of 86.27% and 94.01%, respectively, significantly outperforming traditional methods. Conclusions: These findings suggest that integrating connectivity-aware features with spectral analysis can enhance the generalizability of attention decoding models. The proposed framework provides a promising foundation for the development of practical EEG-based systems for continuous mental state monitoring and adaptive BCIs in real-world environments. Full article
Show Figures

Figure 1

14 pages, 2616 KiB  
Article
Novel Throat-Attached Piezoelectric Sensors Based on Adam-Optimized Deep Belief Networks
by Ben Wang, Hua Xia, Yang Feng, Bingkun Zhang, Haoda Yu, Xulehan Yu and Keyong Hu
Micromachines 2025, 16(8), 841; https://doi.org/10.3390/mi16080841 - 22 Jul 2025
Viewed by 257
Abstract
This paper proposes an Adam-optimized Deep Belief Networks (Adam-DBNs) denoising method for throat-attached piezoelectric signals. The method aims to process mechanical vibration signals captured through polyvinylidene fluoride (PVDF) sensors attached to the throat region, which are typically contaminated by environmental noise and physiological [...] Read more.
This paper proposes an Adam-optimized Deep Belief Networks (Adam-DBNs) denoising method for throat-attached piezoelectric signals. The method aims to process mechanical vibration signals captured through polyvinylidene fluoride (PVDF) sensors attached to the throat region, which are typically contaminated by environmental noise and physiological noise. First, the short-time Fourier transform (STFT) is utilized to convert the original signals into the time–frequency domain. Subsequently, the masked time–frequency representation is reconstructed into the time domain through a diagonal average-based inverse STFT. To address complex nonlinear noise structures, a Deep Belief Network is further adopted to extract features and reconstruct clean signals, where the Adam optimization algorithm ensures the efficient convergence and stability of the training process. Compared with traditional Convolutional Neural Networks (CNNs), Adam-DBNs significantly improve waveform similarity by 6.77% and reduce the local noise energy residue by 0.099696. These results demonstrate that the Adam-DBNs method exhibits substantial advantages in signal reconstruction fidelity and residual noise suppression, providing an efficient and robust solution for throat-attached piezoelectric sensor signal enhancement tasks. Full article
(This article belongs to the Section E:Engineering and Technology)
Show Figures

Figure 1

18 pages, 41412 KiB  
Article
TFSNet: A Time–Frequency Synergy Network Based on EEG Signals for Autism Spectrum Disorder Classification
by Lijuan Shi, Lintao Ma, Jian Zhao, Zhejun Kuang, Sifan Wang, Han Yang, Haiyan Wang, Qiulei Han and Lei Sun
Brain Sci. 2025, 15(7), 684; https://doi.org/10.3390/brainsci15070684 - 25 Jun 2025
Viewed by 386
Abstract
Autism Spectrum Disorder (ASD) seriously affects social, communication, and behavioral functions, and early accurate diagnosis is crucial to improve the prognosis of patients. Traditional diagnosis methods rely on professional doctors to make subjective diagnosis through scales, the feature extraction of existing machine learning [...] Read more.
Autism Spectrum Disorder (ASD) seriously affects social, communication, and behavioral functions, and early accurate diagnosis is crucial to improve the prognosis of patients. Traditional diagnosis methods rely on professional doctors to make subjective diagnosis through scales, the feature extraction of existing machine learning methods is inefficient, and existing deep learning methods have limitations in capturing time-varying features and the joint expression of time–frequency features. To this end, this study proposes a time–frequency synergy network (TFSNet) to improve the accuracy of ASD EEG signal classification. The proposed Dynamic Residual Block (TDRB) was used to enhance time-domain feature extraction; Short-Time Fourier Transform (STFT), convolutional attention mechanism, and transformation technology were combined to capture frequency-domain information; and an adaptive cross-domain attention mechanism (ACDA) was designed to realize efficient fusion of time–frequency features. The experimental results show that the average accuracy of TFSNet on the University of Sheffield (containing 28 ASD patients and 28 healthy controls) and KAU dataset (containing 12 ASD patients and five healthy controls) reaches 98.68%and 97.14%, respectively, yielding significantly better results than the existing machine learning and deep learning methods. In addition, the analysis of model decisions through interpretability analysis techniques enhances its transparency and reliability. Full article
(This article belongs to the Section Neurotechnology and Neuroimaging)
Show Figures

Figure 1

8 pages, 1216 KiB  
Proceeding Paper
Enhanced Lung Disease Detection Using Double Denoising and 1D Convolutional Neural Networks on Respiratory Sound Analysis
by Reshma Sreejith, R. Kanesaraj Ramasamy, Wan-Noorshahida Mohd-Isa and Junaidi Abdullah
Comput. Sci. Math. Forum 2025, 10(1), 7; https://doi.org/10.3390/cmsf2025010007 - 24 Jun 2025
Viewed by 287
Abstract
The accurate and early detection of respiratory diseases is vital for effective diagnosis and treatment. This study presents a new approach for classifying lung sounds using a double denoising method combined with a 1D Convolutional Neural Network (CNN). The preprocessing uses Fast Fourier [...] Read more.
The accurate and early detection of respiratory diseases is vital for effective diagnosis and treatment. This study presents a new approach for classifying lung sounds using a double denoising method combined with a 1D Convolutional Neural Network (CNN). The preprocessing uses Fast Fourier Transform to clean up sounds and High-Pass Filtering to improve the quality of breathing sounds by eliminating noise and low-frequency interruptions. The Short-Time Fourier Transform (STFT) extracts features that capture localised frequency variations, crucial for distinguishing normal and abnormal respiratory sounds. These features are input into the 1D CNN, which classifies diseases such as bronchiectasis, pneumonia, asthma, COPD, healthy, and URTI. The dual denoising method enhances signal clarity and classification performance. The model achieved 96% validation accuracy, highlighting its reliability in detecting respiratory conditions. The results emphasise the effectiveness of combining signal augmentation with deep learning for automated respiratory sound analysis, with future research focusing on dataset expansion and model refinement for clinical use. Full article
Show Figures

Figure 1

17 pages, 3508 KiB  
Article
Zero-Sequence Voltage Outperforms MCSA-STFT for a Robust Inter-Turn Short-Circuit Fault Diagnosis in Three-Phase Induction Motors: A Comparative Study
by Mouhamed Houili, Mohamed Sahraoui, Antonio J. Marques Cardoso and Abdeldjalil Alloui
Machines 2025, 13(6), 501; https://doi.org/10.3390/machines13060501 - 7 Jun 2025
Viewed by 1122
Abstract
Three-phase induction motors are widely adopted in industrial systems due to their robustness, ease of maintenance, and simple operation. However, they are prone to various types of faults, notably stator winding faults. Previous research indicates that 20–40% of three-phase induction motor failures are [...] Read more.
Three-phase induction motors are widely adopted in industrial systems due to their robustness, ease of maintenance, and simple operation. However, they are prone to various types of faults, notably stator winding faults. Previous research indicates that 20–40% of three-phase induction motor failures are stator-related, with inter-turn short circuits as a leading cause. These faults can pose significant risks to both the motor and connected equipment. Therefore, the early detection of inter-turn short circuit (ITSC) faults is essential to prevent system breakdowns and improve the safety and reliability of industrial operations. This paper presents a comparative investigation of two distinct diagnostic methodologies for the detection of ITSC faults in induction motors. The first methodology is based on a Motor Current Signature Analysis (MCSA) utilizing the short-time Fourier transform (STFT) for the real-time monitoring of fault-related harmonics. The second methodology is centered around the monitoring of the zero-sequence voltage (ZSV). The findings from several experimental tests performed on a 1.1 kW three-phase induction motor across a range of operating conditions highlight the superior performance of the ZSV method with respect to the MCSA-based STFT method in terms of reliability, rapidity, and precision for the diagnosis of ITSC faults. Full article
(This article belongs to the Section Electrical Machines and Drives)
Show Figures

Figure 1

25 pages, 38520 KiB  
Article
A Novel Audio-Perception-Based Algorithm for Physiological Monitoring
by Zixuan Zhang, Wenxuan Jin, Dejiao Huang and Zhongwei Sun
Sensors 2025, 25(12), 3582; https://doi.org/10.3390/s25123582 - 6 Jun 2025
Viewed by 470
Abstract
Exercise metrics are critical for assessing health, but real-time heart rate and respiration measurements remain challenging. We propose a physiological monitoring system that uses an in-ear microphone to extract heart rate and respiration from faint ear canal signals. An improved non-negative matrix factorization [...] Read more.
Exercise metrics are critical for assessing health, but real-time heart rate and respiration measurements remain challenging. We propose a physiological monitoring system that uses an in-ear microphone to extract heart rate and respiration from faint ear canal signals. An improved non-negative matrix factorization (NMF) algorithm combines with a short-time Fourier transform (STFT) to separate physiological components, while an inverse Fourier transform (IFT) reconstructs the signal. The earplug effect enhances the low-frequency components, thereby improving the signal quality and noise immunity. Heart rate is derived from short-term energy and zero-crossing rate, while a BiLSTM-based model can refine the breathing phases and calculate indicators such as respiratory rate. Experiments have shown that the average accuracy can reach 91% under various conditions, exceeding 90% in different environments and under different weights, thus ensuring the system’s robustness. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

48 pages, 1559 KiB  
Review
A Review of Key Signal Processing Techniques for Structural Health Monitoring: Highlighting Non-Parametric Time-Frequency Analysis, Adaptive Decomposition, and Deconvolution
by Yixin Zhou, Zepeng Ma and Lei Fu
Algorithms 2025, 18(6), 318; https://doi.org/10.3390/a18060318 - 27 May 2025
Cited by 1 | Viewed by 1469
Abstract
This paper reviews key signal processing techniques in structural health monitoring (SHM), focusing on non-parametric time–frequency analysis, adaptive decomposition, and deconvolution methods. It examines the short-time Fourier transform (STFT), wavelet transform (WT), and Wigner–Ville distribution (WVD), highlighting their applications, advantages, and limitations in [...] Read more.
This paper reviews key signal processing techniques in structural health monitoring (SHM), focusing on non-parametric time–frequency analysis, adaptive decomposition, and deconvolution methods. It examines the short-time Fourier transform (STFT), wavelet transform (WT), and Wigner–Ville distribution (WVD), highlighting their applications, advantages, and limitations in SHM. The review also explores adaptive techniques like empirical mode decomposition (EMD) and its variants (EEMD, MEEMD), as well as variational mode decomposition (VMD) and its improved versions (SVMD, AVMD), emphasizing their effectiveness in handling nonlinear and non-stationary signals. Additionally, deconvolution methods such as minimum entropy deconvolution (MED) and maximum correlated kurtosis deconvolution (MCKD) are discussed for mechanical fault diagnosis. The paper aims to provide a comprehensive overview of these techniques, offering insights for future research into SHM signal processing. Full article
Show Figures

Figure 1

24 pages, 6980 KiB  
Article
Vibration Signal-Based Fault Diagnosis of Rotary Machinery Through Convolutional Neural Network and Transfer Learning Method
by Chirag Mongia and Shankar Sehgal
Vibration 2025, 8(2), 27; https://doi.org/10.3390/vibration8020027 - 25 May 2025
Viewed by 785
Abstract
Artificial Intelligence (AI) is revolutionizing proactive repair systems by enabling real-time identification of bearing faults in industrial machinery. However, traditional fault detection methods often struggle in dynamic environments due to their dependence on specific training conditions. To address this limitation, a transfer learning [...] Read more.
Artificial Intelligence (AI) is revolutionizing proactive repair systems by enabling real-time identification of bearing faults in industrial machinery. However, traditional fault detection methods often struggle in dynamic environments due to their dependence on specific training conditions. To address this limitation, a transfer learning (TL)-based methodology has been developed for bearing fault detection, so that the model trained under some specific training conditions can perform accurately under significantly different real-time working conditions, thereby significantly improving diagnostic efficiency while reducing training time. Initially, a deep learning approach utilizing convolutional neural networks (CNNs) has been employed to diagnose faults based on vibration data. After achieving high classification performance at source domain conditions, the performance of the model is re-evaluated by applying it to the Case Western Reserve University (CWRU) dataset as the target domain through the TL method. short-time Fourier transform is employed for signal preprocessing, enhancing feature extraction and model performance. The proposed methodology has been validated across various CWRU dataset configurations under different operating conditions and environments. The proposed approach achieved a 99.7% classification accuracy in the target domain, demonstrating effective adaptability and robustness under domain shifts. The results demonstrate how TL-enhanced CNNs can be used as a scalable and efficient way to diagnose bearing faults in industrial environments. Full article
Show Figures

Figure 1

15 pages, 2134 KiB  
Article
Method for Extracting Impact Signals in Falling Weight Deflectometer Calibration Based on Frequency Filtering and Gradient Detection
by Jiacheng Cai, Yingchao Luo, Bing Zhang, Lei Chen and Lu Liu
Sensors 2025, 25(11), 3317; https://doi.org/10.3390/s25113317 - 24 May 2025
Viewed by 456
Abstract
FWD is an important non-destructive testing instrument in the field of highways. It evaluates the pavement bearing capacity by continuously hammering the ground. However, due to noise interference, the current identification and extraction of the impact signals generated by the hammering are not [...] Read more.
FWD is an important non-destructive testing instrument in the field of highways. It evaluates the pavement bearing capacity by continuously hammering the ground. However, due to noise interference, the current identification and extraction of the impact signals generated by the hammering are not accurate enough, which affects the calibration accuracy of the FWD results. To address this issue, this work proposes a novel method for impact point identification. The method integrates frequency domain filtering with gradient detection. Firstly, by analyzing the frequency domain characteristics of FWD impact signals using fast Fourier transform (FFT) and short-time Fourier transform (STFT), the primary response frequency band of the impact was identified. Subsequently, the impact signal segment was reconstructed using inverse fast Fourier transform (IFFT) to effectively suppress noise interference. Furthermore, gradient detection was employed to precisely determine the initiation moment of the impact. To validate the proposed method, a simulated acceleration signal incorporating interference noise was constructed. Comparative experiments were also conducted between traditional identification methods and the proposed method under high-noise conditions. The results demonstrate that the proposed method can accurately identify the impact point even under strong noise, thereby providing reliable data support for FWD measurements. This method exhibits strong environmental adaptability and can be extended to other engineering tests involving impact events and impact point identification. Full article
(This article belongs to the Section Physical Sensors)
Show Figures

Figure 1

29 pages, 2722 KiB  
Article
Filamentary Convolution for SLI: A Brain-Inspired Approach with High Efficiency
by Boyuan Zhang, Xibang Yang, Tong Xie, Shuyuan Zhu and Bing Zeng
Sensors 2025, 25(10), 3085; https://doi.org/10.3390/s25103085 - 13 May 2025
Cited by 1 | Viewed by 450
Abstract
Spoken language identification (SLI) relies on detecting key frequency characteristics like pitch, tone, and rhythm. While the short-time Fourier transform (STFT) generates time–frequency acoustic features (TFAF) for deep learning networks (DLNs), rectangular convolution kernels cause frequency mixing and aliasing, degrading feature extraction. We [...] Read more.
Spoken language identification (SLI) relies on detecting key frequency characteristics like pitch, tone, and rhythm. While the short-time Fourier transform (STFT) generates time–frequency acoustic features (TFAF) for deep learning networks (DLNs), rectangular convolution kernels cause frequency mixing and aliasing, degrading feature extraction. We propose filamentary convolution to replace rectangular kernels, reducing the parameters while preserving inter-frame features by focusing solely on frequency patterns. Visualization confirms its enhanced sensitivity to critical frequency variations (e.g., intonation, rhythm) for language recognition. Evaluated via self-built datasets and cross-validated with public corpora, filamentary convolution improves the low-level feature extraction efficiency and synergizes with temporal models (LSTM/TDNN) to boost recognition. This method addresses aliasing limitations while maintaining computational efficiency in SLI systems. Full article
(This article belongs to the Section Sensor Networks)
Show Figures

Figure 1

24 pages, 10907 KiB  
Article
Time-Frequency Analysis of Motor Imagery During Plantar and Dorsal Flexion Movements Using a Low-Cost Ankle Exoskeleton
by Cristina Polo-Hortigüela, Mario Ortiz, Paula Soriano-Segura, Eduardo Iáñez and José M. Azorín
Sensors 2025, 25(10), 2987; https://doi.org/10.3390/s25102987 - 9 May 2025
Viewed by 692
Abstract
Sensor technology plays a fundamental role in neuro-motor rehabilitation by enabling precise movement analysis and control. This study explores the integration of brain–machine interfaces (BMIs) and wearable sensors to enhance motor recovery in individuals with neuro-motor impairments. Specifically, different time-frequency transforms are evaluated [...] Read more.
Sensor technology plays a fundamental role in neuro-motor rehabilitation by enabling precise movement analysis and control. This study explores the integration of brain–machine interfaces (BMIs) and wearable sensors to enhance motor recovery in individuals with neuro-motor impairments. Specifically, different time-frequency transforms are evaluated to analyze the correlation between electroencephalographic (EEG) activity and ankle position, measured by using inertial measurement units (IMUs). A low-cost ankle exoskeleton was designed to conduct the experimental trials. Six subjects performed plantar and dorsal flexion movements while the EEG and IMU signals were recorded. The correlation between brain activity and foot kinematics was analyzed using the Short-Time Fourier Transform (STFT), Stockwell (ST), Hilbert–Huang (HHT), and Chirplet (CT) methods. The 8–20 Hz frequency band exhibited the highest correlation values. For motor imagery classification, the STFT achieved the highest accuracy (92.9%) using an EEGNet-based classifier and a state-machine approach. This study presents a dual approach: the analysis of EEG-movement correlation in different cognitive states, and the systematic comparison of four time-frequency transforms for both correlation and classification performance. The results support the potential of combining EEG and IMU data for BMI applications and highlight the importance of cognitive state in motion analysis for accessible neurorehabilitation technologies. Full article
Show Figures

Figure 1

31 pages, 1691 KiB  
Article
TF-LIME : Interpretation Method for Time-Series Models Based on Time–Frequency Features
by Jiazhan Wang, Ruifeng Zhang and Qiang Li
Sensors 2025, 25(9), 2845; https://doi.org/10.3390/s25092845 - 30 Apr 2025
Viewed by 489
Abstract
With the widespread application of machine learning techniques in time series analysis, the interpretability of models trained on time series data has attracted increasing attention. Most existing explanation methods are based on time-domain features, making it difficult to reveal how complex models focus [...] Read more.
With the widespread application of machine learning techniques in time series analysis, the interpretability of models trained on time series data has attracted increasing attention. Most existing explanation methods are based on time-domain features, making it difficult to reveal how complex models focus on time–frequency information. To address this, this paper proposes a time–frequency domain-based time series interpretation method aimed at enhancing the interpretability of models at the time–frequency domain. This method extends the traditional LIME algorithm by combining the ideas of short-time Fourier transform (STFT), inverse STFT, and local interpretable model-agnostic explanations (LIME), and introduces a self-designed TFHS (time–frequency homogeneous segmentation) algorithm. The TFHS algorithm achieves precise homogeneous segmentation of the time–frequency matrix through peak detection and clustering analysis, incorporating the distribution characteristics of signals in both frequency and time dimensions. The experiment verified the effectiveness of the TFHS algorithm on Synthetic Dataset 1 and the effectiveness of the TF-LIME algorithm on Synthetic Dataset 2, and then further evaluated the interpretability performance on the MIT-BIH dataset. The results demonstrate that the proposed method significantly improves the interpretability of time-series models in the time–frequency domain, exhibiting strong generalization capabilities and promising application prospects. Full article
(This article belongs to the Section Intelligent Sensors)
Show Figures

Figure 1

19 pages, 4561 KiB  
Article
Enhanced Rolling Bearing Fault Diagnosis Using Multimodal Deep Learning and Singular Spectrum Analysis
by Yunhang Wang, Hongwei Wang, Ruoyang Bai, Yuxin Shi, Xicong Chen and Qingang Xu
Appl. Sci. 2025, 15(9), 4828; https://doi.org/10.3390/app15094828 - 27 Apr 2025
Cited by 1 | Viewed by 1140
Abstract
A decision-level multimodal fusion deep learning strategy is proposed for the effective fault detection of rolling bearings based on long-term fault signals collected from multiple sensors. First, key features are extracted from the multimodal signal set using singular spectrum analysis (SSA), and these [...] Read more.
A decision-level multimodal fusion deep learning strategy is proposed for the effective fault detection of rolling bearings based on long-term fault signals collected from multiple sensors. First, key features are extracted from the multimodal signal set using singular spectrum analysis (SSA), and these features are transformed into a composite dataset that combines short-time Fourier transform (STFT) images and time series data. Based on this, a recursive gated convolutional neural network (RGCNN) is designed to process the STFT image data, while a 1D convolutional neural network (1DCNN) is specifically optimized for training with time series data. Furthermore, decision-level multimodal feature fusion is achieved by applying a weighted average method to integrate the features from different deep learning models, aiming to obtain more comprehensive fault prediction results. The proposed method, multimodal fusion fault detection (MFFD), is validated on the Paderborn and Ottawa rolling bearing datasets, which include various typical faults. Experimental results demonstrate the effectiveness of the proposed approach. Compared to traditional single-modality deep learning models, the proposed method shows significant improvements in fault diagnosis accuracy and generalization capability. Full article
Show Figures

Figure 1

Back to TopTop