Next Article in Journal
A Reinforcement Learning-Based Crushing Method for Robots Operating Within Smart Fully Mechanized Mining Faces
Previous Article in Journal
Hybrid Multi-Scale CNN and Transformer Model for Motor Fault Detection
Previous Article in Special Issue
LiMS-MFormer: A Lightweight Multi-Scale and Multi-Dimensional Attention Transformer for Robust Rolling Bearing Fault Diagnosis Under Complex Conditions
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

A Lightweight Audio Spectrogram Transformer for Robust Pump Anomaly Detection

1
School of Aerospace Engineering, Xiamen University, Xiamen 361005, China
2
School of Mechanical and Electrical Engineering & Automation, Xiamen University Tan Kah Kee College, Zhangzhou 363105, China
*
Author to whom correspondence should be addressed.
Machines 2026, 14(1), 114; https://doi.org/10.3390/machines14010114
Submission received: 5 December 2025 / Revised: 31 December 2025 / Accepted: 14 January 2026 / Published: 19 January 2026

Abstract

Industrial pumps are critical components in manufacturing and process plants, where early acoustic anomaly detection is essential for preventing unplanned downtime and reducing maintenance costs. In practice, however, strong background noise, severe class imbalance between rare faults and abundant normal data, and the limited computing resources of edge devices make reliable deployment challenging. In this work, a lightweight Audio Spectrogram Transformer (Tiny-AST) is proposed for robust pump anomaly detection under imbalanced supervision. Building on the Audio Spectrogram Transformer, the internal Transformer encoder is redesigned by jointly reducing the embedding dimension, depth, and number of attention heads, and combined with a class frequency-based balanced sampling strategy and time–frequency masking augmentation. Experiments on the pump subset of the MIMII dataset across three SNR levels (−6 dB, 0 dB, 6 dB) demonstrate that Tiny-AST achieves an effective trade-off between computational efficiency and noise robustness. With 1.01 M parameters and 1.68 GFLOPs, it maintains superior performance under heavy noise (−6 dB) compared to ultra-lightweight CNNs (MobileNetV3) and offers significantly lower computational cost than standard compact baselines (ResNet18, EfficientNet-B0). Furthermore, comparisons highlight the performance gains of this lightweight supervised approach over traditional unsupervised benchmarks (e.g., autoencoders, GANs) by effectively leveraging scarce fault samples. These results indicate that a carefully designed lightweight Transformer, together with appropriate sampling and augmentation, can provide competitive acoustic anomaly detection performance while remaining suitable for deployment on resource-constrained industrial edge devices.
Keywords: acoustic anomaly detection; industrial pumps; Audio Spectrogram Transformer; lightweight Transformer; imbalanced learning; edge computing; MIMII dataset acoustic anomaly detection; industrial pumps; Audio Spectrogram Transformer; lightweight Transformer; imbalanced learning; edge computing; MIMII dataset

Share and Cite

MDPI and ACS Style

Zhang, H.; Lai, Y.-H. A Lightweight Audio Spectrogram Transformer for Robust Pump Anomaly Detection. Machines 2026, 14, 114. https://doi.org/10.3390/machines14010114

AMA Style

Zhang H, Lai Y-H. A Lightweight Audio Spectrogram Transformer for Robust Pump Anomaly Detection. Machines. 2026; 14(1):114. https://doi.org/10.3390/machines14010114

Chicago/Turabian Style

Zhang, Hangyu, and Yi-Horng Lai. 2026. "A Lightweight Audio Spectrogram Transformer for Robust Pump Anomaly Detection" Machines 14, no. 1: 114. https://doi.org/10.3390/machines14010114

APA Style

Zhang, H., & Lai, Y.-H. (2026). A Lightweight Audio Spectrogram Transformer for Robust Pump Anomaly Detection. Machines, 14(1), 114. https://doi.org/10.3390/machines14010114

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop