LW-MS-LFTFNet: A Lightweight Multi-Scale Network Integrating Low-Frequency Temporal Features for Ship-Radiated Noise Recognition
Abstract
1. Introduction
- 1.
- We propose a novel lightweight multi-scale (LW-MS) backbone based on depthwise separable convolutions, specifically tailored for the time-frequency characteristics of underwater acoustic signals. This backbone outperforms existing mainstream lightweight architectures in terms of both performance and efficiency.
- 2.
- We introduce two LSTM-based temporal modules to effectively incorporate low-frequency temporal features (LFTF) into the model. This approach enhances the model’s performance by capturing temporal dependencies in spectrograms, with only a slight increase in computational cost and parameter count.
- 3.
- The LW-MS-LFTF network achieves state-of-the-art performance among lightweight models, with a recognition accuracy of 75.04%, using only 0.85 M parameters, 0.38 GMACs, and 3.27 MB of storage. This optimal balance between accuracy and model complexity makes our model highly suitable for deployment on resource-constrained underwater edge platforms.
2. Materials and Methods
2.1. Time-Frequency Pattern Analysis
2.2. The Design of the LW-MS-LFTFNet Model
2.2.1. Multi-Scale Convolution Backbone with Depthwise Separable Convolutions

2.2.2. CBAM Lightweight Attention Mechanism
2.2.3. LSTM Network for Low-Frequency Temporal Feature Extraction
3. Experiment Setup
3.1. Dataset
3.2. Parameters Setup
3.3. Evaluation Metric
4. Experiment Results and Analysis
4.1. The Result of LW-MS-LFTFNet
4.2. Ablation Experiments
4.3. Comparison Experiments
4.4. Saliency Visualization and Interpretation
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| SRN | Ship-Radiated Noise |
| CNN | Convolutional Neural Network |
| RNN | Recurrent Neural Network |
| CBAM | Convolutional Block Attention Module |
| DSC | Depthwise Separable Convolution |
| MLP | Multi-Layer Perceptron |
| LSTM | Long Short-Term Memory |
| LW-MS | Lightweight Multi-Scale |
| LFTF | Low-Frequency Temporal Features |
| t-SNE | t-distributed Stochastic Neighbor Embedding |
References
- Zhu, P.; Zhang, Y.; Huang, Y.; Zhao, C.; Zhao, K.; Zhou, F. Underwater acoustic target recognition based on spectrum component analysis of ship radiated noise. Appl. Acoust. 2023, 211, 109552. [Google Scholar] [CrossRef]
- Ren, J.; Huang, Z.; Li, C.; Guo, X.; Xu, J. Feature Analysis of Passive Underwater Targets Recognition Based on Deep Neural Network. In Proceedings of the OCEANS 2019, Marseille, France, 17–20 June 2019; pp. 1–5. [Google Scholar] [CrossRef]
- Shen, S.; Yang, H.; Yao, X.; Li, J.; Xu, G.; Sheng, M. Ship Type Classification by Convolutional Neural Networks with Auditory-like Mechanisms. Sensors 2020, 20, 253. [Google Scholar] [CrossRef] [PubMed]
- Irfan, M.; Jiangbin, Z.; Ali, S.; Iqbal, M.; Masood, Z.; Hamid, U. DeepShip: An underwater acoustic benchmark dataset and a separable convolution based autoencoder for classification. Expert Syst. Appl. 2021, 183, 115270. [Google Scholar] [CrossRef]
- Li, S.; Yang, S.; Liang, J. Recognition of ships based on vector sensor and bidirectional long short-term memory networks. Appl. Acoust. 2020, 164, 107248. [Google Scholar] [CrossRef]
- Zhurba, N.; Siek, Y.; Khutornaia, E. Onboard computing environment of autonomous unmanned underwater vehicles: Possible design technologies and their comparative analysis. Vibroeng. Procedia 2021, 38, 62–67. [Google Scholar] [CrossRef]
- Hou, X.; Wang, J.; Bai, T.; Deng, Y.; Ren, Y.; Hanzo, L. Environment-Aware AUV Trajectory Design and Resource Management for Multi-Tier Underwater Computing. IEEE J. Sel. Areas Commun. 2023, 41, 474–490. [Google Scholar] [CrossRef]
- Aslam, M.A.; Zhang, L.; Liu, X.; Irfan, M.; Xu, Y.; Li, N.; Zhang, P.; Zheng, J.; Li, Y. Underwater sound classification using learning based methods: A review. Expert Syst. Appl. 2024, 255, 124498. [Google Scholar] [CrossRef]
- Wang, S.; Zeng, X. Robust underwater noise targets classification using auditory inspired time–Frequency analysis. Appl. Acoust. 2014, 78, 68–76. [Google Scholar] [CrossRef]
- Song, G.; Guo, X.; Wang, W.; Li, J.; Yang, H.; Ma, L. Underwater Noise Classification based on Support Vector Machine. In Proceedings of the 2021 OES China Ocean Acoustics (COA), Harbin, China, 14–17 July 2021; pp. 410–414. [Google Scholar] [CrossRef]
- Ke, X.; Yuan, F.; Cheng, E. Integrated optimization of underwater acoustic ship-radiated noise recognition based on two-dimensional feature fusion. Appl. Acoust. 2020, 159, 107057. [Google Scholar] [CrossRef]
- Qiao, W.; Khishe, M.; Ravakhah, S. Underwater targets classification using local wavelet acoustic pattern and Multi-Layer Perceptron neural network optimized by modified Whale Optimization Algorithm. Ocean Eng. 2021, 219, 108415. [Google Scholar] [CrossRef]
- Qi, P.; Sun, J.; Long, Y.; Zhang, L.; Tianye. Underwater Acoustic Target Recognition with Fusion Feature. In Neural Information Processing, Proceedings of the 28th International Conference, ICONIP 2021, Sanur, Bali, Indonesia, 8–12 December 2021; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2021; pp. 609–620. [Google Scholar] [CrossRef]
- Hu, F.; Fan, J.; Kong, Y.; Zhang, L.; Guan, X.; Yu, Y. A Deep Learning Method for Ship-Radiated Noise Recognition Based on MFCC Feature. In Proceedings of the 7th International Conference on Transportation Information and Safety (ICTIS), Xi’an, China, 4–6 August 2023; pp. 1328–1335. [Google Scholar] [CrossRef]
- Polson, N.G.; Sokolov, V.O. Deep Learning. arXiv 2018, arXiv:1807.07987. [Google Scholar] [PubMed]
- IEEE. IEEE Transactions on Audio, Speech, and Language Processing publication information. IEEE Trans. Audio Speech Lang. Process. 2006, 14, c2. [Google Scholar] [CrossRef]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, L.; Polosukhin, I. Attention Is All You Need. arXiv 2017, arXiv:1706.03762. [Google Scholar] [CrossRef]
- Doan, V.S.; Huynh-The, T.; Kim, D.S. Underwater Acoustic Target Classification Based on Dense Convolutional Neural Network. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1500905. [Google Scholar] [CrossRef]
- Hu, G.; Wang, K.; Peng, Y.; Qiu, M.; Shi, J.; Liu, L. Deep Learning Methods for Underwater Target Feature Extraction and Recognition. Comput. Intell. Neurosci. 2018, 2018, 1214301. [Google Scholar] [CrossRef]
- Tian, S.; Chen, D.; Wang, H.; Liu, J. Deep convolution stack for waveform in underwater acoustic target recognition. Sci. Rep. 2021, 11, 9614. [Google Scholar] [CrossRef]
- Yang, H.; Li, J.; Shen, S.; Xu, G. A Deep Convolutional Neural Network Inspired by Auditory Perception for Underwater Acoustic Target Recognition. Sensors 2019, 19, 1104. [Google Scholar] [CrossRef]
- Han, X.C.; Ren, C.; Wang, L.; Bai, Y. Underwater acoustic target recognition method based on a joint neural network. PLoS ONE 2022, 17, e0266425. [Google Scholar] [CrossRef]
- Yan, C.; Yan, S.; Yao, T.; Yu, Y.; Pan, G.; Liu, L.; Wang, M.; Bai, J. A Lightweight Network Based on Multi-Scale Asymmetric Convolutional Neural Networks with Attention Mechanism for Ship-Radiated Noise Classification. J. Mar. Sci. Eng. 2024, 12, 130. [Google Scholar] [CrossRef]
- Gao, R.; Liang, M.; Dong, H.; Luo, X.; Suganthan, P.N. Underwater Acoustic Signal Denoising Algorithms: A Survey of the State-of-the-art. arXiv 2024, arXiv:2407.13264. [Google Scholar] [CrossRef]
- Wang, B.; Zhang, W.; Zhu, Y.; Wu, C.; Zhang, S. An Underwater Acoustic Target Recognition Method Based on AMNet. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5501105. [Google Scholar] [CrossRef]
- Lin, B.; Gao, L.; Zhu, P.; Zhang, Y.; Huang, Y. An Underwater Acoustic Target Recognition Method Based on Iterative Short-Time Fourier Transform. IEEE Sens. J. 2024, 24, 26199–26210. [Google Scholar] [CrossRef]
- Zhang, Q.; Da, L.; Zhang, Y.; Hu, Y. Integrated neural networks based on feature fusion for underwater target recognition. Appl. Acoust. 2021, 182, 108261. [Google Scholar] [CrossRef]
- Zhang, W.; Lin, B.; Yan, Y.; Zhou, A.; Ye, Y.; Zhu, X. Multi-Features Fusion for Underwater Acoustic Target Recognition based on Convolution Recurrent Neural Networks. In Proceedings of the 2022 8th International Conference on Big Data and Information Analytics, Guiyang, China, 24–25 August 2022. [Google Scholar]
- Xu, J.; Li, X.; Zhang, D.; Chen, Y.; Peng, Y.; Liu, W. Enhanced underwater acoustic target recognition using parallel dual-branch network with attention mechanism. Eng. Appl. Artif. Intell. 2025, 158, 111603. [Google Scholar] [CrossRef]
- Li, P.; Wu, J.; Wang, Y.; Lan, Q.; Xiao, W. STM: Spectrogram Transformer Model for Underwater Acoustic Target Recognition. J. Mar. Sci. Eng. 2022, 10, 1428. [Google Scholar] [CrossRef]
- Chen, L.; Luo, X.; Zhou, H. A ship-radiated noise classification method based on domain knowledge embedding and attention mechanism. Eng. Appl. Artif. Intell. 2024, 127, 107320. [Google Scholar] [CrossRef]
- Lei, Z.; Lei, X.; Wang, N.; Zhang, Q. Present status and challenges of underwater acoustic target recognition technology: A review. Front. Phys. 2022, 10, 1044890. [Google Scholar] [CrossRef]
- Yang, S.; Xue, L.; Hong, X.; Zeng, X. A Lightweight Network Model Based on an Attention Mechanism for Ship-Radiated Noise Classification. J. Mar. Sci. Eng. 2023, 11, 432. [Google Scholar] [CrossRef]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going Deeper with Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Sifre, L.; Mallat, S. Rigid-Motion Scattering for Texture Classification. arXiv 2014, arXiv:1403.1687. [Google Scholar] [CrossRef]
- Howard, A.G.; Zhu, M.; Chen, B.; Kalenichenko, D.; Wang, W.; Weyand, T.; Andreetto, M.; Adam, H. MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. arXiv 2017, arXiv:1704.04861. [Google Scholar] [CrossRef]
- Chollet, F. Xception: Deep Learning with Depthwise Separable Convolutions. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Sandler, M.; Howard, A.; Zhu, M.; Zhmoginov, A.; Chen, L.C. MobileNetV2: Inverted Residuals and Linear Bottlenecks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–22 June 2018. [Google Scholar]
- Ma, N.; Zhang, X.; Zheng, H.T.; Sun, J. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018. [Google Scholar]
- Yang, S.; Jin, A.; Zeng, X.; Wang, H.; Hong, X.; Lei, M. Underwater acoustic target recognition based on sub-band concatenated Mel spectrogram and multidomain attention mechanism. Eng. Appl. Artif. Intell. 2024, 133, 107983. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]















| Ship Type | No. of Ships | Total Time | Total Recordings | Train Size | Validation Size | Test Size |
|---|---|---|---|---|---|---|
| Cargo | 69 | 10 h 40 min | 110 | 9185 | 1789 | 1782 |
| Passenger Ship | 46 | 12 h 22 min | 193 | 10,555 | 2401 | 2393 |
| Tanker | 133 | 12 h 45 min | 240 | 10,827 | 1932 | 1925 |
| Tug | 17 | 11 h 17 min | 70 | 8804 | 2335 | 2327 |
| Frequency Range (Hz) | Pre-Emphasis Coefficient | Number of Filter Banks | Hop Length | N-fft | Dimension |
|---|---|---|---|---|---|
| 1–8000 | 0.97 | 513 | 512 | 4096 | 513 × 94 |
| 100–1000 | 0.00 | 80 | 512 | 2048 | 80 × 94 |
| 1–100 | 0.00 | 20 | 512 | 2048 | 20 × 94 |
| Class | Precision (%) | Recall (%) | F1-Score (%) | Support |
|---|---|---|---|---|
| Cargo | 67.57 | 64.31 | 65.90 | 1782 |
| Passenger ship | 76.53 | 77.68 | 77.10 | 2393 |
| Tanker | 71.03 | 79.74 | 75.13 | 1925 |
| Tug | 83.33 | 76.67 | 79.86 | 2327 |
| Macro average | 74.62 | 74.60 | 74.50 | 8427 |
| Model Configuration | Accuracy (%) | No. Params (M) | MACs (G) | Model Size (MB) |
|---|---|---|---|---|
| Lightweight Backbone (baseline) | 72.40 | 0.27 | 0.34 | 1.06 |
| Backbone + A (1–100 Hz) | 73.38 (+0.98) | 0.54 (+0.27) | 0.36 (+0.02) | 2.10 (+1.04) |
| Backbone + B (100–1000 Hz) | 73.01 (+0.61) | 0.58 (+0.31) | 0.36 (+0.02) | 2.22 (+1.16) |
| Backbone + A + B (proposed) | 75.04 (+2.64) | 0.85 (+0.58) | 0.38 (+0.04) | 3.27 (+2.21) |
| Model | Accuracy (%) | No. Params (M) | MACs (G) | Model Size (MB) |
|---|---|---|---|---|
| LW-MS-LFTFNet (Proposed) | 75.04 | 0.85 | 0.38 | 3.27 |
| Backbone | 72.40 | 0.27 | 0.34 | 1.06 |
| LW-SEResNet10 | 67.95 | 4.91 | 0.90 | 18.70 |
| CFTANet | 66.32 | 0.54 | 0.25 | 2.08 |
| MobileNetV1 (0.5) | 66.82 | 0.82 | 0.16 | 3.23 |
| MobileNetV1 (0.75) | 63.94 | 1.82 | 0.34 | 7.05 |
| MobileNetV1 (1.0) | 64.72 | 3.21 | 0.59 | 12.30 |
| ShuffleNetV2 (0.5) | 64.06 | 0.35 | 0.04 | 1.46 |
| ShuffleNetV2 (1.0) | 66.73 | 1.26 | 0.15 | 4.97 |
| MobileNetV2 (0.5) | 65.50 | 0.69 | 0.11 | 2.82 |
| MobileNetV2 (0.75) | 67.30 | 1.36 | 0.23 | 5.40 |
| MobileNetV2 (1.0) | 69.79 | 2.23 | 0.33 | 8.74 |
| MA-CNN-A | 60.23 | 0.93 | 0.63 | 3.61 |
| ResNet18 | 67.63 | 11.18 | 1.83 | 42.72 |
| MACRN | 65.31 | 3.16 | 1.44 | 12.00 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Feng, Y.; Chen, Z.; Chen, Y.; Xie, Z.; He, J.; Li, J.; Ding, H.; Guo, T.; Chen, K. LW-MS-LFTFNet: A Lightweight Multi-Scale Network Integrating Low-Frequency Temporal Features for Ship-Radiated Noise Recognition. J. Mar. Sci. Eng. 2025, 13, 2073. https://doi.org/10.3390/jmse13112073
Feng Y, Chen Z, Chen Y, Xie Z, He J, Li J, Ding H, Guo T, Chen K. LW-MS-LFTFNet: A Lightweight Multi-Scale Network Integrating Low-Frequency Temporal Features for Ship-Radiated Noise Recognition. Journal of Marine Science and Engineering. 2025; 13(11):2073. https://doi.org/10.3390/jmse13112073
Chicago/Turabian StyleFeng, Yu, Zhangxin Chen, Yixuan Chen, Ziqin Xie, Jiale He, Jiachang Li, Houqian Ding, Tao Guo, and Kai Chen. 2025. "LW-MS-LFTFNet: A Lightweight Multi-Scale Network Integrating Low-Frequency Temporal Features for Ship-Radiated Noise Recognition" Journal of Marine Science and Engineering 13, no. 11: 2073. https://doi.org/10.3390/jmse13112073
APA StyleFeng, Y., Chen, Z., Chen, Y., Xie, Z., He, J., Li, J., Ding, H., Guo, T., & Chen, K. (2025). LW-MS-LFTFNet: A Lightweight Multi-Scale Network Integrating Low-Frequency Temporal Features for Ship-Radiated Noise Recognition. Journal of Marine Science and Engineering, 13(11), 2073. https://doi.org/10.3390/jmse13112073

