- Article
Optimising Convolutional Neural Network Architectures for Fin Whale Pulse Detection in Spectrograms
- Marta Román-Ruiz and
- Claudio Rossi
Deep neural networks are widely used for image classification in different fields, although selecting an appropriate architecture often remains a trial-and-error process. The purpose of this work is to investigate a convolutional neural network architecture used to detect whale pulses in spectrograms in order to better understand the causes of its underperformance. By examining the behaviour of its internal layers, we show that the early convolutional blocks capture the most informative acoustic features, while deeper layers provide limited additional benefit and, under the considered training conditions, may even degrade classification accuracy. Based on these observations, we derive a simplified architecture consisting of only the first two convolutional layers followed by a lightweight classifier. This network achieves near-optimal performance, improving accuracy from 87% to 98%, and exhibits substantially lower variability between repetitions compared to the original model.
28 February 2026



![Spectrogram detections of fin whale 20 Hz and VFP notes (X-BAT (BRP, Cornell)) 2.5 min window, FFT = 512 points, Hanning window, 75% overlap [9].](https://mdpi-res.com/cdn-cgi/image/w=470,h=317/https://mdpi-res.com/applsci/applsci-16-02345/article_deploy/html/images/applsci-16-02345-g001-550.jpg)
![Operational cycle of a haul truck in an open-pit mining operation [15].](https://mdpi-res.com/cdn-cgi/image/w=281,h=192/https://mdpi-res.com/applsci/applsci-16-02343/article_deploy/html/images/applsci-16-02343-g001-550.jpg)



