Robust Dolphin Whistle Detection Based on Dually-Regularized Non-Negative Matrix Factorization in Passive Acoustic Monitoring
Abstract
1. Introduction
2. Methods
2.1. STFT-Based Spectral Model of Noisy Whistles
2.2. Spectrogram Enhancement Model Based on DR-NMF and Iterative Solution Method
2.2.1. Derivation of the Objective Function
2.2.2. Optimization of the Objective Function
| Algorithm 1 Dually-Regularized NMF Algorithm |
|
2.3. Endpoint Detector Based on Time–Frequency Spectrogram
2.4. Experimental Setup
2.4.1. Evaluation Metrics
2.4.2. Baseline Methods
3. Results
3.1. Optimal Parameter Selection
3.2. Enhanced and Detected Results of Typical Whistle Segments
3.3. Comparison with Other Methods
3.4. Experimental Results on Real Dolphin Whistle Recordings
3.5. Complexity Analysis
4. Discussion
4.1. Analysis of the Superiority of the Proposed Method
4.2. Limitations and Future Perspectives
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Rice, A.; Širović, A.; Trickey, J.S.; Debich, A.J.; Gottlieb, R.S.; Wiggins, S.M.; Hildebrand, J.A.; Baumann-Pickering, S. Cetacean occurrence in the Gulf of Alaska from long-term passive acoustic monitoring. Mar. Biol. 2021, 168, 72. [Google Scholar] [CrossRef]
- Zimmer, W.M. Passive Acoustic Monitoring of Cetaceans; Cambridge University Press: Cambridge, UK, 2011. [Google Scholar]
- Haver, S.M.; Rand, Z.; Hatch, L.T.; Lipski, D.; Dziak, R.P.; Gedamke, J.; Haxel, J.; Heppell, S.A.; Jahncke, J.; McKenna, M.F.; et al. Seasonal trends and primary contributors to the low-frequency soundscape of the Cordell Bank National Marine Sanctuary. J. Acoust. Soc. Am. 2020, 148, 845–858. [Google Scholar] [CrossRef] [PubMed]
- Fujioka, E.; Soldevilla, M.S.; Read, A.J.; Halpin, P.N. Integration of passive acoustic monitoring data into OBIS-SEAMAP, a global biogeographic database, to advance spatially-explicit ecological assessments. Ecol. Inform. 2014, 21, 59–73. [Google Scholar] [CrossRef]
- Hung, C.T.; Chu, W.Y.; Li, W.L.; Huang, Y.H.; Hu, W.C.; Chen, C.F. A case study of whistle detection and localization for humpback dolphins in Taiwan. J. Mar. Sci. Eng. 2021, 9, 725. [Google Scholar] [CrossRef]
- Kershenbaum, A.; Sayigh, L.S.; Janik, V.M. The encoding of individual identity in dolphin signature whistles: How much information is needed? PLoS ONE 2013, 8, e77671. [Google Scholar] [CrossRef]
- Kipnis, D.; Diamant, R. Graph-based clustering of dolphin whistles. IEEE/ACM Trans. Audio Speech Lang. Process. 2021, 29, 2216–2227. [Google Scholar] [CrossRef]
- Gregorietti, M.; Papale, E.; Ceraulo, M.; De Vita, C.; Pace, D.S.; Tranchida, G.; Mazzola, S.; Buscaino, G. Acoustic presence of dolphins through whistles detection in Mediterranean shallow waters. J. Mar. Sci. Eng. 2021, 9, 78. [Google Scholar] [CrossRef]
- Simpson, S.D.; Miller, C.E. Identification of key discriminating variables between spinner dolphin (Stenella longirostris) whistle types. J. Acoust. Soc. Am. 2020, 148, 1136–1144. [Google Scholar] [CrossRef]
- Ashokan, M.; Latha, G.; Ramesh, R. Analysis of shallow water ambient noise due to rain and derivation of rain parameters. Appl. Acoust. 2015, 88, 114–122. [Google Scholar] [CrossRef]
- Santoso, T.B. Ambient noise characterization of shallow water environment. EMITTER Int. J. Eng. Technol. 2015, 3, 77–87. [Google Scholar] [CrossRef]
- Hildebrand, J.A. Anthropogenic and natural sources of ambient noise in the ocean. Mar. Ecol. Prog. Ser. 2009, 395, 5–20. [Google Scholar] [CrossRef]
- Juodakis, J.; Marsland, S.; Priyadarshani, N. A changepoint prefilter for sound event detection in long-term bioacoustic recordings. J. Acoust. Soc. Am. 2021, 150, 2469–2478. [Google Scholar] [CrossRef]
- Qiao, G.; Ma, T.; Liu, S.; Zheng, N.; Babar, Z.; Yin, Y. Spectral entropy based dolphin whistle detection algorithm and its possible application for biologically inspired communication. In Proceedings of the OCEANS 2019-Marseille, Marseille, France, 17–20 June 2019; pp. 1–6. [Google Scholar]
- Azevedo, A.F.; Oliveira, A.M.; Rosa, L.D.; Lailson-Brito, J. Characteristics of whistles from resident bottlenose dolphins (Tursiops truncatus) in southern Brazil. J. Acoust. Soc. Am. 2007, 121, 2978–2983. [Google Scholar] [CrossRef] [PubMed]
- Li, L.; Qiao, G.; Liu, S.; Qing, X.; Zhang, H.; Mazhar, S.; Niu, F. Automated classification of Tursiops aduncus whistles based on a depth-wise separable convolutional neural network and data augmentation. J. Acoust. Soc. Am. 2021, 150, 3861–3873. [Google Scholar] [CrossRef] [PubMed]
- Janik, V.M.; Todt, D.; Dehnhardt, G. Signature whistle variations in a bottlenosed dolphin, Tursiops truncatus. Behav. Ecol. Sociobiol. 1994, 35, 243–248. [Google Scholar] [CrossRef]
- Zhou, X.; Wu, R.; Chen, W.; Dai, M.; Zhu, P.; Xu, X. Thresholding Dolphin Whistles Based on Signal Correlation and Impulsive Noise Features Under Stationary Wavelet Transform. J. Mar. Sci. Eng. 2025, 13, 312. [Google Scholar] [CrossRef]
- Gillespie, D.; Caillat, M.; Gordon, J.; White, P. Automatic detection and classification of odontocete whistles. J. Acoust. Soc. Am. 2013, 134, 2427–2437. [Google Scholar] [CrossRef]
- Wang, X.; Jiang, J.; Duan, F.; Liang, C.; Li, C.; Sun, Z.; Lu, R.; Li, F.; Xu, J.; Fu, X. A method for enhancement and automated extraction and tracing of Odontoceti whistle signals base on time-frequency spectrogram. Appl. Acoust. 2021, 176, 107698. [Google Scholar] [CrossRef]
- Li, L.; Wang, Q.; Qing, X.; Qiao, G.; Liu, X.; Liu, S. Robust unsupervised Tursiops aduncus whistle enhancement based on complete ensembled empirical optimal envelope local mean decomposition with adaptive noise. J. Acoust. Soc. Am. 2022, 152, 3360–3372. [Google Scholar] [CrossRef]
- Pu, W.; Liu, S.; Qing, X.; Qiao, G.; Mazhar, S.; Ma, T. Automated extraction of baleen whale calls based on the pseudo-Wigner–Ville distribution. J. Acoust. Soc. Am. 2023, 153, 1564–1579. [Google Scholar] [CrossRef]
- Giard, S.; Simard, Y.; Roy, N. Decadal passive acoustics time series of St. Lawrence estuary beluga. J. Acoust. Soc. Am. 2020, 147, 1874–1884. [Google Scholar] [CrossRef]
- Seger, K.D.; Al-Badrawi, M.H.; Miksis-Olds, J.L.; Kirsch, N.J.; Lyons, A.P. An empirical mode decomposition-based detection and classification approach for marine mammal vocal signals. J. Acoust. Soc. Am. 2018, 144, 3181–3190. [Google Scholar] [CrossRef] [PubMed]
- Vickers, W.; Milner, B.; Risch, D.; Lee, R. Robust North Atlantic right whale detection using deep learning models for denoising. J. Acoust. Soc. Am. 2021, 149, 3797–3812. [Google Scholar] [CrossRef] [PubMed]
- Wang, H.; Wu, X.; Wang, Z.; Hao, Y.; Hao, C.; He, X.; Hu, Q. Low-Resource Generation Method for Few-Shot Dolphin Whistle Signal Based on Generative Adversarial Network. J. Mar. Sci. Eng. 2023, 11, 1086. [Google Scholar] [CrossRef]
- Chen, P.Y.; Selesnick, I.W. Group-sparse signal denoising: Non-convex regularization, convex optimization. IEEE Trans. Signal Process. 2014, 62, 3464–3478. [Google Scholar] [CrossRef]
- Chen, P.Y.; Selesnick, I.W. Translation-invariant shrinkage/thresholding of group sparse signals. Signal Process. 2014, 94, 476–489. [Google Scholar] [CrossRef]
- Bischl, B.; Eichhoff, M.; Weihs, C. Selecting Groups of Audio Features by Statistical Tests and the Group Lasso. In Proceedings of the Sprachkommunikation, Bochum, Deutschland, 6–8 October 2010; pp. 1–4. [Google Scholar]
- Song, Z.; Zhang, C.; Fu, W.; Gao, Z.; Ou, W.; Zhang, J.; Zhang, Y. Investigation on whistle directivity in the Indo-Pacific humpback dolphin (Sousa chinensis) through numerical modeling. J. Acoust. Soc. Am. 2022, 151, 3573–3579. [Google Scholar] [CrossRef]
- Brewer, A.M.; Castellote, M.; Van Cise, A.M.; Gage, T.; Berdahl, A.M. Communication in Cook Inlet beluga whales: Describing the vocal repertoire and masking of calls by commercial ship noise. J. Acoust. Soc. Am. 2023, 154, 3487–3505. [Google Scholar] [CrossRef]
- Ferrer-i Cancho, R.; McCowan, B. The span of correlations in dolphin whistle sequences. J. Stat. Mech. Theory Exp. 2012, 2012, P06002. [Google Scholar] [CrossRef]
- Lee, D.; Seung, H.S. Algorithms for non-negative matrix factorization. Adv. Neural Inf. Process. Syst. 2000, 13. [Google Scholar]
- Lefevre, A.; Bach, F.; Févotte, C. Online algorithms for nonnegative matrix factorization with the Itakura-Saito divergence. In Proceedings of the 2011 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA), New Paltz, NY, USA, 16–19 October 2011; pp. 313–316. [Google Scholar]
- Lee, D.D.; Seung, H.S. Learning the parts of objects by non-negative matrix factorization. Nature 1999, 401, 788–791. [Google Scholar] [CrossRef]
- Beck, A. First-Order Methods in Optimization; SIAM: Philadelphia, PA, USA, 2017. [Google Scholar]
- Yuan, M.; Lin, Y. Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. Stat. Methodol. 2006, 68, 49–67. [Google Scholar] [CrossRef]
- Lim, J.S.; Oppenheim, A.V. Enhancement and bandwidth compression of noisy speech. Proc. IEEE 2005, 67, 1586–1604. [Google Scholar] [CrossRef]
- Loizou, P.C. Speech enhancement based on perceptually motivated Bayesian estimators of the magnitude spectrum. IEEE Trans. Speech Audio Process. 2005, 13, 857–869. [Google Scholar] [CrossRef]
- Kowalski, M.; Torrésani, B. Sparsity and persistence: Mixed norms provide simple signal models with dependent coefficients. Signal Image Video Process. 2009, 3, 251–264. [Google Scholar] [CrossRef]
- Jacob, L.; Obozinski, G.; Vert, J.P. Group lasso with overlap and graph lasso. In Proceedings of the 26th Annual International Conference on Machine Learning, Montreal, QC, Canada, 14–18 June 2009; pp. 433–440. [Google Scholar]
- Kogan, J.A.; Margoliash, D. Automated recognition of bird song elements from continuous recordings using dynamic time warping and hidden Markov models: A comparative study. J. Acoust. Soc. Am. 1998, 103, 2185–2196. [Google Scholar] [CrossRef]
- Cichocki, A.; Zdunek, R.; Phan, A.; Amari, S.I.; Matrix, N.N.; Factorizations, T. Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
- Cichocki, A.; Zdunek, R. Regularized alternating least squares algorithms for non-negative matrix/tensor factorization. In Proceedings of the International Symposium on Neural Networks; Springer: Berlin/Heidelberg, Germany, 2007; pp. 793–802. [Google Scholar]
- Candès, E.J.; Li, X.; Ma, Y.; Wright, J. Robust principal component analysis? J. ACM (JACM) 2011, 58, 1–37. [Google Scholar] [CrossRef]
- Kahl, S.; Wood, C.M.; Eibl, M.; Klinck, H. BirdNET: A deep learning solution for avian diversity monitoring. Ecol. Inform. 2021, 61, 101236. [Google Scholar] [CrossRef]
- Stowell, D. Computational bioacoustics with deep learning: A review and roadmap. PeerJ 2022, 10, e13152. [Google Scholar] [CrossRef] [PubMed]
- Shiu, Y.; Palmer, K.; Roch, M.A.; Fleishman, E.; Liu, X.; Nosal, E.M.; Helble, T.; Cholewiak, D.; Gillespie, D.; Klinck, H. Deep neural networks for automated detection of marine mammal species. Sci. Rep. 2020, 10, 607. [Google Scholar] [CrossRef]






| SNR (dB) | −10 | −9 | −8 | −7 | −6 | −5 |
|---|---|---|---|---|---|---|
| Noisy wav1 | ||||||
| Precision | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Recall | 0.4391 | 0.4589 | 0.4731 | 0.4986 | 0.5172 | 0.5241 |
| -Score | 0.6102 | 0.6291 | 0.6423 | 0.6654 | 0.6779 | 0.6877 |
| Enhanced wav1 | ||||||
| Precision | 0.9967 | 0.9970 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Recall | 0.6686 | 0.7190 | 0.7819 | 0.8499 | 0.8952 | 0.9356 |
| -Score | 0.7999 | 0.8540 | 0.8776 | 0.9188 | 0.9267 | 0.9678 |
| Noisy wav2 | ||||||
| Precision | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Recall | 0.4230 | 0.4360 | 0.4527 | 0.4694 | 0.4917 | 0.5083 |
| -Score | 0.5945 | 0.6072 | 0.6232 | 0.6389 | 0.6592 | 0.6740 |
| Enhanced wav2 | ||||||
| Precision | 0.9917 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Recall | 0.6660 | 0.6827 | 0.7365 | 0.7495 | 0.7959 | 0.8590 |
| -Score | 0.7969 | 0.8115 | 0.8483 | 0.8568 | 0.8864 | 0.9242 |
| Noisy wav3 | ||||||
| Precision | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Recall | 0.4218 | 0.4341 | 0.4482 | 0.4499 | 0.4605 | 0.4745 |
| -Score | 0.5933 | 0.6054 | 0.6189 | 0.6206 | 0.6306 | 0.6436 |
| Enhanced wav3 | ||||||
| Precision | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 |
| Recall | 0.5677 | 0.6257 | 0.6942 | 0.7504 | 0.7645 | 0.7750 |
| -Score | 0.7242 | 0.7697 | 0.8195 | 0.8574 | 0.8665 | 0.8733 |
| Precision | Recall | -Score | |
|---|---|---|---|
| wav4 Segment | |||
| Noisy | 1.0000 | 0.3137 | 0.4775 |
| Blurring | 1.0000 | 0.4649 | 0.6348 |
| OGS | 1.0000 | 0.4594 | 0.6296 |
| GL | 1.0000 | 0.3413 | 0.5089 |
| NMF-OGS | 1.0000 | 0.4668 | 0.6365 |
| NMF-GL | 1.0000 | 0.4631 | 0.6330 |
| Proposed | 1.0000 | 0.7491 | 0.8565 |
| wav5 Segment | |||
| Noisy | 0.0000 | 0.0000 | 0.0000 |
| Blurring | 0.0000 | 0.0000 | 0.0000 |
| OGS | 0.0000 | 0.0000 | 0.0000 |
| GL | 0.0000 | 0.0000 | 0.0000 |
| NMF-OGS | 0.0000 | 0.0000 | 0.0000 |
| NMF-GL | 0.0000 | 0.0000 | 0.0000 |
| Proposed | 0.9438 | 0.3631 | 0.5245 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, L.; Shao, X.; Huang, S.; Cui, X.; Zhu, J.; Liu, S. Robust Dolphin Whistle Detection Based on Dually-Regularized Non-Negative Matrix Factorization in Passive Acoustic Monitoring. J. Mar. Sci. Eng. 2025, 13, 2164. https://doi.org/10.3390/jmse13112164
Li L, Shao X, Huang S, Cui X, Zhu J, Liu S. Robust Dolphin Whistle Detection Based on Dually-Regularized Non-Negative Matrix Factorization in Passive Acoustic Monitoring. Journal of Marine Science and Engineering. 2025; 13(11):2164. https://doi.org/10.3390/jmse13112164
Chicago/Turabian StyleLi, Lei, Xinrui Shao, Shuping Huang, Xuerong Cui, Jiang Zhu, and Songzuo Liu. 2025. "Robust Dolphin Whistle Detection Based on Dually-Regularized Non-Negative Matrix Factorization in Passive Acoustic Monitoring" Journal of Marine Science and Engineering 13, no. 11: 2164. https://doi.org/10.3390/jmse13112164
APA StyleLi, L., Shao, X., Huang, S., Cui, X., Zhu, J., & Liu, S. (2025). Robust Dolphin Whistle Detection Based on Dually-Regularized Non-Negative Matrix Factorization in Passive Acoustic Monitoring. Journal of Marine Science and Engineering, 13(11), 2164. https://doi.org/10.3390/jmse13112164

