Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

A Parallel Classification Model for Marine Mammal Sounds Based on Multi-Dimensional Feature Extraction and Data Augmentation

Sensors 2022, 22(19), 7443; https://doi.org/10.3390/s22197443

by Wenyu Cai^1,*

, Jifeng Zhu^1,*, Meiyan Zhang² and Yong Yang¹

Reviewer 1:

Jianfeng Tong

Reviewer 2:

Alexander Sutin

Sensors 2022, 22(19), 7443; https://doi.org/10.3390/s22197443

Submission received: 25 August 2022 / Revised: 24 September 2022 / Accepted: 28 September 2022 / Published: 30 September 2022

(This article belongs to the Section Remote Sensors)

Round 1

Reviewer 1 Report

Please see the attachment.

Comments for author File: Comments.pdf

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

The paper presents results of Machine Learning application for marine mammal sound classification. There are a lot of publications on this matter, and I would like to see clear explanation of the suggested method advantages in comparison with the other publications. For example, I can point out to several similar publications:

Shiu, Y., Palmer, K.J., Roch, M.A., Fleishman, E., Liu, X., Nosal, E.M., Helble, T., Cholewiak, D., Gillespie, D. and Klinck, H., 2020. Deep neural networks for automated detection of marine mammal species. Scientific reports, 10(1), pp.1-12.

Dugan, P.J., Clark, C.W., LeCun, Y.A. and Van Parijs, S.M., 2015. DCL System Using Deep Learning Approaches for Land-Based or Ship-Based Real Time Recognition and Localization of Marine Mammals. Bioacoustics Research Program, Cornell University Ithaca United States.

Zhong, M., Castellote, M., Dodhia, R., Lavista Ferres, J., Keogh, M. and Brewer, A., 2020. Beluga whale acoustic signal classification using deep learning neural network models. The Journal of the Acoustical Society of America, 147(3), pp.1834-1841.

Kirsebom, O.S., Frazao, F., Simard, Y., Roy, N., Matwin, S. and Giard, S., 2020. Performance of a deep neural network at detecting North Atlantic right whale upcalls. The Journal of the Acoustical Society of America, 147(4), pp.2636-2646.

Usman, A.M., Ogundile, O.O. and Versfeld, D.J., 2020. Review of automatic detection and classification techniques for cetacean vocalization. IEEE Access, 8, pp.105181-105206.

Allen, A.N., Harvey, M., Harrell, L., Jansen, A., Merkens, K.P., Wall, C.C., Cattiau, J. and Oleson, E.M., 2021. A convolutional neural network for automated detection of humpback whale song in a diverse, long-term passive acoustic dataset. Frontiers in Marine Science, 8, p.607321.

I also recommend to refer the review paper Bianco, M.J., Gerstoft, P., Traer, J., Ozanich, E., Roch, M.A., Gannot, S. and Deledalle, C.A., 2019. Machine learning in acoustics: Theory and applications. The Journal of the Acoustical Society of America, 146(5), pp.3590-3628.

Several more comments:

1. Authors are called their method MDF-PNet, but this abbreviation is not explained. I suggest not to use it the title and to explain why did they call this method MDF-PNet.

2. Show scales for axes in Figure 3.

3. Please explain how were the features extracted from the samples with various sampling rates? How were the samples transformed to the standard form?

4. I am not sure that the data augmentation by the flipping of waveform may be used for increasing number of samples. I think that this method does not change frequency-domain features. Could you please calculate features for few direct and flipped samples.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 1 Report

The MS can be accepted

Author Response

Thank you!

Reviewer 2 Report

I am more or less satisfied by the majority of the paper corrections, but the authors did not provide proof that the data augmentation by the flipping of waveform can be used for increasing number of samples. The paper states that “the features used in this paper mainly come from the frequency domain”. The waveform flipping does not change the signal spectrum. It means that the spectral features are not changed after flipping and the data augmentation presented the paper has no sense.

I recommend to the authors to remove the data augmentation from the paper or to explain clearly why the waveform clipping can be used there?

Author Response

Thank you!

Author Response File: Author Response.pdf

Article Menu

A Parallel Classification Model for Marine Mammal Sounds Based on Multi-Dimensional Feature Extraction and Data Augmentation

Further Information

Guidelines

MDPI Initiatives

Follow MDPI