Next Article in Journal
Attention-Guided Differentiable Channel Pruning for Efficient Deep Networks
Previous Article in Journal
Learning to Balance Mixed Adversarial Attacks for Robust Reinforcement Learning
 
 
Article
Peer-Review Record

Enhancing Soundscape Characterization and Pattern Analysis Using Low-Dimensional Deep Embeddings on a Large-Scale Dataset

Mach. Learn. Knowl. Extr. 2025, 7(4), 109; https://doi.org/10.3390/make7040109
by Daniel Alexis Nieto Mora 1,*, Leonardo Duque-Muñoz 1 and Juan David Martínez Vargas 2
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
Mach. Learn. Knowl. Extr. 2025, 7(4), 109; https://doi.org/10.3390/make7040109
Submission received: 22 July 2025 / Revised: 18 September 2025 / Accepted: 19 September 2025 / Published: 24 September 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This manuscript proposed a fully unsupervised framework for Soundscape monitoring. Although this study has some practical significance,some issues require consideration and clear clarification.
1. The proposed framework includes  methods, such as VGGish neural network, convolutional autoencoder, UMAP, PaCMAP and DBSCAN. Integrating these methods is a considerable workload. Howerver, it seems that these methods are just simple combined together without significant improvements. So the novelty of the manuscript is slight.
2. The VGGish is pre-trained on the large-scale AudioSet dataset. Nevertheless, the dataset used in this study was obtained from passive acoustic recordings collected within the Rey Zamuro and Matarredonda Private Nature Reserve. It should be noticed that if the data of AudioSet are different from the data of this study, it is likely to cause significant differences in feature extraction. For instance, networks trained on large-scale natural image datasets such as ImageNet are limited in professional medical image processing.
3. Although some analyses were conducted in the experiment, there was a lack of dedicated ablation experiments, making it difficult to reflect the key steps of the entire framework. Therefore, the conclusion of “One of the key aspects of our approach was the careful and systematic optimization of parameters for both projection and clustering methods.” raises some concerns. 
4. The performances of the proposed framework were carried out on the recorded dataset, which may bring certain limitations. On one hand, the manuscript was proposed as if it were conducting research on the Rey Zamuro and Matarredonda Private Nature Reserve; on the other hand, it did not prove its generalization on other datasets.
5. Currently, there are many networks with strong feature extraction capabilities, such as transformer, etc., but they were not adopted in this study. Furthermore, no comparison was conducted, so the performance of the proposed framework and methods are difficult to evaluate. At the same time, there was no comparison with the latest methods, making it difficult to demonstrate innovation.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors
  1. The abstract summarizes the research background, the utilized methods, and the main achievements of this research. However, there is too much qualitative description and a lack of more quantitative data as support. It is suggested that representative data be selected from the results and presented in the abstract, which is helpful to enhance credibility.
  2. In the introduction, the authors present the previous research work of their research group in this direction and analyze the improvement directions of this study compared with the previous work by citing references [12] and [13]. It is suggested that in the "Results and Discussion" section, the results of this research and those of the literature be compared to demonstrate the necessity of conducting this research.
  3. Regarding the selection of data analysis methods, it is recommended to provide the reasons for choosing the adopted method in this study and explain why this method is more effective compared to others.
  4. The “Results and discussion” section is very long, exceeding 17 pages, and there are no subheadings in between. It is suggested to set subheadings based on the content to facilitate reading and understanding.
  5. In the conclusion section, it is suggested to supplement some quantitative data results to enhance the accuracy of the given conclusions.
  6. For a graph that includes multiple subgraphs, such as Figures 1, 2, 5, 8, 9, 10, 11, 14, it is recommended to use (a), (b), (c)... Make distinctions and name each subgraph to facilitate reading and understanding.
  7. Whether in the abstract or the main text, the first appearance of an abbreviation should provide an explanation of its full meaning to facilitate understanding and reading. The author needs to check the abbreviations in the entire text.
  8. Some references are not cited in the main text but only in the abbreviated appendices. It is recommended that they be marked in the main text as well.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors

Summary:      

In this study, the authors developed an unsupervised deep learning approach to analyze the soundscape of the selected region for ecological monitoring and related applications. The paper was well written with key information clearly communicated. It is a good application of machine learning methods in a selected area. The only concern of this reviewer is why such work is important.

Major comments:      

  1. With the development of machine learning methods, nearly every area of scientific inquiry has started adopting them. Although this work is well presented, it is not very clear to this reviewer how the application of machine learning methods helps the targeted field. In other words, yes we can introduce machine learning to this field, but why?

Minor comments:      

  1. Why was the first letter of ‘scale’ in the title capitalized?
  2. Page 3, Line 111-112: a formatting issue.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

Although the manuscript has been revised carefully, the decision to accept is difficult to make because there is a lack of sufficient evidence.
1. About novelty, although the words "we have clarified in more detail the rationale behind the selection of the methods used in our framework." demonstrates the main work of the manuscript, it is hard to consider the rationale or principles of the methods combination as innovation. The improvement of these methods might be worthy of being regarded as novelty.
2. Without comparing with the latest relevant methods, it is difficult to make a judgment on the innovation. 
3. About the ablation experiments, the words "Although we did not design classical ablation experiments (removing one component at a time), the comparisons we conducted across feature extraction methods (indices, VGGish, autoencoder), dimensionality reduction techniques, and clustering algorithms can be regarded as a form of ablation, helping to isolate the contribution of each step."  is not a satisfactory response. The combination of some best methods does not necessarily result in the best results. Although some analyses and principles are important, the verification through experiments is equally crucial.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop