Review Reports - Research on Underwater Acoustic Source Localization Based on Typical Machine Learning Algorithms

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This paper investigates the application of machine learning algorithms: Decision Tree, Random Forest, Support Vector Machine, and Feedforward Neural Network for underwater acoustic source location using both simulated and real datasets. It evaluates these algorithms in classification and regression tasks under different signal-to-noise ratios, demonstrating effectiveness and limitations in 1D and 2D location scenarios, including validation with sea trial data.

Regarding the use of English: perform a full grammar and spell-check pass, please. Also, replace informal phrases with more formal alternatives. Please, use technical terms consistently (for example, use “signal-to-noise ratio” instead of switching between “SNR” and written forms without definition).

Regarding Affiliation: if all authors come from the same institution I believe there is no need to repeat affiliation.

Introduction

- This section cites many studies but offers little critical evaluation.

- Some references are old (from 1976, 1982, ..., 1998) without clear linkage to the current context. Perhaps the historical background needs to be condensed in more focused paragraphs.

- Please, include your specific research question at the end of this section in a clear manner.

- Why were these four ML models chosen and not others like CNN or LSTM?

Based Theory of Machine Learning for Underwater Acoustic Source Localization

- This section seems to be overly detailed and didactic for a scientific article.

- The choice of algorithms is not justified in terms of acoustic localization suitability.

- Please, explain further why covariance matrices are used instead of direct pressure signals.

- Please, include a table summarizing key parameters of each model. What kernel was used in SVM? What architecture and hyperparameters were used in FNN?

- Please, revise Figure 1. The flow diagrams are clearly different but adding an explanatory label to each one of them will clarify when and how each one is used.

Result Analysis of Simulation Data

- Would be possible to add confidence intervals to the results? At the moment they do not show much statistical rigor.

- There is no formal discussion of model complexity vs performance.

- Figures 3 to 10 should be deeply discussed.

- Were models cross-validated? How many times was each experiment repeated?

- How was Gaussian noise added? Uniformly or per-sample?

Experimental Results and Analysis

- Experimental results are briefly discussed and lack depth. Model performance on real data is not compared to simulated data. Regression is ignored in real data.

- Why was the 40–75 minute segment chosen from SWellEx-96?

Conclusions

- The conclusions summarize results without deeper interpretation. Please, reframe the section to focus on scientific contributions and impact.

Comments on the Quality of English Language

Author Response

Comments1：This section cites many studies but offers little critical evaluation.

Respons1：Thank you for your reply. We have added relevant critical descriptions in the first section.

Comments2： Some references are old (from 1976, 1982, ..., 1998) without clear linkage to the current context. Perhaps the historical background needs to be condensed in more focused paragraphs.

Respons2：Thank you for your reply. We have made corresponding improvements in the references.

Comments3：Please, include your specific research question at the end of this section in a clear manner.

Respons3：Thank you for your reply. Due to the complexity and time-varying nature of the marine environment, the performance in marine acoustics, especially in the case of low SNR, is often unsatis-factory. Small sample datasets are not easy for deep learning training such as CNN. Therefore, this paper studies the machine learning-related algorithms to conduct un-derwater passive localization of simulated acoustic sources and experimental acoustic sources based on classification tasks and regression tasks respectively for a certain number of sample sets. Traditional machine learning methods such as the DT, RF, SVM, and FNN methods are investigated. (1). The performance of machine learning algo-rithms in acoustic source localization applications was compared systematically. (2) Under a unified framework, comprehensively evaluated the four classical machine learning models (DT, RF, SVM, FNN) in terms of underwater acoustic location, The acoustic source location of one-dimensional(distance) and two-dimensional(distance + depth) simulation datasets is carried out by using classification tasks and regression tasks vary different noise ratio (SNR=2, 5, 10), adopt a simulation environment similar to the SWellEx-96 experimental environment. (3) In addition, the research provides a standardized data processing models and validated method through simulated and field data (SWellEx-96 experiment) the feasibility of using machine learning to replace traditional physics-based models (e.g., Matched Field Processing, MFP) in complex ocean environments, thus providing practical algorithm selection guidelines and per-formance boundaries for engineering applications.

Comments4：Why were these four ML models chosen and not others like CNN or LSTM?

Respons4：Thank you for your reply. Machine learning methods that are related to models such as decision trees and support vector machines can rely on manually designed parameters for model adjustment. The deep learning training method often relies on large datasets and processor clusters to accelerate computing which leads to high training energy consumption. Machine learning methods reduce the computational load by manually extracting features and decrease the reliance on datasets. So we try the ML models in this paper.

Comments5：This section seems to be overly detailed and didactic for a scientific article.

Respons5：The section 2 shows the localization based on machine learning. Include the Data Preprocessing and Label Selection, and the Algorithm Flow of a Typical machine learning model. Part of the theoretical explanation in the revised manuscript has been simplified.

Comments6：The choice of algorithms is not justified in terms of acoustic localization suitability.

Respons6：Thank you for your reply. Acoustic source localization in ocean waveguides is often solved with matched-ﬁeld processing and beamforming algorithms. Machine learning method is an alternative approach to the source localization problem which is to ﬁnd features directly from data. The related models such as decision trees, support vector machines, random forest model and feedforward neural network. Under machine learning framework, source localization can be solved as a classiﬁcation or a regression problem.

In our research, we compared the characteristics of different machine learning algorithms for the acoustic source localization with different SNR. In classification problems, the effect of 1D localization is slightly better than that of 2D localization. Meanwhile, SVM and FNN perform significantly better than DT model and RF model in the case of low SNR. However, the convergence and accuracy of the regression task are slightly lower than those of the classification task. Only the SVM regression model in the case of high SNR has a satisfactory positioning effect. These research conclusions provide us with some references for choosing different machine models for acoustic localization.

In the revised manuscript, we also compare the different acoustic localization algorithms and analysis the different algorithm performance.

Comments7：Please, explain further why covariance matrices are used instead of direct pressure signals.

Respons7：Thank you for your reply. To reduce the influence of the acoustic source phase, a normalized sample covariance matrices (SCMs) of complex acoustic pressure is adopted, which is a complex conjugate symmetric matrix averaged from N snapshots: (Equation 3 in the manuscript)

Here, represent the conjugate transpose operator, represents the complex acoustic pressure data over the i-th snapshot. This matrix contains the product of the sound source terms and is the main part, which for large SNR is dominant thereby reducing the influence of phase.

Preprocessing the data according to the related equation ensures that the Green’s function is used for localization. Only the real and imaginary parts of the complex valued entries of

diagonal and upper triangular matrix in C(f) are used as input to save memory and improve calculation speed, these entries are vectorized to form the real-valued input x of size

L×(L+1) to the FNN, SVM, and RF.

Comments8：lease, include a table summarizing key parameters of each model. What kernel was used in SVM? What architecture and hyperparameters were used in FNN?

Respons8：Thank you for your reply. A new table is added to summarize key parameters of each model.

Comments9： Please, revise Figure 1. The flow diagrams are clearly different but adding an explanatory label to each one of them will clarify when and how each one is used.

Respons9：Thank you for your reply. The Figure 1 is revised.

Comments10：Would be possible to add confidence intervals to the results? At the moment they do not show much statistical rigor.

Respons10：Thank you for your reply. In the revised muanuscript, the 95% reliability was set at 96%. And set the confidence intervals relate 92%-100% of the truth range to up and down of the real data to reduce prediction error. In this case, the predicted range and the ground truth range can be achieved through the total 100 times calculation. The performance of these machine learning algorithms is comparable when solving range estimation as a classiﬁcation problem. The related results are shown in the table 2 and table 3.

Comments11： There is no formal discussion of model complexity vs performance.

Respons11：To quantify the prediction performance, the mean absolute percentage error (P_MAPE) is calculated based on equation 12. These results are summarized in Table I with different SNR. The P_MAPE is achieved on the data sets with different SNR. The performance of these for machine learning algorithms is comparable when solving range estimation as a classiﬁcation problem both for the 1D situation and 2D situation. The related results is shown in the revised manuscript.

Comments12： Figures 3 to 10 should be deeply discussed.

Response12：Thank you for your reply. In the revised manuscript, the results for the 1D simulation data and 2D simulation data are analysis and discussed deeply both about the performance of the model, prediction accuracy and the SNR. The comparison of the simulation and experiment results are also discussed in the revised version.

Comments13： Were models cross-validated? How many times was each experiment repeated?

Response13：Thank you for your reply. In the revised manuscript, the cross-validated were performed with 100-time calculation by the different machine learning models. The performance of the model and the experiment results are analyzed in section 4.

Comments14：How was Gaussian noise added? Uniformly or per-sample?

Respons14：Thank you for your reply. We adopt the method of adding Gaussian white noise with a specific signal-to-noise ratio three times separately for each normalized frequency-domain sound pressure sample (which can be implemented using the awgn function in MATLAB). The result of each noise addition is taken as a snapshot, and the average result of these three snapshots is taken as the covariance matrix of the sound pressure sample. At the same time, when processing simulation data, the benefits of such operation include not only eliminating the influence of the sound source phase part but also making it relatively easier for the machine learning model to extract features

Comments15： Experimental results are briefly discussed and lack depth. Model performance on real data is not compared to simulated data. Regression is ignored in real data.

Respons15：In the experimental section, the experimental data analysis process by the machine learning model was analyzed in detail in section 4.1. The related results are discussed in the section 4.2. In addition, the results are also compared with the simulation results. In the experimental section, only the classiﬁcation was performed because classiﬁcation methods perform better than regression which has been analysis from the simulation dataset, so we only use the classiﬁcation model and the regression model is ignored for the experimental data.

Comments16： Why was the 40–75 minute segment chosen from SWellEx-96?

Respons16：Thank you for your reply. The trend of the horizontal distance between the acoustic source and the vertical array is to first decrease and then increase, reaching the minimum value around 60 minutes. Therefore, we can adopt the data from 40 minutes to 75 minutes, among which the data from 2400s to 3580s can be used as the training set, and the data from 3580s to 4500s can be used as the test set.

Comments17： The conclusions summarize results without deeper interpretation. Please, reframe the section to focus on scientific contributions and impact.

Respons17：Thank you for your reply. In the revised version, the conclusion was further stated, the approach for acoustic source localization in ocean waveguides within a machine learning framework was presented which is useful for real underwater source localization. The prediction performance was analysis, and the performance of different algorithms was compared with different SNR both for the 1D localization and 2D localization which can give us reference to selecting the good prediction performance model. In addition, the results also show that classiﬁcation methods perform better than regression.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

After a careful reading of your manuscript, I would like to congratulate you on the clarity and coherence of your work. The paper has been prepared with due attention to editorial and structural details. The English language used is generally clear and comprehensible, though some issues noted below require revision.

The submitted manuscript addresses an important and timely topic – the automatic detection and interpretation of acoustic emission sources. Echolocation plays a vital role in contemporary remote sensing applications and constitutes a significant data source in hydroacoustic and environmental monitoring. Although this subject has been explored in numerous previous studies, the ongoing advancement of sensing technologies and algorithmic approaches clearly justifies further investigation.

Despite its strengths, the manuscript contains several formal issues that must be addressed before it can be considered for publication. Please consider the following detailed comments:

Lines 5–9: Affiliations are presented too generically. Please avoid placeholders like “Affiliation 1,2,3” – this section should provide the full institutional information.
Line 37 and following: Improper citation format – multiple references should be included within a single square bracket, placed before the sentence-ending punctuation. Similar issues appear in lines 48, 56, 67, etc.
Citing authors: Do not include initials or first names when referencing authors in-text – use last names only. This applies to lines 69, 74, 79, 83, and throughout the rest of the article.
Lines 121 and 123: Incorrect bullet formatting (unnecessary period); also, check the entire manuscript for missing spaces (e.g., lines 124, 125).
Section 2: While the background information is relevant, it largely represents established knowledge. Please clearly state that this section serves as an introductory context and support it with appropriate references.
From line 149: All equations should be center-aligned and numbered using standard parentheses formatting.
Line 187: Provide a proper citation for the stated information.
Lines 248–250: Clarify whether this is an original method or a modified version of an existing algorithm.
Line 253: Expand the acronym KRAKEN and provide a literature reference or a valid URL.
Line 263: Explain the rationale behind the selection of the specific dataset.
Figure 2 description: Too general; the figure itself also lacks clarity.
Line 311: A sub-section title appears at the bottom of the page, with content starting on the next – this needs correction.
Figures 3 c,d; 4 b,c,d; 5 a,b,c,d: The legend includes the term “real”, which is not visually represented. Titles of these figures are also too vague.
Line 346: Figure caption is located on a different page than the figure itself.
Figures 8–10: Difficult to interpret due to generic descriptions and low visual clarity.
Line 392: Another instance of a figure and its caption being separated across pages.
Figure 12: Elements are too small to be legible.
Language note: The verb “to adopt” is consistently used where “to adapt” is contextually appropriate, leading to semantic inaccuracies. Please revise all such occurrences.

Once these revisions have been incorporated, the manuscript can be resubmitted for further review. I wish you success in your continued work on this important topic.

Author Response

Comments1：Lines 5–9: Affiliations are presented too generically. Please avoid placeholders like “Affiliation 1,2,3” – this section should provide the full institutional information.

Response1：Thank you for your comments. We have made corresponding revisions.

Comments2：Line 37 and following: Improper citation format – multiple references should be included within a single square bracket, placed before the sentence-ending punctuation. Similar issues appear in lines 48, 56, 67, etc.

Response2：Thank you for your comments. We have made corresponding revisions.

Comments3：Citing authors: Do not include initials or first names when referencing authors in-text – use last names only. This applies to lines 69, 74, 79, 83, and throughout the rest of the article.

Response3：Thank you for your comments. We have made corresponding revisions.

Comments4：Lines 121 and 123: Incorrect bullet formatting (unnecessary period); also, check the entire manuscript for missing spaces (e.g., lines 124, 125).

Response4：Thank you for your comments. We have made corresponding revisions.

Comments5：While the background information is relevant, it largely represents established knowledge. Please clearly state that this section serves as an introductory context and support it with appropriate references.

Response5：The section 2 shows the localization based on machine learning. Include the Data Preprocessing and Label Selection, Selection of Sample Labels, and the Algorithm Flow of a Typical machine learning model. Part of the theoretical explanation in the revised manuscript has been simplified.

Comments6：From line 149: All equations should be center-aligned and numbered using standard parentheses formatting.

Response6：Thank you for your comments. We have made corresponding revisions.

Comments7：Line 187: Provide a proper citation for the stated information.

Response7：Thank you for your comments. We have made corresponding revisions.

Comments8：Lines 248–250: Clarify whether this is an original method or a modified version of an existing algorithm.

Response8：Thank you for your comments. This work focuses on classiﬁcation and regression approach for acoustic source localization which has not been systematically studied yet. The previous research only focus one approach with different machine learning models. In our research, we focus less data with multi algorithm to compare the related performance. In addition, in the data preprocessing, label selection and machine learning model parameters selection, we also made some new settings to adapt to specific underwater acoustic signal processing with different machine learning models.

Comments9：Line 253: Expand the acronym KRAKEN and provide a literature reference or a valid URL.

Response9：Thank you for your comments. The literature reference is added: PORTER M B. The KRAKEN normal mode program[R]. Naval Research Lab Washington DC, 1992.

Comments10：Line 263: Explain the rationale behind the selection of the specific dataset.

Response10：Thank you for your comments. The trend of the horizontal distance between the acoustic source and the vertical array is to first decrease and then increase, reaching the minimum value around 60 minutes. Therefore, we can adopt the data from 40 minutes to 75 minutes, among which the data from 2400s to 3580s can be used as the training set, and the data from 3580s to 4500s can be used as the test set. In this case, we can get the acoustic source signal with contain corresponding features.

Comments11：Figure 2 description: Too general; the figure itself also lacks clarity.

Response11：Thank you for your comments. We have made corresponding revisions.

Comments12：Line 311: A sub-section title appears at the bottom of the page, with content starting on the next – this needs correction.

Response12：Thank you for your comments. We have made corresponding revisions.

Comments13：Figures 3 c,d; 4 b,c,d; 5 a,b,c,d: The legend includes the term “real”, which is not visually represented. Titles of these figures are also too vague.

Response13：Thank you for your comments. We have made corresponding revisions.

Comments14：Line 346: Figure caption is located on a different page than the figure itself.

Response14：Thank you for your comments. We have made corresponding revisions.

Comments15：Figures 8–10: Difficult to interpret due to generic descriptions and low visual clarity.

Response15：Thank you for your comments. We have made corresponding revisions.

Comments16：Line 392: Another instance of a figure and its caption being separated across pages.

Response16：Thank you for your comments. We have made corresponding revisions.

Comments17：Figure 12: Elements are too small to be legible.

Response17：Thank you for your comments. We have made corresponding revisions.

Comments18：Language note: The verb “to adopt” is consistently used where “to adapt” is contextually appropriate, leading to semantic inaccuracies. Please revise all such occurrences.

Response18：Thank you for your comments. All such occurrences are revised in the revised version.

Reviewer 3 Report

Comments and Suggestions for Authors

The manuscript includes a framework using classical methods; the authors must highlight the main contributions of this numerical study. The real challenge lies in using real data in the Underwater Acoustic Source Localization problem.
.
A section for "Related Works" must be included to compare the most closely related methods in a table.
Figure 1 presents a framework where the outputs are not included.
After presenting the equations, the indentation is regularly omitted to proceed with describing all formula parameters.
Figure 2 should be substantially improved.
Images should be enhanced to improve the quality of information. Additionally, most captions are superficial, and sub-images should also be noted.
The experimental results lack a proper discussion; please include a subsection focused exclusively on discussion.
The bibliographical references must be updated, selecting references from the last five years and high-impact journals.

Comments on the Quality of English Language

The manuscript requires deep proofreading.

Author Response

Comments1：A section for "Related Works" must be included to compare the most closely related methods in a table.

Response1：Thank you for your reply.

(1)Systematically compared the performance of machine learning algorithms in acoustic source localization applications.

(2)Within a unified framework, comprehensively evaluated the performance of four classical machine learning models (DT, RF, SVM, FNN) in underwater acoustic localization; based on a simulation environment similar to the SWellEx-96 experimental environment, performed acoustic source localization on one-dimensional (distance) and two-dimensional (distance + depth) simulated datasets through classification and regression tasks under different signal-to-noise ratios (SNR=2, 5, 10).

(3) Furthermore, this study provides standardized data processing models and, through simulated and field data (SWellEx-96 experiment), validates the feasibility of using machine learning to replace traditional physics-based models (e.g., Matched Field Processing, MFP) in complex marine environments, thereby offering practical guidelines for algorithm selection and performance boundaries for engineering applications.

Comments2：Figure 1 presents a framework where the outputs are not included.

Response2：Thank you for your reply. In the revised manuscript, we modified the framework of the machine learning algorithm.

Comments3：After presenting the equations, the indentation is regularly omitted to proceed with describing all formula parameters.

Response3：Thank you for your reply.We have made corresponding revisions.

Comments4：Figure 2 should be substantially improved.
Images should be enhanced to improve the quality of information. Additionally, most captions are superficial, and sub-images should also be noted.

Response4：Thank you for your reply. We have made corresponding revisions.

Comments5：Images should be enhanced to improve the quality of information. Additionally, most captions are superficial, and sub-images should also be noted.

Response5：Thank you for your reply. We have made corresponding revisions.

Comments6：The experimental results lack a proper discussion; please include a subsection focused exclusively on discussion.

Response6：Thank you for your reply .In the revised manuscript, the experimental data analysis process by the machine learning model was analyzed in detail in the section 4.1. The related results are discussed in the section 4.2.

Comments7：The bibliographical references must be updated, selecting references from the last five years and high-impact journals.

Response7：Thank you for your reply. Thank you for your reply. We have made corresponding revisions.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The authors made the requested changes satisfactorily.

Reviewer 2 Report

Comments and Suggestions for Authors

Dear Authors,

Thank you very much for your revised manuscript version. I have read it carefully and checked all the places that have been discussed. I also appreciate your responses to my comments. I can see that you made a big effort to improve your text, but now - in my opinion - it looks much more transparent. I only have one concern - please check the fonts, especially at the formulas. I am not sure whether it is entirely along with the journal requirements. However, please clarify that with the editor.
As I cannot identify any more important mistakes, I recommend your text for publication after the necessary editorial checks. I wish you good luck with your further studies!

Reviewer 3 Report

Comments and Suggestions for Authors

The authors have addressed all my previous remarks satisfactorily. I have no further comments; therefore, the manuscript is recommended for publishing.