Next Article in Journal
Multimodal Data Fusion for Tabular and Textual Data: Zero-Shot, Few-Shot, and Fine-Tuning of Generative Pre-Trained Transformer Models
Previous Article in Journal
Discriminative Deformable Part Model for Pedestrian Detection with Occlusion Handling
 
 
Article
Peer-Review Record

From Camera Image to Active Target Tracking: Modelling, Encoding and Metrical Analysis for Unmanned Underwater Vehicles†

by Samuel Appleby *, Giacomo Bergami and Gary Ushaw
Reviewer 1: Anonymous
Reviewer 2:
Reviewer 3: Anonymous
Submission received: 24 February 2025 / Revised: 22 March 2025 / Accepted: 31 March 2025 / Published: 7 April 2025

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

To enhance the manuscript’s academic rigor and impact, the authors should comprehensively revise the Introduction, Related Work, and Conclusion sections by explicitly articulating research objectives, contextual motivations, and theoretical/practical significance at the outset, while concisely outlining methodological frameworks and original contributions. The Introduction must systematically address the study’s rationale, current research gaps (particularly unresolved challenges from 2023-2024 literature, such as doi.org/10.62762/TIS.2024.136895, doi.org/10.62762/TSCC.2024.212751, and doi.org/10.62762/TSCC.2024.989358), and how findings advance beyond existing knowledge. The literature review requires expanded citations to contemporary works and clearer linkages to how this research addresses identified deficiencies. Methodological transparency should permeate all sections, emphasizing SWiMM2.0 architecture’s technical novelty through systematic contrasts with conventional approaches, supported by rigorous benchmarking against state-of-the-art ML/DL models in accuracy, F1-score, and computational efficiency, complemented by ablation studies. Data provenance must clarify whether datasets are original or benchmark-derived, with ethical protocols specified where applicable. Finally, the manuscript should streamline redundant methodological descriptions, consolidate overlapping discussions in Results/Analysis, and prioritize critical insights over peripheral details to amplify scholarly focus and readability.

        Comments on the Quality of English Language

The English still needs to be improved.

Author Response

For the reviewer’s comments, all authors express their gratitude for any suggestions, constructive comments and improvements that we hope allowed to strengthen the paper’s position. We have tried to fulfil all the comments from the reviewer to the best of our ability.

Our responses are attached.

Author Response File: Author Response.pdf

Reviewer 2 Report

Comments and Suggestions for Authors

   By integrating the Unity simulation modeling of the BLUEROV underwater rover with the Deep reinforcement learning (DRL) backend, this paper proposes a SWiMM2.0 underwater autonomous vehicle navigation and active target tracking framework. Enhanced generalization of different underwater environment control algorithms using conditional hybrid variational self-coding (CMVAE) for dimensionality reduction while minimizing information loss. These methods make SWiMM2.0 innovative and have better evaluation and optimization capabilities. However, there are the following deficiencies in the paper:

  1. Relying on simulation fidelity to ensure a transition from simulation to real life can cause generalization problems and hinder its performance in real-world applications. Although the CMVAE method enhances generalization, its ability to bridge the complexity of the mapping between simulation and reality requires effective validation.
  2. Focusing solely on visual data may overlook other sensory factors, such as sonar or environmental sensors, which should be addressed in the summary section.
  3. System performance evaluation should be given in unpredictable underwater scenarios, such as dynamic or complex operating environments.
  4. The abstract needs to be further modified, and it is suggested that it be modified according to the purpose, method, and conclusion. In addition, some grammatical and lexical errors should be carefully checked and corrected.
  5. A detailed explanation of Figure 1 is recommended.
  6. In section 1.3, what are the improvements in SWiMM2.0 compared to SWiMM1.0? What issues did you fix in the old version?
  7. How can the rapidity of the training algorithm proposed in this paper be verified? Specific comparative data should be provided.
  8. The parameter symbol of Camera Width should be given in Eq. (3)
Comments on the Quality of English Language

Some grammatical and lexical errors should be carefully checked and corrected.

Author Response

For the following reviewer’s comments, all authors express their gratitude for any suggestions, constructive comments and improvements that we hope allowed to strengthen the paper’s position. We have tried to fulfil all the comments from the reviewer to the best of our ability.

Our responses are attached.

Author Response File: Author Response.pdf

Reviewer 3 Report

Comments and Suggestions for Authors The paper deals with the development of target tracking algorithms based on data provided by imaging equipment, such as video cameras. The current state of the art in the field is presented accurately and well documented by the authors. Also, the current elements that can be improved are presented. In this regard, the authors propose their own approach, combining analytical modeling elements with the use of optimization methods and algorithms. Numerical and graphical simulation results are presented. I consider that the main criticism that can be made refers to the fact that the presented research is done "in side of the box", that is, an optimization is attempted using the same elements: video camera, optimization algorithms, more or less analytical methods, etc. In this situation, the results obtained may differ from the previous ones but within a relatively small margin. To conclude, I consider that a greater use of artificial intelligence elements, such as using video camera data to train machine learning algorithms, would bring more benefits to the final result compared to the methods already used. I therefore recommend that the authors review the paper and make at least a few comments regarding the use of ML algorithms in solving the problem.      

Author Response

For the following reviewer’s comments, all authors express their gratitude for any suggestions, constructive comments and improvements that we hope allowed to strengthen the paper’s position. We have tried to fulfil all the comments from the reviewer to the best of our ability.

Our responses are attached.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

I hereby confirm that no further substantive revisions are warranted from my perspective. The current iteration of the work satisfies all academic criteria for rigor and originality within this discipline. Please consider this review cycle concluded on my end.

 

 

Comments on the Quality of English Language

The English could be improved to more clearly express the research.

Back to TopTop