Next Article in Journal
The Effects of Temperature and Salinity Stressors on the Survival, Condition and Valve Closure of the Manila Clam, Venerupis philippinarum in a Holding Facility
Previous Article in Journal
Long-Term Trends of Sea Surface Wind in the Northern South China Sea under the Background of Climate Change
 
 
Article
Peer-Review Record

Sea Surface Object Detection Algorithm Based on YOLO v4 Fused with Reverse Depthwise Separable Convolution (RDSC) for USV

J. Mar. Sci. Eng. 2021, 9(7), 753; https://doi.org/10.3390/jmse9070753
by Tao Liu 1, Bo Pang 1, Lei Zhang 2,*, Wei Yang 3 and Xiaoqiang Sun 1
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Reviewer 3: Anonymous
J. Mar. Sci. Eng. 2021, 9(7), 753; https://doi.org/10.3390/jmse9070753
Submission received: 1 June 2021 / Revised: 4 July 2021 / Accepted: 5 July 2021 / Published: 7 July 2021
(This article belongs to the Section Ocean Engineering)

Round 1

Reviewer 1 Report

The authors propose a small change in the architecture of the Yolov4 object detection model, achieving a reasonable improvement in both precision and inference time for the SeaShips and SeaBuoys datasets. In this regard, the work is relevant in the field of autonomous sea surface vehicles and interesting for the readers of the journal.

 

However, It is difficult for me to assess what is the original contribution of the authors in the architecture improvement. I understand that the main claim of the authors is to propose a reverse DSC in the backbone and FFN parts of the architecture. However, the phrase “RDSC was proposed…” leads to confusion. The use of Mish activations is already used in the original yolov4 paper [20]. As far as I know, “standard” DSC is a 3x3 kernel then the pointwise 1x1 kernel [41]. In fig. 4, the proposed reverse DSC has this order inverted (thus the term “reverse”). Then in Fig. 5  the “standard ResUnit” has the reverse order and the “improved ResUnit”  the one of standard DSC. Is the “improved ResUnit” something from the state of the art or the contribution of this work?

The same applies to the FFN. In fig 6, the architecture BEFORE the improvement is explicitly drawn as 1x1 then 3x3 convolutions, and the AFTER improvement is just a green rectangle. According to figure 4. , the structure of the RDSC (green rectangle)  is 1x1 then 3x3 convolutions. It is very confusing to me.

A revised version of this manuscript should indicate clearly which are the various architecture options from the literature, and which specifically is the architectural change proposed in this work.

Additionally, if the gain in FPS is due to a reduced number of weights, the number of weights with the proposed changes should be compared with the original number.

I don’t understand the sentence “From the perspective of the overall structure of the ResUnit, RDSC is kept as consistent as possible with the original structure during application… “ application?

Although the term RDSC is clear from the text, the acronym should be defined the first time it appears.

The paper is based in yolo v4 and the term “yolo v4” appears 54 times, yet the reference to the original paper proposing this network  [20] is not explicitly mentioned , I think that the reference [20] should be mentioned somewhere, for example at the beginning of 2.1

Finally, there is no information about the implementation of the network. The original implementation is based in darknet in C++, but  the authors mention Tensorflow 2.1 I am not asking to publish the code, but more information about the implementation should be included. For example, if the modifications have been applied to an already published Tensorflow implementation available in Github, if it is the case, the authors should be give credit.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

-The paper should be interesting ;;;
-please add block diagram of the proposed research;;;;
-please add photos of measurements, sensors;;;;
-add arrows, axes, labels what is what  ;;;;; .
-please add photos of the application of the proposed research, 2-3 photos ;;; 
-formulas and fonts should be formatted;;;;
-figures should have high quality;;;
-labels of figures should be added and should be bigger;;;; 
-please add labels + SI units if any etc.
-references should be 2019-2021 Web of Science about 50% or more;; 30-40 at least.;;;; show new knowledge;;;
-the authors should have new knowledge;;;
for example thermal imaging vs the proposed method, advantages/disadvantages

1)
Ventilation Diagnosis of Angle Grinder Using Thermal Imaging. Sensors 2021, 21, 
https://doi.org/10.3390/s21082853

-is there a possibility to use the proposed methods for other problems ? ;;;; 
-Conclusion: point out what are you done;;;; 
-please add information about future analysis;;;;

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

Thank you for this interesting paper on sea surface object detection.

I have just a few comments:

The first two paragraphs of the introduction lack supporting references. It would be good to include appropriate literature.

Your method is described conceptually in section 3. However, I think it would be of great interest to the informed reader to see what is happening mathematically.

Unfortunately, the conclusions are quite brief and do not include an outlook. What are the next steps you are planning in light of the results achieved?

I would finally suggest that the paper includes a link to a repository with your code such that other scientists can make use of it. Our you could state that it is available from the authors upon request.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 2 Report

-it is good idea to add some results in Conclusion section.

for example accuracy/percentages of the proposed method

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop