Next Article in Journal
The Big Picture: An Improved Method for Mapping Shipping Activities
Next Article in Special Issue
A Multi-Modality Fusion and Gated Multi-Filter U-Net for Water Area Segmentation in Remote Sensing
Previous Article in Journal
Study of the OLR Anomalies before the 2023 Turkey M7.8 Earthquake
Previous Article in Special Issue
Sensing and Navigation for Multiple Mobile Robots Based on Deep Q-Network
 
 
Article
Peer-Review Record

NMS-Free Oriented Object Detection Based on Channel Expansion and Dynamic Label Assignment in UAV Aerial Images

Remote Sens. 2023, 15(21), 5079; https://doi.org/10.3390/rs15215079
by Yunpeng Dong 1, Xiaozhu Xie 2,*, Zhe An 3, Zhiyu Qu 1, Lingjuan Miao 1 and Zhiqiang Zhou 1
Reviewer 1:
Reviewer 2:
Reviewer 3:
Reviewer 4:
Remote Sens. 2023, 15(21), 5079; https://doi.org/10.3390/rs15215079
Submission received: 10 June 2023 / Revised: 1 September 2023 / Accepted: 21 September 2023 / Published: 24 October 2023
(This article belongs to the Special Issue Pattern Recognition in Remote Sensing II)

Round 1

Reviewer 1 Report

The paper is novel, and it could be accepted.

If possible, English could be improved.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The manuscript is a well-structured and professionally presented research paper in general. The research questions, investigation methods, procedures, main contributions, and results are clearly articulated.

Regarding acronyms, it is recommended to provide their full spellings upon first appearance in the manuscript and subsequently use the acronym only. Additionally, DynamicOTA does not seem to align with the “dynamic label assignment strategy."

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 3 Report

This manuscript proposes an improved target-oriented detection method that improves performance in terms of speed and accuracy by introducing RE-Net, DynamicOTA and SSM based on YOLOX algorithm. The study experimentally demonstrates the effectiveness of these techniques and verify the superiority of the proposed method by comparing it with other methods. However, there are still some questions.

1.        In Figure 16, the second and third rows demonstrate the superiority of the DynamicOTA method, and the first row demonstrates that the method misses detection. This is exactly the contrary of the description in the text and needs to be corrected.

2.        In table 7, the extraction results of different detectors for the DOTA dataset are compared. pictures of the extraction results of different algorithms for different categories can be added and analyzed to illustrate the superiority of the algorithm.

No comment!

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 4 Report

The paper presents a fast object detection for aerial images. It proposes an anchor-free single-stage oriented bounding box detection method that can optionally be NMS-free for devices with limited resources. It additionally provides solutions to improve the training by changing the sampling strategy of positive samples during the training process. The paper is well-written and well-structured and provides a detailed description of the steps, which is beneficial to even researchers new in this field.

Here are a few suggestions on how to enhance the quality of the paper further:

- There are a few hardcoded parameters introduced in the method (e.g., "2.5" in D_sample, "50", "10/ln(x+1)"). It would be beneficial to see how the parameters were chosen. Were they set purely empirically or there were specific computations behind the assigned values?

- Figure captions can be more descriptive. Currently, it is possible to understand the images while carefully reading the text; however, it would be better if the captions are self-descriptive and reduce the dependence on the text. Furthermore, it is really hard to see the details of some images in print and a good caption can potentially improve the usefulness of the figure to the reader.

- With the mentions of a faster convergence and a better training, I would be interested to know how long the training of the proposed network is compared to the other methods. I understand that the focus is generally on the inference speed, but the information on the training speed is also important to many researchers.

- The paper uses two different metric terms: "mAP50:95" which is a well-known metric introduced by COCO. However, the "mAP" metric used in almost all results is a bit unclear, as the "mAP" metric is usually measured for a specific IoU threshold but there is no mention of the threshold here. Is it mAP@50 (i.e., for IoU threshold of 0.5)? Or some other threshold is used in here?

- Please explain the abbreviations used for the categories in Figure 7.

- I'm curious why the set of methods compared on the DOTA and HRSC2016 datasets are a bit different? As an example, Oriented R-CNN has achieved the best mAP on HRSC2016, but is not reported on DOTA. It would be better if at least the top performers are reported for both datasets.

 

The overall writing quality is good. There are just a few minor grammatical issues that do not break the reading flow in any way but can be fixed with a round of proofreading.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Back to TopTop