Object Detection Using Improved Bi-Directional Feature Pyramid Network
Round 1
Reviewer 1 Report
In this manuscript, Quang et al. proposes an improved Bi-directional Feature Pyramid Network for object detection. The authors provide a quality work with a significant novelty that is clearly presented - for experts. However, I still had some specific questions; I strongly believe that the manuscript will benefit from addressing these issues and revisiting certain figures and paragraphs, and eventually become easier to follow for the general and even for the expert readership.
- Fig1 is a good way of presenting the architecture and giving examples of the certain steps. Although, the latter could and should be improved. The authors demonstrate what MaxPooling layers intuitively do but they didn't go further on this effort. None of the exciting novel improvements are exemplified in this way. Is there any way to do so for a limited number of layers?
- In Fig1, why are more than one MaxPooling layers applied after each other? Would only one carefully designed layer give the same result?
- Very limited amount of information is provided about the MS COCO or PASCAL VOC dataset. For the general readership this is impossible to follow. Would be great to make a panel showing how and what the proposed model identified on a picture from this DB.
Author Response
I enclose the reply letter.
Author Response File: Author Response.pdf
Reviewer 2 Report
In my opinion, the authors performed interesting and useful research in the perspective field. The article is quite well organized but still from my point of view I noticed some weakness:
1; I think authors should more cleary define the research problem and highlight the main task and aims of their work;
2; Authors tried their model only on two datasets, I missed some kind of discussion what problems, limitations and difficulties could appear in real-live applications.
3; In me opinion conclusions should be expanded and stronger supported by experimental results, some qualitative parameters would be useful. Also, I missed discussion about possible implementations, plans or recommendations for future research.
Minor drawbacks:
Where are some citation issues, articles are cited not in the order in which they appear in the text.
Picture's quality, must be improved, some fonts are too small and notes are hardly readable especially Fig.2.
Author Response
I enclose the reply letter.
Author Response File: Author Response.pdf
Reviewer 3 Report
In this paper, the authors proposed an improved bi-directional feature pyramid network for object detection. The authors proposed the detailed architecture and verified the method with two datasets. There are quantitative comparisons with previous methods and also test the proposed method with ablation study. There are some questions about the method and results, and addressing them may help to improve the manuscript.
It seems the improvement is marginal compared to the previous methods (e.g., EFIPNet). In Table 1, the EFIPNet actually performs better than the proposed method. Can the authors comment on the reason why the proposed method has low performance in this scenario?
It would also be good if the authors can show some object detection examples with the proposed and previous methods. Therefore, the readers can directly see the improvements of the detection accuracy.
How long does it take for the training?
Author Response
I enclose the reply letter.
Author Response File: Author Response.pdf
Round 2
Reviewer 1 Report
Based on the reply letter and the revised version, the authors have addressed my concerns and applied my suggestions. Therefore, I recommend to accept the manuscript for publication in Electronics.Author Response
Thank you very much for kind consideration and final acceptance.