Next Article in Journal
Unveiling Temperature Patterns in Tree Canopies across Diverse Heights and Types
Previous Article in Journal
Stem Quality Estimates Using Terrestrial Laser Scanning Voxelized Data and a Voting-Based Branch Detection Algorithm
 
 
Article
Peer-Review Record

A Cross-View Image Matching Method with Feature Enhancement

Remote Sens. 2023, 15(8), 2083; https://doi.org/10.3390/rs15082083
by Ziyu Rao 1, Jun Lu 1,*, Chuan Li 2 and Haitao Guo 1
Reviewer 1: Anonymous
Reviewer 2:
Remote Sens. 2023, 15(8), 2083; https://doi.org/10.3390/rs15082083
Submission received: 6 March 2023 / Revised: 7 April 2023 / Accepted: 13 April 2023 / Published: 14 April 2023

Round 1

Reviewer 1 Report

For solving the cross-view image matching problem, this paper proposed an efficient method using a combination of polar coordinate transformation and deep neural networks with the cross-convolution and feature fusion modules. The manuscript is well written except for some details to be clarified and some minor mistakes to be corrected.

(1) In Sec 3.1, the authors state that “ experiments were conducted on four widely used publicly available datasets”, however, one of the datasets, namely CVUSA-, is nowhere to be found in the following experiments.

(2) In Sec 4.1, only the CVUSA dataset is used to train the model, what are the reasons? To test the generality of the model in cross-dataset applications? Why not to train different models on the four individual datasets, and to evaluate the models correspondingly.

(3) As to the comparison methods, Lines 319-321 states “ we used open source codes to retrain their models under the same experimental conditions”. However, what are the specific models? (seems to be CVM and DSM) Why selecting these models? Please give further explanations.

(4) The references in Table 2 are not correct. Besides, the introduction of metrics r@1% is not given, which is different from r@K.

(5) In Sec 4.2, the authors explained that Adding cross-convolution or feature fusion modules alone has a more general effect on improving the model.  The description is not clear enough. How to understand general effect on improving the model? From Table3, adding cross conv alone (i.e., Polar+Res+Cross conv ) did not improve the performance of Polar+Res. The values do not rise but fall, give an explanation.

(6) Also in Table 3, does Polar+Res+Cross conv+FFM correspond to Ours?If so, please remark. Since the results of Polar+VGG+Cross conv+FFM are given, youd better give the results of Polar+VGG+Cross conv and Polar+VGG+FFM. So the significance of Cross conv and FFM can be more fully seen

(7) Moreover, give the description of the abbreviation Res in Table 3.

(8) In Conclusions, the statements “ However, it is more common that the panoramic and ground-aerial images with non-corresponding ground-aerial centers cannot be acquired in real scenarios.” seem contradictory with the description of the context.

(9) Some typos like Line 185( .....feature availability,.) should be avoided.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Reviewer 2 Report

jilThe manuscript "A Cross-View Image Matching Method with Feature Enhancement" proposes a cross-view image matching method based on deep learning. It uses feature enhancement and feature fusion strategies to overcome the domain gap between satellite images and ground images. Essay organization and language expression are both poor. However, considering that the application scenarios are forward-looking, the method is innovative, and the experimental design is relatively enough, I suggest the author to make great improvement. Here are some comments.

1. It is recommended to add a related work section, which can be elaborated from the following aspects: single-view image matching, cross-view image matching.

2. Expand the Introduction section, focusing on the motivation of the paper. Expand the main contributions section, add details, and emphasize the core contribution of the paper.

3. In section 2.1, is polar coordinate transformation an original method of the article? If not, just briefly introduce the usage.

4. In Section 2.2, the article states that having many aerial targets affects scene extraction. However, in section 2.2.1, only an edge enhancer is used to eliminate these information. I doubt why these blocks can eliminate the effect. Please explain in detail. The authors could add a little intermediate result to prove this point.

5. In Section 2.2.2, feature fusion and residual structure have been widely used in deep learning related fields, and cannot be used as the main contribution of the article. It is suggested that the author re-examine the innovation point and rewrite this part.

6. Section 2.2.3, it is recommended to add details, such as pictures and descriptions.

7. Section 2.2 is the core part of the article. It is recommended that the author re-examine the motivation and innovation points and rewrite this part. It is now written like an engineering report rather than an academic paper.

8. In addition, there are still serious problems in the experimental and evaluation metrics section. The present work is insufficient to demonstrate the superiority of this method over existing methods. It is recommended to choose more advanced methods and more evaluation metrics for comparison.

9. Section 4.2, the ablation experiment is too brief, it is suggested to add some details.

10. Grammar and typography also need to be reconsidered.

Author Response

Please see the attachment.

Author Response File: Author Response.docx

Round 2

Reviewer 2 Report

the authors clarified all my concerns and the manuscript was well organized for publication.

Back to TopTop