Vehicle Localization in a Completed City-Scale 3D Scene Using Aerial Images and an On-Board Stereo Camera
Round 1
Reviewer 1 Report
This study proposed a wall complementarity algorithm based on the geometric structure of buildings to refine the city-scale 3D scene. A 3D-to-3D feature registration algorithm is developed to determine vehicle location by integrating the optimized city-scale 3D scene reconstructed through UAV with the local scene generated by an onboard stereo camera. The effectiveness of the proposed algorithm is validated through simulation experiments in a CG simulator.
Comment 1: There is a lack of comparison between the latest methods and the description of the innovation.
Comment 2: There is a lack of analysis of the affluence of the parameter choice for the result, such as the impact of the number of segmentation layers on reconstruction accuracy.
Comment 3: There are some mistakes in the format of the reference, such as references [17] and [26]。
Author Response
Thank you for your kind review. Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 2 Report
The argument here presented is very interesting and the manuscript appears well structured. I also appreciated future perspective regarding the improvement of this analysis. However, I suggest you to make some additions and adjustments:
- Which sfm software has been used?
- Could be interesting the inclusion of a generic timeline of the entire process
- a discussion chapter could be included with critical opinion on performed analysis.
- Can this process be adapted to the detection of other objects?
Thank you.
Author Response
Thank you for your kind review. Please see the attachment.
Author Response File: Author Response.pdf
Reviewer 3 Report
This article uses low-cost aerial images to establish the 3D scene maps and then uses the on-board stereo camera to perceive the environment to achieve vehicle localization. I think the innovation proposed by the author needs to be discussed. Some specific issues are as follows:
Title: it is ambiguous. The aerial images are used to construct the top views by the widely applied SFM and a completion method is used to repair the facades. The point clouds are captured from the on-board stereo camera and matched with the local 3D scene frame by frame. Therefore, the completed city-scale 3D scene is constructed only by using aerial images? How to update the maps?
Line 1: The method of this paper covers few about the SLAM? True SLAM hardly relies on high-precision maps?
Line 5: The previous mentioned that the generation of high-precision maps utilized but it is expensive and difficult to commercialize. The reason for using low-precision maps is to optimize the visual positioning of vehicles?How to quantify the accuracy level as well?
Line 11-14: There is no quantified results?
Line 16: The emphasis in the Introduction part should be on the accuracy of the study, but the article mainly points out the price difference with the use of high-precision equipment?
Line 191: Figure 2 it not cited?
Section 4.2: Simple methods with too many descriptions. It can only restore the vertical facade structures?
Section 6.4: NDT is widely used, so it is not need to recount too much? Why the NDT was used but not the others?
Section 4, 5, 6: Insufficient description of how to ensure the accuracy and stability of the algorithm. Just a simple assembly of the existing methods?
Results: too little analysis
Reference 17: Lack of meeting time and location
Author Response
Thank you for your kind review. Please see the attachment.
Author Response File: Author Response.pdf
Round 2
Reviewer 3 Report
I accept this version.