Next Article in Journal
Unlocking the Potential of Remote Sensing in Wind Erosion Studies: A Review and Outlook for Future Directions
Previous Article in Journal
DIGITALESCAPE Project—Aerial Remote Sensing, HBIM, and Archaeology for the Preservation and Dissemination of the Cultural Heritage at Risk in the Sierra Sur and Sierra Morena Regions
 
 
Technical Note
Peer-Review Record

A Benchmark for Multi-Modal LiDAR SLAM with Ground Truth in GNSS-Denied Environments

Remote Sens. 2023, 15(13), 3314; https://doi.org/10.3390/rs15133314
by Ha Sier 1,2,†, Qingqing Li 2,†, Xianjia Yu 2,*, Jorge Peña Queralta 2, Zhuo Zou 1,† and Tomi Westerlund 2
Reviewer 1:
Reviewer 2: Anonymous
Reviewer 3:
Remote Sens. 2023, 15(13), 3314; https://doi.org/10.3390/rs15133314
Submission received: 3 May 2023 / Revised: 16 June 2023 / Accepted: 21 June 2023 / Published: 28 June 2023

Round 1

Reviewer 1 Report

The contribution presented is consistent with the themes envisioned by the journal. It summarizes the different phases of a large research project on analyzing the algorithms underlying SLAM Technology, in a rigorous and analytical manner. the contribution is accepted again in its current form.

Author Response

Reviewer 1

We want to sincerely thank you for your thoughtful review and positive evaluation.

Reviewer 2 Report

1. The displayed figures and tables should be close to the cited position. 

2. "Figure 1. Multi-modal LiDAR data acquisition platform and samples from maps obtained in the different environments included in the dataset." Only dataset from indoor environment is displayed?

3. The figures are not scientific and aesthetically pleasing.

4. The characteristic description of the new datasets are not prominent enough.

Author Response

 Reviewer 2 :

Comment #1:  The displayed figures and tables should be close to the cited position.
Reply: 

Thanks for your suggestion, we have made the necessary adjustments by relocating the figures and tables closer to their respective citations as follows:

  • Moved Algorithm Table 1 from page 10 to page 8.
  • Moved Table 3 from page 11 to page 10.

 

Comment #2:  "Figure 1. Multi-modal LiDAR data acquisition platform and samples from maps obtained in the different environments included in the dataset." Only dataset from indoor environment is displayed?
Reply: After thoughtful deliberation, we have chosen to segregate the two subgraphs originally contained in Figure 1 into two separate figures: Fig.1 and Fig.2. The primary objective of Figure 1 is to exhibit the ground truth map, with an emphasis on the intricate, zoomed-in details. We showed the maps of the rest of the environment in Fig.4.

 

 

Comment #3The figures are not scientific and aesthetically pleasing.
Reply:  In accordance with the provided suggestions, we have undertaken a thorough revision of certain figures and tables to enhance their visual appeal and comprehensibility. Specifically, Tables 1 and 4 have been restructured into a horizontal layout. We chose to split the two subplots originally included in Figure 1 into two separate figures: Figure 1 and Figure 2.

 

Comment #4: The characteristic description of the new datasets is not prominent enough.
Reply:

Our dataset offers a comprehensive representation of diverse environments and novel sensor technologies, making it a valuable resource for conducting research in our field. The dataset possesses the following unique features: 

  • The proposed dataset encompasses a wide range of environments, including various indoor and outdoor settings. These environments span from elongated corridors and spacious halls to open roads, forest landscapes, classrooms, and laboratories.
  • The dataset employs various types and resolutions of LiDAR systems, including both novel solid-state and spinning LiDAR. This diversity enhances the robustness and applicability of our research, enabling comprehensive analysis across different scenarios.
  • The dataset provides ground truth for each sequence, enabling researchers to compare sensor performance and benchmark their methods across diverse environments and various LiDAR sensors.

Reviewer 3 Report

In Fig. 1, Why do these two images have to be in the same figure? Maybe it is better to separate them into two figures.

 

Regarding the description of your main contribution, you provide a ground truth trajectory generation method. I did not understand which is this novel method, where it is described and which is actually your contribution to this.

 

Are the lidar and stereo fisheye cameras used somewhere for evaluation in this paper or you just captured this information for the dataset?

 

Are you using five or ten slam methods for the comparisons? In the contribution paragraph, you refer to 10 and in the conclusion 5.

Did you make any modifications to the existing algorithms or the referred contribution is that you run and test the performance of these algorithms to your dataset?

 

 It is very difficult to read Tables 1,3,4. Please change the orientation or find another way to present the information.

 

I did not find anywhere in the text an explanation and discussion for Figure 3. Which is the added value to present this figure and how these map data are produced?

 

I do not understand which is the purpose of presenting Fig. 5, which represents only the first 10 seconds and from 1 only sequence (while also the device is stationary).

 

Even though you used some enlarged details in Figure 6, it is still very hard to compare and evaluate the accuracy of the presented methods.

Additionally, It would be useful for the reference of the (a), (b), and (c) subfigures, to also exist in the main caption of the figure. e.g., (a) indoor, (b) road, and (c) wild environments.

 

An extensive qualitative discussion and conclusions regarding the results of the SLAM methods are required. Not only the presentation of some quantitative results.

 

Are there 2 or 4 used platforms?

Author Response

Reviewer 3  

 

Comment #1:  In Fig. 1, Why do these two images have to be in the same figure? Maybe it is better to separate them into two figures.

Reply:  Thanks for your suggestion, the subgraphs (a) and (b) in Fig.1 have been modified to Fig.1 and Fig.2.

 

 

Comment #2:  Regarding the description of your main contribution, you provide a ground truth trajectory generation method. I did not understand which is this novel method, where it is described and which is actually your contribution to this.

Reply: In Section 3.3, we present the SLAM-assisted ground truth map generation method, with the detailed logic outlined in Algorithm 1. The novelty of the proposed ground-truth shows in two key aspects:

  • Proposed a method to generate ground truth trajectories for situations where common positioning systems (e.g, RTK-GNSS, OptiTrack system) are not available by taking advantage of a data acquisition platform's multi-modality and high-resolution lidar sensors. 
  • The study introduces a unique framework using low-cost solid-state LiDAR and high-resolution spinning LiDAR to establish accurate ground truths for large-scale indoor and outdoor areas, resulting in more detailed environmental sampling and high-definition ground truth maps.

 

 

Comment #3 : Are the lidar and stereo fisheye cameras used somewhere for evaluation in this paper or you just captured this information for the dataset?

Reply: The dataset introduced in this paper, which we have curated, encompasses data sourced from five LiDAR sensors, a LiDAR-integrated camera, and a fisheye camera. It should be noted that the data from both the LiDAR camera and the fisheye camera are incorporated within this dataset for the express purpose of facilitating future benchmarking of multi-modal SLAM algorithms. While in this study, we have executed a benchmarking process for several LiDAR SLAM algorithms using the point cloud data gathered from the 5 of LiDAR sensors in the dataset.

 

Comment #4 : Are you using five or ten slam methods for the comparisons? In the contribution paragraph, you refer to 10, and in the conclusion 5.

Reply:  We performed an analysis of the positioning accuracy achieved, comprising ten unique configurations generated by pairing five distinct LiDAR sensors with five SLAM algorithms, to critically compare and assess their respective performance characteristics. We have corrected it in the abstract and concluded with highlighting.

 

 

Comment #5: Did you make any modifications to the existing algorithms or the referred contribution is that you run and test the performance of these algorithms to your dataset?

Reply: In this research paper, we have intentionally refrained from making any modifications to the existing  SLAM algorithms. Our focus lies in conducting a fair and unbiased evaluation of these pre-existing SLAM algorithms. To facilitate this, we have utilized our own carefully curated dataset. This dataset provides an extensive range of scenarios that adequately challenge the algorithms and test their limits in a variety of contexts. Through this approach, we aim to thoroughly assess the performance and reliability of each SLAM algorithm in different environments.

 

 

Comment #6: It is very difficult to read Tables 1,3,4. Please change the orientation or find another way to present the information.

Reply: Thanks for your suggestion, we modified Table 1 and Table 4 into horizontal tables, but for Table 3, it is difficult to transform into horizontal tables due to the large number of comparison items (10 combinations composed of LIDAR and SLAM algorithms) and compared sequences (9 sequences).

 

 

Comment #7: I did not find anywhere in the text an explanation and discussion for Figure 3. Which is the added value to present this figure and how these map data are produced?

Reply:  Figure 4 (After modification, Fig.3 becomes Fig.4) illustrates the ground truth map procured via the SLAM-assisted ICP-based prior map generation method introduced in section 3.3. The images are presented in a specific sequence; moving from left to right, and then top to bottom, maps generated from sequences 'indoors09', 'indoors11', 'indoors06', and 'indoors10' are exhibited in that respective order. In response to your suggestion, we added the reference in line 181-182 of the text.

 

 

Comment #8: I do not understand which is the purpose of presenting Fig. 5, which represents only the first 10 seconds and from 1 only sequence (while also the device is stationary).

Reply: We assess the generated ground truth in alignment with the evaluation methodology outlined in the referenced paper (reference number in the paper is 25). Subfigures (a), (b), and (c) in Figure 6(After modification, Fig5 becomes Fig6) appraise the static fluctuations in X, Y, and Z coordinates within the 'indoors10' sequence when the platform is stationary. Given the particular difficulty of acquiring ground truth via motion capture systems or GNSS/RTK in long corridor environments, evaluation is limited to the methods used in this paper. Subfigure (d) contrasts the Z value of the trajectory obtained by mocap with that obtained via the generation method in a laboratory setting. This comparison of Z values is considered representative due to the LiDAR sensor's characteristic low vertical resolution.

Comment #9: Even though you used some enlarged details in Figure 6, it is still very hard to compare and evaluate the accuracy of the presented methods. Additionally, It would be useful for the reference of the (a), (b), and (c) subfigures, to also exist in the main caption of the figure. e.g., (a) indoor, (b) road, and (c) wild environments.

Reply: Thanks for your suggestion,we added references to subgraphs (a), (b) and (c) subgraphs in lines 242 to 245. We acknowledge that identifying errors in (b) and (c) is challenging due to the similarity in performance among the selected SLAM methods in the outdoor environment, specifically the forest and road scenarios. The primary purpose of the figure is to provide readers with a general understanding of the performance differences between various algorithms. Regarding figure (a) in the indoor environment, it is evident that the spinning-lidar based methods outperform the solid-state-lidar based methods. 

 

Comment #10: An extensive qualitative discussion and conclusions regarding the results of the SLAM methods are required. Not only the presentation of some quantitative results.

Reply: In an effort to provide a more exhaustive and qualitative comparative analysis, we have incorporated an additional section (Section 4.3). This section is specifically dedicated to evaluating mainstream SLAM methods, with an emphasis on the quality of mapping produced. This enhancement aims to provide a broader perspective on the performance and utility of these contemporary SLAM methods. 

 

Comment #11: Are there 2 or 4 used platforms?

Reply: Thanks for your suggestion, we have fixed it in line L243. In this paper, we tested 10 combinations of 5 state-of-the-art SLAM algorithms and 5 lidars in terms of CPU utilization, memory utilization, and release frequency on 4 different platforms.

Round 2

Reviewer 3 Report

There is a spelling mistake in Figure 3 : pruple  -> purple

Back to TopTop