Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Share Help Cite Discuss in SciProfiles

Open AccessArticle

Peer-Review Record

Enhancing Digital Twins with Human Movement Data: A Comparative Study of Lidar-Based Tracking Methods

Remote Sens. 2024, 16(18), 3453; https://doi.org/10.3390/rs16183453

by Shashank Karki^1,*

, Thomas J. Pingel², Timothy D. Baird¹

, Addison Flack¹ and Todd Ogle³

Reviewer 1: Anonymous

Reviewer 2: Anonymous

Reviewer 3:

Fayez Tarsha-Kurdi

Remote Sens. 2024, 16(18), 3453; https://doi.org/10.3390/rs16183453

Submission received: 5 August 2024 / Revised: 10 September 2024 / Accepted: 12 September 2024 / Published: 18 September 2024

(This article belongs to the Special Issue Remote Sensing: 15th Anniversary)

Round 1

Reviewer 1 Report

Comments and Suggestions for Authors

This study investigates how lidar-based tracking algorithms, such as YOLOv5, OpenCV and Percept approaches, can capture the spatial dynamics of indoor spaces while maintaining the privacy of the people using them. It is interesting.

The manuscript only uses the above three methods to process indoor lidar data, which does not present the authors’ innovative or novel research.

After carefully reading the manuscript, it is suggested to revise the following items to improve the manuscript.

1. It is suggested that the authors should supplement their research methods, and compare them with the existing ones

2. The legend in Fig.1(b) is ambiguous, it is recommended to redraw it.

3. Abbreviations that appear for the first time, such as "LAZ"--in line323,"DSM"--in line346, etc., need to be spelled in full.

4. In line 550, “(see Figure 20)”, Whether “Figure 20” should be changed to “Figure 12”?

Author Response

We greatly appreciate the time and effort they dedicated to offering insightful suggestions. We have attached a detailed, point-by-point response addressing each of the reviewers' comments. Significant revisions have been made throughout the manuscript based on their input with "tracked changes" onto make it easier to see what has been changed. We hope these revisions meet with your approval and welcome any additional feedback you may wish to provide.

Comment: The manuscript only uses the above three methods to process indoor lidar data, which does not present the authors’ innovative or novel research.

Response: Several additional sections have been added in the introduction, discussion, and conclusion to more clearly state our contribution. The title and abstract have been revised as well to emphasize this.

After carefully reading the manuscript, it is suggested to revise the following items to improve the manuscript.

It is suggested that the authors should supplement their research methods, and compare them with the existing ones

Thank you for the suggestion. We have incorporated your feedback by adding additional references in the introduction to provide a clearer comparison of our research methods with relevant literature.

For instance, at lines 189-212, we highlight how lidar can be a better alternative to camera-based approach to accommodate for privacy issues as well as effectively gather data in indoor spaces. Additionally, in lines 252-256, we highlight key studies that focus on the wide applications of the detection models outdoors. Additionally, at lines 271-277, we discuss how our approach builds upon these existing methods by incorporating lidar data to indoor mapping.

The legend in Fig.1(b) is ambiguous, it is recommended to redraw it.
Thank you for bringing this to our attention. We have redrawn all the figures and made sure any text or details have been clarified as per the recommendation, ensuring it is now more easily understandable.

Abbreviations that appear for the first time, such as "LAZ"--in line323,"DSM"--in line346, etc., need to be spelled in full.
We have revised the manuscript to ensure all abbreviations, including "LAZ" and "DSM," are spelled out in full upon their first mention, as recommended.

In line 550, “(see Figure 20)”, Whether “Figure 20” should be changed to “Figure 12”?

Thank you for pointing this out. We have corrected the reference from 'Figure 20' to the correct figure number (line 656).

Reviewer 2 Report

Comments and Suggestions for Authors

This manuscript makes significant contributions through its innovative approach to constructing architecture and indoor spaces using digitalization with remote sensing techniques. I recommend it for publication, subject to minor formatting revisions.

Please ensure that the text in lines 259–269, 361–385, and 651–692 is justified. Additionally, the resolution and quality of Figures 1(b), 3, 4, 5, 6, 7, and 8 should be enhanced.

One practical question that arises concerns the advantages and disadvantages of using radar versus lidar in SLAM. For instance, in the experiment shown in Figure 1, eleven lidar sensors were used to image the space. I am curious about the total processing time required for a single scan.

Also, Integrating radar as a complementary sensor could potentially enhance the proposed approach in this manuscript.

Comments on the Quality of English Language

Minor editing of English language required.

Author Response

Comment 1: This manuscript makes significant contributions through its innovative approach to constructing architecture and indoor spaces using digitalization with remote sensing techniques. I recommend it for publication, subject to minor formatting revisions.

Response: We greatly appreciate the time and effort they dedicated to offering insightful suggestions. We have attached a detailed, point-by-point response addressing each of the reviewers' comments. Significant revisions have been made throughout the manuscript based on their input with "tracked changes" onto make it easier to see what has been changed. We hope these revisions meet with your approval and welcome any additional feedback you may wish to provide.

Comment 2: Please ensure that the text in lines 259–269, 361–385, and 651–692 is justified. Additionally, the resolution and quality of Figures 1(b), 3, 4, 5, 6, 7, and 8 should be enhanced.

Response: We appreciate you bringing these points to our attention. We have made adjustments to ensure the text in lines 259–269, 361–385, and 651–692 is now justified. Additionally, the resolution and quality of Figures 1(b), 3, 4, 5, 6, 7, and 8 have been enhanced as requested. We appreciate your feedback.

Comment 3: One practical question that arises concerns the advantages and disadvantages of using radar versus lidar in SLAM. For instance, in the experiment shown in Figure 1, eleven lidar sensors were used to image the space. I am curious about the total processing time required for a single scan.

Response: Thank you for your comment. In our preliminary research, we explored both radar and lidar technologies. Lidar provides high precision and detailed 3D mapping, ideal for indoor settings, while radar offers resilience in adverse weather but with lower resolution, making it more suitable for outdoor use. We have added details about these trade-offs in the introduction at lines 213-219 and further discussed the potential for integrating both technologies in the discussion at lines 756-759.

Additionally, our system uses eleven lidar sensors, with a processing time of approximately 1/4 of a second per scan, allowing for real-time data capture.

Reviewer 3 Report

Comments and Suggestions for Authors

Abstract

convolutional neural networks (CNNs) should be written as (Convolutional Neural Networks (CNNs)), please check all abbreviations in the paper.

Between the paper title and the abstract, the paper idea become vague. I think that the paper idea is to compare two algorithms with a software for Tracking Human Movement. This remark leads to observe that the paper title and the abstract need to be re-written to fit the paper topic. From other viewpoint, perhaps the paper topic is to suggest two algorithms for tracking human movement, then the paper will compare these two algorithms with available software. If this hypothesis is valid, that means the paper will scan widely and fails to provide the requested details because the topic will become very huge, and one paper will not be enough to present it. Also, the question of homogeneity in this comparison is still in question. Can we compare two algorithms with one software? In this context, the authors said: “Each method had particular strengths and weaknesses,” do the software is a method? Normally the software uses an algorithm.

Introduction

Please highlight the novelty and the contribution in the paper.

Materials and Methods

When you talk about data fusion, it is unacceptable to say that this operation was done using a given informatic tools. You should explain if you reference that data, just make registration, or calibrate the system to make measurements be realized within the same coordinate system. That is not all, you should provide the used parameters and procedures to achieve your goal.

I don’t trust the provided tools packages; you should as researchers control them and be able to assess their functionality.

Figure 1 map key should be readable. Please extend Figure 1 to show the sensor locations; the image quality (resolution is so bad).

321: orthographic images are then recorded: the same sensors are used also for this action?

322: what is the accuracy and point density of the recoded point clouds?

340: Three methods of object detection were used: the sensors measure point clouds and orthographic images, please make list of all outcomes from sensors. Also, what are the input of 3 methods of data processing? Please provide flowcharts which explain data extraction, and then processing.

There is a huge confusion because you merged Materials and Methods section together and you put all information in one sac which make the paper structure incomprehensible. Please separated them into two independent sections to be able to distinguish between your work and the used material description. Furthermore, in the method section, add three subsections to present each approach a side. Thereafter, in the result section, divided it into three subsections: results, comparison, and discussion.

Please don’t put two section titles consecutively, separate them by a transition paragraph.

Please add a section to discuss the used threshold settings.

From Materials and Methods

Section, it seems that you didn’t contribute to tracking algorithms, that is why you put all algorithm and material in the same bag, what you did is using already developed tools. To conclude, you compare result of several algorithms.

I think that all paper structure is not correct. It was written quickly. I am sorry to tell you that the paper structure, title, and abstract should be fixed. Then the paper can be resubmitted to the journal again after clarifying your contribution.

Comments on the Quality of English Language

Minor editing of the English language is required.

Author Response

Abstract
Comment: convolutional neural networks (CNNs) should be written as (Convolutional Neural Networks (CNNs)), please check all abbreviations in the paper.

Response: Thank you for the suggestion. We have reviewed the manuscript and ensured that all abbreviations, including "Convolutional Neural Networks (CNNs)," are written in full upon first mention.

2. Comment: Between the paper title and the abstract, the paper idea become vague. I think that the paper idea is to compare two algorithms with a software for Tracking Human Movement. This remark leads to observe that the paper title and the abstract need to be re-written to fit the paper topic.

Response: We agree with your suggestions and have made the necessary changes to clarify the focus of the paper. The title has been updated to better reflect the study’s objectives: "Enhancing Digital Twins with Human Movement Data: A Comparative Study of Lidar-Based Tracking Methods." Additionally, the abstract has been significantly revised to align more closely with the updated focus and goals of the research. Thank you for your valuable input.

3. Comment: From other viewpoint, perhaps the paper topic is to suggest two algorithms for tracking human movement, then the paper will compare these two algorithms with available software. If this hypothesis is valid, that means the paper will scan widely and fails to provide the requested details because the topic will become very huge, and one paper will not be enough to present it.

Also, the question of homogeneity in this comparison is still in question. Can we compare two algorithms with one software? In this context, the authors said: “Each method had particular strengths and weaknesses,” do the software is a method? Normally the software uses an algorithm.

Thank you for your insightful feedback. We acknowledge the point regarding the comparison between open-source algorithms and proprietary software. The paper aims to show how resource-efficient, open-source methods can perform comparably to more resource-intensive proprietary software, highlighting trade-offs between simplicity, cost, and performance.

We have updated the abstract and revised key sections (lines 418-427) to reflect this focus.

Introduction
Comment 1: Please highlight the novelty and the contribution in the paper.

Response: We have revised the introduction to more clearly highlight the novelty and contribution of the paper, particularly in lines 142-151. Specifically, we emphasize that the innovation lies not in the algorithms themselves but in the application of lidar technology to track human movement in indoor environments, addressing unique challenges such as signal instability and occlusions.

Materials and Methods
Comment 1: When you talk about data fusion, it is unacceptable to say that this operation was done using a given informatic tools.

Response: You should explain if you reference that data, just make registration, or calibrate the system to make measurements be realized within the same coordinate system. That is not all, you should provide the used parameters and procedures to achieve your goal.

We appreciate the reviewer’s insightful comment on data fusion and have revised the Materials and Methods section, specifically lines 366-371, to clarify the process. Additionally, we have included Figure 2 to better illustrate the process. Thank you for your valuable feedback.

Comment 2: I don’t trust the provided tools packages; you should as researchers control them and be able to assess their functionality.

Response: Thank you for your comment. We understand the concern about using the provided tools. To clarify, we selected OpenCV and YOLOv5, both of which are widely used in computer vision, particularly for RGB images. While they are not traditionally used for lidar data, we adapted and validated these methods for our specific application.

Regarding Percept, we acknowledge that it functions more as a "black box." Our study aims to critically evaluate and validate its performance compared to the other, more established open-source methods.

Comment 3: Figure 1 map key should be readable. Please extend Figure 1 to show the sensor locations; the image quality (resolution is so bad).

Response: We apologize for the issues with Figure 1, particularly the readability of the map key and the poor image quality. We have remade all the figures with improved resolution and have extended it to clearly show the sensor locations.

Comment 4: 321: orthographic images are then recorded: the same sensors are used also for this action?

Response: Thank you for your comment. We have added this clarification to the manuscript, specifically in the lines 380-382.

Comment 5: 322: what is the accuracy and point density of the recoded point clouds?

Response: We have included the accuracy of the sensors and point density in lines 354-356. Additionally, we checked the point density of our point clouds and inserted the values in the lines 356-358.

Comment 6: Three methods of object detection were used: the sensors measure point clouds and orthographic images, please make list of all outcomes from sensors. Also, what are the input of 3 methods of data processing? Please provide flowcharts which explain data extraction, and then processing.

Response: We apologise about the confusion regarding the data inputs. We have clarified that the OpenCV-based methods and deep learning used orthographic images as input, while Percept processed real-time point cloud data directly (lines 408-411). We also clarified the outcome from the sensors, which are the raw point clouds from which the orthographic images are created. Additionally, we have provided a schematic flowchart (Figure 2) to illustrate the data extraction and processing steps, including the inputs for each detection method.

Comment 7: There is a huge confusion because you merged Materials and Methods section together and you put all information in one sac which make the paper structure incomprehensible. Please separated them into two independent sections to be able to distinguish between your work and the used material description. Furthermore, in the method section, add three subsections to present each approach a side. Thereafter, in the result section, divided it into three subsections: results, comparison, and discussion.

Response: We apologize for any misunderstandings in the Materials and Methods section and appreciate the detailed feedback. While we used existing algorithms and tools for tracking, our contribution lies in their application to lidar-based human movement detection and the evaluation of these methods in an indoor setting, which presents unique challenges. We have revised and added subsections (sections 3.1-3.4) to the Materials and Methods section to better outline our contributions, clarifying the distinct role of each algorithm in the context of our study.

We hope that the revisions made will adequately address the concerns raised and clarify the research and methodology.

Comment 8: Please don’t put two section titles consecutively, separate them by a transition paragraph.

Please add a section to discuss the used threshold settings.

Response: Thank you for your feedback. We have revised the manuscript to ensure that section titles are now separated by appropriate transition paragraphs to improve readability and flow (lines 430-432; 483-485; 507-510; 526-527). Additionally, we have added a new section specifically (section 2.4) discussing accuracy assessment and the threshold settings used in the study, including their selection process and the rationale behind the chosen values for each algorithm.

Comment 9: From Materials and Methods

Response: Thank you for your feedback. We have made significant revisions to the paper, including restructuring the Materials and Methods section (3.1-3.5) to clearly distinguish between the algorithms and tools used, and emphasizing our contributions in adapting them for lidar-based tracking (lines 142- 151; 190-211; 418-427). We also revised the title and abstract to better align with the research's focus and contributions. We hope these changes address your concerns and clarify the novelty of our work.

Thank you again for your valuable input.

Round 2

Reviewer 1 Report

Comments and Suggestions for Authors

The questions were answered, the relevant contents were added and the inadequacies were revised.

Article Menu

Enhancing Digital Twins with Human Movement Data: A Comparative Study of Lidar-Based Tracking Methods

Further Information

Guidelines

MDPI Initiatives

Follow MDPI