Next Article in Journal
A High-Resolution Global Gridded Historical Dataset of Climate Extreme Indices
Next Article in Special Issue
Evaluation of Photogrammetry and Inclusion of Control Points: Significance for Infrastructure Monitoring
Previous Article in Journal
Autonomous “Figure-8” Flights of a Quadcopter: Experimental Datasets
 
 
Article
Peer-Review Record

LNSNet: Lightweight Navigable Space Segmentation for Autonomous Robots on Construction Sites

by Khashayar Asadi 1,*, Pengyu Chen 2, Kevin Han 1, Tianfu Wu 3 and Edgar Lobaton 3
Reviewer 1: Anonymous
Reviewer 2: Anonymous
Submission received: 11 February 2019 / Revised: 4 March 2019 / Accepted: 7 March 2019 / Published: 13 March 2019

Round  1

Reviewer 1 Report

makes use of Digital Image Processing with artificial intelligence techniques. The authors goals are to reduce the computational effort and maintain the quality of the results. This study may raise interest of the civil construction industry.

Introduction and Literature Review could be improved by making it clearer the scientific goal of the study. It would be helpful to learn the positive and negative aspect of the commented related work. A table containing a comparison would increase the quality of the review. The description of the method is clear and together with the Experimental Setup section should be enough for other groups trying to reproduce the proposed approach. The discussion of the results are concise and well written, but the conclusion section should be improved before publication. It would also be good if the authors could substitute some Internet Links, presented on the References, by indexed documents.

The text needs to be revised by a native language researcher to fit to current English.


Author Response

Response to Reviewer 1 Comments

 

Makes use of Digital Image Processing with artificial intelligence techniques. The authors’ goals are to reduce the computational effort and maintain the quality of the results. This study may raise interest of the civil construction industry.

 Point 1: Introduction and Literature Review could be improved by making it clearer the scientific goal of the study

Response 1: Thank you for your constructive comment. To clear the objective of the study, the authors changed the organization of the Introduction Section and presented the objectives and contributions of the study in a separate section named Research Objectives and Contributions (see lines 60-77).

 Lines 60-77: The major goal of this paper is to propose a deep convolution neural network (CNN) which reduces the computational load (model size) in order to reduce the latency by running multiple modules on the same Jetson. Since Asadi et al. [13] have already validated ENet implementation on the embedded platform in real-time, this paper focuses on comparing the performance of ENet and the proposed method on a server with the following specification: 128 GB RAM, Intel Xeon E5 processor, and two GPUs - NVIDIA Tesla K40c and Telsa K20c.

 The ultimate goal is to develop a robotics platform that navigates on construction sites. The hardware development (the authors' previous work [13]) and algorithm development (this paper) were tailored to navigation on construction sites.  This paper proposes a segmentation method to be efficient while providing accurate semantic segmentation.

 The main contributions of this paper are 1) creating a new pixel-level annotated dataset for real-time and mobile semantic segmentation in construction environments to deal with the limited number of training dataset and 2) proposing an efficient semantic segmentation method with a smaller model size and faster inference speed for future development of autonomous robots on construction sites. Although the focus of this study is on reducing the model size to enable running multiple modules on the same Jetson TX1, the inference time is also decreased, which increases the maximum input frame rate of the segmentation process for real-time performance.

 Point 2: It would be helpful to learn the positive and negative aspect of the commented related work. A table containing a comparison would increase the quality of the review.

 Response 2: Thank you for your suggestion. In the Literature Review Section, at the end of each subsection, the authors have discussed the general limitations and constraints with the previous studies (see lines 117-122 for the subsection 3.1 & lines 140-152 for the subsection 3.2). To highlight these discussions, the authors have edited the following paragraph.

 Revision (changes in italic)

 Lines 140-152: The ability to perform pixel-wise semantic segmentation in real-time is necessary for mobile applications. The above studies have the disadvantage of requiring a large number of floating point operations and have long run-times that hinder their usability. To address this issue, computationally lighter convolutional networks have been presented in [16, 50]. In the authors' previous work [13], for the segmentation model to work on the Jetson board with the limited memory and processing power, ENet [16] which has a smaller architecture compared to the recent deep neural networks was implemented on the UGV. However, the model size and high memory usage of this model forced the authors to allocate a separate Jetson to this module. The resulting latency in data transformation between different modules, caused by integrating multiple Jetson through the network, made the system inefficient in fast movements. This latency forced the authors to restrict the speed of the UGV for real-time performance necessity. Therefore, one of the two main contributions of this paper focuses on proposing a new segmentation model by reducing the model size and memory usage while maintaining accuracy.

 Point 3: The description of the method is clear and together with the Experimental Setup section should be enough for other groups trying to reproduce the proposed approach. The discussion of the results are concise and well written, but the conclusion section should be improved before publication.

 Response 3: Thank you for your feedback. The authors modified the Conclusion Section as follows:

 Revision (changes in italic)

Lines 335-355: One of the major challenges in a robotic system that integrates multiple modules on a single modular framework, is the integration between the modules which are running on separate processing units. Latency in data transformation between different processing units decreases the performance of the system. Combining modules to run on the same processing unit can address this challenge and also reduces the cost, size, and weight of the system which counts as a significant improvement for mobile robotic systems that have either limited computing resources or payload capacity (e.g., unmanned aerial vehicles). This paper presents an efficient semantic segmentation model that can be run in real-time on multiple embedded platforms that are integrated as a system to determine navigable space in real-time. The results show improvement in model size, memory usage, and computation time (50% and 18% reduction in model size and inference time respectively) while keeping the accuracy almost the same. 50% reduction in the model size is a significant contribution, which enables multiple modules to be combined and run on the same processing unit. The core of model architecture is a new block based on separable convolution which compresses the parameters of present residual block meanwhile maintaining the accuracy and performance.  

 The proposed method is a step forward in making intelligent and contextually aware robots ubiquitous. In the future, the model's segmentation ability and inference speed could be further improved by applying new neural network architectures and new loss function design. The proposed binary problem in the dataset is a good starting point but more complex classification tasks would be required in more realistic scenarios. So, a high-precise and multi-class annotated construction site dataset are in need. Moreover, existing 3D models (i.e., BIM) of construction sites can potentially guide and streamline the data collection of highly precise pixel-level annotations.

 Point 4: It would also be good if the authors could substitute some Internet Links, presented on the References, by indexed documents

 Response 4: Thank you for your comment. The authors replaced the internet links with indexed documents (see references #17 and #55). For Clearpath Robotics and NVIDIA Jetson TX1, there is no indexed documents. So the official websites were cited.

 Point 5: The text needs to be revised by a native language researcher to fit to current English

 Response 5: Thank you for your input. The authors reviewed and edited the manuscript with the overall goal of improving the writing quality.

Reviewer 2 Report

This paper presents an efficient semantic segmentation method that, based on the authors, can run on an embedded platform to determine navigable space in real-time. The authors build their approach on previously published research and they provide a solid paper describing their approach. Overall the paper is of good quality. My only, minor concern, is that all the testing has been performed on a computer server and their approach has not been implemented and tested on the real robots. I understand that the scope of the paper isn't to highlight the functionality of the system, but some experimental comparison using the real platforms would have been useful and might had added value for the robotics community. 


Author Response

Response to Reviewer 2 Comments

 This paper presents an efficient semantic segmentation method that, based on the authors, can run on an embedded platform to determine navigable space in real-time. The authors build their approach on previously published research and they provide a solid paper describing their approach. Overall the paper is of good quality.

 

Point 1: My only, minor concern, is that all the testing has been performed on a computer server and their approach has not been implemented and tested on the real robots. I understand that the scope of the paper isn't to highlight the functionality of the system, but some experimental comparison using the real platforms would have been useful and might have added value for the robotics community.

 Response 1: The authors appreciate the reviewer’s comment. The authors have already validated ENet implementation on the embedded platform in real-time (Asadi et al. 2018) (see line 256). In the future, the authors will implement the proposed segmentation model in an embedded platform which enables multiple modules (e.g., SLAM and segmentation) to be combined and run on the same processor unit. By doing this, the latency caused by integrating multiple Jetson TX1 through the wired network will reduce drastically.

 References

Asadi, K., Ramshankar, H., Pullagurla, H., Bhandare, A., Shanbhag, S., Mehta, P., Kundu, S., Han, K., Lobaton, E., and Wu, T. (2018). “Vision-based integrated mobile robotic system for real-time applications in construction.” Automation in Construction, 96.


Back to TopTop