Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros
Round 1
Reviewer 1 Report
Comments and Suggestions for AuthorsThis paper investigates train obstacle detection technology and its testing methods. First, it reviews the technical principles, applications, advantages, and limitations of existing detection systems. In the experimental section, the Intelligent Train Eyewitness (ITE) system is used as an example, where black-box testing is conducted in the Level High-Precision (LH) mode, and white-box testing is conducted in the Level Exploration (LE) mode. The results are analyzed in detail. By comparing the two testing modes, the paper effectively highlights the limitations of existing detection systems and proposes potential improvements. However, the author should address the following points to improve the study:
(1) In the experimental section (Section 4.2), the author introduces the LE and LH modes of the ITE system and explains how the system switches between them. However, the discussion lacks a clear connection to the overall context, making the paragraph appear somewhat disconnected. The author is advised to further explore the suitability of different testing methods for each mode based on their respective characteristics.
(2) Figure 14 presents the specific distribution data of the original and expanded test sets. However, the visual differences between the two datasets are not easily distinguishable. The author is encouraged to provide additional annotations to enhance the clarity of the images.
(3)Some related works are missing.
Comments on the Quality of English LanguageNAN
Author Response
Comments 1:In the experimental section (Section 4.2), the author introduces the LE and LH modes of the ITE system and explains how the system switches between them. However, the discussion lacks a clear connection to the overall context, making the paragraph appear somewhat disconnected. The author is advised to further explore the suitability of different testing methods for each mode based on their respective characteristics.
Response 1:Thank you for your precise suggestion. We all agree that your revision suggestions are very accurate. In the updated manuscript, we add relevant background information and characteristics of black-box testing and white-box testing in Section 4.2. We also briefly explain the testing methods applicable to different modes based on the mode conversion diagram to ensure alignment with the full paper.
Comments 2: Figure 14 presents the specific distribution data of the original and expanded test sets. However, the visual differences between the two datasets are not easily distinguishable. The author is encouraged to provide additional annotations to enhance the clarity of the images.
Response 2:Thank you for your very important suggestion. Figure 14 does have the disadvantage that the type of dataset cannot be directly identified. To address this issue, we have added corresponding subtitles to the two sub-figures of Figure 14 to achieve the purpose of quickly identifying the type of dataset.
Comments 3: Some related works are missing.
Response 3:Thank you for your sincere suggestions. After reading more relevant literature, we found that the article does lack relevant literature on obstacle detection technology. Therefore, we have added a new section reviewing the infrared detection field in obstacle detection technology to compensate for this deficiency.
Reviewer 2 Report
Comments and Suggestions for AuthorsThis paper deals with train obstacle detection problem by implementing deep learning methods. The paper is overall well-written and structured. I find it interesting and relevant for this field. The main advantage of the paper is the application in real-life situation in metros. Also, current limitations of obstacle detection systems are thoroughly discussed. Figures and Tables are clear and visible. The paper's contributions are detecting those limitations and proposed suggestions to circumvent them. Other than theoretical references, they also provide practical ones.
I would suggest accepting this article upon Minor revision based on several suggestions:
- Is there a need for number 1 next to all authors' names if all authors are from the same institution?
- Consider improving the Abstract by emphasizing if you dealt with some specific limitations of current obstacle detection systems. Also, enhance your contribution and write shortly what you achieved and how it can impact future.
- Revise the whole manuscript for possible grammar/spelling mistakes. I have found one missing "space".
- Consider adding paper outline in the Introduction. Since you have made theoretical and practical experiments, outline will help readers to navigate your paper.
- I would enlarge the Conclusions a bit more. Add your the most important conclusions in few sentences based on the limitations.
- Explain more differences between black and white boxe approaches in your case.
- Are there any train directions not included in this study and why?
Author Response
Comments 1:Is there a need for number 1 next to all authors' names if all authors are from the same institution?
Response 1:Thank you for your detailed suggestions. We will delete the "1" next to the author's name to better meet the paper standards.
Comments 2:Consider improving the Abstract by emphasizing if you dealt with some specific limitations of current obstacle detection systems. Also, enhance your contribution and write shortly what you achieved and how it can impact future.
Response 2:Thank you for your valuable suggestions. We all recognize that your feedback is essential. Based on your input, we have improved the abstract, explicitly stating that we designed different test cases for various operating scenarios. Additionally, we have refined the introduction, providing a detailed explanation of how we designed test cases for different system modes and conducted the corresponding tests.
Furthermore, we have enhanced the conclusion, reflecting on certain limitations of this experiment, such as the lack of consideration for all weather conditions. We have also outlined our future work in detail. Moving forward, we will explore the use of indoor simulations to address this limitation.
Comments 3:Revise the whole manuscript for possible grammar/spelling mistakes. I have found one missing "space".
Response 3:Thank you for your detailed suggestions. We have re-checked the entire paper in detail and corrected the formatting issues in this revision to make the paper more standardized.
Comments 4:Consider adding paper outline in the Introduction. Since you have made theoretical and practical experiments, outline will help readers to navigate your paper.
Response 4:Thank you for your important suggestions. We introduced the outline of the paper in the last part of the introduction, specifically describing how we constructed this paper and the general direction of the paper structure, which can be summarized as first discussing the field of obstacle detection, then testing the ITE system, obtaining the test results and discussing them, and finally looking forward to future work.
Comments 5:I would enlarge the Conclusions a bit more. Add your the most important conclusions in few sentences based on the limitations.
Response 5:Thank you for your detailed suggestions. We have made appropriate improvements to the conclusions of the paper, emphasized the breakthrough point of designing specific test cases for different scenarios in different modes of the ITE system, and made detailed plans for future work.
Comments 6:Explain more differences between black and white boxe approaches in your case.
Response 6:Thank you for your precise suggestions. We all agree that your revision suggestions are very correct. In the updated manuscript, we added the relevant background and characteristics of black-box testing and white-box testing in Section 4.2, and briefly explained the test methods applicable to different modes based on the mode conversion diagram to achieve the purpose of adapting to the full paper.
Comments 7:Are there any train directions not included in this study and why?
Response 7:Thank you for your critical suggestions. In order to make the test experiment more complete, we conducted experiments on Beijing Metro Line 11 and put some experimental images at the end of the experimental part to strengthen the connection between the paper and the real world.
Reviewer 3 Report
Comments and Suggestions for Authors-
Please provide real example images for each obstacle category to clearly illustrate the tasks and improve reader comprehension.
-
Clearly specify the exact repository URL or DOI for the dataset used to ensure reproducibility and transparency.
-
The manuscript needs extensive English language revisions to enhance clarity, readability, and professionalism throughout.
-
Provide a clear justification for selecting deep learning methodologies from both theoretical and empirical perspectives. Include a brief review of related empirical evidence, such as: "Unsupervised deep learning approach using a deep auto-encoder with a one-class support vector machine to detect damage" and "Automated Detection of Exterior Cladding Material in Urban Areas from Street View Images Using Deep Learning." This will strengthen the rationale behind your method selection.
-
While precision, recall, F1-score, and LAMR are widely understood metrics, focus more on discussing how these metrics apply specifically to your research context rather than providing the mathematical equations.
-
Since precision and recall curves correspond directly to Average Precision (AP), consider using AP as the primary metric. This can streamline the results section, allowing space for more insightful analysis and discussion.
-
Include a comparative analysis of your method with other established object detection approaches, such as Faster R-CNN and YOLO, to highlight the strengths and limitations of your approach clearly.
-
Provide qualitative analysis with illustrative examples of images where your detector performs particularly well or poorly. This will offer readers valuable insights into your model's practical performance.
-
Strengthen your conclusions by clearly stating the research limitations, future research directions, and potential applications of your findings in academic and practical business contexts.
Author Response
Comments 1:Please provide real example images for each obstacle category to clearly illustrate the tasks and improve reader comprehension.
Response 1:Thank you very much for your sincere suggestions. We think this work is very necessary, so we added real images of the obstacle detection system under different weather conditions at the end of the experiment section, as well as the system's detection images of different types of obstacles. It can be seen that the system's detection ability can reach a very high level. In the future, we will pay more attention to improving the system's real images of different obstacle detections to strengthen the connection between the paper and the real world.
Comments 2:Clearly specify the exact repository URL or DOI for the dataset used to ensure reproducibility and transparency.
Response 2:Thank you for your valuable suggestions. Since there is no publicly available test set for this part of the evaluation, all images in the test set have been manually annotated. The images are collected from real-world driving scenarios in front of trains. For representative sample images, please refer to Figure 14. Due to the involvement of certain confidential scenes, the specific source of the test set cannot be disclosed. If access to the dataset source is required in the future, please feel free to contact us via email.
Comments 3:The manuscript needs extensive English language revisions to enhance clarity, readability, and professionalism throughout.
Response 3:Thank you for your sincere suggestions. We have made some adjustments and revisions to the fragments of the full text and reorganized some of the language to make the paper clear.
Comments 4:Provide a clear justification for selecting deep learning methodologies from both theoretical and empirical perspectives. Include a brief review of related empirical evidence, such as: "Unsupervised deep learning approach using a deep auto-encoder with a one-class support vector machine to detect damage" and "Automated Detection of Exterior Cladding Material in Urban Areas from Street View Images Using Deep Learning." This will strengthen the rationale behind your method selection.
Response 4:Thank you for your sincere suggestions. We believe that it is very necessary to clarify why deep learning methods are chosen for obstacle detection from a theoretical and empirical perspective. Therefore, we have set up a special section 3.1 to summarize in detail how the current obstacle detection technology based on deep learning is implemented, and added a partial review of infrared obstacle detection based on your suggestions to achieve the purpose of comprehensively describing the obstacle detection method based on deep learning. And the workflow of the ITE system itself is based on deep learning, so we are more inclined to explain and analyze this type of detection technology.
Comments 5:While precision, recall, F1-score, and LAMR are widely understood metrics, focus more on discussing how these metrics apply specifically to your research context rather than providing the mathematical equations.
Response 5:Thank you for your accurate suggestions. In the updated manuscript, we deleted the explanations of F1 score and LAMR, and simplified the definitions of precision and recall to the AP definition. We will also pay more attention to explaining the practical significance of various indicators for this experimental process and reduce the use of formulas.
Comments 6:Since precision and recall curves correspond directly to Average Precision (AP), consider using AP as the primary metric. This can streamline the results section, allowing space for more insightful analysis and discussion.
Response 6:Thank you for your important suggestion. In the updated manuscript, we used AP data instead of precision and recall data to make the results clearer.
Comments 7:Include a comparative analysis of your method with other established object detection approaches, such as Faster R-CNN and YOLO, to highlight the strengths and limitations of your approach clearly.
Response 7:Thank you for your precise suggestions, which made us pay attention to the problems we overlooked. We ignored the introduction of the obstacle detection network in the white box test section of Chapter 4.4. In the updated document, we added the description of the obstacle detection network as Yolo v5s and its corresponding framework introduction to help readers better understand the testing process.
Comments 8:Provide qualitative analysis with illustrative examples of images where your detector performs particularly well or poorly. This will offer readers valuable insights into your model's practical performance.
Response 8:Thank you for your valuable suggestions. We have included real-world images of the obstacle detection system in operation at the end of the experimental section. The results demonstrate that the system maintains a high level of obstacle detection capability. However, due to the confidentiality of certain scenes, some types of obstacle detection results can only be presented in tabular form. We appreciate your understanding.
Comments 9:Strengthen your conclusions by clearly stating the research limitations, future research directions, and potential applications of your findings in academic and practical business contexts.
Response 9:Thank you for your sincere suggestions. In the updated manuscript, we have written more specifically about our contributions and the real-world significance of this paper. The test cases we designed for different scenarios will provide theoretical guidance for optimizing train obstacle detection technology and improving test methods, and also provide a reference for the design and evaluation of related systems in the future.
Round 2
Reviewer 3 Report
Comments and Suggestions for Authors- The plagiarism rate is currently at 16%, which is excessively high. Please ensure originality by paraphrasing and adequately citing sources.
- Please revisit previous comments regarding your theoretical background. Clarify how Geoffrey Hinton's paper directly connects to your research context. Specifically, elaborate on:
- Why traditional methods are insufficient for object detection in your study.
- The advantages of employing deep learning, despite its requirements for extensive data and costly hardware.
- The justification for selecting an empirical research approach. Clearly articulate both theoretical and empirical rationales.
- Figure 2 is currently unclear. Please redraw the figure to enhance clarity and ensure that readers can easily interpret the presented information.
- Regarding the following statement:
"The experimental results demonstrate that the ITE system generally performs obstacle detection functions effectively. This study develops specific and detailed test cases for different modes of the ITE system, providing theoretical guidance for optimizing train obstacle detection technology and improving testing methodologies. It also serves as a reference for the design and evaluation of related systems in the future. However, this study does not cover many other safety-related functions of the ITE system, and the current testing scope remains limited. Additionally, due to weather constraints, certain black-box test cases in extreme scenarios cannot be executed. Future work considers integrating other functions into the testing process and supplementing obstacle detection tests under various weather conditions through indoor simulations."
- Clearly specify what theoretical guidance is provided in your study.
- Summarize the main findings explicitly.
- Detail concrete applications of your findings in academic research or practical fields.
- Provide a clear explanation of why CSPDarkNet architecture is utilized within the YOLO framework. Additionally, justify your choice of the mosaic algorithm within this context.
- Figure 14 lacks visibility. Ensure labeling and object annotations within the figure are clearly visible, even if some objects within the image are difficult to discern.
- Justify clearly why image transformations (e.g., motion blur, rain, snow, fog, adverse weather conditions) were applied to the original dataset. Explain how incorporating these variations benefits the robustness and generalization of your models.
- Clarify your application of data augmentation on the test dataset. Typically, data augmentation should only be applied to training datasets. Explain or correct your approach, ensuring that test datasets remain unaltered to preserve the integrity and validity of evaluation results.
Comments on the Quality of English Language
Quality of English should be more improved.
Author Response
Comments 1:The plagiarism rate is currently at 16%, which is excessively high. Please ensure originality by paraphrasing and adequately citing sources.
Response 1:Thank you for your important suggestions. We published a related paper in Sensor in early 2023. This time we tried to reduce the content of that article as much as possible. This paper contains many review sections, which may lead to a high duplication rate.
Comments 2:Please revisit previous comments regarding your theoretical background. Clarify how Geoffrey Hinton's paper directly connects to your research context. Specifically, elaborate on:
1.Why traditional methods are insufficient for object detection in your study.
2.The advantages of employing deep learning, despite its requirements for extensive data and costly hardware.
3.The justification for selecting an empirical research approach. Clearly articulate both theoretical and empirical rationales.
Response 2:We appreciate your thoughtful suggestions and will answer your questions from different angles, and we will appropriately add paragraphs at the beginning of section 3.1 to achieve the purpose of explaining the advantages of deep learning.
- Because traditional methods that rely on manually designed features (such as Haar-like features, HOG, and SIFT) have limited feature representation capabilities, poor robustness to changes and deformations, difficulty in adapting to complex scenes, and low efficiency, there is no way to meet the needs of train obstacle detection.
- In contrast, deep learning models can automatically learn high-level feature representations from images, enabling them to handle complex environments and changes in object appearance. The advancement of high-performance hardware supports the huge computing needs of deep learning models. High-precision training technology further optimizes the feature extraction and classification process, minimizes manual intervention, and improves detection efficiency, which makes deep learning models widely used in object detection technology.
- The integration of empirical research methods provides solid data support for this study. At present, there are few testing methods for neural networks in the railway industry, and it is crucial to ensure the validity and reliability of the final test results. Based on a deep understanding of the obstacle detection model, this study conducted comprehensive tests and analyzed the advantages and limitations of the system based on empirical data, providing a new perspective for neural network testing methods in the railway industry.
Comments 3: Figure 2 is currently unclear. Please redraw the figure to enhance clarity and ensure that readers can easily interpret the presented information.
Response 3:Thank you for your critical suggestion. We have redrawn Figure 2 to make the information clearer.
Comments 4:Regarding the following statement:
"The experimental results demonstrate that the ITE system generally performs obstacle detection functions effectively. This study develops specific and detailed test cases for different modes of the ITE system, providing theoretical guidance for optimizing train obstacle detection technology and improving testing methodologies. It also serves as a reference for the design and evaluation of related systems in the future. However, this study does not cover many other safety-related functions of the ITE system, and the current testing scope remains limited. Additionally, due to weather constraints, certain black-box test cases in extreme scenarios cannot be executed. Future work considers integrating other functions into the testing process and supplementing obstacle detection tests under various weather conditions through indoor simulations."
1.Clearly specify what theoretical guidance is provided in your study.
2.Summarize the main findings explicitly.
3.Detail concrete applications of your findings in academic research or practical fields.
Response 4:Thank you for your important suggestions. We address your questions by improving the conclusions.
1.To respond to your first and second questions: We have observed that the system performs better on straight roads but less effectively on curves. Therefore, in future work, manufacturers or researchers can prioritize technical breakthroughs in curve scenarios.
2.Regarding your third question, testing deep models is still in its infancy. Unlike the mature methods of traditional software testing, there are almost no precedents for testing neural networks in the railway industry. Ensuring that the final test results are effective and appropriate is crucial. After an in-depth analysis of the deep learning model, this study conducted detailed tests and combined the test data to analyze the strengths and weaknesses of the system. This provides new insights for neural network testing methods in the railway industry.
Comments 5:Provide a clear explanation of why CSPDarkNet architecture is utilized within the YOLO framework. Additionally, justify your choice of the mosaic algorithm within this context.
Response 5:Thank you for your sincere suggestions. We provide a more detailed explanation of Figure 13 and introduce two additional structures under the CSP architecture. The primary objective of the CSP architecture is to enhance the learning capability of convolutional neural networks while reducing the computational cost of training. The Mosaic algorithm is employed because it significantly increases the diversity and richness of the dataset and simulates more complex scenes and backgrounds, which is particularly crucial for detecting various obstacles that may appear during train operation. This data augmentation method not only improves the model's generalization ability but also enhances its robustness when handling different scenarios.
Comments 6:Figure 14 lacks visibility. Ensure labeling and object annotations within the figure are clearly visible, even if some objects within the image are difficult to discern.
Response 6:Thank you for your important suggestion. We have improved this picture to convey the information clearly.
Comments 7:Justify clearly why image transformations (e.g., motion blur, rain, snow, fog, adverse weather conditions) were applied to the original dataset. Explain how incorporating these variations benefits the robustness and generalization of your models.
Response 7:Thank you for your valuable suggestions. The application of image transformations to the original dataset aims to simulate various scenarios that may arise in real-world obstacle detection, thereby achieving comprehensive detection. Additionally, we have incorporated a data comparison between the original and expanded datasets in Section 4.4. The results demonstrate that the expanded dataset better evaluates different aspects of the model, contributing to a more comprehensive performance assessment and improvement of the model.
Comments 8:Clarify your application of data augmentation on the test dataset. Typically, data augmentation should only be applied to training datasets. Explain or correct your approach, ensuring that test datasets remain unaltered to preserve the integrity and validity of evaluation results.
Response 8:Thank you for your necessary suggestions. We have added a comparison between the original dataset and the expanded dataset in Section 4.4. The data show that the expanded dataset is more capable of testing different aspects of the model, which helps to comprehensively evaluate the model's performance.
Round 3
Reviewer 3 Report
Comments and Suggestions for AuthorsDuring the review, I identified several critical issues that significantly affect your publication. Please carefully address the following comments:
-
Lack of empirical studies:
You still have not adequately incorporated empirical studies, despite this point being raised in the previous review. Please revisit the initial comments, especially those highlighting specific empirical papers that I mentioned. MDPI has published numerous relevant empirical studies, yet you primarily cited theoretical papers without empirical evidence. You should include and cite more empirical studies. -
Data manipulation issue:
There appear to be discrepancies in your data handling. The annotations and the reported detection results differ significantly. You must clearly explain the reasons behind this discrepancy.
English should be more improved, and should avoid plagiarism.
Author Response
Comments 1:Lack of empirical studies:
You still have not adequately incorporated empirical studies, despite this point being
raised in the previous review. Please revisit the initial comments, especially those
highlighting specific empirical papers that I mentioned. MDPI has published
numerous relevant empirical studies, yet you primarily cited theoretical papers
without empirical evidence. You should include and cite more empirical studies.
Response 1:Thank you for your important suggestion. We indeed neglected to
cite empirical research in the review section. This time, we introduced two
relevant papers in Section 3.1 to make up for our lack of summary of empirical
research.
Comments 2:Data manipulation issue:
There appear to be discrepancies in your data handling. The annotations and the
reported detection results differ significantly. You must clearly explain the reasons
behind this discrepancy.
Response 2:Thank you for your sincere suggestions. We noticed that the original
Figure 15 seemed to have deviations in manual identification. Since the
information illustrated in Figure 15 can also be explained by other images in the
paper, we chose to delete this image to achieve the purpose of streamlining the
paper.
Round 4
Reviewer 3 Report
Comments and Suggestions for AuthorsI recommend this paper is published.
Comments on the Quality of English LanguageEnglish should be improved in particular, the use of present and past tense.