Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros

Yan, Fei; Gu, Yiran; Sun, Yunlai

doi:10.3390/electronics14071318

Open AccessArticle

Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros

by

Fei Yan

,

Yiran Gu

^*

and

Yunlai Sun

School of Automation and Intelligence, Beijing Jiaotong University, Beijing 100044, China

^*

Author to whom correspondence should be addressed.

Electronics 2025, 14(7), 1318; https://doi.org/10.3390/electronics14071318

Submission received: 28 February 2025 / Revised: 23 March 2025 / Accepted: 26 March 2025 / Published: 26 March 2025

(This article belongs to the Special Issue Advancements in Sensing and Perception for Autonomous Vehicles in Adverse Environmental Conditions)

Download

Browse Figures

Review Reports Versions Notes

Abstract

With the rapid development of urban rail transit, unmanned train driving technology is also advancing rapidly. Automatic obstacle detection is particularly crucial and plays a vital role in ensuring train operation safety. This paper focuses on train obstacle detection technology and testing methods. First, we review existing obstacle detection systems and their testing methods, analyzing their technical principles, application status, advantages, and limitations. In the experimental section, the Intelligent Train Eye (ITE) system is used as a case study. Black-box testing is conducted in the level high-precision (LH) mode, with corresponding test cases designed based on various scenarios that may arise during train operations. White-box testing is performed in the level exploration (LE) mode, where the test results are meticulously recorded and analyzed. The test cases in different modes comprehensively cover the testing requirements for train operations. The results indicate that the ITE system successfully passes most of the test cases and meets the primary functional requirements.

Keywords:

rail area detection; obstacle detection; deep learning; black-box testing; white-box testing

1. Introduction

According to statistics, at the end of 2023, urban rail transit systems operated in 563 cities across 79 countries and regions worldwide, with a total operating mileage exceeding 43,400 km. Urban rail transport features high-density operations, short train intervals, high speeds, and strong transport capacity, with a maximum one-way capacity of 100,000 passengers per hour [1]. Operating on dedicated tracks with minimal interference from other transportation modes, it maintains high punctuality. As a widely adopted mode of transportation, urban rail transit is rapidly expanding. Since the world’s first fully automated line entered service in 1983, more than 40 cities worldwide have introduced fully automated lines into their rail networks, and such systems are now widely implemented.

The primary characteristic of driverless metro trains is the use of advanced automation systems, enabling trains to autonomously perform operations such as automatic wake-up, self-inspection, station departure, precise stopping, and automated door operations without manual intervention. Track information is captured by cameras and transmitted in real time to the control center. However, this method places stringent demands on the integrity, safety, and reliability of the system [2].

IEC 62267 (Automatic Urban Guided Transport (AUGT)—Safety Requirements) [3] establishes safety requirements for fully automated urban rail transit trains, stipulating that Driverless Train Operation (DTO) and Fully Automated Operation (FAO) [3] must be equipped with collision prevention functions for tracks and pedestrians. However, with increasing mileage and the growing complexity of construction tasks and operational scenarios, these requirements are increasingly challenged and may even introduce new risks [4]. Furthermore, when the existing signal system degrades due to faults, the level of safety protection drops from Safety Integrity Level 4 (SIL4) to SIL0, forcing trains to reduce speeds to below 25 km/h or come to a complete stop. This increases train intervals, prolongs recovery time from faults, and sharply reduces service quality. As rail transit networks in large cities and megacities rapidly expand and take shape, the imbalance between vehicle capacity and transportation demand on existing lines becomes increasingly prominent, while network integration and system upgrades remain infeasible due to technical and financial constraints.

Active obstacle detection technology based on environmental perception effectively addresses the aforementioned issues. The automotive and rail transit industries are shifting from passive protection to proactive environmental sensing. Using vehicle–road coordination for real-time environmental perception, trains can enhance autonomous safety measures and enable fully driverless operation. Currently, various obstacle detection technologies are used in the transportation sector, mainly including image-based, radar-based, and image–radar fusion methods. In recent years, contactless obstacle detection technologies based on deep learning have been deployed in some domestic metro systems. Track obstacles pose significant threats to rail safety, and research on this topic has become a major focus in both academic research and industry applications. Conducting in-depth studies in this area is highly significant.

The Intelligent Train Eye (ITE) system is developed in response to this demand. It ensures an interval of approximately three minutes between stations even after a train fault occurs and system performance degrades. The system provides SIL2 protection, improving fault recovery efficiency of 30–50%, thereby reducing the burden on both drivers and passengers. The ITE system incorporates a sensor fusion system combining short- and long-focus cameras, Light Detection and Ranging (LiDAR), and millimeter-wave radar [5], leveraging artificial intelligence-based visual perception [6]. This enables the system to autonomously locate the train, assess track clearance, and identify signals. It ensures autonomous perception and safe operation when the primary system fails, using contactless methods to detect obstacles ahead of the train while minimizing reliance on traditional trackside infrastructure.

However, the testing of obstacle detection systems lags behind, especially for the deep learning modules within these systems. Due to their black-box nature, the testing of deep learning models is still in an early stage. The development of this contactless detection technology introduces certain risks. In recent years, studies indicate that intelligent algorithms based on deep neural networks, such as facial recognition, autonomous driving, and small object detection, are highly vulnerable to perturbations, resulting in unpredictable errors, which may lead to severe consequences due to misclassification [7,8]. Therefore, evaluating the quality of machine learning models and ensuring the safety and reliability of artificial intelligence technology have become critical challenges. Since train driverless operation demands exceptionally high standards of safety and efficiency and any oversight could lead to irreversible consequences, the rigorous testing of such systems is particularly important. Whether the ITE system can reliably perform accurate obstacle detection and effectively replace human drivers still requires extensive validation.

This paper first briefly introduces the development of urban rail transit and then reviews and summarizes the current obstacle detection systems and technologies in this context. It also introduces the ITE system, which demonstrates strong performance in obstacle detection. Subsequently, black-box testing is conducted in the LH mode of the ITE system, while white-box testing is applied to the LE mode.

We aim to rigorously evaluate the ITE system through black-box (input–output-based) testing and comprehensive performance analysis. This includes not only quantitative performance evaluation but also the assessment and safety validation of system behavior across different scenarios, ensuring compliance with functional and performance requirements while enhancing robustness. This is essential to safety-critical systems in various application scenarios.

At the same time, we perform white-box (internal structure-based) testing on the obstacle detection function of the ITE system in LE mode to evaluate its functionality. The test assesses the system’s accuracy in detecting different obstacle types in various scenarios, which is crucial to ensuring train operation safety.

Finally, by integrating the results of black-box and white-box testing, previously unidentified functional gaps in the ITE system are revealed. These gaps represent unmet needs or scenarios that can lead to performance bottlenecks or risks. Identifying these gaps provides valuable information for system enhancement and refinement, improving the performance, robustness, and maintainability of the system to ensure that it fully meets operational requirements.

2. Obstacle Detection System

The research and development of key technologies in rail transit start relatively late but grow rapidly. Due to advancements in image processing algorithms and sensor technologies, multiple enterprises and academic institutions worldwide establish and refine obstacle detection methods. Extensive research focuses on ultrasonic sensing, LiDAR, and image processing.

Habib Ahmed et al. from Pakistan proposed a novel obstacle detection technology using a set of cost-effective active and passive sensors (i.e., projectors and a monocular vision system) to tackle obstacle detection challenges. This method utilizes ground grid projection and detects stationary and moving obstacles by analyzing grid line deformations [9].

Tang et al. address challenges such as false positives, false negatives, and poor environmental adaptability in railway obstacle detection. They proposed a multi-sensor fusion active obstacle detection algorithm for rail transit. Experimental results show that this algorithm achieves higher detection accuracy than monocular or LiDAR-based detection methods and effectively identifies sudden intrusions within the train’s clearance envelope [10].

Zhao et al. proposed an active obstacle detection method based on millimeter-wave radar base stations. This method significantly improves the radar detection range, enabling obstacle detection beyond 200 m. By analyzing radar echo data generated from simulations and two-dimensional imaging, the positional information of small spherical models is clearly observed. Establishing communication among the base station, smart vehicles, and vehicle signal processing systems enables obstacle detection [11].

Sarath Manchala et al. from Germany proposed a novel 4D automotive radar, which estimates distance, azimuth, elevation, and velocity in real time. This radar operates at 79 GHz with a 1.6 GHz bandwidth and employs fast linear frequency modulation continuous wave (FMCW). It utilizes Multiple Input Multiple Output (MIMO) technology and BPSK-based coding in transmitted signals to achieve wide-angle field-of-view coverage without blind spots. By optimizing antenna arrangement and simplifying digital beamforming, the system enables real-time signal processing and 3D environmental mapping [12].

Researchers at the Palakkad Institute of Engineering and Technology developed a camera-based railway obstacle detection system. This system segments real-time data into frames, processes them by using the Canny edge detection algorithm, and employs a Convolution Neural Network (CNN) trained on a pedestrian dataset to detect obstacles. The train driver uses the feedback to control the train’s speed [13].

3. Obstacle Detection System Testing Methods

3.1. Object Detection Technology Based on Deep Learning

Deep learning was first proposed by Hinton et al. in 2006, primarily for studying artificial neural networks [14]. As an emerging machine learning technology, deep learning mimics the working mechanism of the human brain through artificial neural networks. It enables learning from data through representation learning and has significant applications in numerous fields, including computer vision, natural language processing, speech recognition, and bioinformatics, leading to significant research progress [15].

The widespread application of deep learning in object detection primarily stems from its powerful feature extraction and representation capabilities. Traditional methods rely on manually designed features, such as Haar-like features, Histogram of Oriented Gradients (HOG), and Scale-Invariant Feature Transform (SIFT). However, these methods have limited feature representation capabilities, poor robustness to variations and deformations, and difficulty adapting to complex scenes. In contrast, deep learning models can automatically learn high-level feature representations from images, enabling them to handle complex environments and variations in object appearance, including articulated structures. Additionally, advancements in technology have led to improved hardware performance and greater accessibility of computational resources, addressing the high computational demands of deep learning. Furthermore, high-precision training techniques allow deep learning models to automatically optimize feature extraction and classification processes, reducing the need for manual intervention and improving detection efficiency. Consequently, deep learning models are widely adopted in object detection technology [16].

Wang et al. proposed an unsupervised deep learning method for structural damage detection, integrating a Deep Auto-Encoder (DAE) and a One-Class Support Vector Machine (OC-SVM). This method is trained exclusively on acceleration response data from intact or baseline structures, allowing it to extract damage-sensitive features through a carefully designed DAE. The extracted features are then classified by using an OC-SVM to detect potential future structural damage. Both experimental and numerical studies demonstrate the high precision and stability of the method in identifying structural damage [17]. Wang et al. developed a deep learning model based on Convolutional Neural Networks (CNNs) to automatically classify exterior wall materials in urban areas by using street-view images. The study identifies six types of material: “brick”, “concrete”, “glass”, “stone”, “mixed”, and “other”. The analysis is carried out in two different regions, London and Scotland. The results indicate that the model, which uses MobileNetV3 and data augmentation techniques, achieves accuracy rates of 75.65% in London and 73.45% in Scotland, demonstrating its potential applicability [18].

In the transportation field, deep learning is widely employed in the object detection modules of autonomous driving systems. Object detection primarily focuses on identifying pedestrians, vehicles, traffic lights, and road signs. The core principle involves using advanced technologies such as millimeter-wave radar and LiDAR to develop deep learning-based object detection methods. These systems enhance driving safety and contribute to safer and more intelligent driving.

For example, Zhang et al. proposed an intelligent train obstacle detection system based on deep learning. This system integrates perception information from industrial cameras and LiDAR to achieve track area detection, obstacle detection, and vision–LiDAR data fusion. By using deep learning algorithms, specifically CNNs, the system can detect track areas and potential obstacles in real time while maintaining high accuracy and robustness in complex environments such as at night, in tunnels, and under adverse weather conditions. Experimental results show that the system demonstrates outstanding performance, achieving a perception accuracy of 99.994% and a recall rate of 100% [19].

Furthermore, high-precision mapping and multi-sensor fusion technologies are rapidly evolving. By integrating LiDAR, an Inertial Measurement Unit (IMU), and camera data through Simultaneous Localization and Mapping (SLAM) technology, obstacle detection reliability is significantly enhanced.

D. Li et al. explored a SLAM framework based on multi-sensor and multi-feature fusion, emphasizing its foundational role in high-precision localization, environmental perception, and autonomous decision making. The study highlights that fusing LiDAR, IMU, and camera data substantially enhances obstacle detection reliability and improves the safety of autonomous driving systems [20].

Zhang et al. proposed several effective methods for infrared obstacle detection. First, they introduced a backbone network called Deep-IRTarget and design the Resource Allocation Model for Features (RAF) to enhance the extraction of features from Convolution Neural Networks (CNNs) [21]. To address the limited diversity of existing infrared obstacle detection datasets, they constructed the first dataset specifically tailored for Infrared Few-Shot Object Detection (IFSOD) [22]. Additionally, they developed the Robust Hybrid Loss function (RHL) to optimize recognition and retrieval by performing a finer-grained division of noisy datasets [23]. Finally, they introduced the Differential Feature Awareness Network (DFANet) based on adversarial learning for infrared and visible-light target detection and verify its effectiveness through extensive experiments [24].

The application and development of these technologies provide strong support for the intelligent development and automation of rail transit and are of significant research and application value.

3.2. Measurement Methods for Deep Learning Systems

The evaluation of deep learning systems primarily relies on measuring their accuracy on test inputs, which are randomly selected from manually labeled datasets and simulated environments [25]. However, this black-box testing approach may miss corner-case behaviors that lead to unexpected errors [26].

Wicker et al. [27] proposed a black-box test method guided by the Scale-Invariant Feature Transform (SIFT) and demonstrated its competitiveness with Carlini–Wagner Attack (CW) and Jacobian-based Saliency Map Attack (JSMA) in this direction. Pei et al. [28] introduced a white-box differential testing algorithm designed to systematically identify inputs that may trigger inconsistencies among multiple deep neural networks (DNNs). They used neuron coverage as a metric to measure the extent of internal logic tested.

DeepTest [29] examines a set of basic image transformations in Open Source Computer Vision Library (OpenCV), such as scaling, shearing, and rotation and shows that these transformations are useful for detecting defects in DNN-driven autonomous vehicles. Following this direction, DeepRoad [30] utilizes input image scene transformations and demonstrates its potential for autonomous driving testing in two scenarios: snowy and rainy conditions. Scene transformations are performed by training a generative adversarial network (GAN) with collected training data that cover the statistical characteristics of the two target scenarios.

Compared with traditional software, deep neural networks have a significantly larger dimensionality and potential test space than traditional software. The Deep Contextualized Term Weighting framework (DeepCT) [31] adopts the concept of combinatorial testing and proposes a set of neuron input interaction-based coverage metrics for each DNN layer to guide test creation, achieving reasonable defect detection capability with a relatively small number of tests. Inspired by Modified Condition or Decision Coverage (MC/DC) testing standard for traditional software [32], Sun et al. [33] proposed an MC/DC standard for DNNs and demonstrated that tests generated under this standard outperform random testing in defect detection for small-scale neural networks, which consist of dense layers with no more than 5 hidden layers and 400 neurons. However, whether the MC/DC standard is applicable to real-world deep learning systems requires further research.

DeepMutation [34] proposes mutating DNNs (i.e., injecting errors at the source or model level) to assess the quality of test data, which may help prioritize test data when evaluating DNN robustness. DeepHunter [35] introduces a coverage-guided fuzz testing framework aimed at uncovering potential defects in DNNs. The paper in which it is detailed presents a variant mutation strategy for generating new test cases with semantic preservation properties and utilizes multiple scalable coverage criteria as feedback to guide test case generation.

4. Experiment

4.1. ITE System

The schematic diagram of the ITE system is shown in Figure 1. The hardware components of the ITE system include an ITE host, a telephoto module (telephoto camera + LiDAR), a short-focus module (short-focus camera + LiDAR), and a millimeter-wave radar. The sensors connect to the ITE host via range extension boxes, and an indicator light panel serves as an interactive terminal connected to the ITE host.

The ITE host acts as the core processing unit of the system, responsible for receiving, processing, and analyzing data from various sensors. The telephoto module primarily monitors and collects data from long-distance scenes, while the short-focus module focuses on short-range scene monitoring. The extended distance box serves as the physical interface between the sensors and the ITE host, facilitating signal transmission and data exchange. The indicator light panel functions as the system’s interactive terminal, providing system status updates and warning information to train operators or other relevant personnel.

These modules work together to form the complete hardware architecture of the ITE system, enabling comprehensive autonomous localization, perception, and obstacle detection capabilities for trains, thereby enhancing the safety and efficiency of rail operations.

4.2. Two Modes of ITE System

The ITE system has two operating modes: level exploration (LE) mode and level high-precision (LH) mode. The system switches between these modes automatically. Initially, the ITE operates in LE mode. Once the system completes high-precision positioning initialization, it transitions to LH mode. If the system receives Intelligent Train Positioning (ITP) information indicating an abnormal position window, direction deviation, or other irregularities, the ITE system switches back to LE mode. The mode transition relationships and conditions are illustrated in Figure 2.

Black-box testing does not consider the internal structure of a program but assesses whether the system functions as expected from the user’s perspective. Common methods include equivalence partitioning and boundary value testing. In contrast, white-box testing requires an understanding of the program’s internal structure, focusing primarily on code logic and quality. Common techniques include statement coverage and branch coverage.

By comparing the two modes, it is observed that the system in LH mode is more mature and functionally stable, whereas in LE mode, it is less developed and offers fewer functionalities. Taking into account the system testing requirements, black-box testing is more suitable for the LH mode, while white-box testing is more appropriate for the LE mode.

4.3. Black-Box Testing in LH Mode

In LH mode, the system provides positioning functions based on high-precision maps. It incorporates sensor fusion and multi-target tracking based on High-Precision Mapping Processing (HPMP), High-Precision Positioning Processing (HPPP), Intelligent Dynamic Localization Processing (IDLP), Predictive Target Detection Processing (PTDP), and Scene Context Decision Processing (SCDP), enabling enhanced signal detection and obstacle detection. From the perspective of ITE system functionality, black-box testing effectively meets user needs by systematically identifying issues based on test cases, improving accuracy, and facilitating test case generation.

The main steps of black-box testing are as follows: First, based on the requirements of the ITE system, the primary functions of the system are classified. According to these functional requirements, input data are classified into different equivalence classes by using equivalence partitioning and boundary value analysis. Test points and appropriate test standards are then designed accordingly. Finally, the system functions are tested according to the predefined test points and functional requirements.

In order to ensure the reliability and safety of autonomous train operation, the ITE system must have high-safety obstacle detection capabilities to ensure accurate obstacle identification and implement appropriate response measures in complex environments. This enables it to replace manual monitoring while ensuring reliable detection and early warning capabilities. Consequently, it meets the requirements of train control systems. Based on the requirements of the train control system for the ITE system, its specific functional requirements can be further refined, as shown in Figure 3.

By analyzing the requirements of the obstacle detection function in LH mode, we can determine the key conditions that must be considered when designing test cases for high-safety obstacle detection, as illustrated in Figure 4.

At the same time, the equivalence class table is obtained by analyzing obstacle detection performance across different categories by using the equivalence classification method. Since the test case design for pedestrian and small obstacle detection follows a similar approach to that of train obstacle detection, this paper focuses only on the test case design for train obstacle detection.

The equivalence class table for train obstacles is shown in Table 1. The ITE system’s detection range for train obstacles in straight and turnout scenarios is between 30 m and 300 m, while the detection range on curves is between 30 m and 70 m, and on ramps, it is between 30 m and 100 m.

In the straight-track test, the test case evaluates the train’s ability to detect obstacles ahead both while running and when stationary. The straight-track scenario is shown in Figure 5.

Based on the equivalence class division table, test cases are designed for different situations. There are two scenarios: the dynamic tracking of two vehicles and a static vehicle. The test cases are shown in Figure 6. TOSD in the figure stands for “Test Of System Data”, the online data playback system of the ITE system.

For Test Case 1, the result is that the tested train ITE always maintains LH mode; the tested train’s obstacle type is reported correctly; the visible distance is correctly output and calculated based on the line conditions.

For Test Case 2, the result is that ITE reports the obstacle 30 m after the obstacle train leaves the blind area; ITE does not report the obstacle once the obstacle train leaves the 300 m distance limit.

Test Cases 3 and 4 consider the scenario with multiple obstacles and test the ITE system’s ability to assess the rail boundaries. The train passes through the turnout in the fixed position, as shown in Figure 7a, and in the reverse position, as shown in Figure 7b.

The specific operational details for Test Cases 3 and 4 are shown in Figure 8.

For Test Case 3, the results are as follows:

(1) Before the tested train enters the turnout for positioning, the obstacle type reported by the ITE system is correct, and the obstacle positions correspond to Obstacle Train A and Obstacle Train B.

(2) After the tested train enters the turnout for positioning, the ITE system reports only the position and category of Obstacle Train B within the positioning area.

(3) The tested train remains in LH mode throughout the process and does not exit. The identification distance error is ≤±1 m.

The results of Test Case 4 are similar to those of Test Case 3. After the tested train enters the turnout’s reverse positioning area, the ITE system reports only the position and category of Obstacle Train A within the reverse positioning area.

Test Cases 5 and 6 evaluate the ITE system’s ability to detect an obstructing train in a curve. The top view of the scene where an obstructing train stops at a curve is shown in Figure 9. The obstructing train stops at a left curve in front in Figure 9a, and the obstructing train stops at a right curve in Figure 9b.

The specific operational details for Test Cases 5 and 6 are shown in Figure 10.

The results of Test Cases 5 and 6 are as follows:

(1) The ITE system of the tested train remains in LH mode, and the recognition distance error is ≤±1 m.

(2) After the obstacle train enters the curve detection range, the tested train can correctly report the information of the obstacle train ahead.

(3) Before the obstacle train enters the curve detection range and after it enters the blind zone, the ITE system of the tested train cannot recognize the preceding train.

Test Cases 7 and 8 evaluate the ability of the ITE system to detect train obstacles in ramp scenarios. Figure 11 shows the schematic diagram of train obstacle detection in ramp scenarios. The scene with a train obstacle on the uphill ramp ahead is shown in Figure 11a, and the scene with a train obstacle on the downhill ramp ahead is shown in Figure 11b.

The specific operational details for Test Cases 7 and 8 are shown in Figure 12.

The results of Test Case 7 are as follows:

(1) The ITE system of the tested train remains in LH mode, and the recognition distance error is ≤±1 m.

(2) When the tested train enters the 100 m range of the obstacle train, the ITE system of the tested train reports the correct obstacle type and distance.

(3) Before the obstacle train enters the curve detection range and after it enters the blind zone, the ITE system of the tested train cannot recognize the preceding train. The expected results of Test Case 8 are consistent with those of Test Case 7.

4.4. White-Box Testing in LE Mode

The ITE system utilizes the YOLO v5s network architecture for obstacle detection. This architecture comprises four main modules: input, backbone network, Feature Pyramid Networks (FPNs), and YOLO Head. Figure 13 presents the structural block diagram. In train obstacle detection, the Mosaic algorithm is used because it significantly enhances the diversity and richness of the dataset. By randomly combining multiple images into a single image, the Mosaic algorithm simulates more complex scenes and backgrounds, which is particularly important for detecting various obstacles encountered during train operation. This data augmentation method not only helps the model generalize better but also improves its robustness in different scenarios. Additionally, image adaptive scaling technology is applied to enhance image quality. The backbone network integrates the Focus module, the Cross Stage Partial (CSP) module, and the Spatial Pyramid Pooling (SPP) module.

YOLO v5 introduces two CSP architectures, as shown in Figure 14. The CSP1_X architecture is designed for the backbone network, while the CSP2_X architecture is applied to the pyramidal feature network. The algorithm employs two different approaches: one extracts features from the image by using branching mechanisms, while the other stacks features in depth through convolution operations. The primary objective is to enhance the learning capacity of convolutional neural networks while reducing the computational cost of training.

In LE mode, the system lacks a positioning function and only provides the most basic obstacle detection function, which is based on radar point clouds and an image deep learning model. Thus, the focus should be on testing the obstacle detection module in LE mode. This function is implemented by using a deep learning model and is still being improved. Thus, it is more appropriate to use the white-box testing method: First, the original test dataset is created, followed by dataset expansion and image transformation. The evaluation indicators and neuron coverage metrics are then obtained for both the original and expanded datasets. After verifying the usability of the expanded dataset, it is used to test the target detection deep learning model of the ITE system.

We created a dataset for railway obstacle detection through manual annotation and divided it into a training dataset and a test dataset. All images were captured from real driving scenes in front of a train, with a resolution of 1280 × 720 pixels.

We tested the model by using an expanded dataset to better account for various real-world situations. By applying image transformations to the original dataset, such as adding images with motion blur, rain, snow, fog, and other adverse weather conditions, all possible scenes were incorporated into the original dataset. The total number of objects in each category of the final original test dataset and the expanded test dataset is shown in Figure 15.

We evaluated the key layer neuron coverage on both the original annotated dataset and the expanded dataset by using the tested model. Four common neuron coverage metrics were used: K-Multisection Neuron Coverage (KMNC), Neuron Boundary Coverage (NBC), Strong Neuron Activation Coverage (SNAC), and Top-k Neuron Coverage. These four metrics comprehensively assess the performance of both the primary functional regions and the corner functional regions of the model, enabling a more thorough evaluation of its overall performance. The neuron coverage performance of the model on the two datasets is presented in Table 2.

In Table 2, LB and UB are the upper and lower boundaries of the main functional areas of the neuron, respectively. From the analysis of the results in Table 2, it is evident that the neuron coverage indicators of the model on the expanded dataset are higher than those on the original dataset. For the KMNC indicator, taking k = 100 as an example, the coverage rate increases from 37.5% to 47.8% in the expanded dataset, reflecting a 10.3 percentage point improvement. This result indicates that the expanded dataset allows the model neuron values to be more comprehensive and cover a larger number of nodes. Since the expanded dataset only adds approximately 10% more images, the increase in NBC and SNAC is not significant. However, it still shows that the expanded dataset can assess the performance of corner areas more comprehensively than the original dataset. Compared with KMNC, the improvement in Top-k coverage is also less pronounced, but the expanded dataset still activates more neurons as Top-k activated neurons, thereby facilitating the detection of hidden defects. By analyzing the results of these indicators, it is clear that the expanded dataset enhances the model’s ability to be evaluated from different perspectives, contributing to a more comprehensive performance assessment and optimization of the model.

We evaluate the model’s performance by assessing the correctness of the algorithm’s function implementation. There are two main secondary indicators: task indicators and response time. In object detection tasks, commonly used evaluation metrics include precision (P), recall (R), and F1 score.

Before presenting the evaluation metrics, we first define several data types. In the target detection task, four types of detection results are generated: true positive (TP), true negative (TN), false positive (FP), and false negative (FN). An example of a classification confusion matrix is shown in Table 3.

The predicted box represents the object’s position estimated by the deep learning model based on the input image, while the ground-truth box denotes the actual position of the object in the image. In object detection, the prediction boxes for a given category are sorted by confidence. If the Intersection over Union (IoU) between a prediction box and the ground-truth box exceeds a predefined threshold, the prediction box is marked as a TP. Each ground-truth box is counted as a TP only once, associated with the prediction box that has the highest confidence. Prediction boxes that do not match any ground-truth box are considered FPs, while ground-truth boxes that are not matched are considered FNs.

Intersection over Union: The IoU is defined as the ratio of the intersection area of two boxes to their union area. A higher IoU indicates a greater overlap between the predicted box and the ground-truth box, reflecting better detection performance. Conversely, a lower IoU suggests a significant deviation between the predicted box and the ground-truth box, indicating poorer detection performance.

Average precision (AP): To comprehensively evaluate the classification and localization capabilities of object detection, the AP for each category and the mean average precision (mAP) across all categories are commonly used. At a given IoU threshold, the predicted boxes are sorted in descending order based on their confidence scores to compute precision and recall. Precision is the ratio of correctly predicted positive samples to all predicted positive samples, while recall is the ratio of correctly predicted positive samples to all actual positive samples. A precision–recall curve (PR) is plotted with recall on the horizontal axis and precision on the vertical axis. The AP for each category is determined by calculating the area under the PR curve, while the mAP at a specific IoU threshold is obtained by averaging the AP values across all categories.

Based on the selected evaluation metrics, the extended test dataset is used to assess the target detection deep learning model of the ITE system. The AP test results for each category under different confidence thresholds are presented in Figure 16.

At the same time, in order to more accurately and comprehensively analyze the model’s mAP under different conditions, the COCO evaluation metric is used to obtain the mAP test results under different IoU thresholds, as shown in Table 4.

In Table 4, area represents the pixel grid area occupied by the target box in the image. Specifically, area = small means that the target occupies a pixel area within

32 \times 32

, area = medium indicates a pixel area between

32 \times 32

and

96 \times 96

, and area = large refers to a pixel area greater than

96 \times 96

.

By analyzing the test results, it is observed that the overall mAP of the model reaches 0.764, indicating a certain level of accuracy and performance in obstacle detection. When the IoU threshold is set to 0.5, the model achieves an mAP of 0.991, suggesting a high level of accuracy in identifying obstacles. When the IoU threshold is increased to 0.75, the mAP decreases to 0.867, indicating that the model maintains strong detection performance even under stricter localization requirements.

Regarding obstacle sizes, the model performs well for medium and large targets, achieving mAP values of 0.695 and 0.861, respectively. This suggests that the model is particularly effective in detecting larger obstacles. However, for small obstacles, the mAP is only 0.421, indicating room for improvement in small-object detection.

Further analysis of the model’s AP performance across different categories reveals that the AP values for Train, Person, and Obstacle all reach 0.99, while the AP for Stop is 0.93. Since the Stop category primarily consists of small objects, this result further confirms the model’s relatively weaker performance in detecting small targets.

In order to strengthen the connection between the test and the actual scene, we conducted specific experiments on Beijing Metro Line 11. Figure 17 shows the results of the system’s detection of different types of obstacles. It can be seen that the system can accurately identify various types of obstacles.

Figure 18 presents a test image showing the recognition ability of the obstacle detection system. The results show that under various lighting conditions, the system performs well and effectively realizes the detection and recognition of obstacles.

5. Discussion

This study integrates the functional principles of various subsystems of the ITE system. After understanding the functional requirements of the ITE system, black-box testing and deep learning model evaluations are designed for different operational modes.

From the black-box testing results, it is evident that the high-safety obstacle detection system performs well in straight sections, with no significant false alarms. However, the main challenge lies in detecting large foreign objects outside curved sections. To enhance the ITE system’s obstacle detection performance in these areas, future efforts should focus on optimizing sensor configurations, training region-adaptive detection models, and augmenting curved-section data to improve the model’s ability to handle complex scenarios more effectively.

From the white-box testing results, it is observed that the obstacle detection module in LE mode of the ITE system performs well in the Obstacle and Train categories but has certain shortcomings in the Person and Stop categories. This issue may arise due to feature similarities between the Person category and the background or other categories (such as Stop signs), leading to misclassification. Additionally, an insufficient number of Stop sign samples in the training data may result in limited model generalization. In the future, more advanced feature extraction methods will be considered, such as using higher-resolution inputs or multi-scale feature fusion. Furthermore, data augmentation techniques (e.g., color adjustment, rotation, and noise addition) will be applied to enhance the model’s adaptability to the Stop category in various scenarios.

6. Conclusions

This paper first summarizes existing obstacle detection systems and their corresponding test methods; then, it focuses on the ITE system, conducts an in-depth analysis of its working principles, and designs both black-box and white-box tests for different operating modes. The experimental results indicate that the obstacle detection system performs well in straight sections, with no significant false obstacle alarms observed. The primary issue is the detection of foreign objects outside curves. In subsequent optimization efforts, priority will be given to technological improvements and breakthroughs for curve scenarios, which also serve as a reference for the future design and evaluation of related systems. However, this study does not encompass many other safety-related functions of the ITE system, and the current test scope remains limited. Additionally, due to weather constraints, black-box test cases for certain extreme scenarios could not be executed. Future work will consider integrating additional functions into the test process and supplementing obstacle detection tests under various weather conditions through indoor simulations.

Author Contributions

Conceptualization, Y.S. and F.Y.; methodology, F.Y.; software, Y.S.; validation, Y.S.; formal analysis, Y.S.; investigation, Y.S.; resources, Y.G.; data curation, Y.G.; writing—original draft preparation, Y.G.; writing—review and editing, F.Y.; visualization, F.Y.; supervision, F.Y.; project administration, F.Y.; funding acquisition, F.Y. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by National Natural Science Foundation of China Regional Innovation Joint Fund Key Project (U22A2053), Central Universities Basic Research Business Fund Project (2022JBZY024), Guangxi Key RD Project (Gui scientific AB22035008), Research and Application of Intelligent Operation and Maintenance System for Rail Transit Communication of Guangxi Science and Technology Plan Project (Guike AB23075209) and Rail Transit Beijing Laboratory.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are not available for commercial reasons at this stage.

Acknowledgments

Technical support from Beijing AI for Rail Technology Co.Ltd is gratefully acknowledged. The authors would like to thank the anonymous reviewers for their insightful and constructive comments.

Conflicts of Interest

The authors declare no conflicts of interest.

References

Hu, J.; Yang, M.; Zhen, Y. A Review of Resilience Assessment and Recovery Strategies of Urban Rail Transit Networks. Sustainability 2024, 16, 6390. [Google Scholar] [CrossRef]
Wen, X.; Si, B.; Wei, Y.; Cui, H. Resilience assessment of urban rail transit systems: A literature review. Public Transp. 2025, 17, 1–25. [Google Scholar]
IEC 62267:2009; Railway Applications-Automated Urban Guided Transport (AUGT)-Safety Requirements. IEC: Geneva, Switzerland, 2009.
Tiusanen, R.; Malm, T.; Ronkainen, A. An overview of current safety requirements for autonomous machines—Review of standards. Open Eng. 2020, 10, 665–673. [Google Scholar] [CrossRef]
Sengupta, A.; Cheng, L.; Cao, S. Robust multiobject tracking using mmwave radar-camera sensor fusion. IEEE Sensors Lett. 2022, 6, 1–4. [Google Scholar]
Li, X.; Zhang, S.; Chen, X.; Wang, Y.; Fan, Z.; Pang, X.; Hu, J.; Hou, K. Research on environmental adaptability of AI-based visual perception system under the perspective of vibration. Expert Syst. Appl. 2023, 231, 120636. [Google Scholar] [CrossRef]
Heaven, D. Why deep-learning AIs are so easy to fool. Nature 2019, 574, 163–166. [Google Scholar] [PubMed]
Szegedy, C. Intriguing properties of neural networks. arXiv 2013, arXiv:1312.6199. [Google Scholar]
Ahmed, H.; Nizami, M.H.A.; Shah, S.I.A.; Ayaz, Y. Monocular Vision-Based Obstacle Detection Technique using Projected Grid Deformation. In Proceedings of the 2018 IEEE International Conference on Information and Automation (ICIA), Wuyishan, China, 11–13 August 2018; pp. 1599–1604. [Google Scholar]
Tang, Z.; Yang, J. Research on Active Obstacle Detection Algorithm of Rail Train Based on Multi-sensor. In Proceedings of the 2023 8th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA), Chengdu, China, 21–23 April 2023; pp. 556–559. [Google Scholar]
Zhao, J.; Lou, C.; Hao, H. Intelligent vehicle communication and obstacle detection based on millimetre wave radar base station. In Proceedings of the 2021 International Wireless Communications and Mobile Computing (IWCMC), Virtual, 28 June–2 July 2021; pp. 1818–1822. [Google Scholar]
Li, G.; Sit, Y.L.; Manchala, S.; Kettner, T.; Ossowska, A.; Krupinski, K.; Sturm, C.; Lubbert, U. Novel 4D 79 GHz radar concept for object detection and active safety applications. In Proceedings of the 2019 12th German Microwave Conference (GeMiC), Stuttgart, Germany, 25–27 March 2019; pp. 87–90. [Google Scholar]
Athira, S. Image processing based real time obstacle detection and alert system for trains. In Proceedings of the 2019 3rd International Conference on Electronics, Communication and Aerospace Technology (ICECA), Coimbatore, India, 12–14 June 2019; pp. 740–745. [Google Scholar]
Hinton, G.E.; Salakhutdinov, R.R. Reducing the dimensionality of data with neural networks. Science 2006, 313, 504–507. [Google Scholar] [CrossRef] [PubMed]
Pravallika, A.; Hashmi, M.F.; Gupta, A. Deep Learning Frontiers in 3D Object Detection: A Comprehensive Review for Autonomous Driving. IEEE Access 2024, 12, 12345–12367. [Google Scholar]
Trigka, M.; Dritsas, E. A Comprehensive Survey of Machine Learning Techniques and Models for Object Detection. Sensors 2025, 25, 214. [Google Scholar] [CrossRef] [PubMed]
Wang, Z.; Cha, Y.J. Unsupervised deep learning approach using a deep auto-encoder with a one-class support vector machine to detect damage. Struct. Health Monit. 2021, 20, 406–425. [Google Scholar] [CrossRef]
Wang, S.; Han, J. Automated detection of exterior cladding material in urban area from street view images using deep learning. J. Build. Eng. 2024, 96, 110466. [Google Scholar]
Zhang, Q.; Yan, F.; Song, W.; Wang, R.; Li, G. Automatic obstacle detection method for the train based on deep learning. Sustainability 2023, 15, 1184. [Google Scholar] [CrossRef]
Li, D.; Zhao, Y.; Wang, W.; Guo, L. Localization and Mapping Based on Multi-feature and Multi-sensor Fusion. Int. J. Automot. Technol. 2024, 25, 123–456. [Google Scholar] [CrossRef]
Zhang, R.; Xu, L.; Yu, Z.; Shi, Y.; Mu, C.; Xu, M. Deep-IRTarget: An automatic target detector in infrared imagery using dual-domain feature extraction and allocation. IEEE Trans. Multimed. 2021, 24, 1735–1749. [Google Scholar]
Zhang, R.; Yang, B.; Xu, L.; Huang, Y.; Xu, X.; Zhang, Q.; Jiang, Z.; Liu, Y. A Benchmark and Frequency Compression Method for Infrared Few-Shot Object Detection. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5001711. [Google Scholar] [CrossRef]
Zhang, R.; Cao, Z.; Huang, Y.; Yang, S.; Xu, L.; Xu, M. Visible-Infrared Person Re-identification with Real-world Label Noise. IEEE Trans. Circuits Syst. Video Technol. 2025, early access. [Google Scholar] [CrossRef]
Zhang, R.; Li, L.; Zhang, Q.; Zhang, J.; Xu, L.; Zhang, B.; Wang, B. Differential Feature Awareness Network Within Antagonistic Learning for Infrared-Visible Object Detection. IEEE Trans. Circuits Syst. Video Technol. 2024, 34, 6735–6748. [Google Scholar] [CrossRef]
Witten, I.H.; Frank, E.; Hall, M.A.; Pal, C.J. Data Mining Practical Machine Learning Tools and Techniques; Morgan Kaufmann: Burlington, MA, USA, 2017. [Google Scholar]
Goodfellow, I.; Papernot, N. The Challenge of Verification and Testing of Machine Learning. Cleverhans-blog 2017. Available online: https://cleverhans.io/security/privacy/ml/2017/06/14/verification.html (accessed on 25 March 2025).
Wicker, M.; Huang, X.; Kwiatkowska, M. Feature-guided black-box safety testing of deep neural networks. In Proceedings of the Tools and Algorithms for the Construction and Analysis of Systems: 24th International Conference, TACAS 2018, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2018, Thessaloniki, Greece, 14–20 April 2018; Springer: Berlin/Heidelberg, Germany, 2018; pp. 408–426. [Google Scholar]
Pei, K.; Cao, Y.; Yang, J.; Jana, S. Deepxplore: Automated whitebox testing of deep learning systems. In Proceedings of the 26th Symposium on Operating Systems Principles, Shanghai, China, 28–31 October 2017; pp. 1–18. [Google Scholar]
Tian, Y.; Pei, K.; Jana, S.; Ray, B. Deeptest: Automated testing of deep-neural-network-driven autonomous cars. In Proceedings of the 40th International Conference on Software Engineering, Gothenburg, Sweden, 27 May–3 June 2018; pp. 303–314. [Google Scholar]
Zhang, M.; Zhang, Y.; Zhang, L.; Liu, C.; Khurshid, S. DeepRoad: GAN-based metamorphic autonomous driving system. In Proceedings of the 2018 26th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE’18), Lake Buena Vista, FL, USA, 4–9 November 2018; ACM: New York, NY, USA, 2018; pp. 132–142. [Google Scholar]
Ma, L.; Zhang, F.; Xue, M.; Li, B.; Liu, Y.; Zhao, J.; Wang, Y. Combinatorial testing for deep learning systems. arXiv 2018, arXiv:1806.07723. [Google Scholar]
Hayhurst, K.J. A Practical Tutorial on Modified Condition/Decision Coverage; DIANE Publishing: Collingdale, PA, USA, 2001. [Google Scholar]
Sun, Y.; Huang, X.; Kroening, D.; Sharp, J.; Hill, M.; Ashmore, R. Testing deep neural networks. arXiv 2018, arXiv:1803.04792. [Google Scholar]
Ma, L.; Zhang, F.; Sun, J.; Xue, M.; Li, B.; Juefei-Xu, F.; Xie, C.; Li, L.; Liu, Y.; Zhao, J.; et al. Deepmutation: Mutation testing of deep learning systems. In Proceedings of the 2018 IEEE 29th International Symposium on Software Reliability Engineering (ISSRE), Memphis, TN, USA, 15–18 October 2018; pp. 100–111. [Google Scholar]
Xie, X.; Ma, L.; Juefei-Xu, F.; Xue, M.; Chen, H.; Liu, Y.; Zhao, J.; Li, B.; Yin, J.; See, S. Deephunter: A coverage-guided fuzz testing framework for deep neural networks. In Proceedings of the 28th ACM SIGSOFT International Symposium on Software Testing and Analysis, Beijing, China, 15–19 July 2019; pp. 146–157. [Google Scholar]

Figure 1. ITE system structure.

Figure 2. Conversion between two modes of ITE system.

Figure 3. Requirements for the obstacle detection function.

Figure 4. Design elements of high-safety obstacle detection function test case.

Figure 5. Schematic diagram of train straight-track scenario.

Figure 6. Test cases for a train obstacle on a straight track.

Figure 7. Schematic diagram of multiple-obstacle train detection in turnout scenarios.

Figure 8. Test cases with train obstacles in turnout scenarios.

Figure 9. Top view of obstacle detection in the curve scene.

Figure 10. Test cases with train obstacles in curve scenarios.

Figure 11. Schematic representation of obstacle detection in ramp scenario.

Figure 12. Test cases with train obstacles in ramp scenarios.

Figure 13. Structure of target detection network.

Figure 14. Two types of CSP structures.

Figure 15. Total number of targets for each category.

Figure 16. AP test results of the model under different categories.

Figure 17. ITE system identification results.

Figure 18. The recognition results of the ITE system under different simulated weather conditions.

Table 1. Equivalence classes of train obstacles.

Input Conditions	Straight Track	Track	Curve	Slope
Input Conditions	Obstacle Distance (L1)	Obstacle Distance (L2)	Obstacle Distance (L3)	Obstacle Distance (L4)
Effective Equivalence Class	30 m $\leq L 1 \leq 300$ m (1)	30 m $\leq L 2 \leq 300$ m (2)	30 m $\leq L 3 \leq 70$ m (3)	30 m $\leq L 4 \leq 100$ m (4)
Ineffective Equivalence Class	$L 1 < 30$ m (5)	$L 2 < 30$ m (7)	$L 3 < 30$ m (9)	$L 4 < 30$ m (11)
Ineffective Equivalence Class	$L 1 > 300$ m (6)	$L 2 > 300$ m (8)	$L 3 > 70$ m (10)	$L 4 > 100$ m (12)

Table 2. Neuron coverage test of the model on different datasets.

Neuron Coverage Metric (%)	Parameter Selection	Original Dataset Test Results	Expanded Dataset Test Results
KMNC (%)	k = 100	37.5	47.8
	k = 500	24.8	26.5
NBC (%)	LB = 0.5, UB = 0.9	3.76	3.9
SNAC (%)	UB = 0.9	0.62	0.66
	k = 10	1.23	1.36
TKNC (%)	k = 100	4.33	4.58
	k = 1000	7.86	8.13

Table 3. Classification confusion matrix.

	Positive Sample	Negative Sample
Predicted positive	Predict positive sample as positive sample	Predict negative sample as positive sample
	(true positive, TP)	(false positive, FP)
Predicted negative	Predict positive sample as negative sample	Predict negative sample as negative sample
	(false negative, FN)	(true negative, TN)

Table 4. mAP of the model at different IoU thresholds.

Evaluation Object	Evaluation Metric	Result
mAP	IoU = 0.50:0.95, area = all, maxDets = 100	0.764
	IoU = 0.50, area = all, maxDets = 100	0.991
	IoU = 0.75, area = all, maxDets = 100	0.867
	IoU = 0.50:0.95, area = small, maxDets = 100	0.421
	IoU = 0.50:0.95, area = medium, maxDets = 100	0.695
	IoU = 0.50:0.95, area = large, maxDets = 100	0.861

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, F.; Gu, Y.; Sun, Y. Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros. Electronics 2025, 14, 1318. https://doi.org/10.3390/electronics14071318

AMA Style

Yan F, Gu Y, Sun Y. Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros. Electronics. 2025; 14(7):1318. https://doi.org/10.3390/electronics14071318

Chicago/Turabian Style

Yan, Fei, Yiran Gu, and Yunlai Sun. 2025. "Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros" Electronics 14, no. 7: 1318. https://doi.org/10.3390/electronics14071318

APA Style

Yan, F., Gu, Y., & Sun, Y. (2025). Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros. Electronics, 14(7), 1318. https://doi.org/10.3390/electronics14071318

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Deep Learning-Based Train Obstacle Detection Technology: Application and Testing in Metros

Abstract

1. Introduction

2. Obstacle Detection System

3. Obstacle Detection System Testing Methods

3.1. Object Detection Technology Based on Deep Learning

3.2. Measurement Methods for Deep Learning Systems

4. Experiment

4.1. ITE System

4.2. Two Modes of ITE System

4.3. Black-Box Testing in LH Mode

4.4. White-Box Testing in LE Mode

5. Discussion

6. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI