1. Introduction
Rail travel is one of the safest means of transportation [
1], and its use has been steadily increasing post-pandemic, already being at levels similar to or higher than those in 2019. In 2023, EU rail passenger transport hit its highest level in years, reaching 429 billion passenger-kilometers, surpassing the pre-pandemic numbers of 2019 for the first time [
2]. In the European Union alone, as of 2022, there were 201,922 km of rail tracks [
3].
Considering the significant length of rail tracks in Europe, railway intrusion, i.e., when an unexpected object is within the railroad’s clearance gauge, is one of the major safety topics. In 2022 alone, in the European Union, there were 43 serious injuries or fatalities caused by collisions, including with obstacles, and 27 caused by derailments, out of a recorded total of 126 rail-related serious injuries or fatalities [
4]. If any potential obstacles on the railway can be detected ahead of time, this would contribute to avoiding accidents and preventing deaths, damages, and service disruptions. While advancements in safety technology have been made, effectively detecting and mitigating these hazards remains an ongoing challenge.
The search for improved railway safety has spurred extensive research into obstacle detection technologies over the last few decades [
5,
6,
7,
8,
9]. In the literature, it is possible to find a wide array of technologies that have been used to address this issue [
10,
11,
12,
13], ranging from camera-based to radar. Many such approaches employ data fusion [
14,
15,
16], mitigating the limitations of some methods with the upsides of others.
Camera-based obstacle detection systems are particularly popular in the automotive sector, being the basis of Tesla’s Autopilot obstacle detection algorithms [
17]. Although such camera-based systems can be cheaper due to a reduced hardware cost, they are also significantly affected by light conditions and other environmental factors, such as dirt and rain.
Radar, on the other hand, is significantly more weather-independent and long-range but presents smaller spatial resolution and, thus, less information in complex environments.
LiDAR (Light Detection and Ranging) is another widely used approach for obstacle detection in autonomous vehicles and is a major part of Waymo’s algorithms [
18]. It creates a 3D representation of the world around the object, which can be leveraged to detect obstacles. However, the amount and type of data make its processing more complex and computationally heavy. Nonetheless, it also has significant advantages over cameras [
19,
20] in that it provides a 3D representation of the surroundings while being unaffected by lighting conditions.
As all methods have inherent upsides and downsides, they are usually combined with other technologies in data fusion approaches, with any pair of the three mentioned approaches being common [
11,
14,
17,
18]. Other sources for fusion data, while also used, are less common.
In automotive applications, the sensing devices are almost always included in the vehicle, as it would not be feasible otherwise. On the other hand, in railway applications, due to the fact that trains follow a consistent pre-determined path, it is feasible to monitor only critical regions using fixed sensing devices. Several studies have focused on using LiDAR in a fixed position along the railway, such as at level crossings. In these cases, a LiDAR sensor is usually assembled atop a post to monitor just a limited surrounding area. For example, in [
21], a 3D mechanical LiDAR system, fixed near a railway and able to scan an area of 50 m, is presented, and the authors also describe the methods and rail extraction algorithms developed and used for obstacle detection: an SFRE (scanline feature-based rail extraction) algorithm that retains track characteristics is used, combined with Octree downsampling to reduce computational overhead. In [
22], a multi-modal contrast learning strategy, DHT-CL (neighborhood attention-driven multi-modal contrast learning strategy), is proposed, using multiple sets of LiDAR sensors mounted on trackside signal poles. Ref. [
23] presents a system for obstacle detection in railway level crossing, with the main focus being the detection of smaller objects, such as rocks, using 3D point clouds acquired with tilting 2D laser scanners.
However, such stationary approaches only monitor a reduced section of the train track, like, for example, a level crossing where the flow of vehicles and people across the tracks is expected to be high, but disregard the rest of the train’s path, where obstacles like landslides can also appear. On the other hand, most automotive applications follow a machine-learning-based approach [
24,
25], where the availability of large amounts of good data is paramount. In road-based transportation, it is significantly easier to acquire such data, including situations with obstacles, than in railway-based ones.
Therefore, this paper follows our previous work [
26] and describes a data-fusion framework that combines on-vehicle data from LiDAR and GNSS (Global Navigation Satellite System) with pre-existing path knowledge to detect obstacles deterministically.
The proposed data fusion approach combines two commonly used information sources, LiDAR and GNSS, with an uncommon one, path maps. This is only feasible in a railway environment, as, unlike cars, trains will usually follow a predetermined path with little deviation, especially in single- or double-track railways.
This enables the use of simplified methods that require only moderate amounts of data for development, thus circumventing the large data requirements of a machine learning algorithm, which would not be feasible to obtain in the project’s context.
2. Method
Machine-learning-based approaches are currently the most common approach for obstacle detection in vehicles, particularly road-bound ones. In a railway environment, obstacle detection is, by itself, less common than in a road environment, and thus, there are no open datasets that can be used. This, allied with the fact that obtaining good quality data on the railway is significantly more challenging, made a data-heavy solution like machine learning unfeasible.
An alternative to using lots of data to train a model is a map-based approach, where previous knowledge of the train’s route, which is predetermined and limited by the rails, can be leveraged to determine the target volume relative to the train’s location.
Figure 1 shows an overview of the solution, which is based on two live data sources: the GNSS/GPS (Global Navigation Satellite System/Global Positioning System) coordinates and the LiDAR point cloud. The GPS coordinates are used to determine the train’s position in its scheduled path and orient the point cloud data on the same coordinate system. This enables the definition of a target volume at a predefined distance on the LiDAR point cloud, where obstacles on the track will be searched.
A commonly used [
27,
28] clustering algorithm by Ester et al. [
29] and known as DBSCAN, or Density-Based Spatial Clustering of Applications with Noise, was employed for the obstacle identification phase, as shown in
Figure 2.
The relevant parameters for implementing DBSCAN include the eps, which is the maximum distance between two neighboring points, and the minimum number of samples that are required to form a dense region. The final parameter relates to post-processing and is the number of points that are required for a dense region to be considered as an obstacle.
As previously stated, the fact that the train follows a pre-determined path is one of the pillars for the development of the proposed method. This means that it is necessary to define this path in a descriptor file, which will be used as a reference for the real-time processing of LiDAR data. It includes a sequence of track coordinates, i.e., latitude, longitude, and altitude, along the path at periodic intervals that will be used to calculate the location of the search region.
Figure 3 presents a high-level overview of the processing cycle of the proposed solution. After acquiring LiDAR and GPS data, the first step is determining the GPS train position in the predefined path. This is performed by selecting the closest point on the train’s path to the train’s actual location using the Haversine formula.
The Haversine formula, Equation (
1), determines the great circle distance between two points in a sphere given their longitudes and latitudes (the shortest distance over the Earth’s surface). It is a special case of a more general formula in spherical trigonometry, the Law of Haversines, which relates the sides and angles of spherical triangles.
If we consider two points A and B according to their respective latitudes
and longitude
, as A = (
,
) and B = (
,
) (in radians), the great circle distance between A and B is
. This
has a radius of
r and a central angle of
, which leads to
=
. The law of haversine hav(
) corresponding to the central angle
is then
With hav(
) computed for the pair of points (A, B),
, which can be calculated using the inverse haversine function:
Finally
can be calculated as
With the closest point determined, it is possible to follow the track to define the target volume. This is carried out by selecting a point along the tracks in front of the train at a predefined distance, X m, which is the central point of the volume where obstacles will be searched.
To define the target volume from the central search point in latitude–longitude coordinates, the GPS and LiDAR coordinate systems must first be aligned. To accomplish this, it is necessary to calculate the angle between three GPS points (A, B, C), where A and B are the GPS points before and after the current GPS train coordinates, and C is the GPS central search point. A and B are used to define the train’s bearing, and C is used to determine the orientation of the track line.
The first required step is calculating the angle between the AB and BC lines, defined by their bearings,
and
. The bearing of A from B (
) is the angle measured in the clockwise direction from the north axis at point B, calculated as
where
and
are the latitude and longitude of points A and B, in radians, and
b is the bearing of that direction relative to North. The bearing of B from C (
) is calculated using the same process.
After calculating the bearings and , the difference between them is calculated, , which is the angle between the three GPS points (A, B, C).
To transform the GPS coordinates points into the LiDAR coordinate system, we also need to obtain the distance (
d) between the train’s point (
O) and the central search point (
C) in GPS coordinates, using once again the Haversine formula (
1). With this distance and the
angle, the difference between bearing calculated before, we can finally obtain the (x, y) point,
, in the LiDAR coordinate system that corresponds to the central search point; see Equation (
6).
Because the LiDAR is not centered inside the locomotive, it is necessary to calibrate its positioning. After the LiDAR is installed, an acquisition is performed, and a point located on the tracks’ central axis (
) is determined to be used as the search’s origin. Its coordinates and associated rotations are then saved in the configuration file. This calibration is then applied to correct the central search point’s coordinates (
), followed by determining its altitude, i.e., its
z coordinate.
With the central search point in the LiDAR coordinate system,
, the next step is to determine if this point is inside the field of view (FOV) of the LiDAR (
Figure 4 and
Figure 5). With this, only relevant data are processed, meaning data directly on the train path and inside the FOV, which enables a more efficient search on curved train paths by avoiding redundant operations. Consequently, this also speeds up data processing.
If the point is inside the FOV, a configurable search area centered around it is defined and used to detect obstacles. Then, this search area will be scanned for obstacles by employing the DBSCAN clustering algorithm, which is able to detect groups of points according to their density by employing the parameters in
Table 1. Each cluster that has more than a configurable number of points will be considered an obstacle.
Depending on the target distance, the threshold number of points may change, as it is expected that the closer the obstacle is to the LiDAR, the more points will be detected. The target number may be user-adjusted but will remain constant for a determined analysis distance.
3. Validation
In order to test and validate the proposed methodology, the Livox Tele-15 [
30], which can sense distances up to 500 m, was selected. It was specially chosen due to its compact size, long range, and high precision, allowing vehicles to detect and avoid obstacles well in advance when moving at high speeds. It is also an automotive LiDAR, which should provide good resilience to difficult environmental conditions, as, even though it is possible that heavy rain/snow and dense fog might reduce the LiDAR response, normal rain/snow and fog should not affect it.
Additionally, the u-blox ZED-F9P-02B GPS module [
31] was also employed. This module supports multi-band GNSS technology, making it well suited for demanding industrial applications requiring accurate and reliable geospatial data.
The process aimed to elucidate the benefits and constraints of LiDAR technology, particularly in the context of integrating it into an obstacle detection system. This was achieved through an initial series of acquisitions designed to gather foundational data.
Figure 6 presents the initial acquisitions’ position and direction.
The first phase consisted of several static acquisitions in different directions and with varying ground inclinations. Examples are depicted in
Figure 7 and
Figure 8. The varying orientations and ground elevations were important for assessing the performance and precision of the obstacle detection algorithm when integrating LiDAR and GPS data.
The second phase involved a controlled experiment conducted at various distances to capture detailed information about the human form within a point cloud. An example can be visualized in
Figure 9, where a person is at 50 m and 200 m in front of the LiDAR. These data served as the basis for defining what constitutes an obstacle within a specific range, using the DBSCAN algorithm. The static nature of these acquisitions meant that both the LiDAR sensor and the subject remained stationary throughout the process, ensuring consistent conditions for accurate data collection and comparison.
The acquisitions were systematically carried out at four distinct distances: 50 m, 100 m, 150 m, and 200 m. These measurements allowed for a comprehensive understanding of LiDAR’s performance across varying environmental conditions and distances, providing insights crucial for optimizing its use in real-world obstacle detection systems. Examples of these acquisitions can be visualized in
Table 2. From this analysis, it is possible to conclude that the detection range and accuracy of this system go hand in hand, as longer ranges imply the use of a smaller threshold for the minimum number of points to consider as an obstacle and will make any potential outliers have possibly larger effects, contributing to an increase in the false positive obstacle detection rate.
In
Table 3, the detection of a person at different distances during the static acquisitions is presented.
The next phase involved conducting static tests in a railway environment, as depicted in
Figure 10. Acquisitions with a person in several locations were performed, namely in the middle of the track, near the track, and outside the track. The acquisitions were processed using the proposed solution, and the person was successfully detected when inside the rail track.
With the static tests successfully completed, the next phase involved conducting tests in a dynamic environment, specifically within a moving car (
Figure 11).
Figure 12 illustrates sample routes used to test the proposed solution, encompassing various scenarios within the LiDAR’s FOV coverage.
As an example,
Figure 13 presents the successful detection of an obstacle in the road by the proposed solution, where, in this case, a person crossed the road in front of the car. The dynamic tests described in
Figure 12 were performed using a car that maintained an average speed of 10 to 20 km/h and searched for obstacles 100 m ahead of the car.
The Contumil-Leixões line (
Figure 14) was used for the final tests. These dynamic tests aimed at the validation of the proposed solution inside a locomotive, allowing for an assessment of the solution’s effectiveness and feasibility in real-world railway operating conditions. A Comboios de Portugal (CP, Portugal) 2600-2620 series locomotive was used along an assigned section of the line. The environmental conditions were normal, with overcast weather in the spring, without any noticeable fog or rain.
A custom physical support was designed and built in Aluminium 3.3315 (EN-AW 5005) to be installed inside the train locomotive, with the goal of securing the LiDAR and GPS during the experiments.
During the dynamic validation in a railway environment phase, a maximum speed of 80 km/h was reached by the locomotive. Similar to earlier tests, obstacle detection was performed 100 m ahead of the LiDAR, itself positioned at the front of the locomotive. The processing results employing the proposed methodology are exemplified in
Figure 15. Two scenarios are presented: a search for obstacles inside the field of view (
Figure 15a), and a search for obstacles outside the field of view (
Figure 15b). In the first case, shown in
Figure 15a, no obstacles were detected.
In the second case, as shown
Figure 15b, because the search area was outside the LiDAR’s field of view, the search was not executed, allowing valuable processing time to be saved, a critical point in real-time systems’ performance.
4. Discussion
In road vehicles, the human detection of obstacles has been evaluated, and most individuals can detect obstacles at approximately 100 to 350 m, depending on the object size, shape, and color [
33]. However, this is highly dependent on each particular individual, and other factors also play major roles in the capabilities of human drivers, such as vehicle velocity and environmental, physiological, and psychological conditions.
The proposed LiDAR-based solution aims to provide a consistent, configurable detection distance of up to 500 m ahead of the train.
Considering a velocity of 220 km/h, the maximum velocity of a CP 4000 series train, as an example, and a mean LiDAR processing cycle time (
Figure 3) of 0.523036 s, if an obstacle is detected 500 m ahead, the collision velocity can be determined as
where
v is the collision velocity in m/s;
is the initial velocity in m/s;
a is the deceleration in m/
;
s is the collision position in meters;
is the detection position in meters; and
t is the time in seconds.
The mean deceleration was defined as 0.95
, as obtained from the CP 4000 series datasheet [
34], while the LiDAR mean processing cycle (
) was considered in the system of Equation (
7).
Alternatively, considering obstacle detection by a human at a conservative range of 150 m and applying the same equations, the collision velocity would be
Despite not avoiding the impact, the proposed solution can reduce the collision velocity for the selected use case. By decreasing the speed at which the collision occurs, the system mitigates the severity of the impact forces experienced during the crash, enhancing the safety of the vehicle’s occupants by lowering the risk of severe injuries and minimizing damage to the vehicle itself.
Additionally, the processing time would also impact the periodicity of analysis, where, for the same 220 km/h, it would come out as one analysis every 32 m, which can be compensated with the scan region size. However, considering that this processing time was obtained without any particular optimizations and using exclusively CPU-bound calculations with an Intel (United States of America) Core i7-9700TE at 1.8 GHz in a system with 16 GB of RAM, it would be possible to improve this time using parallel processing and GPU-bound calculations.
5. Conclusions
This paper presents a framework for obstacle detection in railway environments. The proposed methodology integrates a long-range LiDAR system with the train’s latitude, longitude, and altitude to determine whether there are any obstacles ahead of its path. This involves the determination of a search volume from prior knowledge of its path and employing the DBSCAN clustering algorithm to identify obstacles within the defined area of interest.
The system was subjected to multiple tests, including trials on a moving locomotive along the Contumil-Leixões line, to evaluate the approach’s effectiveness under operational conditions.
Ultimately, it was shown that it is possible to augment obstacle detection in railway operations using the combination of LiDAR data with GNSS and advanced clustering algorithms. If such a system can help reduce the overall occurrence of collisions, the economic and societal benefits would be noticeable, as even a single disruption can affect the lives and livelihoods of many.
Future work will focus on extending the framework to more complex railway environments and conditions to assess its robustness under diverse operational conditions. Another avenue for further improvement to the method would be distinguishing among different obstacle types in order to be able to tell the difference between a moving obstacle that may not require braking and a stationary obstacle that will require braking. Besides this, a comparison with established approaches such as Radar and Deep Learning would be interesting, although neither was available during this work.
Finally, integrating additional sensors into the proposed data-fusion method, such as cameras or thermal imaging, could further enhance system reliability and performance. These improvements will help refine the framework for broader real-world applications.