An Intelligent and Autonomous Sight Distance Evaluation Framework for Sustainable Transportation

: Railways are facing a serious problem of road vehicle–train collisions at unmanned railway level crossings. The purpose of the study is the development of a safe stopping sight distance and sight distance from road to rail track model with appropriate computation and analysis. The scope of the study lies in avoiding road vehicle–train collisions at unmanned railway level crossings. An intelligent and autonomous framework is being developed using supervised machine learning regression algorithms. Further, a sight distance from road to rail track model is being developed for road vehicles of 0.5 to 10 m length using the observed geometric characteristics of the route. The model prediction accuracy obtained better results in the development of a stopping sight distance model in comparison to other intelligent algorithms. The developed model suggested an increment of approximately 23% in the current safe stopping sight distance on all unmanned railway level crossings. Further, the feature analysis indicates the ‘approach road gradient’ to be the major contributing parameter for safe stopping sight distance determination. The accident prediction study ﬁnally indicates that, as the safe stopping sight distance is increased by following the developed model, it is predicted to decrease road vehicle–train collisions.


Introduction
Railways are the biggest platform for the transportation of goods and people to different places. With sensors/actuators application under European Telecommunications Standards Institute (ETSI) standards of Intelligent Transportation System (ITS) hierarchical architecture [1,2], vehicle safety may become an easy task across the world. In the coming years, the railways may have the ability to predict maintenance and planning automatically [3]. A major component of the railway travel scenario is the approach of level crossings, where the railway line and road driver intersect each other at the same level. According to [4], "The compensation given to the families of the persons who were killed at unmanned railway was Rs 10.88 lakh, Rs 2.22 lakh and Rs 17.41 lakhs in years 2012-13, 2011-12 and 2010-11, respectively". When the road user approaches an unmanned railway level crossing, road vehicles must be able to realize that there is an unmanned level crossing in the vicinity. At these crossings, there is an unavailability of any gatekeeper, which tends to increase the chances of accidents [5]. Road vehicle accidents occur majorly due to incorrect defensive driving action. The road driver tries to reduce the speed of the vehicle and it is needed to be stopped at a particular threshold distance, known as the stopping sight distance [6]; further, when there is no train available within a particular distance, i.e., sight distance from road to track. In this condition, road users stopped near the crossing easily traverse the unmanned railway level crossing. The Present Serviceability Rating (PSR), as per the American Association of State Highway and Transportation Officials (AASHTO), is a rating of pavement defect visual inspection on a scale of 0-5, or Very Poor, Poor, Fair, Good, and Very Good. At unmanned railway level crossings, the road drivers approaching the unmanned railway level crossing face different situations, which lie in three different zones. Firstly, the area in which a road driver, when approaching the unmanned railway level crossing, observes that there is an unmanned railway level crossing ahead. The road driver scans the unmanned railway level crossing to determine whether or not there is a train ahead at the crossing. Thereafter, the road driver decides upon the action to continue ahead or not. In another situation in which the road vehicle approaches the unmanned railway level crossing, the road vehicle must be able to realize that there is a crossing in the vicinity. The road vehicle needs to be stopped at a particular safe distance known as the stopping sight distance (SSD) to avoid hitting an unidentified road object [7]. The particular spot where the road driver decides to halt before the crossing is called the non-recovery zone. Finally, in the highest risk zone, where the road driver is over the rail tracks, there is a risk of the train approaching the unmanned railway level crossing, i.e., uncertain risky crossings [8,9]. The road driver tries to cross the hazard zone very quickly to get into the right of way [10], avoiding road vehicle-train collisions at unmanned railway level crossings [11][12][13][14].
The study makes a major contribution to the study of the development of new road vehicle-train sight distances assessment framework for unmanned railway level crossings using machine learning algorithms. The communication has two major characteristics of low delay and high reliability of communication [15]. Therefore, new developing systems focus on better measures of efficiency [16]. The main contributions of this manuscript are summarized below:

•
We collected the data viz. unmanned level crossing location, pavement condition, traffic volume, road width, visibility, speed of road vehicles, perception and reaction time, approach road gradient, distance of speed breaker distance from crossing, friction, number of lanes, environment, construction, width of rail gauge, length of the road vehicle, and accident statistics from unmanned railway level crossings.

•
We determined a safe stopping sight distance model for unmanned railway level crossings in the study area using supervised machine learning regression algorithms in the Python programming language. • A sight distance equation was developed from a road to rail track model for unmanned railway level crossings, and prediction accuracy analysis of different machine learning algorithms was performed for the development of the SSD model.

•
Analysis of relevant features for judging the major contributing parameter for safe stopping sight distance was carried out.

•
We also performed the sensitivity analysis of the SSD model for parameters such as road width, visibility, speed of road vehicles, perception and reaction time, approach road gradient, and speed breaker distance from crossing to perform accident prediction analysis of the new SSD model.
The paper is organized into five sections. Section 2 reviews the studies carried out in the past related to the work. Section 3 talks about the materials and methods, and is organized into three subsections-data collection, modeling, and validation. Section 4 presents the results and discussion of the proposed work. The last section discusses the conclusions regarding future work for the proposed study.

Related Work
This section discusses the existing studies of road vehicle-train sight distances assessment techniques for unmanned railway level crossings, developed over the world. The study conducted by Schoppert and Hoyt [17] developed an associative connection between lateral sight distance and collision history using the risk homeostasis technique. The disadvantage of the relationship developed is that it does not correctly explain the logically agreed situation. Wigglesworth [18] studied unprotected railway level crossings. The approaching road had a lesser stopping sight distance and road visibility. The road vehicle speeds were observed to be the same as of the trains on days with or without trains. Indian Roads Congress (IRC) manuals, i.e., IRC-73-1980 [19] and IRC-39-1986 [20], suggested the statistical computation of SSD for road vehicles approaching unmanned railway level crossings. The SSD at the unmanned railway level crossings was calculated using the speed of approaching road vehicles, coefficient of longitudinal friction, and perception and breaking reaction time.
where, V = speed of approaching road vehicles (km/hr), f = coefficient of longitudinal friction, and t = perception and brake reaction time (secs).
King [21] studied and compared road driver behavior with restricted and unrestricted lateral visibility at dissimilar railway level crossings. The study resulted in more safety with an unrestricted SSD of road drivers. Ward and Wilde [22] studied the effect of increasing lateral sight distance requirements at unprotected railway level crossings. The results prove that increasing the lateral sight distance helped in reducing the risk of accidents. The disadvantage of the method was the increase in search time for the crossing. Johtaja [23] developed a sight distance from the road to track requirement model for American railway level crossings. The sight distance from the road to track was defined as the direct association between the centerline distances of the farthest tracks at level crossings. Gou et al. [24] suggested the required sight distance from the road to track for Canadian railway level crossings. The sight distance from the road to track was defined as the approaching train speed and time needed to cross the railway level crossing. Nizam and MacDonald [25] discussed the Washington State Department of Transportation (WSDOT) design manual about the types of sight distances videlicet (viz.) ahead to the crossing, down the tracks approaching the crossing, the sight triangle, and down the tracks at the crossing. The study defined the sight triangle for the road driver to stop 15" before the railway level crossings as the train approaches the crossings, and described the SSD for the road vehicles by providing crosses called "St. Andrew's Crosses". The crosses were provided at sufficient stopping sight distance, which enabled the road vehicle users to see the approaching train easily. Mornell [26] defined the required sight distance from the road to the track for Swedish railway level crossings. The proposed sight distance from track to road was observed to be three times the approaching train speed. The New Zealand Transport Agency [27] proposed the required stopping sight distance from the road to track for New Zealand railway level crossings. The sight distance was based on the geometric design of the New Zealand railway level crossings. A study conducted by Kallberg [28] determined the sight distance requirements for Finnish railway level crossings [29]. The study proposed a safe stopping sight distance as the minimum distance at which a road vehicle user approaching the level crossing must be able to view the rail track at a height of 1.1 m (approx.). The results reveal that the required detection distance at railway level crossings was directly related to the road vehicle speed approaching the unmanned railway level crossing. The study was carried out only for non-skewed crossings. Sever [30] proposed an approach for determining the road visibility length on passively protected level railroad crossings. The study focused on roads with worse conditions and weather conditions near the crossings. The study also suggested that the visibility length at lower speeds of the road vehicles had a comfortable reaction time and shorter visibility. Kumar and Panday [31] studied the road driver sight distance requirements and problems in the detection of approaching trains at railway level crossings. Further, the Sustainability 2021, 13, 8885 4 of 14 study investigated the hindrances in the vision of road drivers, i.e., sight distance led to obstructions in visualizing the approaching train at railway level crossings.
Moayedfar [32] studied and calculated the sight triangle dimensions in an unobstructed area of Iranian railway level crossings. Three zones were considered in the study: approach zone, non-recovery zone, and hazard zone. The study suggests that the visibility of the road driver should be clear enough, i.e., it should be free from obstructions such as trees, buildings, fixed or mobile equipment, etc. The various factors used in the sight distance requirements for safety at the railway level crossings involved in the study were length of road vehicle (considered equal to 2 m), stop line distance to side rail (considered equal to 4.5 m), train speed in km/h, road vehicle speed in m/sec, the maximum speed of road vehicle in km/hr (considered equal to 2.7 m/s), the absolute value of the slope based on the percentage, the road friction coefficient of wet pavement, acceleration of road vehicle speed (considered equal to 0.45 m/s 2 ), the sum of observation and reaction time (considered equal to 2 s), the distance at which the road vehicle has maximum speed (considered equal to 8.1 m), and the distance between the road driver's eyes and front bumper of road vehicle (considered equal to 2.4 m). The study finally suggested that, with a train speed of 120 km/h and a 784 m distance from the crossing, the train driver needs to decrease their speed by 95 km/hr. They determined the stopping sight distance requirements for Iranian railroad crossings by calculating the vehicle stop sight distance. When a road driver entered at a distance of 250 m from an unmanned railway level crossing, the train driver was recommended to lower the train speed to 95 km/h (approx.). The study suggested that both railway track and road should not have horizontal and vertical curves to provide good stopping sight distance. Further, there should be no obstacle in between the railway track and road, so that there is a hindrance in the road vehicle and train drivers' visibility. The proper sight distance helped in avoiding collisions between the road vehicle and train railway level crossings. Rudin-Brown [33] extended the road approach zone distance to improve the stopping sight distance requirements at Canadian unmanned railway level crossings. The approach distance on unmanned railway level crossings, having cross bucks as the only protection system, was extended to allow the road vehicle driver easy scanning for the train approaching the railway level crossing. The human problems include road driver obstruction in viewing the train, road driver retinal image remaining unchanged, train horn being less audible, road driver tendency to react slowly, late detection of the hazard, distraction due to fatigue, and hazard perception effect in the train detection. Therefore, the road driver approach zone was extended to at least 10 m and hence, the train was easily detected. Thus, the road driver could stop at a comfortable sight distance on unmanned railway level crossings.

Materials and Methods
When the road user approaches the unmanned railway level crossing, it must be able to realize that there is a crossing in the vicinity. The road vehicle needs to be stopped at a particular safe distance from the crossing [5] known as the stopping sight distance. The methodology consists of data collection, modeling, and validation, as shown in Figure 1.

Data Collection
To achieve the objectives proposed in this study, it was first necessary to select a sufficient number of unmanned railway level crossings for the study. Therefore, 19 unmanned railway level crossings on the Shahdra-Shamli-Tapri (DSA-SMQL-TPZ) railway route were selected as the study route. The criteria for selecting unmanned railway level crossings for study was the total number of accidents from the year 2008 to 2013. The railway route has a chainage of 0 to 150 km, 15 block sections, and 68 unmanned railway level crossings. The locations' latitude and longitude data characteristics were captured by GPS [34], traffic flow control systems such as traffic volume [35] were captured by a video camera, and speed of road vehicles was captured through a speed radar gun. Other manually collected parameters were approach road gradient, perception and reaction time,

Data Preprocessing
The objective of preprocessing is to create an input feature matrix and observed responses vector. The data feature rescaling was performed using normalization and standardization techniques. The normalization rescaled the real-value data attributes in the range of 0 to 1. Further, data attributes were standardized using the distribution shift of each attribute to zero mean and unit variance. The selection of input features is a major step in building a good model. Therefore, 9 input features were selected, as listed in Table 1. Further, to evaluate the strength of the relationship between two features, correlation analysis was performed. Highly correlated attributes should have a correlation value very close to +1 and less correlated attributes should have a value closer to −1; it is evident from Table 1 that no data parameter is very close to 1. Therefore, all attributes are important and need to be considered for building the SSD model.

Data Splitting
The data was split for training and testing purposes. In this study, approximately 75% (approx.) of the data were used for training, and the remaining 25% (approx.) were used for validation of the model. The model accuracy may change according to the data split performance.

Safe Stopping Sight Distance Model
The study route was observed to have both rural and undivided roads. At all 19 selected unmanned railway level crossings, speed of road vehicles in km/hr was observed to be approximately less than 29 km/hr. Therefore, by IRC-73-1980 the friction was set as 0.4. The statistical computation was performed by using linear regression analysis [36] with SSD as the dependent variable calculated using Equation (1). The prediction modeling was performed using linear regression analysis with the features traffic data, pavement condition, road width in meters, road vehicles speed in m/sec, reaction time in seconds, approach road gradient (%), number of lanes, friction, environment, and construction. Therefore, the model was developed using twelve independent variables viz. pavement condition rating, average daily traffic in PCU/day, road width in meters, road visibility in meters, speed of road vehicles in km/hr, t in seconds, approach road gradient (%), speed breaker distance from crossing in meters, friction, number of lanes, construction, and environment-dependent variable 'SSD' using a regression modeling algorithm in Python, as shown in Table 2. The t-test was conducted for all twelve parameters with 'SSD' unmanned railway crossings and checked for p-value < 0.05 and also for standard error (S.E.) closeness to 0. The statistical analysis shows that the six independent variables viz. road width in meters, road visibility in meters, V in km/hr, approach road gradient, t in seconds, and PSR in meters, are significant.  Therefore, a new model was developed for 'SSD' using six significant parameters, as shown in Equation (2): where PSR = present serviceability rating (0 to 5), RV = road visibility (meters), V = speed of road vehicles (km/h), and ARG = approach road gradient (%). The SSD new model concludes that SSD values are incremented by 0.35% from the standard IRC SSD Equation (1). Table 3 shows the unmanned railway level crossing categorization 'i' = (1,2,3,4) based on ARG and average SSD observed values at 19 unmanned crossings on the Shahdra-Shamli-Tapri railway route. The minimum observed ARG was 0.0% and the maximum ARG observed was 3.33%. The minimum observed t i was 1.3 s and the maximum observed ti was 4.2 s. Table 3 also shows the SSD calculated as per IRC and the new SSD calculated. S.E. must be close to zero and R 2 must be closer to 1, and in Equation (2), S.E. = 1.509, and R 2 is 0.990 (closer to 1).

Sight Distance from Road to Track Modeling
To develop the sight distance from road to track model, the parameters viz. speed of the train in km/hr, angle of crossing in degrees, track width in meters, length of the vehicle in meters, and reaction time in seconds, were selected. In accordance with the unmanned railway level crossing category 'i' of Table 3, S L based on the geometric characteristics of the crossings is shown in Equation (3): where S L (i) = required sight distances from road to track at unmanned level crossing category 'i' in meters, V t = speed of train in km/hr, Z = angle of crossing in degrees, and W T = track width in meters. Therefore, S L (i), as calculated from Equation (3) and presented in Figure 2 with Z = 45 • to 90 • for l = 0.5 to 10 m, is shown in Table 4. It shows S L (i) calculated from Equation (3) for an unmanned level crossing with Z = 45 • and 90 • .

Model Validation Evaluation Metrics
The model validation encompasses the different metrics for checking the newly developed SSD model performance, which are defined as: The coefficient of determination, R 2 [37], for the regression models is another validation parameter for determining the validity of the statistical models, as shown in Equation (4).
• p-value Statistics The probability of the hypothesis is the probability of discarding the null hypothesis of a problem, if the hypothesis is true. The p-value [37] helps in testing the significance of variables. At confidence interval 95%, if the p-value of the parameters and model is equal to or less than 0.05, the parameters and model are proved to be significant.

•
Mean Absolute Error (MAE) MAE [37] is an estimator of the average misprediction of the model, and is expressed as: where y i = observed value, y = predicted value, and n = sample size.

• Mean Squared Error (MSE)
MSE [37] is used to measure the error associated with a validation or external dataset. It is measured as: where y i = observed value, y = predicted value, and n = sample size RMSE [37] is defined as the difference between the predicted and actual value.
where y 1 is the predicted value, y 0 are the actual values, and n is the number of data points. MAPE is defined as the average difference between the predicted and actual value.    S.E. must be close to zero and R 2 must be closer to 1. In Equations (5) and (8), S.E. = 1.509, and R 2 is 0.990 (closer to 1). Figure 3 shows the validation of a developed relationship SSD based on PCR, road width, V, t, ARG, and road visibility. The validation of the statistical parameters of the new SSD model are given in Table 5. As R 2 is 0.964 (closer to significance level of 1) and MAE, MSE, RMSE, and MAPE of the SSD model have lower values, the new model is found to be suitable for SSD determination.

Model Prediction Accuracy
As explained in the data splitting section, the algorithm was run on four learners, i.e., Gradient Boost Regression, Random Forest Regression, Linear Regression by randomly splitting 75% of the dataset for the development of the model, and using the remaining 25% to assess its accuracy. In the study, 2175 samples out of a total of 2900 were used for training. The accuracy was evaluated through mean and standard deviation. Figure 4 shows that linear regression had better accuracy than the other three algorithms.

Feature Importance
The feature importance technique is used to evaluate the selected features to predict the output. The features reaching the highest score contributed the most to developing

Model Prediction Accuracy
As explained in the data splitting section, the algorithm was run on four learners, i.e., Gradient Boost Regression, Random Forest Regression, Linear Regression by randomly splitting 75% of the dataset for the development of the model, and using the remaining 25% to assess its accuracy. In the study, 2175 samples out of a total of 2900 were used for training. The accuracy was evaluated through mean and standard deviation. Figure 4 shows that linear regression had better accuracy than the other three algorithms.

Feature Importance
The feature importance technique is used to evaluate the selected features to predict the output. The features reaching the highest score contributed the most to developing the SSD model. It is evident from Table 6 that approach road gradient seemed to be the most important parameter for determining the SSD. Among all learning algorithms, linear regression was found to be the most significant for determining the feature importance, as it gave accurate results for the tested datasets.

Sesnitivity Analysis
The sensitivity level of a particular parameter is obtained by keeping the other parameters constant. The sensitivity analysis of ( Figure 5) the parameters PSR, road width (meters), road visibility (meters), V (km/hr), t (seconds), and ARG (%) were approximately in input ranges of 0 to 5, 2.3-5.0 m, 800 to 1000 m, 8.23-28.65 km/h (approx.), 1.3-4.2 s, and 0.00%-3.33% (approx.), respectively. If there is a rise in 'PSR' up to 41.58% (approx.), SSD is predicted to increase by up to 41.61% (approx.). If 'road width' increases by 31.02% (approx.), SSD is predicted to rise to 31.18% (approx.). With increasing 'road visibility' up to 15.07% (approx.), SSD is predicted to increase up to 15.11% (approx.). If the increase in 'V' rises to 39.29% (approx.), the SSD increase goes up by 39.3% (approx.). With the increase in 't' to 46.62%, SSD is predicted to increase by 46.64% (approx.). If, again, an increase in distance of 'ARG' goes up to 44.32%, SSD is predicted to increase up to 93.91%.

Model Prediction Accuracy
As explained in the data splitting section, the algorithm was run on four learners, i.e., Gradient Boost Regression, Random Forest Regression, Linear Regression by randomly splitting 75% of the dataset for the development of the model, and using the remaining 25% to assess its accuracy. In the study, 2175 samples out of a total of 2900 were used for training. The accuracy was evaluated through mean and standard deviation. Figure 4 shows that linear regression had better accuracy than the other three algorithms.

Feature Importance
The feature importance technique is used to evaluate the selected features to predict the output. The features reaching the highest score contributed the most to developing

SSD Analysis
The developed model predicts a significant rise in SSD at almost every crossing. The new SSD model suggested a rise of 23% (approx.) as compared to the IRC-calculated SSD IRC-39-1986 (1990) equation, as shown in Figure 6.

Road Vehicle-Train Collision Prediction Analysis
The new SSD model suggested a rise in SSD in comparison to the IRC equation. Figure 7 indicates that the rise in SSD tends to decrease the road vehicle-train collisions at unmanned railway level crossings. At almost half of the crossings, the road vehicle-train collisions tended to decrease in the range of 2 to 7% (approx.).
(approx.), SSD is predicted to increase by up to 41.61% (approx.). If 'road width' increases by 31.02% (approx.), SSD is predicted to rise to 31.18% (approx.). With increasing 'road visibility' up to 15.07% (approx.), SSD is predicted to increase up to 15.11% (approx.). If the increase in 'V' rises to 39.29% (approx.), the SSD increase goes up by 39.3% (approx.). With the increase in 't' to 46.62%, SSD is predicted to increase by 46.64% (approx.). If, again, an increase in distance of 'ARG' goes up to 44.32%, SSD is predicted to increase up to 93.91%.

SSD Analysis
The developed model predicts a significant rise in SSD at almost every crossing. The new SSD model suggested a rise of 23% (approx.) as compared to the IRC-calculated SSD IRC-39-1986 (1990) equation, as shown in Figure 6.

Road Vehicle-Train Collision Prediction Analysis
The new SSD model suggested a rise in SSD in comparison to the IRC equation. Figure 7 indicates that the rise in SSD tends to decrease the road vehicle-train collisions at unmanned railway level crossings. At almost half of the crossings, the road vehicletrain collisions tended to decrease in the range of 2 to 7% (approx.).

SSD Analysis
The developed model predicts a significant rise in SSD at almost every crossing. The new SSD model suggested a rise of 23% (approx.) as compared to the IRC-calculated SSD IRC-39-1986 (1990) equation, as shown in Figure 6.

Road Vehicle-Train Collision Prediction Analysis
The new SSD model suggested a rise in SSD in comparison to the IRC equation. Figure 7 indicates that the rise in SSD tends to decrease the road vehicle-train collisions at unmanned railway level crossings. At almost half of the crossings, the road vehicletrain collisions tended to decrease in the range of 2 to 7% (approx.).

Conclusions and Future Work
In this study, a safe stopping sight distance model for road vehicles approaching unmanned railway level crossings was developed using linear regression analysis, which

Conclusions and Future Work
In this study, a safe stopping sight distance model for road vehicles approaching unmanned railway level crossings was developed using linear regression analysis, which was found to be the best algorithm for SSD model prediction in comparison to Gradient Boosting, Random Forest, and Support Vector algorithms. Thereafter, a sight distance from road to rail track model was developed for road vehicles of length 0.5 to 10 m using the observed geometric characteristics of unmanned railway level crossings on the predefined route. The parameters viz. friction, environment, number of lanes, and construction, did not affect safe stopping sight distance. The approach road gradient was found to be the major contributing factor for stopping sight distance determination. The new safe stopping sight distance model suggested a rise of 23% in SSD in comparison to the IRC-39-1986 SSD equation. The accident prediction study suggested that an increase in the SSD calculated from the new SSD model predicts decreasing road vehicle-train collisions. The sight distance evaluation using machine learning algorithms of unmanned railway level crossings is expected to help in developing a warning system for road users. Therefore, in the future, the work can be extended to other artificial-intelligence-enabled techniques which may help in predominantly alerting the road users to avoid accidents with the train at unmanned railway level crossings, with already known complete sight distance information.