A Systematic Review of Artificial Intelligence Public Datasets for Railway Applications
Abstract
:1. Introduction
2. Methodology
2.1. Search Criteria and Dataset Selection
2.2. Subdomains in the Railway Sector
2.3. Classification and Types of Data
3. Overview of Hypothetical Data
4. Review of Selected Datasets
4.1. Traffic Planning and Management
4.1.1. Passenger Transport
4.1.2. Rolling Stock and Freight Transport
4.1.3. Passenger Experience
4.2. Maintenance and Inspection
4.2.1. Rolling Stock
4.2.2. Railway Tracks
4.2.3. Railway Ballast
4.2.4. Catenary System and Electrical Equipment
4.2.5. Communication System
4.2.6. Construction Works
4.3. Safety and Security
4.3.1. Situational Awareness
4.3.2. Surveillance
4.3.3. Accident Prevention and Risk Assessment
4.4. Passenger Mobility
Passenger Flow Estimation and Trends
5. Supporting Datasets and Public APIs
6. Discussion
7. Challenges and Opportunities
8. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A. List of Datasets
Ref. | Name/Title | Description | Data Type(s) | Railway Application | Last Updated |
---|---|---|---|---|---|
[79] | Cityscapes | Video sequences from city scenes in different times and weather conditions. Includes high-quality pixel-level annotations. Includes a class specific to railway vehicles | Image data and label data | Situational intelligence, safety and security | 2020 |
[50] | Image data of spark erosion-ProRail | Images of insulation joints from the Netherlands railway network | Image data | Asset detection, predictive maintenance, fault detection and system monitoring | 2020-11 |
[55] | Evaluating railway track support stiffness from trackside measurements in the absence of wheel load data | Deflection data, trackside measurements, geometrical data and speed data | Numerical data | Predictive maintenance and system monitoring | 2020-10 |
[95] | BART Ridership | Bay Area Rapid Transit (BART) hourly ridership broken up by year (starting 2011) | Numerical data | Passenger flow, traffic planning and infrastructure capacity | 2020-10 |
[31] | Trains Held Short | Weekly average number of trains held short per day | Numerical data and label data | Traffic planning and traffic management | 2020-10 |
[70] | Dataset of measured and commented pantograph electric arcs in DC railways | Digitized sampled data using a data acquisition system located onboard and connected to voltage and current sensors | Numerical data | Maintenance | 2020-08 |
[99] | Predict Train Occupancy Time Series | Monthly data on train occupancy from 1999 to 2011 | Numerical data | Passenger flow, traffic planning and infrastructure capacity | 2020-08 |
[46] | Finding railway fasteners in image data-ProRail | Images of fasteners from the Netherlands railway network | Image data | Asset detection, predictive maintenance, fault detection and system monitoring | 2020-07 |
[89] | Experiments on granular flow behavior and deposit characteristics: implications for rock avalanche kinematics | Experimental data on rock avalanche dynamics | Numerical data | Safety and accident prevention | 2020-07 |
[72] | Data sets of measured pantograph voltage and current of European AC railways | Digitized, sampled data using various data recorders located onboard and connected to voltage and current sensors | Numerical data | Maintenance and system monitoring | 2020-06 |
[44] | Bearing Database | Induced failure test data on rolling elements of a spherical roller bearing | Numerical data | Fault detection and predictive maintenance | 2020-06 |
[66] | Cyclic friction tests of ballast stones interfaces under varying vertical load | 3D scanner data of angular-tip stones before and after the friction tests | Other data | Maintenance | 2020-06 |
[27] | Four small cases for the fairness problem of train timetabling | Data on four small case scenarios for the timetable scheduling fairness problem | Other data | Timetable scheduling | 2020-06 |
[20] | NJ Transit + Amtrak (NEC) Rail Performance | Monthly train trip performance on the NJ Transit rail network | Numerical data and label data | Timetable scheduling, rescheduling and customer satisfaction | 2020-05 |
[76] | Monitoring data for railway bridge KW51 in Leuven, Belgium, before, during, and after retrofitting | Measurements of acceleration on the bridge structures, strain on the rails, displacement at the bearings and more | Numerical data | Maintenance | 2020-04 |
[85] | 3D Point Cloud of a railway slope-MOMIT (Multi-scale Observation and Monitoring of railway Infrastructure Threats) | 3D point cloud of a railway trench in Lavancia-Épercy in France | Other data | Maintenance and safety | 2020-04 |
[17] | Train delays in Italy Bologna-Milan | Train delays for the Bologna–Milan railway line | Numerical data and label data | Timetable scheduling and rescheduling | 2020-03 |
[57] | An Analysis of Railway Track Behaviour based on Distributed Optical Fiber Acoustic Sensing | Distributed acoustic sensing (DAS) data, plus statistical data from study | Numerical data | Predictive maintenance and system monitoring | 2020-02 |
[62] | 3D scans of two types of railway ballast including shape analysis information | 3D scanner data of two types of railway ballast, calcite and Kieselkalk | Other data | Maintenance | 2020-02 |
[75] | Performance of Congestion Control Algorithms on High-speed railway scenario | Performance of Hd-TCP and its comparison to simulated data on high-speed railways | Numerical data | Communication | 2020-02 |
[60] | The DR-Train dataset: dynamic responses, GPS positions and environmental conditions of two light rail vehicles in Pittsburgh | Acceleration data, GPS positions, environmental conditions and track maintenance schedules for a light rail network | Numerical data and other data | Predictive maintenance, fault detection and system monitoring | 2020-01 |
[19] | Real-world case based on Batong line in Beijing railway network | Real data for train skip-stopping pattern optimization | Numerical data | Traffic management | 2020-01 |
[81] | RailSem19 | Set of 8500 images from a rail vehicle perspective | Image data, numerical data and label data | System monitoring and situational intelligence | 2019 |
[92] | Ungulate-train collision database | Over 3500 animal–train collisions and over 10.000 locations provided by Polish State Railways, PKP | Numerical data and label data | Safety and risk assessment | 2019-11 |
[100] | Train Crowd Density | Records of crowd density on several trains over a span of months | Numerical data | Passenger flow, Infrastructure Capacity | 2019-11 |
[16] | SEPTA-Regional Rail | Performance data on regional trains from Southeastern Pennsylvania Transportation Authority | Numerical data, label data | Timetable scheduling, rescheduling | 2019-11 |
[49] | Image data of insulation joints-ProRail | Images of insulation joints from the Netherlands railway network | Image data | Predictive maintenance, fault detection and system monitoring | 2019-10 |
[73] | 2 × 25 kV Railway Feeding System Simulation Database | Electricity data measurements from simulations | Numerical data | Maintenance | 2019-08 |
[28] | OpenTrack simulation model files and output dataset for a Copenhagen suburban railway | Simulation data for the evaluation of performance and delays of a suburban railway line | Numerical data | Timetable scheduling and traffic management | 2019-08 |
[38] | Effect of train speed and track geometry on the ride comfort of high-speed railways based on ISO 2631-1 | Effects of train speed and track geometry on ride comfort | Numerical data | Predictive maintenance and fault detection | 2019-07 |
[98] | Indian Metro Data | Prediction of future traffic | Numerical data and label data | Passenger flow and traffic management | 2019-07 |
[97] | DBAHN Travel Captures | Data captured from trains and travels in different stations in Germany | Numerical data | Timetable scheduling and passenger flow | 2019-06 |
[24] | IRCTC–Train Info | Data on details of trains of Indian Railways, including timetables and destinations | Numerical data and label data | Timetable scheduling | 2019-06 |
[32] | Integrated train blocking and shipment path optimization (TBSP) | Operational data for resolving the TBSP problem | Numerical data and other data | Route selection and fleet management | 2019-05 |
[23] | Railway Timetable | Timetable data and train details from Indian Railways | Numerical data and label data | Timetable scheduling | 2019-05 |
[87] | Monitoring and early warning method for a rock fall along railways based on vibration signal characteristics | Train vibration signal and rockfall vibration signals captured by sensors | Numerical data | Predictive maintenance, system monitoring and safety | 2019-04 |
[90] | Analytic Geomagnetic and Geoelectric Fields | Geomagnetic and geoelectric field values generated by analytic calculations | Numerical data | Situational intelligence, safety, risk assessment and predictive maintenance | 2019-01 |
[68] | Data on wind-induced responses of the hanging point for a high-speed railway in China | Measurements from micro-acceleration sensor and a laser displacement meter on the catenary systems of HSR | Label data and numerical data | Maintenance | 2018-12 |
[94] | Sleep Patterns of Railroad Dispatchers | Data on the work schedules and sleep patterns of railroad employees | Numerical data and label data | Security, safety and risk assessment | 2018-11 |
[77] | Towards the use of UHPFRC in railway bridges: the rehabilitation of Buna Bridge | Acceleration data corresponding to a roving test during an experiment carried out while refactoring bridge | Numerical data | Maintenance | 2018-10 |
[65] | Compression tests and direct shear test of two types of railway ballast | Measurement data from uniaxial compression tests and direct shear tests conducted on railway ballast | Numerical data | Maintenance | 2018-09 |
[45] | Condition of pantograph slide plates | Images of pantograph slide plates from various rolling stock vehicles | Image data | Predictive maintenance, fault detection and system monitoring | 2018-07 |
[22] | Data analysis and visualization of Indian Railways | Trip information on Indian Railways | Numerical data and label data | Timetable scheduling and traffic planning | 2018-07 |
[91] | Accidents in France from 2005 to 2016 | Detailed data on traffic accidents in France from 2005 to 2016 | Numerical data and label data | Safety and security | 2018-06 |
[21] | Commuter train timetable | Commuter train service in Stockholm during 2012, including timetables and passenger flow | Numerical data and label data | Timetable scheduling and passenger flow | 2018-05 |
[25] | Predicting Near Term Train Schedule Performance and Delay | Data on train deviations from planned schedules for resolving the re-scheduling problem | Numerical data, label data and other data | Timetable scheduling and train re-scheduling | 2018-02 |
[51] | Automated processing of railway track deflection signals obtained from velocity and acceleration measurements | Modeled and measured data for train passages to classify the range of total and downward deflection from train pass-by records | Numerical data | Predictive maintenance, fault detection and system monitoring | 2018-01 |
[42] | Prediction of rail and bridge noise from concrete railway viaducts using a multi-layer rail fastener model and a wavenumber domain method | Noise data in Hz and dB/m | Numerical data | Predictive maintenance and system monitoring | 2017 |
[53] | Investigating the Influence of Auxiliary Rails on Dynamic Behavior of Railway Transition Zone by a 3D Train-Track Interaction Model | Results from different sensitive analyses (vehicle speed, vehicle load, number of auxiliary rails and railpad stiffness) performed with 3D models | Numerical data | Fault detection and predictive maintenance | 2017-12 |
[47] | Fatigue Assessment Method for pre-stressed concrete sleeper | Calculation of remaining fatigue life of concrete sleeper | Numerical data | Predictive maintenance | 2017-11 |
[83] | PETS 2017 | Data from on-board surveillance systems for protection of critical assets | Image data | Safety, security and system monitoring | 2017-07 |
[40] | Influence of rail fastener stiffness on railway vehicle interior noise | Interior noise in Hz frequency and dB values of different fasteners at different train speeds. Exterior noise. Vibration spectra of train parts | Numerical data | Predictive maintenance and system monitoring | 2017-05 |
[30] | Experimental dataset for optimizing the freight rail operations | Operational data for the development of mathematical models | Label data and numerical data | Logistics and optimization | 2016-12 |
[96] | SBB CFF FFS-Passenger Frequency | Passenger frequency data from Swiss Federal Railways during 2014 | Numerical data | Passenger flow | 2016-08 |
[18] | Trains Express Régionaux: Points d’arrêts et horaires des lignes | Timetables of TER trains in France with stops and timetables | Numerical data and label data | Timetable scheduling | 2016-08 |
[26] | Routing Trains through a Railway Network: Joint optimization on train timetabling and maintenance task scheduling | Operational data for resolving the schedule optimization and maintenance task scheduling problem | Other data | Timetable scheduling and maintenance scheduling | 2016-07 |
[59] | Track geometry analytics | Historical detection readings for three types of track defects: surface, cross level and dip | Numerical data and other data | System monitoring, fault detection and predictive maintenance | 2015 |
[33] | Railroad Hump Yard Block-to-Track Assignment | Operational data for resolving the hump yard classification problem | Other data | Fleet management | 2014 |
[34] | Modeling Railroad Yard Capacity | Supporting files for resolving the hump yard capacity modeling problem | Other data | Fleet management | 2013 |
[35] | Movement Planner Algorithm Design for Dispatching on Multi-Track Territories | Supporting files for resolving the multi-track territories dispatching problem | Other data | Fleet management | 2012 |
[36] | Train Design Optimization Problem | Supporting files for resolving the block-to-train assignment problem | Other data | Fleet management and train routing | 2011 |
[37] | Locomotive Refueling Problem | Operational data for resolving the locomotive refueling problem | Other data | Asset management | 2010 |
References
- Schwab, K. Foreign Affairs, The Fourth Industrial Revolution, What It Means and How to Respond. December 2019. Available online: https://www.foreignaffairs.com/articles/2015-12-12/fourth-industrial-revolution (accessed on 3 February 2020).
- David, B. The Future of Intelligence is Artificial. International Railway Journal (IRJ). Available online: https://www.railjournal.com/in_depth/future-intelligence-artificial (accessed on 3 February 2020).
- European Parliamentary Research Service (EPRS), European Parliament. Artificial Intelligence in Transport. Current and Future Developments, Opportunities and Challenges. April 2019. Available online: https://www.europarl.europa.eu/RegData/etudes/BRIE/2019/635609/EPRS_BRI(2019)635609_EN.pdf (accessed on 3 February 2020).
- Innovation and Networks Executive Agency (INEA). Horizon 2020 Funding Areas. European Commission. Available online: https://ec.europa.eu/inea/en/horizon-2020 (accessed on 3 February 2020).
- Shift2rail.org, “About”. Available online: https://shift2rail.org/about-shift2rail/ (accessed on 3 February 2020).
- Nakhaee, M.C.; Hiemstra, D.; Stoelinga, M.; van Noort, M. The Recent Applications of Machine Learning in Rail Track Maintenance: A Survey. In Proceedings of the International Conference on Reliability, Safety, and Security of Railway Systems, Lille, France, 4–6 June 2019; Springer: Berlin/Heidelberg, Germany, 2019; pp. 91–105. [Google Scholar] [CrossRef]
- Thilagavathy, N.; Harene, J.; Sherine, M.; Shanmugasundari, T. Survey on railway wheel defect detection using machine learning. AutAut Res. J. 2020, 11, 4. [Google Scholar]
- Liu, S.; Wang, Q.; Luo, Y. A review of applications of visual inspection technology based on image processing in the railway industry. Transp. Saf. Environ. 2019, 1, 185–204. [Google Scholar] [CrossRef] [Green Version]
- Xie, P.; Li, T.; Liu, J.; Du, S.; Yang, X.; Zhang, J. Urban flow prediction from spatiotemporal data using machine learning: A survey. Inf. Fusion 2020, 59, 1–12. [Google Scholar] [CrossRef]
- Wen, C.; Huang, P.; Li, Z.; Lessan, J.; Fu, L.; Jiang, C.; Xu, X. Train Dispatching Management With Data-Driven Approaches: A Comprehensive Review and Appraisal. IEEE Access 2019, 7, 114547–114571. [Google Scholar] [CrossRef]
- Zhu, L.; Yu, F.R.; Wang, Y.; Ning, B.; Tang, T. Big Data Analytics in Intelligent Transportation Systems: A Survey. IEEE Trans. Intell. Transp. Syst. 2018, 20, 383–398. [Google Scholar] [CrossRef]
- Ghofrani, F.; He, Q.; Goverde, R.; Liu, X. Recent applications of big data analytics in railway transportation systems: A survey. Transp. Res. Part C Emerg. Technol. 2018, 90, 226–246. [Google Scholar] [CrossRef]
- Wu, Q.; Cole, C.; McSweeney, T. Applications of particle swarm optimization in the railway domain. Int. J. Rail Transp. 2016, 4, 167–190. [Google Scholar] [CrossRef]
- Bešinović, N.; Tang, R.; Lin, Z.; Liu, R.; Tang, T.; De Donato, L.; Vittorini, V.; Wang, Z.; Flammini, F.; Pappaterra, M.J.; et al. Deliverable D1.2: Summary of Existing Relevant Projects and State-of-the-Art of AI Application in Railways, RAILS, Shift2Rail. Available online: https://rails-project.eu/wp-content/uploads/sites/73/2021/05/RAILS_D12_v23.pdf (accessed on 5 April 2020).
- Marrone, S.; De Donato, L.; Vittorini, V.; Nardone, R.; Tang, R.; Besinovic, N.; Flammini, F.; Goverde, R.M.P.; Lin, Z. Findings about the State-of-Practice. Deliverable D1.3 Application Areas (Chapter 5). 2021. Available online: https://rails-project.eu/wp-content/uploads/sites/73/2021/09/RAILS_D1_3_Application_Areas_v32.pdf (accessed on 15 August 2021).
- The Southeastern Pennsylvania Transportation Authority (SEPTA). Regional Rail: Predict Arrival Times of Philadelphia’s Regional Trains (Version 1). [dataset]. 2019. Available online: https://www.kaggle.com/septa/on-time-performance (accessed on 24 September 2020).
- Cecaj, A. Train Delays in Italy Bologna-Milan (Version 1). [dataset]. 2020. Available online: https://www.kaggle.com/alketcecaj/train-delays-in-italy-bolognamilan (accessed on 24 September 2020).
- Trains Express Régionaux. Trains Express Régionaux: Points D’arrêts et Horaires des Lignes. [dataset]. 2016. Available online: https://www.data.gouv.fr/en/datasets/trains-express-regionaux-points-darrets-et-horaires-des-lignes/ (accessed on 24 September 2020).
- Yinghui, W. Real-World Case Based on Batong Line in Beijing Railway Network. Mirror of Mendeley Data. [dataset]. 2020. Available online: https://figshare.com/articles/dataset/Real-world_case_based_on_Batong_line_in_Beijing_railway_network/11627880 (accessed on 5 October 2020).
- Pranav, B. NJ Transit + Amtrak (NEC) Rail Performance (Version 2). [dataset]. 2020. Available online: https://www.kaggle.com/pranavbadami/nj-transit-amtrak-nec-performance (accessed on 24 September 2020).
- Abderrahman, A.A. Commuter Train Timetable: Commuter Train Service in Stockholm 2012 (Version 2). [dataset]. 2012. Available online: https://www.kaggle.com/abdeaitali/commuter-train-timetable (accessed on 25 September 2020).
- Tanima, S. Data Analysis and Visualization of Indian Railways (Version 1). [dataset]. 2018. Available online: https://www.kaggle.com/tanimasarkhel/data-analysis-and-visualization-of-indian-railways (accessed on 25 September 2020).
- Harshit, G. Indian Railways Time Table for Trains Available (Version 1). [dataset]. 2019. Available online: https://www.kaggle.com/harsh16/indian-railways-time-table-for-trains-available (accessed on 25 September 2020).
- Binil, J. IRCTC-TrainInfo (Version 1). [dataset]. 2019. Available online: https://www.kaggle.com/binilj04/irctctraininfo (accessed on 26 September 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS). Predicting Near Term Train Schedule Performance and Delay (Version 1). [dataset]. 2018. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 26 September 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS). Routing Trains through a Railway Network: Joint Optimization on Train Timetabling and Maintenance Task Scheduling. [dataset]. 2016. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 26 September 2020).
- Yinghui, W. Four Small Cases for the Fairness Problem of Train Timetabling. [dataset]. 2020. Available online: https://mendeley.figshare.com/articles/dataset/Four_small_cases_for_the_fairness_problem_of_train_timetabling/12402911 (accessed on 26 September 2020).
- Harrod, S.; Cerreto, F.; Nielsen, O.A. OpenTrack simulation model files and output dataset for a Copenhagen suburban railway. Data Brief 2019, 25, 103952. [Google Scholar] [CrossRef]
- Harrod, S.; Cerreto, F.; Nielsen, O.A. A closed form railway line delay propagation model. Transp. Res. Part C Emerg. Technol. 2019, 102, 189–209. [Google Scholar] [CrossRef]
- Mahmoud, M.; Erhan, K.; Geoff, K.; Shi, Q.L. Experimental dataset for optimizing the freight rail operations. Data Brief 2016, 9, 492–500. [Google Scholar] [CrossRef] [Green Version]
- Jesse, G.; (Surface Transportation Board). Trains Held Short. [dataset]. 2020. Available online: https://agtransport.usda.gov/Rail/Trains-Held-Short/iacs-9uck (accessed on 5 October 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Integrated Train Blocking and Shipment Path Optimization (TBSP) (Version 1). [dataset]. 2019. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Railroad Hump Yard Block-to-Track Assignment. [dataset]. 2014. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Modeling Railroad Yard Capacity. [dataset]. 2013. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Movement Planner Algorithm Design for Dispatching on Multi-Track Territories. [dataset]. 2012. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Train Design Optimization Problem. [dataset]. 2011. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Locomotive Refueling Problem. [dataset]. 2010. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- Liu, C.; Thompson, D.; Griffin, M.J.; Entezami, M. Dataset for “Effect of Train Speed and Track Geometry on the Ride Comfort Of High-Speed Railways Based on ISO 2631-1”. University of Southampton. [dataset]. 2019. Available online: https://eprints.soton.ac.uk/432605/ (accessed on 5 October 2020).
- Liu, C.; Thompson, D.; Griffin, M.J.; Entezami, M. Effect of train speed and track geometry on the ride comfort in high-speed railways based on ISO 2631-1. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2019, 234, 765–778. [Google Scholar] [CrossRef] [Green Version]
- Li, L.; Thompson, D.; Xie, Y.; Zhu, Q.; Luo, Y.; Lei, Z. Dataset for Influence of Rail Fastener Stiffness on Railway Vehicle Interior Noise. University of Southampton. [dataset]. 2017. Available online: https://eprints.soton.ac.uk/428923/ (accessed on 10 August 2020).
- Li, L.; Thompson, D.; Xie, Y.; Zhu, Q.; Luo, Y.; Lei, Z. Influence of rail fastener stiffness on railway vehicle interior noise. Appl. Acoust. 2018, 145, 69–81. [Google Scholar] [CrossRef] [Green Version]
- Li, Q.; Thompson, D. Dataset for Paper: Prediction of Rail and Bridge Noise from Concrete Railway Viaducts Using a Multi-Layer Rail Fastener Model and a Wavenumber Domain Method. University of Southampton. [dataset]. 2017. Available online: https://eprints.soton.ac.uk/411733/ (accessed on 10 August 2020).
- Li, Q.; Thompson, D. Prediction of rail and bridge noise arising from concrete railway viaducts by using a multilayer rail fastener model and a wavenumber domain method. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2017, 232, 1326–1346. [Google Scholar] [CrossRef]
- César, R.S.-O.; José, M.M.; Juan, D.C.-M.; José, L.; Garcia, B. Bearing Database (Version V1). [dataset]. 2020. Available online: https://zenodo.org/record/3898942 (accessed on 10 August 2020).
- Gehrig, U. Condition of Pantograph Slide Plates: Images From Pantograph Slide Plates of Various Rolling Stock Vehicles (Version 5). [dataset]. 2018. Available online: https://www.kaggle.com/gehrig/pantograph (accessed on 2 September 2020).
- van Hees, O. Finding Railway Fasteners in Image Data—ProRail (Version 4). [dataset]. 2020. Available online: https://www.kaggle.com/oscarvanhees/finding-railway-fasteners-in-image-data-prorail (accessed on 20 September 2020).
- Ruilin, Y.; Dan, L.; Chayut, N.; Rims, J.; Sakdirat, K. Fatigue Assessment Method for Pre-Stressed Concrete Sleeper (Version 2). [dataset]. 2017. Available online: https://zenodo.org/record/1155711#.YI6t_ej7Stp (accessed on 15 August 2020).
- You, R.; Li, D.; Ngamkhanong, C.; Janeliukstis, R.; Kaewunruen, S. Fatigue Life Assessment Method for Prestressed Concrete Sleepers. Front. Built Environ. 2017, 3, 68. [Google Scholar] [CrossRef] [Green Version]
- van Hees, O. Image Data of Insulation—ProRail: Image Recognition Used for Asset Detection. [dataset]. 2019. Available online: https://www.kaggle.com/oscarvanhees/insulation-joint-training-set-prorail (accessed on 9 November 2020).
- van Hees, O. Image Data of Spark Erosion—ProRail (Version 3). [dataset]. 2020. Available online: https://www.kaggle.com/oscarvanhees/image-data-of-spark-erosion-prorail (accessed on 10 January 2021).
- Milne, D. Automated Processing of Railway Track Deflection Signals Obtained from Velocity and Acceleration Measurements. [dataset]. 2018. Available online: http://eprints.soton.ac.uk/id/eprint/419011 (accessed on 15 January 2021).
- Milne, D.; Pen, L.L.; Thompson, D.; Powrie, W. Automated processing of railway track deflection signals obtained from velocity and acceleration measurements. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2018, 232, 2097–2110. [Google Scholar] [CrossRef] [Green Version]
- Heydari-Noghabi, H.; Varandas, J.N.; Esmaeili, M.; Zakeri, J. Investigating the Influence of Auxiliary Rails on Dynamic Behavior of Railway Transition Zone by a 3D Train—Track Interaction Model. [dataset]. 2017. Available online: https://figshare.com/articles/dataset/Investigating_the_Influence_of_Auxiliary_Rails_on_Dynamic_Behavior_of_Railway_Transition_Zone_by_a_3D_Train-Track_Interaction_Model/5734317 (accessed on 15 August 2020).
- Heydari-Noghabi, H.; Varandas, J.N.; Esmaeili, M.; Zakeri, J. Investigating the Influence of Auxiliary Rails on Dynamic Behavior of Railway Transition Zone by a 3D Train-Track Interaction Model. Lat. Am. J. Solids Struct. 2017, 14, 2000–2018. [Google Scholar] [CrossRef] [Green Version]
- Le Pen, L.; Milne, D.; Thompson, D.; Powrie, W. Evaluating Railway Track Support Stiffness from Trackside Measurements in the Absence of Wheel Load Data; University of Southampton: Southampton, UK, 2016. [Google Scholar] [CrossRef]
- Le Pen, L.; Milne, D.; Thompson, D.; Powrie, W. Evaluating railway track support stiffness from trackside measurements in the absence of wheel load data. Can. Geotech. J. 2016, 53, 1156–1166. [Google Scholar] [CrossRef] [Green Version]
- Milne, D.; le Pen, L.; Watson, G.; Masoudi, A. Data for: An Analysis of Railway Track Behaviour based on Distributed Optical Fibre Acoustic Sensing (Version 1). University of Southampton. [dataset]. 2020. Available online: https://eprints.soton.ac.uk/438063/ (accessed on 3 December 2020).
- Milne, D.; Masoudi, A.; Ferro, E.; Watson, G.; Le Pen, L. An analysis of railway track behaviour based on distributed optical fibre acoustic sensing. Mech. Syst. Signal Process. 2020, 142, 106769. [Google Scholar] [CrossRef]
- The Institute for Operations Research and the Management Sciences (INFORMS) Railway Application Section (RAS). Track Geometry Analytics. [dataset]. 2015. Available online: https://connect.informs.org/railway-applications/new-item3/problem-repository16 (accessed on 1 November 2020).
- Liu, J.; Chen, S.; Lederman, G.; Kramer, D.B.; Noh, H.Y.; Bielak, J.; Berges, M. The DR-Train dataset: Dynamic Responses, GPS Positions and Environmental Conditions Of Two Light Rail Vehicles in Pittsburgh (Version 1.0). [dataset]. 2018. Available online: https://zenodo.org/record/1432702#.YLpmY6j7Sto (accessed on 29 July 2020).
- Liu, J.; Chen, S.; Lederman, G.; Kramer, D.B.; Noh, H.Y.; Bielak, J.H.G., Jr.; Kovačević, J.; Bergés, M. Dynamic responses, GPS positions and environmental conditions of two light rail vehicles in Pittsburgh. Sci. Data 2019, 6, 146. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Suhr, B.; Six, K.; Skipper, W.A.; Lewis, R. 3D Scans of Two Types of Railway Ballast Including Shape Analysis Information (Version 1). [dataset]. 2020. Available online: https://zenodo.org/record/3689592 (accessed on 1 December 2020).
- Suhr, B.; Skipper, W.A.; Lewis, R.; Six, K. Shape analysis of railway ballast stones: Curvature-based calculation of particle angularity. Sci. Rep. 2020, 10, 6045. [Google Scholar] [CrossRef] [Green Version]
- Suhr, B.; Six, K. Simple particle shapes for DEM simulations of railway ballast: Influence of shape descriptors on packing behaviour. Granul. Matter 2020, 22, 43. [Google Scholar] [CrossRef] [PubMed] [Green Version]
- Suhr, B.; Six, K. Compression Tests and Direct Shear Test of Two Types of Railway Ballast (Version 1). [dataset]. 2018. Available online: https://zenodo.org/record/1423742#.YI6RR-j7Stp (accessed on 29 July 2020).
- Suhr, B.; Butcher, T.A.; Lewis, R.; Six, K. Cyclic Friction Tests of Ballast Stones Interfaces Under Varying Vertical Load (Version 1). [dataset]. 2020. Available online: https://zenodo.org/record/3893842 (accessed on 22 November 2020).
- Suhr, B.; Butcher, T.; Lewis, R.; Six, K. Friction and wear in railway ballast stone interfaces. Tribol. Int. 2020, 151, 106498. [Google Scholar] [CrossRef]
- Xie, Q.; Zhi, X. Data on wind-induced responses of the hanging point for a high-speed railway in China. Data Brief 2018, 21, 2259–2261. [Google Scholar] [CrossRef] [PubMed]
- Xie, Q.; Zhi, X. Wind tunnel test of an aeroelastic model of a catenary system for a high-speed railway in China. J. Wind. Eng. Ind. Aerodyn. 2018, 184, 23–33. [Google Scholar] [CrossRef]
- Signorino, D.; Giordano, D.; Mariscotti, A.; Gallo, D.; Femine, A.D.; Balic, F.; Quintana, J.; Donadio, L.; Biancucci, A. Dataset of measured and commented pantograph electric arcs in DC railways. Data Brief 2020, 31, 105978. [Google Scholar] [CrossRef]
- MyRailS. MyRailS: Accurate Measurements for Energy Efficiency in European Railway and Subway Systems. Available online: https://myrails.it/ (accessed on 1 October 2020).
- Mariscotti, A. Data sets of measured pantograph voltage and current of European AC railways. Data Brief 2020, 30, 105477. [Google Scholar] [CrossRef] [PubMed]
- Arboleya, P.; Mohamed, B.; El-Sayed, I.; Gonzalez-Moran, C. 2 × 25 kv Railway Feeding System Simulation Database, IEEE Dataport. [dataset]. 2019. Available online: https://ieee-dataport.org/documents/2x25kv-railway-feeding-system-simulation-database (accessed on 10 September 2020).
- Mohamed, B.; Arboleya, P.; ElSayed, I.; Gonzalez-Moran, C.; El-Sayed, I. High-Speed 2 × 25 kV Traction System Model and Solver for Extensive Network Simulations. IEEE Trans. Power Syst. 2019, 34, 3837–3847. [Google Scholar] [CrossRef]
- Yuan, Z. Performance of Congestion Control Algorithms on High-Speed Railway Scenairo (Version 1), IEEE Dataport. [dataset]. 2020. Available online: https://ieee-dataport.org/documents/performance-congestion-control-algorithms-high-speed-railway-scenairo (accessed on 12 December 2020).
- Maes, K.; Lombaert, G. Monitoring Data for Railway Bridge KW51 In Leuven, Belgium, Before, During, and after Retrofitting (Version 1.0). [dataset]. 2020. Available online: https://zenodo.org/record/3745914 (accessed on 10 December 2020).
- Martin-Sanz, H.; Tatsis, K.; Stipanovic, I.; Damjanovic, D.; Sanja, A.; Brühwiler, E.; Chatzi, E. Towards the use of UHPFRC in railway bridges: The rehabilitation of Buna Bridge (Version 1). [dataset]. 2008. Available online: https://zenodo.org/record/2574457#.YI5hGOj7Stq (accessed on 8 August 2020).
- Martín-Sanz, H.; Tatsis, K.; Chatzi, E.; Brühwiler, E.; Stipanovic, I.; Mandic, A.; Damjanovic, D.; Sanja, A. Towards the use of UHPFRC in railway bridges: The rehabilitation of Buna Bridge. In Proceedings of the 5th International Symposium on Life-Cycle Civil Engineering (IALCCE 2018), Lake Como, Italy, 11–14 June 2018; 14 June 2018. [Google Scholar]
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. [dataset]. 2016. Available online: https://www.cityscapes-dataset.com (accessed on 10 August 2020).
- Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016. Available online: https://www.cityscapes-dataset.com (accessed on 10 August 2020).
- Zendel, O.; Murschitz, M.; Zeilinger, M.; Steininger, D.; Abbasi, S.; Beleznai, C. RailSem19: A Dataset for Semantic Rail Scene Understanding (Version 1). [dataset]. 2019. Available online: https://wilddash.cc/railsem19 (accessed on 25 August 2020).
- Zendel, O.; Murschitz, M.; Zeilinger, M.; Steininger, D.; Abbasi, S.; Beleznai, C. A Dataset for Semantic Rail Scene Understanding. Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, Long Beach, CA, USA, 16–20 June 2019. [Google Scholar]
- Patino, L.; Nawaz, T.; Cane, T.; Ferryman, J. PETS 2017. [dataset]. 2017. Available online: https://doi.org/10.1109/CVPRW.2017.264 (accessed on 23 August 2020).
- Patino, L.; Nawaz, T.; Cane, T.; Ferryman, J. PETS 2017: Dataset and Challenge. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 2126–2132. [Google Scholar]
- Riquelme, J.L.A.; Ruffo, M.; Tomás, R.; Riquelme, A.; Pagán, J.I.; Cano, M.; Pastor, J.L. 3D Point Cloud of a Railway Slope—MOMIT (Multi-Scale Observation And Monitoring of Railway Infrastructure Threats) EU Project—H2020-EU.3.4.8.3.—Grant Agreement ID: 777630. [dataset]. 2020. Available online: https://zenodo.org/record/3777996 (accessed on 2 December 2020).
- MOMIT Project Consortium. MOMIT: Multi-scale Observation and Monitoring of railway Infrastructure Threats. Available online: https://www.momit-project.eu/ (accessed on 1 October 2020).
- Yan, Y.; Li, T.; Liu, J.; Wang, W.; Su, Q. Monitoring and Early Warning Method For A Rock Fall Along Railways Based On Vibration Signal Characteristics. [dataset]. 2019. Available online: https://www.nature.com/articles/s41598-019-43146-1 (accessed on 1 October 2020).
- Yan, Y.; Li, T.; Liu, J.; Wang, W.; Su, Q. Monitoring and early warning method for a rockfall along railways based on vibration signal characteristics. Sci. Rep. 2019, 9, 6606. [Google Scholar] [CrossRef] [PubMed]
- Li, K.; Wang, Y.; Lin, Q.; Cheng, Q.; Wu, Y. Experiments on Granular Flow Behavior and Deposit Characteristics: Implications For Rock Avalanche Kinematics (Version 3). [dataset]. 2020. Available online: https://zenodo.org/record/3930161#.YHM1Quj7Stp (accessed on 1 November 2020).
- Boteler, D.; Pirjola, R.; Marti, L. Analytic Geomagnetic and Geoelectric Fields, IEEE Dataport. [dataset]. 2019. Available online: https://ieee-dataport.org/open-access/analytic-geomagnetic-and-geoelectric-fields (accessed on 8 August 2020).
- Mimi, A.L. Accidents in France from 2005 to 2016 (Version 2). [dataset]. 2018. Available online: https://www.kaggle.com/ahmedlahlou/accidents-in-france-from-2005-to-2016 (accessed on 8 August 2020).
- Jasińska, D.; Żmihorski, M.; Krauze-Gryz, D.; Kotowska, D.; Werka, J.; Piotrowska, D.; Pärt, T. Data From: Linking Habitat Composition, Local Population Densities and Traffic Characteristics to Spatial Patterns of Ungulate-Train Collisions. [dataset]. 2019. Available online: https://datadryad.org/stash/dataset/doi:10.5061/dryad.870t013 (accessed on 8 August 2020).
- Jasińska, K.D.; Żmihorski, M.; Krauze-Gryz, D.; Kotowska, D.; Werka, J.; Piotrowska, D.; Pärt, T. Linking habitat composition, local population densities and traffic characteristics to spatial patterns of ungulate-train collisions. J. Appl. Ecol. 2019, 56, 2630–2640. [Google Scholar] [CrossRef]
- Minasyan, N. Sleep Patterns of Railroad Dispatchers: How well Railroad Dispatchers Sleep (Version 1). [dataset]. 2018. Available online: https://www.kaggle.com/nairaminasyan/sleep-patterns (accessed on 8 August 2020).
- Geislinger, V. BART Ridership (Version 6). [dataset]. 2020. Available online: https://www.kaggle.com/mrgeislinger/bartridership (accessed on 20 August 2020).
- Bellanger, A. SBB CFF FFS—Passenger Frequency. [dataset]. 2016. Available online: https://data.world/antoinebell/sbb-passengerfrequence (accessed on 20 August 2020).
- Mengibar, C. D BAHN Travels Captures: Data Captured from Trains And Travels in Different Station Of Germany (Version 1). [dataset]. 2019. Available online: https://www.kaggle.com/chemamengibar/dbahn-travels-captures (accessed on 10 August 2020).
- Ansari, U. Indian Metro Data: Prediction of the Future Traffic (Version 1). [dataset]. 2019. Available online: https://www.kaggle.com/umairnsr87/indian-metro-data (accessed on 8 August 2020).
- Reddy, R. Predict Train Occupancy Time Series (Version 1). [dataset]. 2020. Available online: https://www.kaggle.com/gajjadarahul/predict-train-occupancy-time-series (accessed on 3 October 2020).
- Tyagi, A. Train Crowd Density: Details of Several Trains Along with Target Variable Being Crowd Density (Version 1). [dataset]. 2019. Available online: https://www.kaggle.com/akashtyagi08/trainn (accessed on 8 August 2020).
- Silva, F.B.E.; Forzieri, G.; Herrera, M.A.M.; Bianchi, A.; LaValle, C.; Feyen, L. HARCI-EU, a harmonized gridded dataset of critical infrastructures in Europe for large-scale risk assessments. Sci. Data 2019, 6, 126. [Google Scholar] [CrossRef] [Green Version]
Search Portal | Short Description | URL |
---|---|---|
Data World | World’s largest open data and data collaboration community | https://data.world/ |
Data.gov | Open data from the government of the United States of America | https://www.data.gov/ |
Data.gov.uk | Open data from the government of the United Kingdom | https://data.gov.uk/ |
European Data Portal | Open data portals across 36 European countries | https://www.europeandataportal.eu/ |
FigShare | Open repository of data and papers published in academic research | https://figshare.com/ |
Google Dataset Search | Google’s dataset search engine | https://datasetsearch.research.google.com/ |
Humanitarian Data Exchange (HDX) | Open platform for sharing humanitarian data maintained by the United Nations (ONU) | https://data.humdata.org/ |
IEEE DataPort | Dataset storage and search platform maintained by IEEE | https://ieee-dataport.org/ |
Kaggle | Data science and machine learning portal maintained by Google | https://www.kaggle.com/ |
ScienceDirect’s Data in Brief journal | Open access journal on published datasets and data articles maintained by Elsevier | https://www.sciencedirect.com/journal/data-in-brief |
Zenodo | Open access repository of research papers and datasets maintained by the EU’s Horizon 2020 program | https://zenodo.org/ |
Type of Data | Datasets |
---|---|
Image data | (46,49,50,67,79,81,83) |
Numerical data | (16–25,28,30,31,32,34,38,40,42,44,47,51,53,55,57,59,60,65,68,70,72,73,75–77,81,87,89–92,94–100) |
Label data | (16–18,20–25,30,31,68,79,81,91,92,94,98) |
Other data | (25,26,27,32–37,59,60,62,66,85) |
Railway Subdomain | Type of Data | Citations |
---|---|---|
Traffic Planning and Management | Image data | --- |
Numerical data | (16–25,28,30,31,32,38,40,42,95,97–99) | |
Label data | (16–18,20–25,30,31,98) | |
Other data | (25,26,27,32–37) | |
Maintenance and Inspection | Image data | (45,46,49,50) |
Numerical data | (38,34,40,42,44,47,51,53,55,57,59,60,65,68,70,72,73,75–77,87,90) | |
Label data | (68) | |
Other data | (59,60,62,66,85) | |
Safety and Security | Image data | (79,81,83) |
Numerical data | (81,85,89–92,94) | |
Label data | (79,81,91,92,94) | |
Other data | (85) | |
Passenger Mobility | Image data | --- |
Numerical data | (20,21,95–100) | |
Label data | (20,21,98) | |
Other data | --- | |
Autonomous Train Driving and Train Control | Image data | --- |
Numerical data | --- | |
Label data | --- | |
Other data | --- | |
Revenue Management | Image data | --- |
Numerical data | --- | |
Label data | --- | |
Other data | --- | |
Transport Policy | Image data | --- |
Numerical data | --- | |
Label data | --- | |
Other data | --- |
Name/Title | Description | Last Updated | Link |
---|---|---|---|
ERAIL-Railway accident and incident links | Reports of railway accidents and incidents within the European Union | 2020-10 | https://data.europa.eu/euodp/en/data/dataset/erail-investigations |
Train Stations in Europe | Names, coordinates and properties of European railway stations | 2020-10 | https://www.kaggle.com/headsortails/train-stations-in-europe |
Grade Crossings Inventory | An inventory of the location and characteristics of railway crossings in Canada | 2020-09 | https://open.canada.ca/data/en/dataset/d0f54727-6c0b-4e5a-aa04-ea1463cf9f4c |
Railroad Crossings | Detailed information of all railway crossings in the United States | 2020-05 | https://hifld-geoplatform.opendata.arcgis.com/datasets/railroad-crossings |
HARCI-EU | Geospatial data of critical infrastructure, including railway networks | 2019-12 | https://figshare.com/articles/dataset/HARmonized_grids_of_Critical_Infrastructures_in_Europe_HARCI-EU_/7777301 |
Citylines: Transit systems of the world | Transportation line data from cities from around the world, and historical data of railway line development | 2019-03 | https://www.citylines.co/ |
National Railway Network-NRWN-GeoBase Series | Geometric descriptions and basic railway attributes | 2018-05 | https://open.canada.ca/data/en/dataset/ac26807e-a1e8-49fa-87bf-451175a859b8 |
Freight Analysis Framework (FAF) | Flows of goods among US regions for all modes of transportation, including railways | 2017-08 | https://www.kaggle.com/usdot/freight-analysis-framework |
WFP-Global railways | Geodata about global railways | 2017-05 | https://data.humdata.org/dataset/global-railways |
Rail Network | Linear network representing railway tracks and other data from Canadian railways | 2012-06 | https://open.canada.ca/data/en/dataset/c2c4f386-a736-4eaa-b5b6-28c3a8f75466 |
Railroad Bridges | Detailed information on all railway bridges in the United States | 2009-09 | https://hifld-geoplatform.opendata.arcgis.com/datasets/railroad-bridges |
National Rail Enquires (NRE) | Real-time train data in Great Britain as a public API | Active | https://www.nationalrail.co.uk/46391.aspx |
Sydney Trains Service Interruptions RSS Feed | Real-time machine-readable feed of Sydney Trains information concerning service interruptions | Active | https://opendata.transport.nsw.gov.au/dataset/public-transport-realtime-alerts-0 |
Nederlandse Spoorwegen (Netherlands Railways) | A public REST API for Dutch Railways in the Netherlands | Active | https://www.ns.nl/en/travel-information/ns-api |
Traffiklab | A collection of APIs for public transport in Sweden | Active | https://www.trafiklab.se/ |
Google Transit APIs | Google’s API services for GTFS Static and GTFS Realtime | Active | https://developers.google.com/transit |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Pappaterra, M.J.; Flammini, F.; Vittorini, V.; Bešinović, N. A Systematic Review of Artificial Intelligence Public Datasets for Railway Applications. Infrastructures 2021, 6, 136. https://doi.org/10.3390/infrastructures6100136
Pappaterra MJ, Flammini F, Vittorini V, Bešinović N. A Systematic Review of Artificial Intelligence Public Datasets for Railway Applications. Infrastructures. 2021; 6(10):136. https://doi.org/10.3390/infrastructures6100136
Chicago/Turabian StylePappaterra, Mauro José, Francesco Flammini, Valeria Vittorini, and Nikola Bešinović. 2021. "A Systematic Review of Artificial Intelligence Public Datasets for Railway Applications" Infrastructures 6, no. 10: 136. https://doi.org/10.3390/infrastructures6100136
APA StylePappaterra, M. J., Flammini, F., Vittorini, V., & Bešinović, N. (2021). A Systematic Review of Artificial Intelligence Public Datasets for Railway Applications. Infrastructures, 6(10), 136. https://doi.org/10.3390/infrastructures6100136