Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins

Mutlu, Ugur; Kaewunruen, Sakdirat

doi:10.3390/infrastructures11030087

Open AccessSystematic Review

Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins

by

Ugur Mutlu

and

Sakdirat Kaewunruen

^*

Department of Civil Engineering, University of Birmingham, Birmingham B15 2TT, UK

^*

Author to whom correspondence should be addressed.

Infrastructures 2026, 11(3), 87; https://doi.org/10.3390/infrastructures11030087

Submission received: 13 February 2026 / Revised: 2 March 2026 / Accepted: 6 March 2026 / Published: 8 March 2026

(This article belongs to the Special Issue Smart Mobility and Transportation Infrastructure)

Download

Browse Figures

Versions Notes

Abstract

Railway infrastructure faces growing degradation risks from intensified operational loads and climate change, necessitating a paradigm shift from reactive repairs to digitalized predictive maintenance. This study explores the synergistic convergence of Artificial Intelligence (AI), Building Information Modeling (BIM), and Digital Twins (DT) to optimize asset management. A Systematic Literature Review was conducted, adhering to PRISMA guidelines and strictly selecting and analyzing 73 peer-reviewed articles from Web of Science and Scopus (2015–2026). The results reveal that while Supervised Learning remains the dominant paradigm for defect detection, Reinforcement Learning is emerging as a key tool for maintenance scheduling. However, a critical “Digital Twin Gap” is identified, where most systems function only as unidirectional digital representations rather than bidirectional, self-correcting twins. Furthermore, despite frequent sustainability claims, there is a marked absence of quantified environmental metrics in current research. Consequently, this paper concludes that future advancements must prioritize the development of “True Digital Twins” with autonomous actuation, ensure interoperability through Industry Foundation Classes (IFC), and integrate explicit “Green KPIs” to objectively validate the environmental benefits of digitalized maintenance strategies.

Keywords:

railway; predictive maintenance; digital twin; machine learning; building information modelling; artificial intelligence; deep learning; sustainability

1. Introduction

Rail systems are fundamental to the social, economic, and sustainable progression of modern nations [1,2,3]. Consequently, numerous countries are consistently expanding their investments in railway networks due to their capacity to transport significant quantities of freight and passengers with superior energy efficiency relative to alternative land-based transportation methods [4,5]. Compared to other means of transport, such as air or road transportation, rails are considered to be more environment friendly [6]. Recent data indicates that while the transport sector accounts for approximately 25% of Europe’s Greenhouse Gas (GHG) emissions, rail transport contributes only 0.4% [7]. That is why rail networks are fundamental to advancing the United Nations Sustainable Development Goals, specifically regarding the development of resilient infrastructure and innovation (Goal 9), as well as the creation of sustainable, inclusive urban environments (Goal 11). By 2030, Target 11.2 intends to ensure that safe, inclusive, affordable, and sustainable transportation networks are available to everyone. Similarly, Target 9.1 focuses on developing resilient, high-quality, and sustainable infrastructure to bolster economic growth and enhance societal welfare [8]. However, this sustainability advantage is contingent upon the reliability of the physical infrastructure. Significant economic deficits occur when these vital infrastructures are unable to perform their designated functions, such as instances where physical breakdowns result in service interruptions [9].

The global demand for rail transport is consistently rising. To accommodate this growth, both the velocity and the axle loads of rolling stock must be elevated, which subjects the railway infrastructure to significantly higher mechanical stresses [10]. These intensified loads serve as a primary catalyst for the accelerated degradation of track components [11,12]. Simultaneously, climate change is responsible for an increase in the frequency and severity of extreme events, which are responsible for extensive damage to railway infrastructure [13,14].

Maintenance interventions serve as the critical mechanism for managing infrastructure degradation and ensuring the long-term integrity of railway assets. To preserve safety and passenger comfort, these activities must be executed with increasing frequency and precision; failure to optimize maintenance leads to a choice between severe infrastructure deterioration or unsustainable maintenance expenditures. This efficiency is paramount given that maintenance must be performed more frequently to keep the network in a standard condition.

Without efficient maintenance, the railway project may suffer from poor serviceability, availability, reliability, passenger comfort, efficiency, and safety of the overall railway system. If maintenance is neglected, the infrastructure suffers from rapid deterioration. Conversely, if maintenance is performed too aggressively without precise data, it results in the waste of valuable materials and energy. Therefore, the optimization of maintenance is not merely a technical or financial concern; it is a fundamental prerequisite for the sector’s long-term sustainability.

The starting point of this optimization is the implementation of a “maintenance policy,” which represents the strategic framework and managerial actions suggested by maintenance models to ensure a system consistently performs its required functions. Historically, railway operators have relied on corrective maintenance, a reactive approach where repairs are conducted only after defects are identified. Due to its reactive nature, corrective maintenance is inherently unplanned, leading to high operational disruptions and emergency repair costs that far exceed those of planned interventions. To mitigate these risks, the industry transitioned toward preventive maintenance, a strategy defined by routine, scheduled activities [15].

Although preventive maintenance successfully minimizes the unforeseen failures associated with corrective approaches, it frequently results in “over-maintenance”, where functional infrastructure is serviced or replaced prematurely, causing significant resource waste. To address these dual inefficiencies, the industry is undergoing a paradigm shift toward predictive maintenance. By leveraging advanced tools to forecast future defects, predictive maintenance aims to execute interventions only when strictly necessary, thereby maximizing the RUL of components while maintaining infrastructure integrity. By limiting interventions to components that genuinely require attention, this approach significantly enhances sustainability. Recent studies have demonstrated that transitioning from scheduled to predictive strategies can reduce the number of maintenance activities by up to 48%, directly leading to a 52% reduction in maintenance-related carbon emissions [16]. One of the objectives of the European Rail Traffic Management System (ERTMS) is to further improve predictive maintenance in railway infrastructure through improved early fault detection [6].

While the theoretical benefits of predictive maintenance are clear, its practical implementation is hindered by the sheer complexity of railway data. The maintenance phase of a railway project generates a vast volume of information, a consequence of both its extended duration and the inherent complexity of the rail system. Furthermore, data continues to accumulate and evolve throughout the entire project lifecycle, spanning from initial construction and commissioning to operations, ongoing maintenance, and eventual decommissioning [17]. Therefore, robust data management strategies are essential to ensure operational efficiency and reliability. To bridge this gap, research is increasingly focusing on the convergence of digital enablers.

Digital Twin (DT) technology has experienced significant growth across various sectors. A widely accepted understanding is that a DT is a virtual counterpart of a specific physical object [18]. Complementing this is Building Information Modeling (BIM). From the problems, the integration between BIM and ML can resolve the problems. In addition, a benefit gained from efficient data management is that data can be utilized to improve railway maintenance [19]. Artificial Intelligence (AI) provides the analytic engine. Predictive maintenance based on Data-Driven models is one of the most promising approaches among emerging technologies, supported by the extraordinary growth of the AI sector [20].

AI serves as the overarching discipline for these computational advancements, with ML functioning as its core subset dedicated to data-driven performance improvement. Within the domain of railway maintenance, ML algorithms are generally categorized into three primary learning paradigms: Supervised Learning, which relies on labeled historical data to map inputs to outputs; Unsupervised Learning, which identifies hidden patterns or anomalies in unlabeled datasets; and Reinforcement Learning, which optimizes decision-making through agent-environment interactions. As illustrated in Figure 1, each paradigm encompasses a diverse array of specialized algorithms tailored to specific maintenance tasks such as defect detection, clustering, or scheduling optimization.

Despite the growing academic interest in digital transformation, existing research predominantly focuses on the individual application of technologies to address specific cost or technical challenges. While many studies examine the isolated benefits of BIM, DT, or ML for cost reduction and technical accuracy, there is a distinct gap regarding their combined use cases. Fewer studies explicitly investigate the synergistic potential of integrating these three domains into a unified maintenance framework. This paper provides a systematic literature review and critical analysis of the state of the art in digitalized predictive maintenance, bridging this gap by examining how the fusion of these technologies optimizes maintenance strategies. This paper aims to:

Identify and categorize the ML paradigms and algorithms currently applied to railway maintenance.
Correlate specific digital enablers with distinct maintenance optimization tasks (e.g., defect detection, RUL estimation) to determine technical trends and suitability.
Analyze the geographical and temporal distribution of research to identify global hubs, “Motor Themes,” and emerging areas of interest.
Highlight critical gaps to guide future research toward resilient and sustainable railway maintenance.
Assess how mature the integration of BIM and DT technologies is within predictive maintenance workflows.
Identify, categorize, and synthesize key empirical findings, applied algorithms, and targeted railway components from the selected literature.

The remainder of this paper is structured as follows: Section 2 details the research methodology, outlining the Systematic Literature Review process and PRISMA compliance used for data selection and filtration. Section 3 presents the comprehensive results and discussion, categorizing Machine Learning paradigms, evaluating the integration maturity of DTs and BIM, and analyzing the findings from cost and sustainability perspectives. This section also synthesizes critical technical barriers and proposes future research directions to address identified gaps. Section 4 discusses the practical applications of the study’s findings for industry stakeholders, alongside the inherent limitations of this review. Finally, Section 5 offers concluding remarks and summarizes the principal contributions of this work to the field of digitalized railway maintenance.

2. Methodology

To provide a comprehensive assessment of the state of the art in digitalized railway maintenance, this study employed a Systematic Literature Review approach. The methodology was designed to be transparent, replicable, and scientifically rigorous, strictly adhering to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [21]. The PRISMA checklist was followed throughout the methodology and reporting phases to maintain high standards of transparency. This systematic review was not registered in any prospective registry, and no prior protocol was published. This structured framework was selected to mitigate researcher bias, ensuring that the selection of evidence regarding predictive maintenance, BIM, DT, and AI remains objective and comprehensive.

The review process was executed in four distinct phases: identification, screening, eligibility, and data extraction and content analysis. To ensure the effective identification of relevant studies, a search string strategy was developed and iteratively refined throughout the review process. To align the dataset with the research objectives, pre-defined inclusion and exclusion criteria were employed to filter the search results. Following this selection, relevant data points were systematically retrieved from the literature and consolidated into a standardized extraction table [22]. Subsequently, the extracted information was critically assessed and categorized to address the specific research objectives. This explicit and systematic framework ensures that reliable results are obtained efficiently [23]. The efficiency of this process has been justified in various studies [24,25,26]. The following subsections outline the detailed steps of this approach.

2.1. Identification of Papers

To ensure comprehensive coverage, the search was conducted using the Web of Science (WoS) and Scopus digital libraries. These platforms are recognized as leading multidisciplinary databases, offering a vast repository of high-impact, peer-reviewed literature from leading academic journals [23]. The search string was constructed by identifying railway, predictive maintenance, digital twin, and machine learning, which were categorized into four sets alongside their respective synonyms. The finalized Boolean search query, detailed in Table 1, was applied to the search fields of both digital libraries. During this phase, the query was executed across the Title, Abstract, and Keywords fields of the databases, resulting in a preliminary identification of 632 papers, as can be seen in Figure 1. Three different criteria for inclusion/exclusion were applied as follows:

Language: Papers written in English were included, and others were excluded.
Document Type: Peer-reviewed journal articles were included; conference papers, book chapters, reports and theses were excluded.
Time Interval: Papers dated between 2015 and 2026 were included, and others were excluded

Following the application of the pre-defined inclusion/exclusion protocols, the dataset was refined to 142 records from Web of Science and 178 from Scopus. After removing duplicate entries, a total of 189 unique papers remained for the formal screening process.

2.2. Screening and Eligibility

Following the initial identification and deduplication, the remaining 189 unique papers were subjected to a screening phase based on their titles and abstracts. The objective of this stage was to filter out studies that, while strictly meeting the search term requirements, did not align with the specific methodological or domain-related focus of this review.

A significant portion of the dataset was eliminated during this phase due to deviations from the core research objectives. The exclusion process was driven by the following constraints:

Maintenance Strategy Mismatch: Papers focusing exclusively on preventive maintenance strategies or health monitoring only were excluded, as this review specifically targets predictive and optimized maintenance frameworks.
Scope Limitations: Studies limited to defect detection without broader maintenance implications were removed to ensure a focus on holistic maintenance management.
Component Specificity: Research concentrating solely on electrical sub-components or large-scale civil structures (e.g., bridges) was excluded to prioritize core railway infrastructure and rolling stock assets.
Domain Relevance: Papers that were found to be unrelated to the railway sector or those lacking a clear maintenance dimension were discarded.

The quantitative distribution of papers eliminated due to these specific reasons is illustrated in Figure 2.

2.3. Data Extraction

Upon the finalization of the eligible dataset, the selected articles were subjected to a comprehensive full-text review to facilitate data extraction. Critical data attributes comprising the core research objectives, applied ML algorithms, and targeted railway components were synthesized into a structured table, presented as Table A1 in Appendix A.

The final phase of the methodology involved the critical synthesis and thematic analysis of this structured dataset to address the study’s primary aims [22]. This analytical framework was specifically designed to satisfy the defined research objectives through direct data interrogation, focusing on:

Mapping the geographical distribution of scholarly contributions to identify global research hubs.
Correlating specific ML algorithms with distinct maintenance optimization tasks to determine technical trends.
Evaluating the integration maturity of BIM and DT technologies in upgrading predictive maintenance workflows.

2.4. Use of GenAI

During the preparation of this study, the authors used the Image Creation tool of Gemini 3 Pro for the purpose of figure creation. The authors have reviewed and edited the output and take full responsibility for the content of this publication.

3. Results and Discussion

3.1. Distribution of the Papers

The annual distribution of the 73 selected publications, spanning from 2019 to 2025, is illustrated in Figure 3. While the research area showed a steady but modest output between 2019 and 2022, a significant increase in research activity is observed starting in 2023. The data reveals a peak in 2025 with 23 publications. This dramatic upward trajectory, particularly in the last three years, indicates that the integration of DTs, BIM, and ML into railway maintenance has transitioned from an emerging concept to a rapidly accelerating field of global interest.

The geographical distribution of the research, as depicted in Figure 4, highlights a broad international engagement with contributions from over 30 distinct countries. China emerges as the leading contributor with 10 papers, followed closely by Italy (six papers), Spain (five papers), and the United Kingdom (five papers).

Beyond individual country rankings, a regional analysis reveals a dominant European cluster, with nations such as France, Poland, Portugal, and the Netherlands contributing significantly to the discourse. Furthermore, the presence of contributions from diverse regions—including Australia, India, North America, and the Middle East—underscores that the optimization of railway infrastructure through advanced digitalization is a global priority. This widespread participation suggests that while certain regions lead in volume, the technical challenges of predictive maintenance are being addressed through a highly diverse and international research lens.

3.2. Thematic Analysis

To provide a deeper understanding of the conceptual evolution within the field, a thematic analysis was conducted using a strategic diagram based on the density (development degree) and centrality (relevance degree) of identified keywords. As illustrated in Figure 5, the research landscape is categorized into four distinct quadrants: Motor Themes, Niche Themes, Basic Themes, and Emerging or Declining Themes.

The Motor Themes quadrant represents the most well-developed and essential topics driving the current research agenda. Key clusters in this area include Predictive Maintenance, DT, Neural Networks, and Anomaly Detection. The high centrality and density of these themes indicate that the integration of data-driven predictive models with digital representations of railway assets constitutes the core focus of modern railway maintenance research.

The Niche Themes quadrant identifies highly specialized areas that are well-developed but maintain a lower level of centrality to the broader field. Notably, research into ballast fouling and specific risk assessment frameworks falls into this category. While these topics are technically mature, their application is often restricted to specific infrastructure components rather than holistic system optimization.

Basic Themes represent foundational pillars that are highly relevant to the industry but remain less “dense” in terms of recent theoretical development. Topics such as vibrations, failure analysis, and general inspection techniques are located here. These are considered transversal requirements that support more advanced predictive technologies found in the Motor Themes quadrant.

Finally, the Emerging or Declining Themes quadrant highlights nascent areas with significant future growth potential. Clusters such as real-time monitoring, safety, urban transportation, and multi-body dynamics models are currently positioned here. Given the overall acceleration of publications observed in recent years, these emerging themes—particularly real-time systems—are likely to transition toward the Motor Themes quadrant as digital transformation in the railway sector matures.

3.3. ML Algorithms

As a specialized branch of AI, ML focuses on the development of computer algorithms capable of learning from data and improving performance over time without explicit additional programming, which is shown in Figure 6. The primary objective of ML is to create systems that automatically detect patterns and generate predictions based on those identified regularities. While widely applied in fields such as image processing and fraud detection, its application in predictive maintenance is central to the scope of this study. Based on the methodology and data requirements, ML is categorized into three primary learning paradigms: supervised, unsupervised, and reinforcement learning.

While Figure 6 visually maps the frequency and general relationships between the learning paradigms and their target components, understanding the operational rationale behind these applications requires a closer examination of each algorithm’s technical strengths. To address this, Table 2 provides a comparative summary detailing the specific effectiveness of the dominant AI architectures identified in the literature. This comparison highlights why certain models are uniquely suited to specific railway assets and predictive maintenance tasks.

3.3.1. Supervised Learning

Supervised learning is utilized to generate a functional mapping between input data and specific outputs. It is termed “supervised” because the training process requires labeled data, where humans act as teachers by providing the computer with features and corresponding ground-truth labels. The machine learns these patterns to provide the best estimation for future unlabeled inputs. The primary problem types within this paradigm are regression and classification. Drawing on labeled datasets from maintenance histories and finite element simulations, these models are trained to execute regression for continuous variables like TQI or classification for categorical health states [27].

Tree-based algorithms are frequently selected for their interpretability and robustness in handling tabular, imbalanced datasets typical of railway maintenance logs. Decision Trees and Random Forest (RF) algorithms are popular for analyzing maintenance logs. Historical Enterprise Resource Planning (ERP) data has been leveraged through DT and RF architectures to predict the timing and nature of maintenance activities for railway switches [28]. Similarly, Predictive models based on RF have been applied to metro station assets like elevators, outperforming alternative algorithms with a peak accuracy of 96.83% [29]. Additionally, RF is cited for its robustness against overfitting and ability to handle mixed data types (categorical and numerical) effectively. Rolling stock maintenance rules derived from decision trees have demonstrated the potential to optimize labour by reducing staff requirements from four to two individuals [30]. Some other application areas are predicting discrete defect counts using features like tonnage and seasonality [15], estimating lateral track irregularities from vehicle dynamics [31], and predicting the risk of service failures on heavy haul lines [32]. Gradient boosting techniques are also tree-based algorithms utilized for regression tasks requiring high precision. The lifespan of rail infrastructure has been modeled using high-performance boosting architectures; the application of CatBoost and XGBoost to multisource data yielded a coefficient of determination (R2) of 0.81 [33].

Support Vector Machines (SVMs) and Artificial Neural Networks (ANNs) are often utilized as robust baselines or for specific regression tasks where deep learning might overfit due to data scarcity. An SVM classifier was employed to distinguish between vertical and lateral track alignment conditions based on car-body vibration data [34]. Also, SVM was selected in various studies in order to predict oil temperatures in tram components [35], the Mean Time to Repair (MTTR) for rolling stock CCTV systems [36], and anomalies in train braking pipes and bearings [37]. SVM was selected for its ability to maximize the margin between fault classes, ensuring high generalization. A Feed-Forward ANN was trained to predict the wear of pantograph sliding strips [38]. Some of the other use cases for ANNs are characterizing the “normal” behavior of car-body acceleration based on train speed and curvature [39], evaluating railway track buckling risks [40], and forecasting the TQI [41].

K-Nearest Neighbours (KNN) is another supervised learning algorithm that is favored for its simplicity and ability to capture local patterns by averaging the values of the k most similar historical instances. However, the accuracy of the models is not satisfying in many cases. Based on the reviewed literature, the KNN algorithm appears in several specific application contexts, serving either as a baseline for comparison against more complex models or as a core component in hybrid anomaly detection frameworks. As an example, KNN achieved an accuracy of 78% in classifying railway track segments into “failure” or “non-failure” states based on track geometry data, while Long Short-Term Memory (LSTM) achieved 98.7% accuracy [42].

Recurrent architectures are the primary choice for time-series forecasting, where the current state of an asset is heavily dependent on its historical degradation trajectory. For tasks involving temporal dependencies and RUL estimation, LSTM and Recurrent Neural Network (RNN) architectures are the primary methodological choices [42]. LSTMs are specifically favored for their ability to capture “long-term dependencies” in data that deteriorates gradually over time, effectively addressing the vanishing gradient problems found in standard RNNs [43]. LSTM is applied to predict specific track geometry parameters like alignment, gauge, twist for the upcoming year [44]. Gated Recurrent Units (GRU) models were found to be effective for predicting longitudinal level defects in track geometry [44]. GRUs are selected for their computational efficiency compared to LSTMs (fewer gates) while maintaining performance on shorter sequences [44]. Additionally, Attention mechanisms are integrated with temporal encoders in the RailCANet framework to detect anomalies by focusing on critical time steps within railway big data [45].

Deep learning architectures, particularly CNNs, are predominantly used for automated feature extraction from high-dimensional, unstructured data such as vibration signals and acoustic spectrograms. A specialized variant of CNNs, known as a Graph Convolutional Network (GCN), was proposed to predict track failures. This architecture was chosen to capture the spatial dependencies and network topology of railway tracks, processing geometric measurements as nodes in a graph to predict intervention levels [46]. In the domain of component monitoring, a 1D-CNN was implemented to process raw signals from railway bogies [47]. Some other use cases are classifying rail joints and defects [48], rail crack stages [49], wheel out-of-roundness from axle-box vibration data [50], rail head wear and rolling contact fatigue [51]. The core advantage cited is the ability to perform automated feature extraction, eliminating the need for manual feature engineering, which is prone to error in noisy railway environments [52].

Beyond individual algorithm applications, current research increasingly leverages hybrid architectures for enhanced predictive accuracy. To enable proactive maintenance, a hybrid CNN-LSTM model was developed to forecast sensor signals 1, 3, and 6 h in advance. The LSTM component captured temporal dynamics to predict future signal values, which were then analyzed for anomalies to detect failures before they occurred [53]. In a more complex integration, an LSTM-Autoencoder was utilized to estimate the RUL of mechanical equipment. Within this framework, LSTM layers facilitated the automated extraction of features from dynamic time-series data, which were subsequently mapped to time-to-failure using regression techniques [54]. Additionally, Deep Neural Networks (DNNs) were utilized—in comparison with LSTM architectures—to predict wheel wear progression, specifically focusing on tread wear and flange height. These supervised predictions functioned as the state input for a Reinforcement Learning agent, demonstrating a hybrid strategy designed to manage wheel wear effectively, even under conditions of limited measurement data [55].

3.3.2. Unsupervised Learning

In unsupervised learning, the computer trains itself without human intervention, identifying patterns and providing insights from unlabeled data. This paradigm is particularly beneficial when desired labels are unknown or when mining for rules and summarizing data points. It is primarily used for clustering and association rules.

In railway maintenance, unsupervised learning addresses the lack of labeled failure data by establishing “normal” operational baselines. Autoencoders are frequently selected to detect anomalies based on high reconstruction errors, which also aids in assessing defect severity. K-Means clustering is utilized to group track sections or components with similar behaviors, effectively reducing uncertainty in degradation modeling. These methods allow machines to discover previously unknown knowledge or non-intuitive relationships within complex datasets.

Clustering algorithms are extensively used to group similar asset behaviors, impute missing data, and refine anomaly detection results. K-Means clustering is the most frequently cited unsupervised algorithm in the reviewed literature. Kasraei et al. employed K-Means clustering to determine optimal track geometry maintenance limits. The authors chose this method to minimize the uncertainty inherent in maintenance modeling by dividing the railway line into track sections with the “most similarity” [56]. K-means was also used for real-time geometry monitoring [57], rolling stock prognostics [58], and general defect grouping [59]. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is another clustering algorithm that is selected for its ability to handle noise and cluster categorical railcar data based on density [60].

Autoencoders are the most prominent unsupervised architecture in the reviewed literature, primarily selected for their ability to learn efficient data coding and detect anomalies through reconstruction errors. Scholars implemented an undercomplete Autoencoder to localize track geometry faults. They chose this method to establish a baseline of “normal” track behavior (zero settlement) using a moving frame of reference. They integrated K-Means Clustering as the second stage of a fault localization pipeline, following the Autoencoder. While the Autoencoder identified where geometry changed, K-Means (with k = 3) was used to classify what type of change occurred (deterioration, maintenance improvement, or stability) based on the reconstruction loss and area under the curve features [61]. Another study employed a hybrid LSTM-Autoencoder to estimate the RUL of mechanical equipment. The unsupervised autoencoder component was used for automatic feature extraction from dynamic time-series data, capturing complex temporal dependencies that were subsequently fed into a regression mode [54]. Toribio et al. applied Autoencoders and Sparse Autoencoders as part of a comparative framework for early failure detection in metro Air Production Units [53].

3.3.3. Reinforcement Learning

Reinforcement learning utilizes observations of an environment to take actions that maximize a predefined reward. The machine continuously learns from environmental interactions, exploring possibilities to optimize performance based on a decision-making function. Each action results in a feedback reward that reinforces or discourages specific behaviors.

This paradigm is specifically chosen to solve complex maintenance planning and sequential decision-making problems, such as rail renewal or autonomous inspection. Unlike supervised learning, which predicts a static state, RL enables systems to learn continuous maintenance policies by interacting with an environment to maximize long-term rewards. Common algorithms identified in this category include Q-Learning, Deep Q-Network (DQN), Advantage Actor Critic (A2C), and Markov Decision Process (MDP).

DQN and standard Q-Learning are identified as foundational RL algorithms for maintenance scheduling and resource allocation, though they are often cited in comparative or review contexts within the provided sources. These algorithms are chosen for their ability to optimize decision-making policies in stochastic environments where the agent must be “greedy” to estimate future rewards [62]. In the systematic review by Bianchi et al., DQN is highlighted as a tool for rail renewal and maintenance planning, where the algorithm uses inputs like the Track Quality Index (TQI) and hazard index to select from action spaces, including tamping, grinding, and renewal. The review notes that while effective, DQN generally has longer processing times compared to A2C [63]. Other reviewed studies used the DQN algorithm for rail renewal and maintenance planning [27], wheel wear management [55], UAV path planning [64], and train rescheduling [65].

While often considered the mathematical framework underpinning RL, MDPs are explicitly utilized as a reinforcement learning technique to optimize maintenance policies for rolling stock components. Bhadouria et al. employed MDP to establish a decision framework for wheelset maintenance that minimizes long-term costs while extending the lifecycle and ensuring safety [66]. The authors modeled the degradation of wheelsets using three lifetime variables: time, mileage, and gross ton mileage. The MDP framework evaluates the “state” of the wheelset (based on tread diameter D) and chooses optimal maintenance actions—specifically “do-nothing,” “renewal,” or “turning” (re-profiling)—to avoid unnecessary costs and early lifecycle termination.

The A2C algorithm represents the state-of-the-art application of RL in railway maintenance, specifically for integrating predictive maintenance with DTs. Railway maintenance requires continuous, sequential decision-making based on historical actions. That is why Sresakoolchai and Kaewunruen selected A2C because traditional supervised and unsupervised learning techniques perform prediction only once [62]. Furthermore, A2C was chosen over DQN because it has been proven to perform better (by approximately 20%) and requires shorter training times while providing more stable performance due to the Policy Gradient component. Tang et al. also note the application of Asynchronous Advantage Actor-Critic (A3C) and other Actor-Critic methods for metro scheduling. These algorithms are chosen for their efficiency in learning both a policy (Actor) and a value function (Critic) simultaneously, which stabilizes training in dynamic environments like metro circulation with stochastic demand [65]. Despite its highlighted advantages, the use of A2C in railway maintenance literature is quite limited.

3.4. Use of DTs and Building Information Modeling

True DTs are characterized by a bidirectional flow: data flows from the physical asset to the digital model (monitoring), and information/control flows from the digital model back to the physical asset (actuation or parameter updating). These flows are illustrated in Figure 7a,b.

3.4.1. Unidirectional Architectures

Based on a rigorous analysis of the selected papers, many of the studies reviewed utilize unidirectional data flow from physical to digital rather than true DTs (bidirectional automatic control). While many studies claim to use DTs, they are primarily using high-fidelity 3D simulations for visualization or as data repositories for predictive maintenance results.

Yang et al. created a system that consists of a “behavior model” (visual/Unity3D) and a “rule model” (LSTM-ARIMA prediction) [67]. The text mentions providing a “theoretical basis for real-time interaction,” but the implemented workflow involves sensors updating the virtual model to “visually display the physical equipment’s state” and predict faults. The output is for maintenance personnel to arrange plans, not for the DT to reconfigure the switch machine.

The study of Wysocki et al. focused on equipping trams with low-cost sensors to feed a cloud-based analytics platform. The system establishes baseline reference values and monitors real-time parameters against them. It provides “User Interfaces” and “Infrastructure Management” alerts but does not describe a feedback control loop to the tram’s ECU or control systems [35]. In another study, the digital model is used to simulate conditions (like bolt loosening) that are hard to capture in the field to train the AI. It does not appear to run in real-time synchronization with a specific physical rail joint. The “DT” here is a Finite Element model used to generate synthetic training data for a classifier [68].

Sahal et al. propose a framework where DTs “collaborate”, but the focus is on data sharing via blockchain. The “feedback” to the physical world is described conceptually as “consensus-based decision making”, but the technical implementation focuses on reading data from the physical asset into the ledger [69]. Marquez et al. highlighted the “Interaction” feature, but the feedback loop relies on “simple business rules” to interact with maintenance technicians rather than the asset itself. The DT acts as a sophisticated dashboard/alert system rather than an autonomous controller [70].

3.4.2. High-Level DT Integration

These papers explicitly demonstrate or define a feedback loop where the digital model influences the physical system or updates itself autonomously. Figure 8 illustrates the system control loop.

Nasim et al. implement a “closed-loop feedback mechanism”. Sensors feed the model (Physical → Digital), and the system performs “real-time calibration” where the FEM parameters are updated using an Extended Kalman Filter based on the sensor residuals (Digital → Model Update) [71]. The framework goes beyond simple visualization by incorporating “recursive model updating” to ensure the digital replica evolves with the physical asset’s degradation. Also, Torzoni et al. mathematically formalize the feedback loop using a “health-dependent control policy” computed offline to maximize expected rewards, moving beyond passive monitoring to active lifecycle management [72].

With the help of 5G technologies, a recent study claims “two-way real-time synchronization” with latency under 10 ms [73]. The integration of “Multiphysics simulation” with unsupervised learning is used to visualize transient states like “serpentine instability” in real time. Another system forms a “perception-prediction-optimization” closed loop, where the digital layer provides feedback for “intelligent decision-making” [47]. In the study of Ahmad et al., while the “feedback” is not automatic actuation like stopping a train, the DT generates “additional sensor data” (virtual sensors) via simulation and provides “predictive feedback to the physical system” in the form of grinding cycle recommendations [74].

3.4.3. Architectural Frameworks

Some studies acknowledge the necessity of bidirectional flow for a “true” DT but present it as a reference architecture rather than an implemented field application. De Donato et al. propose a high-level architecture that explicitly includes an “Actuating” capability in the Physical Layer to execute decisions received from the Service Layer. The authors define the DT as a “dynamic and self-evolving virtual replica... characterized by a bi-directional seamless communication” [75]. It distinguishes between Primary Data (raw) and Secondary Data (AI pre-processed), enabling the integration of legacy assets that cannot be easily sensorized by using non-intrusive AI (e.g., video processing) to create the DT data stream. This study is one of the limited sources that explicitly architect the actuation/control layer required for a true bidirectional DT, though it remains a guideline rather than a case study.

3.4.4. High-Level BIM Integration and the Use of IFC

The use of Industry Foundation Classes (IFC) is the hallmark of a “real” BIM integration that supports interoperability. Most papers ignore this, relying on proprietary formats or generic 3D meshes.

A fundamental challenge in digitalized maintenance is bridging the inherently static geometric data of BIM with the dynamic, continuous data required for AI learning loops. In a mature integration, the BIM model does not remain static; rather, it acts as a spatial and semantic anchor. Static 3D objects (e.g., a specific rail joint modeled in IFC) are mapped to dynamic IoT sensor streams. When the sensor detects a vibration anomaly, the data is fed into the ML algorithm (e.g., an LSTM network) to calculate the Remaining Useful Life (RUL). Crucially, this predictive output is then pushed back into the BIM environment.

One of the technically mature integrations of BIM with ML appears in the works by Sresakoolchai & Kaewunruen, who explicitly detail the software ecosystem and standards used. The authors achieve what they classify as “6D BIM Level 3,” integrating schedule, cost, and maintenance data into the 3D model [44]. Their data flow moves from Track Geometry Cars (sensors) → ML Prediction → BIM Model storage. They also utilize Autodesk Dynamo and VBA scripts to automate the data exchange between the BIM environment (Civil 3D) and the ML algorithms, using “Property Set Definitions” to store predictive maintenance states (e.g., “Defective ballast = True”) directly within the BIM objects. This automated data exchange transforms the static 3D model into a dynamic repository that continuously reflects the AI-driven health state of the physical asset. The study explicitly details the use of IFC. The application of such multidimensional BIM is also expanding beyond core track geometry; for example, recent studies have successfully developed 6D BIM frameworks to manage climate change adaptation, lifecycle costs, and proactive risk mitigation for railway overhead line systems [76].

Another innovative paper regarding BIM integration explicitly combines physics-based FEM with IoT sensing and standardizes the geometry using IFC (BIM) and 3DTiles (GIS) [71]. This represents a mature convergence of structural engineering, geospatial data, and open standards. Explicitly integrates BIM and IFC.

3.5. Cost Perspective

The Cost perspective is rigorously quantified across multiple papers. Authors frequently use specific monetary values for maintenance activities, derailment risks, and comparative savings between predictive and corrective policies. Several papers provide explicit percentage reductions in expenditure, operational costs, or resource utilization, as shown in Table 3. However, a distinction must be made between papers that use financial cost functions for optimization (theoretical) and those that report actual monetary savings or efficiency gains (practical). Studies have focused on cost benefits under various categories like maintenance inspection costs, total maintenance costs, downtime costs and costs of failure.

To understand the economic advantages of digitalized maintenance, it is necessary to first establish how preventive maintenance (PvM) reduces total lifecycle costs when contrasted with reactive, run-to-failure (corrective) strategies. Corrective maintenance is universally recognized as the least efficient and most expensive form of asset management. Relying on reactive interventions leads to escalated repair costs, sudden equipment downtime, and severe logistical expenses. Preventive maintenance mitigates these high costs through several specific mechanisms:

Prevention of Catastrophic Failures: By addressing the underlying causes of deterioration before they manifest as critical faults, PvM averts severe safety and infrastructure incidents. Derailments and system failures incur massive financial penalties; for example, the literature quantifies the average direct cost of a broken rail at $525,000 per incident, with full derailments costing up to $1.5 million [32].
Asset Lifespan Extension: Intervening prior to failure physically extends the operational lifespan of railway infrastructure and rolling stock. In the context of railway wheelsets, which degrade progressively due to uniform wear and rolling contact fatigue, corrective turning requires the removal of a significantly larger portion of the tread diameter. This drastic material loss prematurely terminates the wheelset’s lifecycle, forcing a highly expensive total renewal. Fixed-interval preventive turning requires less material removal, thereby extending the component’s usable life and lowering total lifecycle costs.
Reduction of Unplanned Downtime: Unexpected component failures trigger costly operational downtime, service delays, and penalty fees. An unexpected one-day stoppage for railway machinery can incur massive capital losses, with benchmarks indicating downtime costs reaching 100,000 to 200,000 euros [59]. Furthermore, rolling stock unavailability directly impacts network congestion; empirical data indicates that a 1% increase in unavailability results in a 0.5% increase in annual delay minutes [36]. Scheduled interventions preserve operational availability by preventing these unexpected shutdowns.
Optimization of Workforce and Logistics: Because preventive maintenance is pre-planned based on predetermined time or mileage intervals, it can be executed during train-free sub-intervals. This allows infrastructure managers to strategically allocate human and material resources, balancing workloads and avoiding inefficient, ad hoc emergency travel across the network. Optimizing these stages for rolling stock traction and braking systems successfully reduced the required technical staff and equipment unavailability by 50% [30].

3.6. Sustainability Perspective

While “sustainability” is frequently mentioned as a motivating factor for railway research, there is a significant research gap regarding the quantification of environmental impacts directly attributable to specific maintenance methodologies. Most papers discuss sustainability in qualitative terms (e.g., “environmentally friendly,” “resource efficiency”) rather than providing calculated metrics (e.g., kg of CO₂ avoided, exact material waste reduction).

The majority of evaluated sources focus exclusively on reliability, prediction accuracy, and monetary cost reduction [79,81,82]. Consequently, performance indices such as specific carbon footprint reductions (kg of CO₂ avoided), energy consumption in kWh, or material waste in tons—are notably absent from the results and evaluation sections of these technical papers. Several studies suggest that DT approaches pave the way for reduced emissions and energy usage through virtual simulation, yet they offer no calculated metrics to support these claims [68,77].

In the absence of direct emissions reporting, several researchers utilize Asset Life Extension as a quantitative proxy for sustainability, emphasizing the preservation of materials in the operational cycle. A discrepancy exists between theoretical models and empirical results. For instance, while Pratico and Fedele explicitly include environment-related costs and CO₂ emissions within their cost function, the paper does not provide a calculated reduction percentage for their case study [80]. Similarly, while Sresakoolchai et al. utilize Deep Reinforcement Learning with the stated goal of reducing emissions, their evaluation metrics remain focused on defect reduction and maintenance efficiency rather than specific tonnage of carbon saved [55].

3.7. Challenges, Gaps and Future Research Directions

The systematic review of the literature reveals several critical barriers that hinder the widespread industrial adoption of ML and Digital Twins in the railway sector. Addressing these deficiencies is essential for transitioning from theoretical models to resilient, real-time, and sustainable infrastructure management. The following subsections detail these integrated challenges and propose targeted research trajectories to guide future advancements.

3.7.1. Data Scarcity, Quality and the Safety Paradox

The most common challenge identified is the lack of high-quality, labeled datasets. A critical paradox in predictive maintenance is that railway operators prioritize safety and regularity, meaning components are maintained before they fail. Consequently, there is a severe deficiency of “run-to-failure” data required to train accurate RUL models [62]. For Reinforcement Learning scheduling applications, this manifests directly as the “cold start” problem: RL agents cannot effectively learn optimal policies for new or highly reliable railway infrastructures because they lack historical failure states to explore. This shortage makes it difficult to determine failure thresholds during the model training phase.

Datasets are often highly skewed, where the majority of observations represent “healthy” or “normal” states, while defect data is rare [59]. In the analysis of railway switches, models tended to be biased toward majority classes (e.g., “maintenance performed”), performing poorly on minority classes [29]. Similarly, in rail fastener detection, the dataset contained over 6000 healthy cases but only 47 cases of missing clamps, requiring augmentation techniques to be fit for purpose [83].

The railway environment is inherently noisy due to mechanical vibrations and environmental factors. Sensors often pick up background noise that interferes with fault detection. Furthermore, low-cost sensors used to enable scalable DTs often provide lower data quality, requiring more complex analytics to compensate for lack of precision [35]. In Ground Penetrating Radar (GPR) analysis, models trained on clean synthetic data degraded sharply when applied to noisy field data [84].

Manual labeling is described as resource-intensive. For instance, in track maintenance planning, only about 10% of data per region could be labeled because of human resource limitations [46]. In online ML contexts, manual labeling cannot keep up with the volume of data streams, and failure labels are often not generated again after a failure is fixed via the FRACAS procedure [85].

To resolve this paradox and overcome the Reinforcement Learning cold start problem, future research must pivot toward three methodologies:

Synthetic Data Generation: DTs should be utilized for “fault injection” to safely generate synthetic failure data, acting as offline virtual environments where RL agents can safely explore catastrophic scenarios without real-world consequences.
Rather than relying solely on data, models should incorporate physical laws. Agustin et al. demonstrate the utility of training Neural Networks (MLP) on data generated by 3D Multibody Dynamics (MBD) simulations, effectively using physics to fill the data void.
Methodologies must be developed to transfer knowledge from laboratory test rigs or simulations to the field, using domain adaptation techniques to account for environmental noise.

3.7.2. Model Interpretability and the “Black-Box” Nature of Deep Learning

The opacity of advanced ML and Deep Learning models acts as a barrier to adoption by infrastructure managers who require justification for maintenance actions. Advanced models like DNNs and Convolutional Neural Networks (CNNs) are frequently described as “black boxes,” making it difficult to justify predictions to railway engineers or safety certifiers [86]. In other words, they are “challenging to interpret for infrastructure managers,” despite providing accurate results [60]. This lack of transparency leaves decision-makers “unclear about the reasoning behind them” [81]. Stakeholders require actionable insights, not just probabilities. Authors explicitly note that while deep learning models are accurate, their results are difficult to interpret compared to decision trees or Bayesian networks [28].

Future research must prioritize the integration of explainable AI techniques to decompose complex model outputs into understandable feature contributions. There is also a need for standardized protocols that define how AI predictions should be justified to safety certifiers, moving the field toward a “Certified AI” framework for critical railway infrastructure.

3.7.3. Integration Maturity and Standardized Interoperability

The transition toward a fully digitalized railway network is currently impeded by low integration maturity and a lack of standardized protocols for cross-platform interoperability. Currently, many Digital Twin applications rely on closed, proprietary software ecosystems. For long-term railway asset management, where physical infrastructure lifespans exceed 50 years, this creates a severe risk of ‘vendor lock-in’. While the individual benefits of BIM, DT, and ML are well-documented, their synergistic potential remains largely untapped in current practical applications.

The gap is the absence of a unified, bidirectional framework that connects sensing, prediction, and autonomous actuation within a standardized ecosystem. There is a distinct lack of field-implemented “True DTs” that go beyond passive dashboards to execute real-time parameter updates or maintenance reconfigurations. To transition from unidirectional monitoring to True Digital Twins, the literature suggests a layered technical roadmap relying on standardized protocols: (1) data acquisition layer (IoT and Edge), (2) semantic and spatial layer (BIM Integration—IFC), (3) synchronization layer (5G), and (4) actuation layer (control execution).

Future studies must prioritize the use of open-source architectures and open data standards like IFC to ensure cross-platform interoperability across structural engineering, geospatial data, and ML platforms. Research should shift toward creating reference architectures that explicitly include an “actuating” capability in the physical layer, allowing for autonomous or semi-autonomous execution of decisions generated by the DT.

3.7.4. Real-Time Deployment, Operational and Computational Constraints

The transition from offline analysis to real-time onboard monitoring introduces significant hardware and latency constraints. High-fidelity simulations like 3D track models for buckling and deep learning models (e.g., GRU, GCN with attention mechanisms) are computationally demanding [40]. This poses limitations when deploying models in resource-constrained environments or edge computing settings [45].

Real-time fault detection requires low-latency inference. The integration of complex architectures increases computational complexity, necessitating model compression or hardware acceleration for large-scale deployment [87].

The primary gap is the “deployment barrier” between high-performance laboratory algorithms and the practical requirements of large-scale, low-latency railway applications. There is a lack of research focusing on how to maintain high predictive precision while significantly reducing the computational footprint and synchronization latency of the maintenance system. Additionally, operational barriers such as harsh weather, strong winds, and variable lighting adversely influence UAV stability and sensor performance [64]. Similarly, car-body vibration signals for track monitoring are affected by the condition of the whole assembly (e.g., bogie vibrations superposing rail vibrations), making signal isolation difficult [80]. Finally, models trained on specific railway lines or vehicle types often fail to generalize to other networks due to variations in operational contexts [50].

To resolve these, future research directions should be:

Model Compression and Hardware Acceleration: Future efforts must focus on creating lightweight versions of deep learning models suitable for edge deployment.
5G-Enabled Synchronization: Research should explore the integration of 5G technologies to enable ultra-low latency data synchronization between physical assets and DTs, supporting real-time “perception-prediction-optimization” loops.
Domain Generalization: Studies should prioritize the development of robust models that can generalize across different railway lines, vehicle types, and varying operational contexts, reducing the need for line-specific recalibration.
Resilient Sensing for Harsh Environments: Future work must address the impact of adverse environmental factors (e.g., extreme temperatures, wind, and mechanical vibration) on sensor stability and signal isolation to ensure data reliability during real-time deployment.

3.7.5. Quantifying Environmental Impacts: Moving Beyond Qualitative Claims

The primary gap lies in the failure to treat sustainability as an explicit optimization variable rather than an inherent, unmeasured byproduct of efficiency. Without standardized environmental metrics, it is impossible to objectively compare the “green” performance of different maintenance methodologies or to support high-level climate adaptation strategies with verifiable data. To transition from qualitative sustainability claims to objective environmental quantification, future research must adopt a standardized framework of “Green KPIs”. Future maintenance models should explicitly measure and report the following metrics:

Avoided Logistic Emissions (kg CO₂e): The reduction in greenhouse gas emissions achieved by eliminating unnecessary physical inspection trips and ad hoc emergency maintenance routing.
Material Preservation Index (Tonnes): The exact mass of raw materials (e.g., steel from rails and wheelsets, concrete from sleepers) saved from premature scrapping due to the AI-driven extension of the asset’s RUL.
Operational Energy Optimization (kWh): The energy saved by preventing unplanned train stoppages, idling, and the rerouting of freight/passenger services caused by unexpected infrastructure failures.
Algorithmic Carbon Footprint (kg CO₂e): A critical counter-metric. Researchers must calculate and report the computational energy consumed to train and deploy complex Deep Learning models or run real-time Digital Twin simulations, ensuring the AI’s carbon cost does not outweigh the physical maintenance savings.

3.7.6. The Physical Foundation: High-Fidelity Modeling and Structural Optimization

While the integration of AI and BIM provides the necessary data architecture for predictive maintenance, a truly bidirectional Digital Twin must be built upon a rigorous physical foundation. A critical gap in purely data-driven approaches is the oversight of the complex mechanical behavior of modern rolling stock. To accurately predict structural degradation, AI models must interface with recent advancements in high-fidelity FEM and multi-disciplinary optimization.

The transition from reactive to predictive maintenance requires robust material and component design frameworks to define the baseline performance metrics that AI is tasked to monitor. For instance, the implementation of dynamic analysis for advanced hybrid railcar structures [88], and the topological optimization of railway bolster beams factoring in complex manufacturing constraints [89], provide the essential structural baselines required for accurate degradation forecasting. By integrating these high-fidelity structural engineering principles, such as the development and experimental validation of physics-based Digital Twins for freight wagon monitoring [90], future researchers can effectively bridge the gap between static geometric BIM data and dynamic, self-correcting Digital Twins.

4. Applications and Limitations of the Study

As discussed in previous sections, the findings of this study can contribute to improving predictive maintenance management for railway infrastructure and rolling stock in several ways. Furthermore, although this study focused on identifying a broad range of digital technologies—including AI, BIM, and DTs—and their applications on a global scale, its results can be utilized for preliminary technology identification and strategy formulation within specific railway networks.

It should be noted that the findings of this literature review are also subject to some limitations, which are outlined as follows:

Rapid Technological Evolution: The identified algorithms, digital frameworks, and their applications are subject to the rapid evolution of AI. Furthermore, by methodologically restricting this review to peer-reviewed journal articles to ensure high empirical validity, some preliminary or cutting-edge theoretical advancements frequently published in early-stage conference proceedings may have been excluded. As the field of deep learning and DT technology progresses, continuous updates to this review are necessary to ensure the relevance and accuracy of the state of the art.
Contextual and Data Dependency: The effectiveness of the proposed predictive maintenance solutions is not conclusive across all operational environments, as performance heavily depends on local data availability, the age of the infrastructure, and the quality of installed sensors. The technical viability of specific digital enablers varies by region and operator, necessitating localized research to determine the most feasible solutions for specific assets.
Specific Asset Focus: This paper primarily focused on core railway track infrastructure and major rolling stock components. However, there is a need to explore other critical railway assets in greater depth, including power supply systems, catenary lines, and complex signaling equipment, which may require specialized sensing and data management frameworks. Furthermore, as this review strictly focused on the technical and operational optimization of maintenance, the critical dimensions of cybersecurity, data privacy, and the ethical management of infrastructure data remain unaddressed and require dedicated systematic investigation in future studies.

SWOT Analysis for Industrial Application

To assist infrastructure managers and industry practitioners in evaluating the transition toward digitalized predictive maintenance, the technical findings of this systematic review have been synthesized into a SWOT analysis, as shown in Table 4. This strategic framework outlines the immediate internal capabilities and barriers (Strengths and Weaknesses) alongside external operational and technological factors (Opportunities and Threats).

5. Conclusions

This study focused on reviewing the state of the art in digitalized predictive maintenance for railway infrastructure and rolling stock, specifically examining the convergence of AI, BIM, and DTs. To this end, a systematic literature review was conducted strictly adhering to PRISMA guidelines, resulting in the final selection of 73 peer-reviewed and relevant articles from an initial pool of 632 records. The completed PRISMA 2020 checklist, which ensured the methodological integrity of this selection process, is provided as a Supplementary Material [91]. The collected dataset was then subjected to rigorous bibliometric and thematic content analysis to classify ML paradigms and evaluate the maturity of digital integration. The main findings are as follows:

Three primary technological pillars were identified as the foundation for digitalized predictive maintenance: AI, DT and BIM. These pillars rely on diverse data sources, including IoT sensors, track geometry cars, visual inspection data, and maintenance logs.
Three major ML paradigms were identified and categorized to address specific maintenance challenges: Supervised Learning, Unsupervised Learning, and Reinforcement Learning. Supervised Learning is the most dominant paradigm, primarily utilizing tree-based algorithms like RF for tabular maintenance logs and Deep Learning architectures (e.g., CNNs) for unstructured visual data. Unsupervised Learning (e.g., Autoencoders, K-Means) is critical for anomaly detection in the absence of labeled failure data. Reinforcement Learning (e.g., DQN, A2C) is emerging as a key tool for optimizing sequential decision-making tasks such as maintenance scheduling and renewal planning. This classification underscores the specialized utility of each algorithm type within the maintenance hierarchy.
Predictive maintenance applications were identified and categorized into physical targets and operational outcomes. Physical targets include core infrastructure assets such as rails, sleepers, switches, and rolling stock components like wheels and bearings. Operational outcomes were classified into defect detection, RUL estimation, and maintenance scheduling optimization. These actionable categorizations assist railway managers in selecting the appropriate algorithmic tool for specific asset classes and operational goals.
The identified digital technologies and algorithms were prioritized based on their occurrence in the literature to highlight well-established versus emerging methods. It was revealed that Supervised Learning, specifically RF and Neural Networks, are the most frequently studied algorithms for defect prediction. In contrast, True DTs with bidirectional control and Reinforcement Learning applications like A2C remain niche but are rapidly growing areas of research. Additionally, predictive maintenance, anomaly detection, and DTs were identified as “Motor Themes,” driving the current research agenda.
An informative relationship mapping was created to correlate specific ML algorithms with focused railway components. It demonstrated that CNNs are predominantly correlated with visual surface defects in rails and rolling stock, while LSTMs are strongly associated with time-series forecasting of track geometry degradation. The mapping also showed that Reinforcement Learning is most effective for high-level decision support in maintenance planning rather than direct defect detection. These insights facilitate the selection of optimal algorithmic architectures for specific maintenance tasks.
The complex interplay of digital integration and its limitations was also illustrated, specifically the DT Gap. The analysis revealed a prevalence of unidirectional digital models that only monitor assets, as opposed to bidirectional True DTs capable of autonomous actuation or parameter updates. Furthermore, the study highlighted the critical role of IFC in enabling interoperability between BIM and AI, a standard often neglected in favor of proprietary formats.
Four distinct levels of digital maintenance maturity were identified and discussed: (1) Visualization using 3D models, (2) Diagnosis using basic alerts and dashboards, (3) Self-correction using recursive model updating, and (4) Autonomous control using closed-loop feedback systems.
Five critical gaps in the current body of literature were also identified to guide future research. These include the “Safety Paradox” leading to data scarcity, the “Black-Box” nature of deep learning hindering interpretability, the lack of standardized bidirectional control in DTs, computational constraints for real-time edge deployment, and the absence of quantified environmental metrics to validate sustainability claims.
In complementing the existing literature, this study presents a comprehensive systematic review of the convergence of AI, BIM, and DTs in railway maintenance. It offers a structured classification of algorithms and their component-specific applications, maps the current state of interoperability and data flow, and provides an organized framework of challenges to support the transition toward fully autonomous and sustainable railway infrastructure management.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/infrastructures11030087/s1, The completed PRISMA 2020 checklist is available as supplementary material alongside this manuscript [91].

Author Contributions

Conceptualization, U.M. and S.K.; methodology, U.M. and S.K.; software, U.M.; validation, U.M. and S.K.; formal analysis, U.M. and S.K.; investigation, U.M.; resources, S.K.; data curation, U.M.; writing—original draft preparation, U.M.; writing—review and editing, U.M. and S.K.; visualization, U.M.; supervision, S.K.; project administration, S.K.; funding acquisition, S.K. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the European Commission, grant number: H2020-MSCA-RISE No. 691135. In addition, the authors wish to thank the European Commission and UKRI Engineering and Physical Science Research Council (EPSRC) for the financial sponsorship of Re4Rail project (Grant No. EP/Y015401/1). The APC was funded by MDPI’s Invited Paper Initiative.

Data Availability Statement

All data supporting the findings of this study are available within the article itself and through the referenced sources.

Acknowledgments

The authors wish to thank the European Commission and UKRI Engineering and Physical Science Research Council (EPSRC) for the financial sponsorship of Re4Rail project. The first author wishes to gratefully thank the Turkish Government and the Turkish Ministry of Education for the PhD scholarship at the University of Birmingham.

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:

A2C	Advantage Actor Critic
A3C	Asynchronous Advantage Actor-Critic
AI	Artificial Intelligence
ANN	Artificial Neural Network
BIM	Building Information Modeling
CBM	Condition-Based Maintenance
CCTV	Closed-Circuit Television
CNN	Convolutional Neural Network
kg CO₂e	Kilograms of Carbon Dioxide Equivalent
DBSCAN	Density-Based Spatial Clustering of Applications with Noise
DNN	Deep Neural Network
DQN	Deep Q-Network
DT	Digital Twin
ECU	Electronic Control Unit
FEM	Finite Element Method
GRU	Gated Recurrent Unit
IFC	Industry Foundation Classes
IoT	Internet of Things
KNN	K-Nearest Neighbors
kWh	Kilowatt-hour
LSTM	Long Short-Term Memory
MDP	Markov Decision Process
ML	Machine Learning
PRISMA	Preferred Reporting Items for Systematic Reviews and Meta-Analyses
RF	Random Forest
RNN	Recurrent Neural Network
RUL	Remaining Useful Life
UAV	Unmanned Aerial Vehicle

Appendix A. List of Reviewed Papers

The papers reviewed in this study are listed in Table A1. They are sorted alphabetically by titles.

Table A1. Summary of selected resources.

Ref.	Aim of Study	Focused Component	Validation	AI Paradigm
[27]	Develop optimal renewal and maintenance planning minimizing long-term costs and failure risks.	Track	Real-world Case Study	Deep Reinforcement Learning (DRL), Double Deep Q-Network (DDQN), Cox Hazard Model
[46]	Optimize maintenance schedules to enhance operational efficiency and safety.	Track	Real-world Case Study	Graph Convolutional Networks (GCN), GraphSAGE, MLP
[47]	Real-time monitoring and fault diagnosis of railway vehicle bogies.	Rolling Stock (Bogies)	Lab Experiment	Convolutional Neural Networks (CNNs)
[38]	Predict wear and damage of pantograph sliding strips.	Rolling Stock (Pantograph)	Real-world Case Study	Artificial Neural Networks (ANNs)
[42]	Compare ML algorithms for estimating future failure points of tracks.	Track	Real-world Case Study	LSTM, Random Forest, Decision Tree, KNN, SVM
[20]	Classify and analyze data-driven approaches for railway PdM.	Track, Rolling Stock	N/A	SVM, RF, CNN, RNN, Autoencoders
[53]	Detect failures in advance for Air Production Units.	Rolling Stock (Air Production Unit)	Real-world Case Study	CNN-LSTM (Forecasting), XGBOD, Autoencoders
[49]	Real-time condition assessment of rail tracks at turnout areas.	Switch/Crossing	Real-world Case Study	CNN, Sparse Bayesian Extreme Learning Machine (SBELM)
[63]	Explore combined use of InSAR and GPR for health monitoring.	Track	N/A	CNN, SVM, RF, XGBoost
[92]	Detect damages in gear transmission systems early.	Rolling Stock (Mechanical)	Simulation	Convolutional Neural Networks (CNNs)
[81]	Anticipate failures and support maintenance decisions with explainability.	Rolling Stock (Air Production Unit)	Real-world Case Study	Adaptive Random Forest Classifier (ARFC), Hoeffding Trees
[93]	Detect traffic anomalies using self-powered sensors.	N/A (General Traffic)	Lab Experiment	Autoencoder, CNN-LSTM, Contrastive Learning
[32]	Forecast risk of service failures to improve safety.	Track (Heavy Haul)	Real-world Case Study	Gradient Boosting, Random Forest, Decision Tree, MLP
[94]	Automate inspection for safety and efficiency.	Track	Real-world Case Study	Convolutional Neural Networks (CNNs)
[69]	Framework for Digital Twin collaboration to diagnose faults.	All	Simulation	Machine Learning (General)
[95]	Assessment of track structure condition using monitoring data.	Track	Real-world Case Study	Sparse Bayesian Extreme Learning Machine (SBELM)
[34]	Monitor track conditions using car-body vibration.	Track	Real-world Case Study	Support Vector Machine (SVM)
[39]	Characterize normal acceleration behavior to detect anomalies.	Rolling Stock (Car Body)	Real-world Case Study	Artificial Neural Networks (ANN)
[96]	Monitor conditions leading to high-impact loads.	Rolling Stock (Wheel)	Real-world Case Study	Logical Analysis of Data (LAD), Ant Colony Optimization (ACO)
[78]	Support maintenance decisions for degrading assets.	Rolling Stock (Wheelset), Track	Real-world Case Study	Markov Decision Process (MDP)
[37]	Detect failures in machinery before critical stages.	Rolling Stock (Air Production Unit)	Real-world Case Study	Half-Space-Trees, One Class K Nearest Neighbor (OCKNN)
[15]	Develop efficient inspection and maintenance policies.	Track	Real-world Case Study	Random Forest, Recurrent Neural Networks (RNNs)
[61]	Automatically localize track regions with high settlement rates.	Track	Real-world Case Study	Autoencoder, KMeans Clustering
[97]	Predict rail useful lifetime and analyze risk.	Track (Rail)	Real-world Case Study	Deep Neural Networks (DNNs), DeepSurv
[31]	Monitor lateral and cross-level track irregularities.	Track	Simulation	Random Forest, CNN
[60]	Assign health scores to railcars to prioritize maintenance.	Rolling Stock (Railcar)	Real-world Case Study	Random Forest, Decision Tree, DBSCAN, PCA
[48]	Detect rail defects, joints, and switches.	Track (Rail, Switch)	Real-world Case Study	Deep Neural Networks (DNNs), CNN
[74]	Predict rail surface damage (Rolling Contact Fatigue).	Track (Rail Surface)	Real-world Case Study	N/A (Physics-based simulation)
[98]	Review Digital Twin applications in transportation maintenance.	All	N/A	Machine Learning, Deep Reinforcement Learning
[73]	Anomaly detection for rail transportation.	Rolling Stock/Track	Lab Experiment	Unsupervised Machine Learning (LSTM Autoencoder)
[70]	Design a Digital Twin configuration for CBM applications.	Rolling Stock (Axle Bearings)	Real-world Case Study	Artificial Neural Networks (ANNs), Weibull Analysis
[54]	Estimate Remaining Useful Life (RUL) of equipment.	Rolling Stock/Mechanical	Simulation	LSTM Autoencoder
[80]	Investigate optimal maintenance strategy under various conditions.	Track	Simulation	AI-based algorithms (general mention)
[85]	Implement core PdM functionality using online machine learning.	Rolling Stock (Doors)	Real-world Case Study	LSTM-AE, Clustering (CheMoc)
[64]	Optimize UAV-based autonomous inspection.	Track	Simulation	Hybrid Deep Reinforcement Learning, DeepSeek
[55]	Manage wheel wear with limited measurement data.	Rolling Stock (Wheel)	Simulation	DNN, Deep Reinforcement Learning (Deep Q-Learning)
[82]	Predict track geometry defects using GPR data and explainable QNN.	Track (Subsurface)	Real-world Case Study	Quantum Neural Network (QNN), SHAP
[68]	Detect bolt preload conditions in insulated rail joints using a Digital Twin.	Track (Insulated Rail Joints)	Simulation/Lab Experiment	Decision Trees (Coarse/Medium/Fine)
[77]	Develop functions to classify track condition and predict service levels.	Track (Geometry)	Real-world Case Study	ANN, Linear/Exponential Regression
[45]	Propose a framework (RailCANet) for anomaly detection and maintenance.	Infrastructure/Rolling Stock	Real-world Case Study	RailCANet (GNN, CNN), LSTM, Transformer
[51]	Automate rail head wear detection using deep learning computer vision.	Track (Rail Head)	Real-world Case Study	Mask R-CNN, YOLOv8
[52]	Predict wheel–rail dynamic contact forces using a lightweight surrogate model.	Rolling Stock (Wheel-Rail)	Simulation (MBS/FEM)	CNN, LSTM, Transformer, Knowledge Distillation
[43]	Predict failures of train traction converter cooling systems.	Rolling Stock (Cooling System)	Real-world Case Study	LSTM
[33]	Predict rail track lifetime integrating environmental and operational data.	Track	Real-world Case Study	CATB, XGBoost, Random Forest, Decision Tree
[99]	Model degradation rates of track geometry local defects.	Track (Geometry)	Real-world Case Study	Regression Analysis
[50]	Detect wheel out-of-roundness using axlebox vibration.	Rolling Stock (Wheels)	Simulation (MBS)	OORNet (Deep Learning), 1DCNN
[56]	Determine optimal maintenance limits to minimize costs.	Track (Geometry)	Real-world Case Study	K-Means Clustering
[100]	Discuss AI applications for traffic and infrastructure maintenance.	General	N/A	N/A (General AI discussion)
[101]	Analyze track parameters’ effect on performance/LCC.	Track (Slab Track)	Simulation (FEM)	Linear Regression, KNN, DT, RF, MLP, Gradient Boosting
[102]	Develop TQI prediction model for sections not measured by cars.	Track (Geometry/TQI)	Real-world Case Study	Logit/Linear Regression
[103]	Evaluate approaches toward implementing predictive maintenance.	General	N/A	Review (SVM, RF, ANN, LSTM, etc.)
[67]	Propose PdM model for switch machines using Digital Twins.	Switch/Crossing	Experiment/Simulation	LSTM, ARIMA
[104]	Create a life prediction framework for wheels and rails.	Wheel-Rail Interface	Real-world Case Study	N/A (MBS Simulation)
[29]	Predict failure category and maintenance needs of station elevators.	Station Facilities (Elevators)	Real-world Case Study	Decision Tree, Random Forest, Gradient Boosted Tree
[30]	Create strategic decision support for rolling stock maintenance.	Rolling Stock (Traction/Braking)	Real-world Case Study	J48 (Decision Tree), M5P (Regression Tree)
[28]	Predict maintenance need, activity type, and trigger status.	Switch/Crossing	Real-world Case Study	Decision Tree, Random Forest, Gradient Boosted Trees
[41]	Forecast future TQI values using historical measurements.	Track (Geometry/TQI)	Real-world Case Study	General Regression Neural Network (GRNN)
[62]	Improve maintenance efficiency using DRL and Digital Twin.	Track (Geometry/Components)	Real-world Case Study	Advantage Actor Critic (A2C) (Deep Reinforcement Learning)
[40]	Evaluate track buckling risks using a surrogate ML model.	Track	Simulation	Multilayer Perceptron (MLP)
[58]	Predict remaining time to critical fault severity without RTF data.	Rolling Stock (Door Systems)	Real-world Case Study	K-Means Clustering
[105]	Predict vehicle defects to optimize maintenance scheduling.	Rolling Stock	Real-world Case Study	MLP, ANFIS, Particle Swarm Optimization (PSO)
[106]	Review data analytics techniques for track condition monitoring.	Track (Geometry)	N/A	Review (ANN, Bayesian, SVM, etc.)
[79]	Optimize scheduling to reduce costs and failure risk.	Rail Network	Real-world Case Study	N/A (Optimization Models)
[57]	Real-time track geometry monitoring using low-cost sensor fusion.	Track (Geometry)	Field Experiment	K-Means Clustering, Fuzzy Logic
[59]	Review data-driven models for track predictive maintenance.	Track	N/A	Review (Deep Learning, Ensemble, etc.)
[84]	Review SHM and digital tools for rail infrastructure.	Infrastructure	N/A	Review (AI, ML, DL, RL)
[36]	Predict system downtime and incident risk levels for CCTV.	Rolling Stock (CCTV)	Real-world Case Study	Bayesian Ridge, SVR, KNN, LSTM, CNN, SARIMAX
[107]	Develop a roadmap for data-driven PdM in SA railways.	Railway Industry	Real-world Case Study	N/A
[84]	Automated assessment of track bed stratigraphy/fouling.	Track (Ballast/Substructure)	N/A	AI (Specifics not detailed)
[75]	Propose guidelines and reference architecture for AI-DTs.	General/Maintenance	N/A	N/A (Architecture)
[35]	Develop DT framework for trams using low-cost sensors.	Rolling Stock (Tram)	Real-world Case Study	SVM, Random Forest, Huber Regression
[87]	Review predictive diagnostics methods for axle bearings.	Rolling Stock (Axle Bearings)	N/A	Review (Deep Learning, Hybrid)
[44]	Predict track geometry using 3D RNN models co-simulated with BIM.	Track (Geometry)	Real-world Case Study	RNN, LSTM, GRU, Attention

References

Sun, Q.; Wang, X.; Ma, F.; Han, Y.; Cheng, Q. Synergetic Effect and Spatial-Temporal Evolution of Railway Transportation in Sustainable Development of Trade: An Empirical Study Based on the Belt and Road. Sustainability 2019, 11, 1721. [Google Scholar] [CrossRef]
Whiteing, T.; Menaz, B. Thematic Research Summary: Rail Transport; Transport Research Knowledge Centre—European Commission: Brussels, Belgium, 2009.
Souza, E.F.; Bragança, C.; Meixedo, A.; Ribeiro, D.; Bittencourt, T.N.; Carvalho, H. Drive-by Methodologies Applied to Railway Infrastructure Subsystems: A Literature Review—Part I: Bridges and Viaducts. Appl. Sci. 2023, 13, 6940. [Google Scholar] [CrossRef]
Sharma, S.K.; Kumar, A. A comparative study of Indian and world wide railways. Int. J. Mech. Eng. Robot. Res. 2014, 1, 114–120. [Google Scholar]
Du, C.; Dutta, S.; Kurup, P.; Yu, T.; Wang, X. A review of railway infrastructure monitoring using fiber optic sensors. Sens. Actuators A Phys. 2020, 303, 111728. [Google Scholar] [CrossRef]
Dekker, B.; Ton, B.; Meijer, J.; Bouali, N.; Linssen, J.; Ahmed, F. Point Cloud Analysis of Railway Infrastructure: A Systematic Literature Review. IEEE Access 2023, 11, 134355–134373. [Google Scholar] [CrossRef]
European Environment Agency. Transport and Environment Report 2022; European Environment Agency: Copenhagen, Denmark, 2023; Available online: https://www.eea.europa.eu/en/analysis/publications/transport-and-environment-report-2022 (accessed on 17 November 2025).
Vieira, J.; Martins, J.P.; de Almeida, N.M.; Patrício, H.; Morgado, J.G. Towards Resilient and Sustainable Rail and Road Networks: A Systematic Literature Review on Digital Twins. Sustainability 2022, 14, 7060. [Google Scholar] [CrossRef]
Berdica, K. An introduction to road vulnerability: What has been done, is done and should be done. Transp. Policy 2002, 9, 117–127. [Google Scholar] [CrossRef]
Soleimanmeigouni, I.; Ahmadi, A.; Kumar, U. Track geometry degradation and maintenance modelling: A review. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2016, 232, 73–102. [Google Scholar] [CrossRef]
Lidén, T. Railway Infrastructure Maintenance—A Survey of Planning Problems and Conducted Research. Transp. Res. Procedia 2015, 10, 574–583. [Google Scholar] [CrossRef]
Chandran, P.; Asber, J.; Thiery, F.; Odelius, J.; Rantatalo, M. An Investigation of Railway Fastener Detection Using Image Processing and Augmented Deep Learning. Sustainability 2021, 13, 12051. [Google Scholar] [CrossRef]
Dobney, K.; Baker, C.J.; Quinn, A.D.; Chapman, L. Quantifying the effects of high summer temperatures due to climate change on buckling and rail related delays in south-east United Kingdom. Meteorol. Appl. 2009, 16, 245–251. [Google Scholar] [CrossRef]
Sogabe, M.; Asanuma, K.; Nakamura, T.; Kataoka, H.; Goto, K.; Tokunaga, M. Deformation Behavior of Ballasted Track during Earthquakes. Q. Rep. RTRI 2013, 54, 104–111. [Google Scholar] [CrossRef]
Gerum, P.C.L.; Altay, A.; Baykal-Gürsoy, M. Data-driven predictive maintenance scheduling policies for railways. Transp. Res. Part C Emerg. Technol. 2019, 107, 137–154. [Google Scholar] [CrossRef]
Sresakoolchai, J.; Kaewunruen, S. Carbon emission reduction in railway maintenance using reinforcement learning. In Life-Cycle of Structures and Infrastructure Systems; CRC Press: Boca Raton, FL, USA, 2023. [Google Scholar]
Du, G.; Karoumi, R. Life cycle assessment of a railway bridge: Comparison of two superstructure designs. Struct. Infrastruct. Eng. 2013, 9, 1149–1160. [Google Scholar] [CrossRef]
Kushwaha, D.; Kumar, A.; Harsha, S.P. Advancements and applications of digital twin in the railway industry: A literature review. Int. J. Rail Transp. 2024, 13, 865–890. [Google Scholar] [CrossRef]
Bryde, D.; Broquetas, M.; Volm, J.M. The project benefits of Building Information Modelling (BIM). Int. J. Proj. Manag. 2013, 31, 971–980. [Google Scholar] [CrossRef]
Davari, N.; Veloso, B.; de Assis Costa, G.; Pereira, P.M.; Ribeiro, R.P.; Gama, J. A Survey on Data-Driven Predictive Maintenance for the Railway Industry. Sensors 2021, 21, 5739. [Google Scholar] [CrossRef]
Page, M.J.; Moher, D.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D.; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. PRISMA 2020 explanation and elaboration: Updated guidance and exemplars for reporting systematic reviews. BMJ 2021, 372, n160. [Google Scholar] [CrossRef]
Lagap, U.; Ghaffarian, S. Digital post-disaster risk management twinning: A review and improved conceptual framework. Int. J. Disaster Risk Reduct. 2024, 110, 104629. [Google Scholar] [CrossRef]
Carrera-Rivera, A.; Ochoa, W.; Larrinaga, F.; Lasa, G. How-to conduct a systematic literature review: A quick guide for computer science research. MethodsX 2022, 9, 101895. [Google Scholar] [CrossRef]
Banihashemi, S.; Meskin, S.; Sheikhkhoshkar, M.; Mohandes, S.R.; Hajirasouli, A.; LeNguyen, K. Circular economy in construction: The digital transformation perspective. Clean. Eng. Technol. 2024, 18, 100715. [Google Scholar] [CrossRef]
Sheikhkhoshkar, M.; Bril El Haouzi, H.; Aubry, A.; Hamzeh, F. An advanced exploration of functionalities as the underlying principles of construction control metrics. Smart Sustain. Built Environ. 2024, 13, 644–676. [Google Scholar] [CrossRef]
Haghighi, E.; Kasraei, A.; Famurewa, S.; Strandberg, G.; Sas, G.; Garmabaki, A. Climate change risks on railway infrastructure: A systematic review and analysis. Sustain. Cities Soc. 2025, 129, 106504. [Google Scholar] [CrossRef]
Mohammadi, R.; He, Q. A deep reinforcement learning approach for rail renewal and maintenance planning. Reliab. Eng. Syst. Saf. 2022, 225, 108615. [Google Scholar] [CrossRef]
Bukhsh, Z.A.; Saeed, A.; Stipanovic, I.; Doree, A.G. Predictive maintenance using tree-based classification techniques: A case of railway switches. Transp. Res. Part C Emerg. Technol. 2019, 101, 35–54. [Google Scholar] [CrossRef]
Alothman, A.; Malraj, M.; Vilventhan, A. Predictive Maintenance of Metro Rail Station Facilities Using Tree-Based Machine-Learning Algorithms. J. Perform. Constr. Facil. 2026, 40, 04025074. [Google Scholar] [CrossRef]
Kalathas, I.; Papoutsidakis, M. Predictive maintenance using machine learning and data mining: A pioneer method implemented to greek railways. Designs 2021, 5, 5. [Google Scholar] [CrossRef]
Kaviani, N.; Rannquist, A.; Fraseth, G.T.; Lau, A.; Ricci, S.; Rizzetto, L. Detecting lateral track irregularities by onboard measurements of lateral acceleration and displacements and Machine Learning algorithms. Ing. Ferrov. 2024, 79, 633–653. [Google Scholar]
Ghofrani, F.; Sun, H.; He, Q. Analyzing Risk of Service Failures in Heavy Haul Rail Lines: A Hybrid Approach for Imbalanced Data. Risk Anal. 2021, 42, 1852–1871. [Google Scholar] [CrossRef]
Ji, K.; Wang, G.H.; Choi, I.; Jeon, J.-S. Machine learning based life prediction of rail tracks using environmental and operational factors. Dev. Built Environ. 2025, 23, 100718. [Google Scholar] [CrossRef]
Tsunashima, H. Condition Monitoring of Railway Tracks from Car-Body Vibration Using a Machine Learning Technique. Appl. Sci. 2019, 9, 2734. [Google Scholar] [CrossRef]
Wysocki, O.; Kuziemski, M.; Freitas, A.; Wasilczuk, M.; Czyżewicz, J. Toward Low-Cost Digital Twins for Urban Transportation Systems. IEEE Access 2025, 13, 120277–120292. [Google Scholar] [CrossRef]
Rahman, M.; Alkali, B.; Jain, A.K.; Parrilla-Gutierrez, J.; Mcneil, C.; Nelson, J. The application of time series predictive maintenance model on rolling stock critical systems. Adv. Mech. Eng. 2025, 17, 16878132251384345. [Google Scholar] [CrossRef]
Meira, J.; Veloso, B.; Bolón-Canedo, V.; Marreiros, G.; Alonso-Betanzos, A.; Gama, J. Data-driven predictive maintenance framework for railway systems. Intell. Data Anal. 2023, 27, 1087–1102. [Google Scholar] [CrossRef]
Kuźnar, M.; Lorenc, A. A Method of Predicting Wear and Damage of Pantograph Sliding Strips Based on Artificial Neural Networks. Materials 2022, 15, 98. [Google Scholar] [CrossRef]
Martínez-Llop, P.G.; de Dios Sanz Bobi, J.; Solano Jiménez, Á.; Sánchez, J.G. Condition-based maintenance for normal behaviour characterisation of railway car-body acceleration applying neural networks. Sustainability 2021, 13, 12265. [Google Scholar] [CrossRef]
Agustin, D.; Wu, Q.; Bernal, E.; Spiryagin, M.; Cole, C. Railway track buckling evaluation using rigid-flexible multibody dynamic model and machine learning. Mech. Based Des. Struct. Mach. 2025, 53, 4830–4852. [Google Scholar] [CrossRef]
Nedevska, T.I.; Zafirovski, Z. Predictive modeling of track quality index with neural networks. J. Appl. Eng. Sci. 2025, 23, 735–741. [Google Scholar] [CrossRef]
Nigam, S.; Kumar, D. A predictive model based on the LSTM technique for the maintenance of railway track system. Int. J. Comput. Sci. Eng. 2025, 28, 10–20. [Google Scholar] [CrossRef]
De Simone, L.; Caputo, E.; Cinque, M.; Galli, A.; Moscato, V.; Russo, S.; Cesaro, G.; Criscuolo, V.; Giannini, G. LSTM-based failure prediction for railway rolling stock equipment. Expert Syst. Appl. 2023, 222, 119767. [Google Scholar] [CrossRef]
Sresakoolchai, J.; Kaewunruen, S. Track Geometry Prediction Using Three-Dimensional Recurrent Neural Network-Based Models Cross-Functionally Co-Simulated with BIM. Sensors 2023, 23, 391. [Google Scholar] [CrossRef] [PubMed]
Quan, L.; Wang, M.; Baihang, L.; Ziwen, Z. Integration of deep learning and railway big data for environmental risk prediction models and analysis of their limitations. Front. Environ. Sci. 2025, 13, 1550745. [Google Scholar] [CrossRef]
MajidiParast, S.; Monemi, R.N.; Gelareh, S. A graph convolutional network for optimal intelligent predictive maintenance of railway tracks. Decis. Anal. J. 2025, 14, 100542. [Google Scholar] [CrossRef]
Li, D.; Su, D.; Tang, H.; Yu, D.; Yu, L.; Zhang, H.; Hu, G. A hardware-centered digital twin framework for real-time monitoring of railway vehicle bogies. Proc. Inst. Mech. Eng. Part C J. Mech. Eng. Sci. 2025, 239, 10242–10257. [Google Scholar] [CrossRef]
Yang, C.; Sun, Y.; Ladubec, C.; Liu, Y. Developing Machine Learning-Based Models for Railway Inspection. Appl. Sci. 2021, 11, 13. [Google Scholar] [CrossRef]
Zhou, L.; Chen, S.-X.; Ni, Y.-Q.; Liu, X.-Z. Advancement of data-driven SHM: A research paradigm on AE-based switch rail condition monitoring. J. Infrastruct. Intell. Resil. 2024, 3, 100107. [Google Scholar] [CrossRef]
Ye, Y.; Zhu, B.; Huang, P.; Peng, B. OORNet: A deep learning model for on-board condition monitoring and fault diagnosis of out-of-round wheels of high-speed trains. Measurement 2022, 199, 111268. [Google Scholar] [CrossRef]
Khan, A.T.; Jahanzaib, A.; Khan, A.R.; Ramzan, M.; Li, S. Intelligent Predictive Maintenance in Urban Rail Systems: Computer Vision Approaches for Enhanced Wear Detection. Transp. Res. Rec. 2025, 2680, 626–638. [Google Scholar] [CrossRef]
Lin, Z.; Cho, C. Lightweight Knowledge Distillation-Based Surrogate Model for Wheel–Rail Dynamic Contact Force Prediction Using Hybrid CNN–LSTM–Transformer Networks. IEEE Sens. J. 2025, 25, 17239–17251. [Google Scholar] [CrossRef]
Toribio, L.; Veloso, B.; Gama, J.; Zafra, A. A two-stage framework for early failure detection in predictive maintenance: A case study on metro trains. Neurocomputing 2026, 670, 132506. [Google Scholar] [CrossRef]
Hu, L.; Dai, G. Estimate remaining useful life for predictive railways maintenance based on LSTM autoencoder. Neural Comput. Appl. 2022, 37, 22967–22978. [Google Scholar] [CrossRef]
Sresakoolchai, J.; Ngamkhanong, C.; Kaewunruen, S. Hybrid learning strategies: Integrating supervised and reinforcement techniques for railway wheel wear management with limited measurement data. Front. Built Environ. 2025, 11, 1546957. [Google Scholar] [CrossRef]
Kasraei, A.; Zakeri, J.A.; Bakhtiary, A. Optimal track geometry maintenance limits using machine learning: A case study. Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2020, 235, 876–886. [Google Scholar] [CrossRef]
Fikri, A.A.; Subhan, M.F.N.; Suryanto, H.; Muhdi, K.D.; Pratama, D.F.; Iqbal, A. Sensor Fusion of Laser and Inertial Units with Kalman-KMeans-Fuzzy Framework for Real-Time Railway Geometry Monitoring. Bul. Ilm. Sarj. Tek. Elektro 2025, 7, 572–594. [Google Scholar] [CrossRef]
Shimizu, M.; Perinpanayagam, S.; Namoano, B.; Starr, A. Real-Time Prognostics and Health Management Without Run-to-Failure Data on Railway Assets. IEEE Access 2023, 11, 28724–28734. [Google Scholar] [CrossRef]
Xie, J.; Huang, J.; Zeng, C.; Jiang, S.-H.; Podlich, N. Systematic Literature Review on Data-Driven Models for Predictive Maintenance of Railway Track: Implications in Geotechnical Engineering. Geosciences 2020, 10, 425. [Google Scholar] [CrossRef]
Ejlali, M.; Arian, E.; Taghiyeh, S.; Chambers, K.; Sadeghi, A.H.; Taghiye, E.; Cakdi, D.; Handfield, R.B. Developing hybrid machine learning models to assign health score to railcar fleets for optimal decision making. Expert Syst. Appl. 2024, 250, 123931. [Google Scholar] [CrossRef]
Popov, K.; De Bold, R.; Chai, H.-K.; Forde, M.; Ho, C.; Hyslip, J.; Kashani, H.; Kelly, R.; Hsu, S.; Rippin, M. Data-driven track geometry fault localisation using unsupervised machine learning. Constr. Build. Mater. 2023, 377, 131141. [Google Scholar] [CrossRef]
Sresakoolchai, J.; Kaewunruen, S. Railway infrastructure maintenance efficiency improvement using deep reinforcement learning integrated with digital twin based on track geometry and component defects. Sci. Rep. 2023, 13, 2439. [Google Scholar] [CrossRef]
Koohmishi, M.; Kaewunruen, S.; Chang, L.; Guo, Y. Advancing railway track health monitoring: Integrating GPR, InSAR and machine learning for enhanced asset management. Autom. Constr. 2024, 162, 105378. [Google Scholar] [CrossRef]
Wang, C. HTFN: A Hybrid Deep Reinforcement Learning Framework for UAV-Based Railway Inspection With Deepseek Integration. IEEE Access 2025, 13, 196821–196835. [Google Scholar] [CrossRef]
Tang, R.; De Donato, L.; Bešinović, N.; Flammini, F.; Goverde, R.M.P.; Lin, Z.; Liu, R.; Tang, T.; Vittorini, V.; Wang, Z. A literature review of Artificial Intelligence applications in railway systems. Transp. Res. Part C Emerg. Technol. 2022, 140, 103679. [Google Scholar] [CrossRef]
Bhadouria, A.S.; Braga, J.A.P.; Mishra, R.P.; Andrade, A.R. Optimizing maintenance in railway wheelsets from freight wagons: What is the most appropriate lifetime variable? Proc. Inst. Mech. Eng. Part F J. Rail Rapid Transit 2025, 09544097251400096. [Google Scholar] [CrossRef]
Yang, J.; Sun, Y.; Cao, Y.; Hu, X. Predictive Maintenance for Switch Machine Based on Digital Twins. Information 2021, 12, 485. [Google Scholar] [CrossRef]
Bianchi, G.; Freddi, F.; Giuliani, F.; La Placa, A. Implementation of an AI-based predictive structural health monitoring strategy for bonded insulated rail joints using digital twins under varied bolt conditions. Railw. Eng. Sci. 2025, 33, 703–720. [Google Scholar] [CrossRef]
Sahal, R.; Alsamhi, S.H.; Brown, K.N.; O’Shea, D.; McCarthy, C.; Guizani, M. Blockchain-Empowered Digital Twins Collaboration: Smart Transportation Use Case. Machines 2021, 9, 193. [Google Scholar] [CrossRef]
Crespo Marquez, A.; Marcos Alberca, J.A.; Guillén López, A.J.; De la Fuente Carmona, A. Digital twins in condition-based maintenance apps: A case study for train axle bearings. Comput. Ind. 2023, 151, 103980. [Google Scholar] [CrossRef]
Nasim, M.; Rajabifard, A.; Chen, Y.; Samali, B. A demonstration of a digital twin framework for structural health monitoring: Application to bridge infrastructures. J. Infrastruct. Intell. Resil. 2026, 5, 100184. [Google Scholar] [CrossRef]
Torzoni, M.; Tezzele, M.; Mariani, S.; Manzoni, A.; Willcox, K.E. A digital twin framework for civil engineering structures. Comput. Methods Appl. Mech. Eng. 2024, 418, 116584. [Google Scholar] [CrossRef]
Zhao, X.; He, D.; Fang, Z.; Chen, H.; Zhang, Y.; Zeng, L.; Pan, H.; Zhang, Z. Digital twin-enhanced self-powered sensing anomaly detection system for intelligent rail transportation. Nano Energy 2025, 144, 111338. [Google Scholar] [CrossRef]
Ahmad, S.; Spiryagin, M.; Wu, Q.; Bernal, E.; Sun, Y.; Cole, C.; Makin, B. Development of a Digital Twin for prediction of rail surface damage in heavy haul railway operations. Veh. Syst. Dyn. 2024, 62, 41–66. [Google Scholar] [CrossRef]
De Donato, L.; Dirnfeld, R.; Somma, A.; De Benedictis, A.; Flammini, F.; Marrone, S.; Saman Azari, M.; Vittorini, V. Towards AI-assisted digital twins for smart railways: Preliminary guideline and reference architecture. J. Reliab. Intell. Environ. 2023, 9, 303–317. [Google Scholar] [CrossRef]
Kaewunruen, S.; Zhaochen, Z.; Khongsomchit, L.; Lin, Y.-H.; Ndlovu, N.T. BIM-Driven Digital Twin for Climate Change Adaptation and Resilience of Railway Overhead Line System. Sustainabiliy 2026, 18, 1909. [Google Scholar] [CrossRef]
Nagy, R.; Horvát, F.; Fischer, S. Innovative Approaches in Railway Management: Leveraging Big Data and Artificial Intelligence for Predictive Maintenance of Track Geometry. Teh. Vjesn. 2024, 31, 1245–1259. [Google Scholar] [CrossRef]
Braga, J.A.P.; Andrade, A.R. Data-driven decision support system for degrading assets and its application under the perspective of a railway component. Transp. Eng. 2023, 12, 100180. [Google Scholar] [CrossRef]
Consilvio, A.; Di Febbraro, A.; Meo, R.; Sacco, N. Risk-based optimal scheduling of maintenance activities in a railway network. EURO J. Transp. Logist. 2019, 8, 435–465. [Google Scholar] [CrossRef]
Praticò, F.G.; Fedele, R. Exploring the paradigm of railway predictive maintenance. Struct. Infrastruct. Eng. 2025, 1–12. [Google Scholar] [CrossRef]
García-Méndez, S.; de Arriba-Pérez, F.; Leal, F.; Veloso, B.; Malheiro, B.; Burguillo-Rial, J.C. An explainable machine learning framework for railway predictive maintenance using data streams from the metro operator of Portugal. Sci. Rep. 2025, 15, 27495. [Google Scholar] [CrossRef]
Alabintei, D.D.; Attoh-Okine, N. Hybrid Quantum Neural Network and Shapely Additive Explanations in Railway Track Geometry Modeling. ASME J. Risk Uncertain. Eng. Syst. Part B Mech. Eng. 2025, 11, 031206. [Google Scholar] [CrossRef]
Bianchi, G.; Fanelli, C.; Freddi, F.; Giuliani, F.; La Placa, A. Systematic review railway infrastructure monitoring: From classic techniques to predictive maintenance. Adv. Mech. Eng. 2025, 17, 16878132241285631. [Google Scholar] [CrossRef]
Mbubia Tchoua, E.; Tissier, J.; Martin, A.; Fargier, Y.; Ihamouten, A. The use of Ground Penetrating Radar and artificial intelligence for automated railway trackbed stratigraphy and Ballast Fouling assessment. Transp. Eng. 2026, 23, 100415. [Google Scholar] [CrossRef]
Le-Nguyen, M.-H.; Turgis, F.; Fayemi, P.-E.; Bifet, A. Exploring the potentials of online machine learning for predictive maintenance: A case study in the railway industry. Appl. Intell. 2023, 53, 29758–29780. [Google Scholar] [CrossRef]
Hassija, V.; Chamola, V.; Mahapatra, A.; Singal, A.; Goel, D.; Huang, K.; Scardapane, S.; Spinelli, I.; Mahmud, M.; Hussain, A. Interpreting Black-Box Models: A Review on Explainable Artificial Intelligence. Cogn. Comput. 2024, 16, 45–74. [Google Scholar] [CrossRef]
Konecny, J.; Ozana, S.; Choutka, J.; Prauzek, M. Towards railways safety: A systematic review on predictive diagnostics for axle bearings. Measurement 2026, 257, 118510. [Google Scholar] [CrossRef]
Cascino, A.; Meli, E.; Rindi, A. High-Fidelity Finite Element Modelling (FEM) and Dynamic Analysis of a Hybrid Aluminium–Honeycomb Railway Vehicle Carbody. Appl. Sci. 2026, 16, 549. [Google Scholar] [CrossRef]
Cascino, A.; Meli, E.; Rindi, A. Development of a Methodology for Railway Bolster Beam Design Enhancement Using Topological Optimization and Manufacturing Constraints. Eng 2024, 5, 1485–1498. [Google Scholar] [CrossRef]
Cascino, A.; Nencioni, L.; Lanzillo, L.; Mazzeo, F.; Strano, S.; Terzo, M.; Delle Monache, S.; Meli, E. Development and Experimental Validation of a Physics-Based Digital Twin for Railway Freight Wagon Monitoring. Sensors 2026, 26, 643. [Google Scholar] [CrossRef]
Page, M.J.; McKenzie, J.E.; Bossuyt, P.M.; Boutron, I.; Hoffmann, T.C.; Mulrow, C.D; Shamseer, L.; Tetzlaff, J.M.; Akl, E.A.; Brennan, S.E.; et al. The PRISMA 2020 statement: An updated guideline for reporting systematic reviews. BMJ 2021, 372, n71. [Google Scholar] [CrossRef]
Koutsoupakis, J.; Giagopoulos, D.; Chatziparasidis, I. AI-based condition monitoring on mechanical systems using multibody dynamics models. Eng. Appl. Artif. Intell. 2023, 123, 106467. [Google Scholar] [CrossRef]
Fang, Z.; Kong, L.; Chen, H.; Zhao, X.; Wang, Y.; Fan, C.; Zhang, Z.; Bai, L. An intelligent traffic anomaly detection system based on self-supervised learning and self-powered hybrid nano-sensor. Adv. Eng. Inform. 2025, 66, 103461. [Google Scholar] [CrossRef]
Vračar, L.; Marinković, D.; Stojanović, M.; Milovančević, M. Automated Inspection System with GPS and Deep Learning in Urban Rail Safety and Efficiency. ACTA Polytech. Hung 2025, 22, 9–27. [Google Scholar] [CrossRef]
Wang, S.; Gao, J.; Lin, C.; Li, H.; Huang, Y. Condition assessment of high-speed railway track structure based on sparse Bayesian extreme learning machine and Bayesian hypothesis testing. Int. J. Rail Transp. 2023, 11, 364–388. [Google Scholar] [CrossRef]
Osman, H.; Yacout, S. Condition-based monitoring of the rail wheel using logical analysis of data and ant colony optimization. J. Qual. Maint. Eng. 2022, 29, 377–400. [Google Scholar] [CrossRef]
Zeng, C.; Huang, J.; Wang, H.; Xie, J.; Zhang, Y. Deep Bayesian survival analysis of rail useful lifetime. Eng. Struct. 2023, 295, 116822. [Google Scholar] [CrossRef]
Werbińska-Wojciechowska, S.; Giel, R.; Winiarska, K. Digital Twin Approach for Operation and Maintenance of Transportation System-Systematic Review. Sensors 2024, 24, 6069. [Google Scholar] [CrossRef] [PubMed]
Rodrigues, P.; Teixeira, P.F. Modelling degradation rates of track geometry local defects: Lisbon-Porto line case study. Struct. Infrastruct. Eng. 2024, 20, 867–882. [Google Scholar] [CrossRef]
Prasad, C.; Jamuar, S.S. Optimising indian railways infrastructure by AI. J. Mob. Multimed. 2021, 17, 157–174. [Google Scholar] [CrossRef]
Sainz-Aja, J.A.; Ferreño, D.; Pombo, J.; Carrascal, I.A.; Casado, J.; Diego, S.; Castro, J. Parametric analysis of railway infrastructure for improved performance and lower life-cycle costs using machine learning techniques. Adv. Eng. Softw. 2023, 175, 103357. [Google Scholar] [CrossRef]
Yudariansyah, H.; Ismiyati, I.; Narendera, A. Prediction Model for Track Quality Index Categories on the Northern and Southern Railway Lines of Java. Period. Polytech. Transp. Eng. 2025, 53, 184–193. [Google Scholar] [CrossRef]
Binder, M.; Mezhuyev, V.; Tschandl, M. Predictive Maintenance for Railway Domain: A Systematic Literature Review. IEEE Eng. Manag. Rev. 2023, 51, 120–140. [Google Scholar] [CrossRef]
H-Nia, S.; Flodin, J.; Casanueva, C.; Asplund, M.; Stichel, S. Predictive maintenance in railway systems: MBS-based wheel and rail life prediction exemplified for the Swedish Iron-Ore line. Veh. Syst. Dyn. 2024, 62, 3–20. [Google Scholar] [CrossRef]
Khalilzadeh, M.; Pamucar, D.; Heidari, A. Reducing Train Delays with Machine Learning-Based Predictive Maintenance for Railways. Decis. Mak. Appl. Manag. Eng. 2025, 8, 265–284. [Google Scholar] [CrossRef]
Gonzalo, A.P.; Horridge, R.; Steele, H.; Stewart, E.; Entezami, M. Review of Data Analytics for Condition Monitoring of Railway Track Geometry. IEEE Trans. Intell. Transp. Syst. 2022, 23, 22737–22754. [Google Scholar] [CrossRef]
Nethamba, L.; Grobbelaar, S. The Development of an Action Priority Matrix and Technology Roadmap for the Implementation of Data-Driven and Machine-Learning-Based Predictive Maintenance in the South African Railway Industry. S. Afr. J. Ind. Eng. 2023, 34, 318–335. [Google Scholar] [CrossRef]

Figure 1. Artificial Intelligence hierarchy.

Figure 2. Flow diagram for resource selection process.

Figure 3. Annual distribution of the collected papers.

Figure 4. Global distribution of the selected papers.

Figure 5. Thematic map of reviewed papers.

Figure 6. Relations between ML types, algorithms and focused components.

Figure 7. Data flow types of the systems used in reviewed papers: (a) unidirectional data flow; (b) bidirectional data flow.

Figure 8. Cyber-physical system control loop.

Table 1. Final Boolean search queries.

Digital Library

Search Term

Web of Science

TS=(rail*) AND TS=(“predictive maintenance” OR “condition-based maintenance” OR “condition based maintenance”) AND (TS=(“digital twin*” OR “building information model*”) OR TS=(“machine learning” OR “artificial intelligence” OR “deep learning” OR “unsupervised learning” OR “reinforcement learning” OR “neural network*”))

Scopus

(TITLE-ABS-KEY (rail*)) AND (TITLE-ABS-KEY (“predictive maintenance” OR “condition-based maintenance” OR “condition based maintenance” OR “preventive maintenance”)) AND (TITLE-ABS-KEY (“digital twin*” OR “building information model*”) OR TITLE-ABS-KEY (“machine learning” OR “artificial intelligence” OR “deep learning” OR “unsupervised learning” OR “reinforcement learning” OR “neural network*”))

Table 2. Comparative effectiveness of dominant AI architectures.

AI Architecture	Primary Asset Target	Advantage
Convolutional Networks (CNNs/GCNs)	Rails, Wheels, Track Surface	Excels at automated spatial feature extraction from complex, unstructured data such as visual images, acoustics, and vibration signals.
Recurrent Networks (LSTMs/RNNs)	Track Geometry, Component RUL	Highly effective at capturing long-term temporal dependencies and modeling continuous degradation trajectories over time.
Random Forest/Decision Trees (DT)	Switches, Station Facilities	Handles tabular, mixed, and imbalanced maintenance logs well; provides high interpretability, which is crucial for infrastructure managers.
Gradient Boosting (XGBoost/CatBoost)	Rail Infrastructure Lifespan, Heavy Haul	Delivers high-precision regression capabilities for complex, multi-source tabular datasets while resisting overfitting.
Support Vector Machines (SVM) & ANN	Machinery Components, Car-Body Vibration	Serves as a robust, highly generalizable baseline that maximizes the margin between fault classes; ideal when deep learning might overfit due to data scarcity.
Autoencoders (AE)	Track Geometry Faults, Machinery Anomalies	Effectively detects anomalies in entirely unlabeled datasets by learning “normal” operational baselines and identifying high reconstruction errors.
Clustering (K-Means/DBSCAN)	Maintenance Limits, Defect Grouping	Groups assets or track sections with similar degradation behaviors, significantly reducing uncertainty in maintenance modeling without requiring human labels.
Reinforcement Learning (A2C/DQN)	Maintenance Scheduling, Rail Renewal	Uniquely capable of optimizing sequential decision-making policies in stochastic environments to maximize long-term rewards (e.g., minimizing lifecycle costs).

Table 3. Summary of economic impacts and quantified cost-benefits.

Category	Source	Applied Activity	Baseline for Comparison	Quantified Benefit
Infrastructure & Track Geometry	[57]	Track Geometry Inspection	Manual/Analog	65–75% cost reduction
	[56]	Track Maintenance	Sub-optimal alert limits	27–57% cost reduction
	[62]	Track Maint.	Field/Historical Data	21% reduction in activities
	[77]	Track Geometry Maint.	Preventive	Up to 30% saving
	[15]	Track Inspection & Schedule	Current Policy	100% Saving
	[51]	Rail Head Wear Detection	Manual Graph Paper	Qualitative
Rolling Stock & Components	[30]	Traction & Braking Systems	Visual Control/Fixed Int.	Staff reduced from 4 to 2
	[66]	Freight Wagon Wheelsets	Time-based Maintenance	Avoids 12% increase
	[35]	Transmission/Suspension	Reactive/Run-to-Failure	Qualitative
	[58]	Machinery/Asset Health	Unexpected Failure	Qualitative
	[78]	Wheelset Reprofiling	Optimal CBM Policy	1% Cost Deviation
	[70]	Axle Bearings Maint.	Preventive/Corrective	Qualitative
Network Scheduling & Planning	[79]	Network Maint. Routing	Standard Scheduling	80% work reduction
	[27]	Tamping, Grinding, Renewal	Conventional Planning	Qualitative
	[36]	Rolling Stock Reliability	Rolling Stock Unavailability	Qualitative
General Frameworks & Safety	[32]	Heavy Haul Rail Lines	Contextual Cost of Failure	$525,000 per broken rail
General Frameworks & Safety	[80]	Deep Tech/Sensing	Reactive Maintenance	Qualitative

Table 4. SWOT analysis of digitalized railway maintenance.

Strengths (Internal)	Weakness (Internal)
Cost Reduction: Proven reductions in operational downtime and maintenance expenditures	The Safety Paradox: Severe lack of “run-to-failure” data due to strict safety regulations, hindering AI training
Asset Longevity: Maximizes the RUL of physical infrastructure, preventing premature renewals	The “Black-Box” Nature: Lack of interpretability in advanced AI models limits trust among safety certifiers
Automated Accuracy: Deep Learning (e.g., CNNs) provides highly accurate, automated defect detection from unstructured data	Computational Constraints: High latency and hardware requirements limit real-time edge deployment
Opportunities (External)	Threats (External)
Synthetic Data & Physics-Informed ML: Using Digital Twins and Multibody Dynamics to safely simulate failure data	Cybersecurity Risks: Bidirectional Digital Twins acting on critical national infrastructure face severe vulnerability to cyberattacks
Standardization (IFC): Using open-source BIM standards to enable interoperability across platforms	Vendor Lock-In: Over-reliance on proprietary, closed-source software ecosystems limits long-term 50+ year asset management.
Green KPIs: Integrating environmental cost functions to objectively validate sustainability and carbon reduction.	Harsh Environments: Extreme weather and mechanical vibrations severely disrupt real-world sensor and UAV data quality

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Mutlu, U.; Kaewunruen, S. Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins. Infrastructures 2026, 11, 87. https://doi.org/10.3390/infrastructures11030087

AMA Style

Mutlu U, Kaewunruen S. Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins. Infrastructures. 2026; 11(3):87. https://doi.org/10.3390/infrastructures11030087

Chicago/Turabian Style

Mutlu, Ugur, and Sakdirat Kaewunruen. 2026. "Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins" Infrastructures 11, no. 3: 87. https://doi.org/10.3390/infrastructures11030087

APA Style

Mutlu, U., & Kaewunruen, S. (2026). Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins. Infrastructures, 11(3), 87. https://doi.org/10.3390/infrastructures11030087

Article Menu

Digitalised Predictive Maintenance in Railways: A Systematic Review of AI, BIM, and Digital Twins

Abstract

1. Introduction

2. Methodology

2.1. Identification of Papers

2.2. Screening and Eligibility

2.3. Data Extraction

2.4. Use of GenAI

3. Results and Discussion

3.1. Distribution of the Papers

3.2. Thematic Analysis

3.3. ML Algorithms

3.3.1. Supervised Learning

3.3.2. Unsupervised Learning

3.3.3. Reinforcement Learning

3.4. Use of DTs and Building Information Modeling

3.4.1. Unidirectional Architectures

3.4.2. High-Level DT Integration

3.4.3. Architectural Frameworks

3.4.4. High-Level BIM Integration and the Use of IFC

3.5. Cost Perspective

3.6. Sustainability Perspective

3.7. Challenges, Gaps and Future Research Directions

3.7.1. Data Scarcity, Quality and the Safety Paradox

3.7.2. Model Interpretability and the “Black-Box” Nature of Deep Learning

3.7.3. Integration Maturity and Standardized Interoperability

3.7.4. Real-Time Deployment, Operational and Computational Constraints

3.7.5. Quantifying Environmental Impacts: Moving Beyond Qualitative Claims

3.7.6. The Physical Foundation: High-Fidelity Modeling and Structural Optimization

4. Applications and Limitations of the Study

SWOT Analysis for Industrial Application

5. Conclusions

Supplementary Materials

Author Contributions

Funding

Data Availability Statement

Acknowledgments

Conflicts of Interest

Abbreviations

Appendix A. List of Reviewed Papers

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI