1. Introduction
Efficient and reliable train dwell time management plays a critical role in promoting sustainable railway operations by improving timetable stability, reducing energy-intensive idling, and enhancing passenger satisfaction. Understanding how dwell time behaves under regional operational constraints supports the optimisation of service schedules and resource allocation—key pillars of sustainable transport systems [
1]. Accurate prediction and management of dwell times are essential for optimising train schedules, ensuring smooth passenger flow, and enhancing the overall performance and service sustainability of transportation networks.
Historically, the academic and practical pursuit of comprehending and forecasting dwell times has led to the development of various models. These models have generally fallen into one of three categories: statistical-based models, simulation models, and so-called ‘advanced’ models [
1]. Statistical-based models are deemed helpful in comprehending the relationship between various variables and factors and can be generically applied to all appropriate stations on a rail network. Simulation-based models help comprehend passenger flow dynamics during dwell time for specific environments and circumstances, such as a specific station design and train type. The ‘advanced’ models aim to impart new technologies and complex thought processes to dwell time modelling, such as digital twins, fuzzy logic approach, and machine learning methods. All three categories of models have predominantly been developed for and based on metro or suburban types of railways, and it is relatively unknown how well dwell time models suit other types of railways, specifically regional railways.
Over the past two decades, the demand for regional rail travel has expanded markedly worldwide. In the United Kingdom, for example, regional rail patronage rose by approximately 42% between 2002/03 and 2008/09, representing an average annual growth rate of around 6% [
2]. This upward trend continued, reaching a 66% increase by 2014/15, accompanied by rises of 74% in passenger kilometres and 151% in passenger revenue over the same period [
3]. In the Australian context, regional rail has also experienced significant growth. In Victoria, annual regional passenger trips increased from 6.7 million in 2006 to 17.9 million in 2017, marking an overall expansion of 167% [
4]. Following the opening of the Regional Rail Link (RRL) in 2015, the Geelong line, Victoria’s busiest regional corridor, recorded a further 80% increase in ridership [
5]. Meanwhile, New South Wales maintained a considerably larger passenger base, carrying over 40 million regional rail journeys annually between 2017 and 2019, prior to the COVID-19 pandemic [
6].
Unlike urban railways, regional rail systems are characterised by single-track operations, longer interstation distances, and mixed-traffic interfaces with freight and suburban services. These factors introduce variability in dwell time through procedural, signalling, and crew coordination effects not typically captured in urban dwell models. For instance, single-track sections and junction scheduling introduce operational buffers that affect total dwell beyond passenger-related factors [
7,
8]. Our previous study [
9] utilised CCTV data from regional stations in Australia and identified unique operational patterns, including the “blinded phenomenon” affecting conductors during afternoon peaks. Recognising these distinctions is crucial for understanding why traditional dwell models, developed for metro networks, may underperform in regional contexts.
This study focuses on the regional railway network of Victoria, Australia, which serves as the basis for defining a regional rail system (see
Figure 1 for the network layout, adapted from [
9]). Approximately 70–95% of Victorian regional passengers travel between major provincial centres, such as Geelong, Ballarat, Bendigo, Seymour, and Traralgon, and Melbourne’s central business district (CBD). These regional cities are located roughly 70–165 kilometres from Melbourne and are connected through a series of intermediate regional stations (e.g., Ballan) and peri-urban stations (e.g., Rockbank). During peak periods, these stations typically experience 6–7 train services per hour, compared with 1.5–3 services per hour during off-peak times. Several lines also extend to outer destinations such as Warrnambool, Ararat, Maryborough, Swan Hill, Echuca, Shepparton, and Bairnsdale, although these operate with only 2–5 services daily. Despite offering fewer services than metropolitan networks, passenger behaviour and its influence on dwell times remain crucial because regional lines operate within constrained infrastructure environments—including single-track sections, flat junctions, and shared corridors with suburban and metropolitan services. For instance, the Geelong, Ballarat, and Bendigo corridors converge at Sunshine Station, approximately 13 kilometres from the Melbourne CBD, funnelling 18–19 regional trains per hour during peak periods. In such a setting, unplanned or extended dwell times can disrupt carefully timed paths through junctions, passing loops, or shared suburban tracks, causing cascading delays across both regional and metropolitan services.
The primary aim of this study is to evaluate the transferability and contextual reliability of established statistical dwell-time models within regional railway operations. We hypothesise that established statistical dwell time models, while effective for predicting passenger flow, will be insufficient for predicting total dwell time in regional settings due to significant and variable operational overheads. While existing research has largely focused on densely populated metropolitan systems, limited evidence exists on how these well-known analytical models perform under the operational and behavioural conditions unique to regional railways. Rather than seeking to introduce new theoretical paradigms or models, this study deliberately positions itself as a foundational contribution that validates and contextualises existing theories for non-urban settings. Such evidence-based validation is essential before any generalised theoretical framework can be extended across different rail typologies, as transferability without empirical grounding risks misrepresentation of regional operating realities [
9,
10]. Therefore, the originality of this paper lies in bridging the gap between theoretical constructs and applied operational environments—offering an empirically verified baseline that future, more theory-oriented research can build upon. In this way, the study contributes to methodological advancement through contextual robustness rather than through abstraction alone, aligning with calls for practical, evidence-driven extensions of public transport modelling theory [
10,
11,
12].
2. Related Works
In the realm of rail operations research, statistical dwell-time models play a pivotal role in enhancing the operational efficiency of rail systems. These models, which are fundamentally derived from multivariate regression analyses, can be categorised as either linear or non-linear. The evolution of dwell time models has seen significant contributions from various scholars, including Wirasinghe and Szplett [
13], Weston [
14], Lin and Wilson [
15], Lam et al. [
16], Puong [
17], and Douglas [
18]. A generalised representation of these models can be succinctly expressed as follows:
where dwell time (
) is a function of passengers boarding (
) and alighting (
); the constant (
) represents a fixed notion of fixed operational and mechanical time; and the constants of (
) and (
) represent a boarding and alighting flow factor, respectively.
The inception of these investigations by Wirasinghe and Szplett [
13] introduced a model that underscored the average passenger demand for boarding and alighting, coupled with a fixed operational time component. Subsequent developments by Weston [
14] tailored a non-linear model specific to the London Underground, incorporating variables such as the number of double-door-width doors on the train, the peak door factor, and the existing passenger count, aiming to account for the unequal distribution of passengers across the train. Lin and Wilson [
15] further diversified the landscape by including the number of car sets in their analysis, based on data from the Massachusetts Bay Transportation Authority (MBTA) Green Line.
The studies by Lam et al. [
16] and Puong [
17], although distinct in their approach, shared a common foundation in examining the impact of boarding and alighting passenger numbers. However, Puong’s [
17] model notably diverged by considering the per-door perspective and the influence of through-standing passengers on dwell time. This line of inquiry was further refined by Douglas [
18], who, while building upon Puong’s [
17] model, introduced modifications tailored to the Sydney suburban rail network. Douglas’s [
18] research not only accounted for the base variables of earlier models but also incorporated additional factors to better reflect the passenger behaviours encountered in the Sydney metro system.
In addition to these developments, Douglas [
18] also highlighted a model devised for the Thameslink and Crossrail projects by John Rosser and Peter Howarth. This model, which utilised Weston’s work as a foundational concept, focuses on the intricacies of passenger boarding and alighting dynamics without delving into door operation times, thus presenting a specialised approach for specific rail projects. The academic community has shown a particular interest in the Weston [
14] model for its comprehensive and adaptable nature. Notably, studies by Harris [
19] and Harris and Anderson [
20] have underscored its applicability across a wide range of conditions and locations, despite identifying some overestimations in scenarios of high passenger volume.
More recently, research has advanced toward machine learning (ML), hybrid, and simulation-based dwell-time models that leverage large-scale, sensor-derived datasets. To touch on simulation works for a broader context, these models are beneficial for accounting for cascading delays and passenger interactions in rail operations. Studies by Zhang et al. [
21] and Jiang et al. [
22] used micro and macro-simulation models to analyse passenger movement and the relationship between train and passenger delays, respectively. Yamamura and Inagi [
23] developed a multi-agent model to estimate train dwell times considering congestion. Jiang et al. [
24] introduced a time-driven micro-simulation model to optimise dwell times on Shanghai’s rail transit Line 8. Perkins et al. [
25] used agent-based modelling to reduce dwell times, finding that a combination of an active passenger information system and designated doors decreased loading times by 7.3%. Ahn et al. [
26] improved passenger satisfaction in Brisbane through an agent-based simulation that relayed carriage occupancy levels, leading to more evenly distributed occupancy and reduced crowding. Simulation models are effective for localised studies but can be time-intensive for broader network outcomes.
There are also those ‘advanced’ models that incorporate methods like ML, fuzzy logic, probabilistic/stochastic approaches, and digital twins. Alvarez et al. [
27] used a fuzzy-logic-based AI method to estimate dwell times at metro stations in Panama City, accounting for passenger preferences, yielding accurate approximations of actual dwell times. Wen-jun et al. [
28] proposed an extreme learning machine (ELM) neural network model to analyse and model factors influencing urban rail dwell times in Beijing, outperforming other algorithms and existing statistical models. Glatin and Clarke [
29] conducted a feasibility study for the Rail Safety and Standards Board on a real-time digital twin to reduce dwell-time variations on the UK Thameslink route, focusing on real-time passenger-flow prediction. Coulaud et al. [
10] developed a hybrid approach using machine learning to create statistical models for the Paris Metro, validating their approach with extensive railway operations and passenger flow data.
A hybrid data-driven approach combining deep reinforcement learning (Proximal Policy Optimisation) and machine learning was proposed in [
12] to optimise train trajectory reconstruction under service interruptions. Applied to the Wuhan–Guangzhou high-speed railway, the method achieved over 12% improvement in timetable rescheduling and a 20% reduction in train delays compared with conventional control decisions, demonstrating its effectiveness and practical applicability. A simulation-based Digital Twin (DT) prototype was developed by Padovano et al. [
30] for a major Italian railway station to enhance operational control. The case study demonstrates the DT’s effectiveness in synchronising virtual and physical systems, integrating diverse data sources, and achieving a practical balance between accuracy, performance, and scalability, thereby bridging the gap between theoretical models and real-world transport hub management. A study by Bapaume et al. [
31] introduces a computer vision–based deep learning framework for real-time prediction of passenger loads and train headways in urban metro systems, formulating the task as an image completion problem. Using three years of data from Paris Metro Line 9, France, the research compares several architectures, including transformer-based models, and demonstrates the framework’s robustness across both typical and atypical operating conditions, such as strikes and disruptions. Pritchard et al. [
32] introduced a data-driven approach proposing a novel metric referred to as excess probability of delays to quantify how specific factors affect train delays, using two years of Swedish railway data. Results show that train meets and passes, particularly on single-track lines, substantially increase dwell-time delays, contributing roughly 4% of total dwell-time delays and highlighting opportunities for targeted operational improvements.
Through review of speed optimisation and dwell time control as measures for intelligent traffic management in rail-bound public transportation systems, Abrecht et al. [
33] introduced operational target points and windows to improve throughput and reduce energy consumption. Case studies across three different rail-bound passenger systems in Germany demonstrate the potential benefits of these strategies, which can be implemented via Driver Advisory Systems (DAS) or Automatic Train Operation (ATO). A study by Kecman and Goverde [
34] develops and compares data-driven models for estimating running and dwell times in railway traffic, using high-granularity historical track-occupation data. Both global models (robust linear regression, regression trees, and random forests) and refined local models are evaluated, with local models showing the best performance in terms of accuracy and computational efficiency, demonstrating their applicability for real-time railway operation.
A peer-to-peer train rescheduling system using Genetic Algorithm–based local search and negotiation protocols, tested on a UK railway bottleneck, shows faster computation and comparable optimality to centralised approaches [
35]. A multi-agent system for real-time train rescheduling, which decomposes the network into single-junction levels and utilises a Condorcet voting–based collaboration mechanism [
36], demonstrates a 34% increase in line capacity compared with conventional methods on a UK railway network. In another study in the UK, a reinforcement learning–based Q-learning approach with a tiered reward mechanism for very-short-term train rescheduling in a bi-directional single-track corridor was proposed [
37], demonstrating improved solution quality, computational efficiency, and knowledge reusability compared with existing methods. A safety-oriented Origin Destination-based time-dependent fare optimisation model solved using an Iterated Local Search algorithm, which, when applied to the Beijing Metro Batong Line in China, effectively reduced passenger accumulation risk and improved safety on overcrowded metro lines [
38].
For readers interested in state-of-the-art modelling advancements, several recent reviews [
39,
40,
41] provide detailed discussions of the integration of digital twins, machine learning, and data-driven methods in railway operations and dwell-time analysis.
This extensive body of work on dwell time models not only exemplifies the depth of research dedicated to understanding and optimising rail system operations but also highlights the continuous need for model evolution to adapt to the nuanced demands of different rail systems and passenger behaviours worldwide.
3. Methodology
3.1. Data Collection and Locations
To obtain essential data on passenger boarding and alighting times (passenger flow time), door preferences, and total dwell times for regional trains, video-based observations from CCTV footage at Cobblebank and Rockbank stations, Victoria, Australia, were used. These stations were selected for their high-quality CCTV footage, capable of clearly distinguishing passenger movements in and out of the train and capturing the entire train length. The selection followed extensive discussions with the rail operator, who provided a sample of CCTV footage from various stations. The use of two stations—Cobblebank and Rockbank—was based on their representativeness of mid-tier regional stops with distinct passenger access configurations and consistent CCTV visibility, which ensured high data quality. This follows best practice in exploratory validation studies where observational precision outweighs geographic breadth [
42]. Future research can extend this dataset to additional stations and longer durations as higher-fidelity video and sensor data become available. The stations’ location with respect to the broader network is shown in
Figure 1. It is worth noting that a portion of the dataset used in this study was also included in our earlier work [
9]; however, the present study focuses specifically on the evaluation and comparison of model performance.
The configuration of the two stations is illustrated in
Figure 2. To describe the stations, at Cobblebank, passengers access Platform 1 via a staircase or a long ramp at the Ballarat end. At the same time, Platform 2 has two open entrances, one leading directly to the car park and another accessible via a staircase or ramp (see
Figure 2a). In contrast, Rockbank has a central, large, open entrance on both platforms connecting to the overpass and parking area, which all passengers must use to board or alight (
Figure 2b).
3.2. Ethics Consideration
The rail operator granted permission for the use of CCTV footage for research purposes. All extracted footage contained only non-identifiable visual information, ensuring that individual passengers could not be recognised. As the dataset consisted solely of secondary, de-identified material, the university’s ethics committee subsequently granted an exemption from full ethical review.
3.3. Data
CCTV footage was acquired for Cobblebank and Rockbank Stations throughout the day as two sets from Monday, 7 February 2022, to Wednesday, 9 February 2022, and from Monday, 5 September, to Friday, 10 September 2022. These time periods were selected because they were free of significant external influences on train patronage. The Victorian populace had adapted to the “new COVID-normal,” resuming in-person activities where possible.
The surveillance system footage obtained comprised short video files capturing only the periods when trains were at the station to minimise viewing risk. Each station had four camera angles, but no single camera covered all the necessary details. For complete visual coverage of the platform and train doors, videos from all four cameras were synchronised in Movavi Academic 23 (a video editing software) and merged into one composite file to enable thorough review. Data were collected from 398 VLocity train services at Cobblebank and Rockbank Stations. Each train stopped at the relevant marker on the platform depending on whether it was a 3-car or 6-car VLocity. Our sample of 398 services significantly exceeds the sample sizes reported in similar observational studies, such as Oliveira et al. [
42], which used only nine departures at one station.
The train services observed throughout the day were categorised as follows. Morning peak (AMP Up) services, arriving at Southern Cross Station (Melbourne) between 07:00 and 09:00, used Platform 1. Interpeak Up services (INP Up), arriving Southern Cross Station (Melbourne) between 09:01 and 15:30, used Platform 1. Interpeak Down services (INP Down), departing Southern Cross Station (Melbourne) between 09:01 and 15:30, used Platform 2. Afternoon peak (PMP Down) services, departing Southern Cross Station between 15:30 and 18:30, used Platform 2. Post PM Up (POP Up) services, arriving at Southern Cross Station (Melbourne) after 18:30, used Platform 1. Post PM Down (POP Down) services, departing Southern Cross Station (Melbourne) after 18:30, used Platform 2. Services in the counter-peak direction were excluded (i.e., AMP Down, PMP Up) due to having very low service counts and patronage.
The data breakdown and the number of services examined are detailed in
Table 1.
3.4. Model Selection
The most prominent statistical dwell time models available in the current literature are Wirasinghe and Szplett [
13], Weston [
14], Lam et al. [
16], Puong [
17], and Douglas [
18], which are related to heavy rail and would be performance tested as part of this study. The comparative evaluation of these five established statistical dwell-time models represents the chronological and methodological progression of regression-based approaches developed over the past four decades. These models were selected because they are the most frequently cited and/or operationally applied dwell time formulations in both academic and practitioner literature, each contributing a distinct advancement to the understanding of how passenger boarding and alighting behaviour influences total dwell duration [
1,
43,
44].
While more recent studies have demonstrated the predictive superiority of data-driven and machine learning (ML) methods, such as random forests or neural networks [
10,
12], traditional regression models continue to play a critical role in rail operations research due to their transparency, interpretability, and low data requirements, allowing explicit estimation of how individual factors, such as passenger boarding, alighting activity, and operational constants, affect total dwell time. This feature is particularly valuable for railway planners and policymakers who require clear, quantifiable relationships rather than opaque algorithmic outputs [
45,
46]. In contrast, most ML techniques, while often delivering higher predictive accuracy, function as “black-box” systems, providing limited insight into causal mechanisms or the practical levers available for intervention [
47,
48]. Classical statistical frameworks, such as those five models used in this study, remain indispensable for benchmarking emerging algorithms, validating their outcomes, and explaining the physical and behavioural mechanisms underlying dwell-time variability [
43,
44]. Accordingly, this study examines these foundational models in a regional railway context, not to replicate past work but to test their contextual robustness and delineate the boundaries of their applicability, providing a reference point for future hybrid or ML-based model development.
Wirasinghe and Szplett [
13] introduced a series of models, developed from survey observations (unknown time period), distinguishing between scenarios with dominant alighting, mixed flow, and dominant boarding. In the dominant alighting model, the total dwell time (t) is expressed as a linear function of the number of alighting (a) and boarding (b) passengers, with alighting passengers weighted more heavily. This model underscores the disproportionate impact of alighting passengers on dwell times. Conversely, the mixed flow model adopts a more balanced approach, albeit with different coefficients, suggesting a nuanced interaction between boarding and alighting processes. The dominant boarding model mirrors the structure of the dominant alighting model but assumes equal weights for both boarding and alighting, highlighting scenarios where boarding activities predominate. What this paper terms as the ‘partial model (Pf)’, for Wirasinghe and Szplett [
13], refers to the complete model without the ‘fixed time lost (l)’, as this aims to represent the operational nuances as a fixed constant.
Weston [
14] model, developed from survey observations (unknown time period), offers a complex formulation that incorporates not only the numbers of boarding and alighting passengers but also the train’s seating capacity, the number of through passengers, and the train’s door factors, among others. This comprehensive approach aims to capture the multifaceted dynamics affecting dwell time, accounting for physical constraints and passenger behaviours. What this paper terms the ‘partial model (Pf)’ for Weston [
14] is a complete model without the ‘function time’ that also aims to represent the operational nuances as a fixed constant. Lam et al. [
16] proposed a simpler model, developed from peak-period survey observations, linearly relating dwell time to the number of boarding and alighting passengers, with a fixed time added to account for operational constants. This model emphasises the direct impact of passenger flow on dwell times, providing a straightforward method for estimation. Again, what this paper terms the ‘partial model (Pf)’ for Lam et al. [
16] refers to the complete model without the 10.5 s fixed constant representing operation nuance.
Puong [
17] developed from peak-period survey observations, focusing on the per-door dynamics of passenger boarding and alighting, introducing a model that considers the number of passengers per door and the cubic impact of standing passengers per door. This model highlights the significance of door-level passenger flows and the non-linear effects of passenger congestion on dwell times. The ‘partial model (Pf)’ for Puong [
17], as deemed by this paper, refers to the complete model without the 12.22 s fixed constant representing operation nuance. Douglas [
18] further explored the non-linear dynamics of passenger interactions at the door level, incorporating both boarding and alighting passengers, as well as the estimated number of standing through-passengers per door. This model, developed from a combination of peak-period survey observations and controlled live simulations, accounts for the complex interplay between different types of passengers and their collective impact on dwell times. The ‘partial model (Pf)’ for Douglas [
18], as deemed by this paper, refers to the complete model without the 10 s fixed constant representing operation nuance. Both complete and partial models used for testing in this study are stated in
Table 2.
3.5. Performance Testing
The performance testing of the selected models began by segmenting the observed data into distinct ‘conditions’. These conditions corresponded to specific time periods and directions during which the data was recorded. The categories included AMP Up, INP Up, INP Down, PMP Down, POP Up, and POP Down.
For each observation under each condition, relevant recorded data—such as the number of boarders and alighters and the door that was used—was processed through each selected model. The models were evaluated in two distinct states: the partial state, which represented only passenger flow time, and the complete state, which represented total dwell time.
Each model generated predictions for both passenger flow time and total dwell time based on the input data for each condition. These predicted times were then compared against the observed times. The comparisons were visualised through plotting, which provided an intuitive understanding of how each model performed under each condition.
To quantitatively assess the model performance, the coefficient of determination (r2) was calculated. This statistic measured the proportion of variance in the observed times (both passenger flow and total dwell time), which was explained by the model predictions. An r2 value close to 1 indicated a high proportion of variance explained by the model, suggesting good predictive accuracy. By combining visual and quantitative analyses, this methodology ensured a comprehensive evaluation of the selected models’ performance across different conditions.
5. Discussion
5.1. Performance
The consistently higher r2 values for passenger flow time only, compared to total dwell time, across studies suggest that while passenger flow can be predicted with a moderate degree of accuracy, total dwell time is influenced by additional factors not captured by passenger flow models alone. This discrepancy may be attributed to variables such as door operations, operator staff procedures, and other inefficiencies not directly related to the number of passengers.
For instance, in scenarios with high passenger flow, one might expect total dwell time to increase due to the longer time required for passengers to board and alight. However, the lower correlation with total dwell time suggests that the number of passengers is not the sole determinant of dwell time, particularly for a regional railway; operational factors play a significant role as well.
One additional point to be conscious of is that weak r
2 values in this context should not be completely disregarded, as they can still provide meaningful insights into the study. In the field of social sciences, correlations of 0.20 to 0.30 are often deemed meaningful due to the complexity of human behaviour [
49,
50,
51]. Therefore, even weak correlations have been identified and discussed to provide a comprehensive view of the data, identifying trends that contribute to understanding passengers’ boarding or alighting behaviours. This approach ensures no significant patterns or insights are overlooked.
5.2. Model Architecture
Considering the discussion about the higher performance of the models by Puong [
17] and Douglas [
18] against a regional railway, it is worth discussing why the authors believe this to be the case, and one pathway that could be followed is to examine where these models have been developed and the key variables within them.
It is worth noting that the highest performing models in this study, Puong [
17] and Douglas [
18] are similar in nature. They both account for the same key variables of the number of alighting and boarding passengers per door, in addition to the number of standing passengers per door. One difference between the models is that Douglas [
18] uses a power function of 0.7 on the number of boarders and alighters (similar to Weston [
14])) as opposed to Puong’s [
17] linear function for this component, and adopts a linear function for standing through passengers that multiplied standing passengers by the combined total of boarding plus alighting passengers, as opposed to the cubic function. It is worth noting that the ‘standing through passengers’ component of all models tested was effectively cancelled out as this was assumed to be zero due to patronage data showing that the number of people on board at the departure of the two stations that the data came from, was less than the number of seats provided, and it was assumed all people would have gravitated to a free seat.
Another difference between the respective models of Puong [
17] and Douglas [
18] is the magnitude of the flow rate factors used for the estimation of the time required for boarding and alighting. Puong [
17] found in their dataset that boarders were slower than alighters, which explained the flow rate factor of 2.27 and 1.82 applied, respectively. Douglas [
18], on the other hand, found in their dataset that alighters were slower than boarders, which explained the flow rate factor of 1.9 and 1.4, respectively, which is contrary to Puong’s [
17] findings. It is also important to note that both these models were developed based on peak-period datasets.
As this study found that the average passenger flow rates in the AMP Up and PMP Down were 1.99 and 1.42 s per passenger, respectively, it could be suggested that peak period passenger behaviour aligned closer to the dataset by Puong [
17], where boarders were slower than alighters. However, it must be noted that interestingly, the flow rates found in this study were essentially the inverse of Douglas (2012), where the boarding rate found in this study was very close to Douglas [
18] alighting rate found in that study and vice versa. Potential reasons for this could lie in the rolling stock type used in each of the studies. It was noted that Puong [
17] collected data from rolling stock types that had single-stream doors; however, Douglas [
18] collected data from the Millennium train sets, where each carriage had large double-stream doors. The VLocity train, from which the data was collected in this study, has single stream doors and therefore aligns most closely to the train type Puong [
17] used to collect data, which can explain the more similar passenger behaviour observed in the PMP Down, compared to that of Douglas [
18], which subsequently can explain the slightly better passenger flow r
2 fit of Puong [
17] model of 0.8553 compared with Douglas [
18]’s r
2 fit of 0.8124.
The condition that drew the poorest performance from the models by Puong [
17] and Douglas [
18] was the INP Up, where both models could only achieve r
2 values of 0.2182 and 0.2302, respectively. The possible explanation for such a poor fit for this particular condition was that the INP Up saw the slowest average flow rate of all (by at least a full second on average), with passengers boarding at an average rate of 3.63 s per passenger. This can indicate that passengers who travel between the peaks on a regional railway can be less confident about the experience and require more time to board (i.e., tourist, occasional traveller). This slower flow rate can therefore have a significant impact on dwell time estimations using known statistical models, as none of the models have been developed using ‘off peak’ data that captures ‘less confident’ travellers and the slower flow rates they may require.
5.3. Implications for Dwell Time Modelling
The prolonged and variable conductor times (C4) observed in this study are likely a direct outcome of the procedural and coordination requirements unique to regional rail operations. Unlike metropolitan systems, regional lines often involve single-track sections, flat junctions, and shared corridors with freight and suburban services, which necessitate additional crew communication, safety checks, and dispatch confirmation before train departure. These operational complexities, sometimes compounded by “blinded sections” or restricted sightlines, contribute to greater variability in conductor procedure time. This finding reinforces the need for dwell time models that explicitly account for regional operational characteristics, rather than relying solely on urban-based formulations.
The implications of the results of this study draw particular attention to three aspects of dwell time modelling, with a particular focus on regional railways. The first is that the modelling of time required for passenger flows, only from previous studies, can be appropriately applied to regional railways as well. The second is the need to gain a better understanding of the lost time due to operational factors, and the third is how differing passenger flow rates and peak door percentages can affect the performance of dwell time models.
5.3.1. Passenger Flow Compatibility
The high correlations between observed and modelled passenger (PAX) flow from previous studies indicate that these models are robust and can also be effectively applied to a regional railway. This suggests that the assumptions used to develop the passenger flow component of models in previous research, such as those by Wirasinghe and Szplett [
13] and Douglas [
18], are also apparent for a regional railway. This transferability is beneficial for a regional rail operator, as it can be suggested that one could use these existing models to forecast passenger flow time and the consequences of that.
5.3.2. Time Lost to Operation Factors
The weaker correlations for total dwell time predictions across the studies highlight the need for a deeper understanding of the lost time due to operational factors. Factors such as door operations (i.e., how long it takes the doors to open and close) are generally fixed for each train type, and this lost time can be appropriately assumed as a fixed constant in models. However, factors such as operational staff procedures that must be completed for both train and platform can also significantly impact the time lost to operational factors; however, these factors do vary as seen by the observed results in
Table 3. This variability needs to be better understood to better inform dwell time modelling, and from a rail planning perspective, rail operators who may use the available dwell time models may need to quantitatively account for this based on local network procedures.
5.3.3. Flow Rates and Peak Door Percentage
Based on the findings of this study, passenger flow rates in the peak period and direction (see
Figure 3, AMP U, PMP D) were found to be the fastest compared to the off-peak findings and ‘the peak door percentage of total passengers per service’ was also observed to be the lowest in the peak period and direction for both 3-car (see
Figure 4, AMP U, PMP D) and 6-car (see
Figure 5, AMP U, PMP D) VLocity trains indicating that passengers in the peak periods had more of an inclination to ‘spread out’ when they looked to board or alight.
These findings can be explained by the rationale that during peak periods, passengers are typically commuters who are likely to be more confident in the station/train environment and exhibit more efficient boarding and alighting behaviours compared to off-peak periods. In the off-peak periods, it can generally be expected to see a higher proportion of passengers who are likely less confident and efficient with the station/train environment [
52] as it is generally more isolated to family/group travel, occasional travellers, tourists, or the elderly (i.e., fewer commuters) [
53], which can align with the findings of slower passenger flow rates and a higher percentage of passengers boarding through the peak door (i.e., a family or group of four friends would nearly always board through the same door to stay together as opposed to finding a separate door each).
It can, therefore, be suggested that passenger behaviour and potentially, passenger demographics (i.e., commuter, occasional traveller, elderly) can vary significantly across different time periods on a regional railway, which does impact the performance of dwell time models (as seen in
Figure 20) if the variations in both flow rates and peak door use are not considered together. Using one of the tested dwell-time models to estimate dwell times for the entire day for a regional railway can have varying results (as seen in
Figure 20). Timetable planners should consider using different dwell time values for peak vs. off-peak services to account for these behavioural differences.
5.4. Policy Implications
The findings have several important policy and operational implications for regional railway management. First, the strong and variable influence of C4 time (conductor procedures) suggests a need for standardised staff training and operational protocols across stations to reduce inconsistency in departure processes. Establishing clearer procedural guidelines and monitoring compliance could help minimise avoidable dwell time extensions.
Second, the results support the development of evidence-based timetable policies that incorporate more realistic dwell time allowances reflecting regional operational complexity, including mixed-traffic conditions, single-track operations, and manual dispatch procedures. Adjusting dwell buffers to match observed variability can enhance service punctuality and minimise cascading delays.
Third, policymakers should consider targeted infrastructure and technology investments, such as door-automation improvements, real-time dwell monitoring systems, or digital dispatch aids, to reduce dependence on manual coordination. These measures can strengthen both efficiency and safety without requiring large-scale infrastructure expansion.
Finally, the methodology used in this study demonstrates the value of micro-level video analysis as a policy tool for evaluating on-ground operational performance. Expanding this data-driven approach to more stations could enable continuous dwell time auditing and support more adaptive, regionally tailored railway policies.
5.5. Limitations
This study presented several limitations that should be acknowledged. Firstly, the geographic and operational scope was confined to two stations—Cobblebank and Rockbank—within the Victorian regional railway network. These stations were selected based on the quality and suitability of available CCTV footage; however, they may not have captured the broader diversity of infrastructure, operational conditions, or passenger demographics that exist across other regional stations. As such, the results may not be entirely generalizable to other regional railways elsewhere. Although this study focused on two representative intermediate stations along the Ballarat line, the findings provide an important benchmark for understanding dwell dynamics in regional contexts. By systematically decomposing 398 train services into passenger flow and operational components, the study offers transferable methodological insights for other regional rail networks. The results emphasise that model reconstruction and system-wide extrapolation should account for regional heterogeneity in track layout, crew procedures, and scheduling policies. The relatively low explanatory power of the complete model likely reflects unobserved influences from infrastructure and procedural variables not captured in the dataset, such as the speed of door opening, the length of platform announcements, and crew coordination during departure authorisation. While these variables could further explain the variability within the C4 component, they were beyond the scope of this study due to data and access limitations. Future research should seek to disaggregate C4 into regulatory (“mandatory”) and behavioural (“discretionary”) subcomponents and test their effects using multi-level or random-effects models when detailed operational data are available. These findings thus provide an empirical basis for future research aimed at tailoring statistical or hybrid models to regional operating environments.
Secondly, the coverage of the data was limited to two observation periods (February and September 2022), spanning approximately ten days in total. Although these timeframes were chosen to avoid abnormal patterns in patronage (e.g., public holidays or major disruptions), they did not account for potential seasonal fluctuations or other temporal variations in passenger behaviour. This constraint may have limited the comprehensiveness of the findings from a ‘model responsiveness’ perspective; however, the intent of this study is intended more to inform high-level timetable planning, where it is important to establish the ‘normal’ expected dwell time at a station.
Furthermore, counter-peak services were excluded due to their low service frequencies and patronage, which meant that this study did not explore model performance under conditions that were more conducive to minimal boarding or alighting. Likewise, the study did not consider scenarios involving high service disruption, crowd surges, or emergency operational conditions, which could be relevant for resilience planning in regional rail networks.
The study also focused exclusively on established statistical dwell-time models and did not evaluate simulation-based or machine learning approaches. While this was appropriate given the study’s objectives, it limited the exploration to more computational or adaptive modelling methods that may be better suited to capturing the possibility of a more complex, non-linear nature of regional rail operations.
Given these limitations, the authors reiterate that this study is more about laying a foundational piece that uses empirical evidence to highlight the limited ability of models tested in this study to model regional railway dwell times of two regional stations accurately. This study does not claim that the empirical findings presented here can be generalised to all regional railway stations, as the data scope was intentionally bounded to ensure data quality and control for local operational variability. Rather than representing a limitation, this focused design provides a robust empirical foundation for understanding model performance in a representative regional context. The findings should be interpreted as an evidence-based prompt for further exploration of dwell-time modelling in regional rail systems, where empirical studies remain limited. Building on this foundation, future research should broaden the spatial and temporal scope by incorporating more stations, larger sample sizes, and varying operational conditions, as well as by developing and testing new or hybrid models specifically calibrated for regional railway environments.
6. Conclusions
This study rigorously tested the performance of established statistical dwell time models against empirical data from a regional railway system, offering novel insights into the accuracy and reliability of these models in non-urban contexts. The findings highlighted several critical aspects of dwell time modelling for regional railways.
Firstly, the strong correlations between observed and modelled passenger flow times indicated that existing models, particularly those developed by Puong [
17] and Douglas [
18], were robust and could be effectively applied to regional railways. This transferability suggested that regional rail operators could utilise these models to forecast passenger flow times reliably.
Secondly, the weaker correlations for total dwell time predictions emphasised the need for a deeper understanding of operational factors that influenced dwell time. Although some models reached statistical significance, error metrics were notably higher, highlighting the challenges of predicting total dwell time. Variability in the door operation times and staff procedures significantly impacted dwell time but were not adequately captured in existing models. This variability underscored the necessity for regional rail operators to account for these factors quantitatively based on local network procedures.
Thirdly, the study found that passenger flow rates and the percentage of passengers using peak doors varied significantly between peak and off-peak periods. During peak periods, passengers were typically more efficient in boarding and alighting, whereas off-peak periods saw slower flow rates due to potentially less confident travellers, such as tourists and occasional passengers. These behavioural differences affected the performance of dwell-time models and needed to be accounted for to ensure accurate predictions across different time periods.
Policymakers and rail operators should address several critical implications highlighted in this study. For instance, incorporating local operational variables, such as door operation times and staff procedures, into predictive models is essential, as current dwell time models fail to adequately account for the impact of regional-specific factors. Policy measures should prioritise integrating these variables to enhance the accuracy and reliability of predictions. Additionally, the variability in passenger behaviour between peak and off-peak periods necessitates differentiated operational strategies. Developing dynamic dwell-time models that consider these temporal variations would enable more efficient resource allocation and scheduling. Furthermore, refining dwell time predictions requires comprehensive data collection systems that capture detailed operational and passenger flow variables. Policymakers should invest in advanced monitoring technologies and data analytics to continuously update and improve these models, ensuring robust and adaptive solutions to evolving operational challenges. Since staff procedures significantly impact dwell times, policies should advocate for regular training and standardised protocols tailored to the regional context. This approach can streamline operations and reduce variability in dwell times. Recognising differences in passenger demographics and behaviours during various periods, policies should promote passenger-centric solutions, such as targeted information campaigns and improved station design, to facilitate smoother boarding and alighting processes.
It is important to note that this study was based on two representative intermediate stations along a single corridor, chosen for data quality and operational uniformity. Accordingly, the external validity of the findings is limited to similar regional contexts. The insights gained, however, provide an essential benchmark for extending model calibration to other station types and operational configurations.
Overall, this research contributed to the enhancement of scheduling and operational planning for policymakers and the regional rail services they represent by identifying strengths and limitations in existing models and the potential for necessary refinements. This study serves as a foundational study for regional railways to inform future research into dwell time modelling (particularly for regional railways). The findings of this study reinforce the importance of context-aware modelling for sustainable railway planning. By revealing the limitations of urban-derived dwell time models and identifying region-specific operational influences, the research underscores the need for adaptive, data-informed approaches that enhance service reliability and energy efficiency. Extending such modelling efforts across broader regional networks can support more sustainable mobility outcomes by reducing delays, optimising resource use, and improving passenger experience, aligning with the global agenda for low-carbon and equitable transport systems. Future work should focus on integrating the variability of operational factors and passenger behaviours into dwell-time models to improve their predictive accuracy, particularly for use on a regional railway. Specifically, future research should aim to extend and refine this work through several specific directions:
Develop correction factors or sub-models, including multi-level or random-effects formulations that explicitly account for operational delay components, infrastructure or regulatory factors, particularly C4 time (conductor procedures), tailored to regional railway conditions.
Expand data collection to include a broader range of stations with diverse infrastructure, service frequency, and operational constraints, enabling broader model validation and generalisation.
Investigate hybrid modelling approaches that combine a statistical core (for passenger flow prediction) with rule-based or stochastic elements to better capture the operational variability inherent in regional railways.
These targeted extensions will help improve dwell time modelling accuracy, enhance operational planning, and support the development of more resilient and efficient regional rail systems.