Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting

Barrera Hernandez, Javier Esteban; Tarazona Torres, Luis Enrique; Tabares, Alejandra; Álvarez-Martínez, David

doi:10.3390/smartcities8030087

Open AccessArticle

Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting

by

Javier Esteban Barrera Hernandez

,

Luis Enrique Tarazona Torres

,

Alejandra Tabares

^*

and

David Álvarez-Martínez

Industrial Engineering Department, Universidad de los Andes, Bogota 111711, Colombia

^*

Author to whom correspondence should be addressed.

Smart Cities 2025, 8(3), 87; https://doi.org/10.3390/smartcities8030087

Submission received: 21 March 2025 / Revised: 9 May 2025 / Accepted: 11 May 2025 / Published: 26 May 2025

Download

Browse Figures

Versions Notes

Abstract

:

Highlights

What are the main findings?

The adaptive heuristic achieves on average a 95% match to the MILP benchmark’s operational utility with a mean optimality gap below 5%, while preserving high service levels (>90%) across diverse demand profiles.
Computational times are reduced by up to 98% compared to the exact MILP model, enabling near real-time dispatch decisions on a standard computing platform.

What is the implication of the main finding?

The framework offers a scalable, implementable dispatching solution for mid-sized urban transport operators, balancing service quality and cost efficiency under uncertain demand.
By integrating machine-learning forecasts with a lightweight heuristic, the approach bridges the gap between theoretical optimization and practical, data-driven real-time operations.

Abstract

Accurate and adaptive bus dispatching is vital for medium-sized urban centers, where static schedules often fail to accommodate fluctuating passenger demand. In this work, we propose a dynamic heuristic that integrates machine learning-based demand forecasts into a discrete-time planning horizon, thereby enabling real-time adjustments to dispatch decisions. Additionally, we introduce a tailored mathematical model—grounded in mixed-integer linear programming and space-time flows—that serves as a benchmark to evaluate our heuristic’s performance under the operational constraints typical of traditional public transportation systems in Colombian mid-sized cities. A key contribution of this research lies in combining predictive modeling (using Prophet for passenger demand) with operational optimization, ensuring that dispatch frequencies adapt promptly to varying ridership levels. We validated our approach using a real-world case study in Montería (Colombia), covering eight representative routes over a full day (5:00–21:00). Numerical experiments show that: 1. Our heuristic matches or surpasses 95% of the optimal solution’s operational utility on most routes, with an average gap of 4.7%, relative to the benchmark mathematical model. 2. It maintains high service levels—above 90% demand coverage on demanding corridors—and robust bus utilization, without incurring excessive operating costs. 3. It reduces computation times by up to 98% compared to the optimization model, making it practically viable for daily scheduling where solving large-scale models exactly can be prohibitively time-consuming. Overall, these results underscore the heuristic’s practical effectiveness in boosting profitability, optimizing resource use, and rapidly adapting to demand fluctuations. The proposed framework thus serves as a scalable and implementable tool for transportation operators seeking data-driven dispatch solutions that balance operational efficiency and service quality.

Keywords:

demand forecasting; dispatch optimization; dynamic heuristic; transport fleet management; urban public transport

1. Introduction

This article addresses the challenge of dynamic bus dispatching under demand uncertainty in medium-sized cities. Traditional fixed-schedule systems often fail to respond effectively to fluctuations in passenger flow and traffic congestion, resulting in inefficient resource allocation, long waiting times, and decreased service reliability. These issues are particularly acute in developing countries where public transportation systems are constrained by limited budgets and outdated operational practices.

In the case of Montería, Colombia, public transport inefficiencies are evident. The system suffers from low service frequency, poor passenger satisfaction, and financial instability among service providers. These problems have led to an increase in the use of informal transportation modes and private vehicles, exacerbating urban congestion and undermining the sustainability of formal transit systems. The conditions in Montería reflect broader challenges shared by other mid-sized cities across Latin America.

Although numerous studies have proposed optimization-based approaches for transit planning, including Mixed Integer Linear Programming (MILP) models and predictive techniques, many of these assume static demand or require computational resources unsuitable for real-time applications. Few works explicitly address the integration of predictive analytics and dynamic decision-making under operational constraints specific to non-metropolitan systems in the Global South.

The main contributions of this article are:

We develop a hybrid dispatching framework that integrates short-term demand forecasting using Prophet with a MILP-based exact optimization model and an adaptive heuristic strategy for real-time responsiveness.
Our approach is tailored to the realities of traditional transportation systems in medium-sized Latin American cities, where dispatching decisions must balance operational costs and service quality under high uncertainty.
We validate our model using empirical data from Montería, demonstrating significant improvements in service efficiency and operational viability compared to both fixed-schedule and purely heuristic baselines.

The remainder of this paper is organized as follows: Section 2 reviews the relevant literature on dynamic scheduling and public transportation optimization. Section 3 details the methodological framework, including the forecasting model, optimization formulation, and heuristic components. Section 4 presents the case study, experimental results and discusses implications, and Section 5 concludes.

2. State-of-the-Art

The TNSP represents a central challenge in public transportation management, as it requires balancing multiple objectives—such as minimizing operating costs and maximizing responsiveness to passenger demand—in order to enhance service quality and operational efficiency, all while adhering to the system’s inherent operational constraints and limitations. In recent decades, various approaches, both exact and approximate, have been developed to address this problem from different perspectives, integrating mathematical, heuristic, and hybrid models to propose adaptable solutions for specific contexts [1]. However, although these approaches have demonstrated effectiveness in various applications, their implementation usually involves a high computational cost. This implies that adapting them to real-world environments still faces significant challenges due to the inherent complexity of implementing these methodologies [2].

Exact methods stand out by guaranteeing optimal solutions when modeling the TNSP, with mathematical models being one of the most commonly used approaches: linear programming. This approach focuses on formulating operational constraints such as bus capacity, operation schedules, and passenger demand. Generally, these methods aim to minimize operational costs or maximize system efficiency. For example, Borndörfer et al. [3] and Zhou et al. [4] addressed the problem using Mixed Integer Linear Programming (MILP) to minimize operational costs and travel times, considering constraints such as vehicle capacity, synchronization between routes, and minimization of transfer times. In these studies, the problem was formulated as a multi-commodity flow model, which allowed for dynamically generating optimized transit lines. However, the second approach uses column generation to handle the computational complexity of capacity constraints, achieving solutions applicable to real networks.

On the other hand, Pei et al. [5] explored the use of modular vehicles in transport networks through an MILP approach that dynamically adjusts vehicle capacities and frequencies to respond to fluctuations in demand. This solution showed a significant reduction in operational costs and travel times compared to traditional systems. Guan et al. [6] proposed a dispatch and route optimization model for public transportation systems with variable demand, using a hybrid approach based on a hybrid LNS-genetic algorithm, which resulted in significant improvements in operational efficiency and user service. Van Oudheusden et al. [7] introduced a nonlinear programming approach to optimize frequencies and schedules, focusing on minimizing empty trips performed by the fleet and increasing the number of transported passengers. To avoid stochastic uncertainties, Van Berkum et al. [8] formulated the problem within a rolling horizon framework, dividing an operational day into predetermined intervals. This approach uses a convex nonlinear formulation that allows solving the problem to global optimality with a limited computational cost, simultaneously optimizing the dispatch times of all scheduled trips in each interval. Gkiotsalitis [9] demonstrated that periodic optimization through convex quadratic programming can minimize variations in vehicle departure intervals, improving service regularity in high-frequency lines, as evidenced in the case study presented in the 302 bus network in Singapore.

Another notable approach in the reviewed literature is that of Chen and Zhou [10], who implemented a Dynamic Programming (DP) algorithm to solve the dispatch problem in oversaturated systems. This approach efficiently handles constraints such as vehicle capacities and waiting times, applying valid inequalities to reduce the computational complexity of DP. The results demonstrated significant reductions in operational costs and waiting times, with practical applications in networks such as the Beijing metro and Tampa Bay.

Although exact methods guarantee optimal solutions, they face limitations in terms of scalability and applicability in dense networks or highly dynamic systems, as they require computationally expensive methods such as Branch and Bound to achieve optimality [11]. In these cases, heuristics provide fast and flexible solutions, sacrificing precision for efficiency. These methodologies explore the solution space using operational rules or adaptive algorithms. Hadas et al. [12] combined an analytical and iterative approach that jointly optimizes frequencies, schedules, and transfers in a public transportation network. This methodology adjusts bus dispatching based on stochastic simulations using historical data, balancing service quality and operational efficiency. Similarly, Eranki [13] developed an iterative heuristic focused on assigning departure schedules to maximize simultaneous arrivals at transfer points, classifying nodes according to their relevance in the allocation process.

Additionally, Gorev et al. [14] introduced a model that combines demand predictions based on machine learning with a heuristic for dynamic vehicle allocation. Validated on a European network, this approach reduced waiting times by 25% and improved fleet utilization by 18%. Furthermore, Berrebi et al. [15] proposed a model based on stochastic decision processes to mitigate “bus bunching”. This approach, tested on high-frequency circular routes, significantly reduced average waiting times.

Finally, hybrid methods have emerged as a versatile solution, combining the precision of exact approaches with the adaptability of heuristics. For example, Gkiotsalitis et al. [16] integrated dynamic programming with genetic algorithms to optimize dispatching in dense multimodal networks, achieving efficient route synchronization and adapting to variable scenarios. This approach stood out for its ability to handle large amounts of data and its integration with real-time predictive systems. Yao et al. [17] implemented dynamic simulations combined with relaxation techniques to adjust dispatching in real-time, improving the user experience while maintaining reasonable computational costs. In comparison, hybrid methods, although generally more computationally expensive than pure heuristics, offer a greater capacity to address complex constraints such as multimodal synchronization and variable flows.

As a summary, Table 1 presents a structured comparison of the approaches reviewed in the literature to address the TNSP problem. This table highlights the objectives, methodologies used, key variables, constraints considered, case studies, and relevant results for each work, allowing the identification of strengths and limitations of different methods.

In recent years, technological advancements have enabled the development of more dynamic and adaptive methodologies to address the TNSP, integrating tools such as real-time data analysis, synchronization with mobile applications, and self-organization strategies [18]. The research presented in this article complements and extends these approaches by proposing a dynamic heuristic that incorporates demand predictions based on machine learning. This work addresses a gap identified in the State-of-the-Art: the need to integrate predictive tools with dynamic operational strategies, differentiating itself from traditional approaches by emphasizing practicality and adaptability. While exact analytical methods often excessively simplify real operational conditions, sacrificing their applicability in complex scenarios, the proposed methodology is designed to work with imperfect information and system-specific constraints. Thus, it becomes an accessible and applicable tool for medium and small-scale companies, offering a practical and low-computational-cost solution that meets the specific needs of these organizations.

The proposed method builds on the existing literature that addresses the interplay of demand forecasting and resource allocation for transit systems; yet, it offers a distinctive combination of features—ranging from the underlying space-time formulation to the explicit handling of dynamic passenger flows in medium-scale networks. The core advances of this study, when compared against prior work, can be summarized as follows:

While previous works often address dispatch optimization using either stand-alone ML techniques [14] or purely heuristic rules [12], our approach combines a machine learning demand-forecasting module with a discrete-time heuristic to yield both adaptability (to fluctuating passenger loads) and computational efficiency.
Several studies emphasize exact formulations (e.g., MILP or nonlinear programming) suitable for large metropolitan transit systems [3,7], yet the associated models become prohibitively expensive or over-simplified in mid-sized cities with traditional operation schemes. Our method leverages a space-time model specifically tailored to medium-scale fleets, ensuring that real operational nuances—like flexible stop structures and limited data collection—are captured.
Unlike purely simulated frameworks [4,6], this research uses real operational records from Montería and obtains solutions that stay within 5% of the optimal benchmark for high-demand corridors, closing a persistent gap between theoretical models and applied outcomes.
Exact methods [8] can handle rolling-horizon dispatch but become increasingly costly as the number of buses and time intervals grows. Our proposed heuristic not only adapts to short time intervals but also reduces computational time by over 90%, compared to the tested MILP, making it feasible for near real-time scheduling.
While many works focus on either high-frequency or low-demand scenarios exclusively [12,15], our model demonstrates robustness across a range of demand profiles (peak versus off-peak) and route lengths, which is crucial for non-metropolitan contexts where demand can vary drastically throughout the day.
In contrast to heuristic approaches that concentrate on single-route case studies or oversimplify fleet usage [13], we formulate explicit decision rules for bus reallocation (idle repositioning) and endpoint dispatch thresholds, thus providing a structured policy framework that operators can readily implement.

3. Methodological Framework

This section outlines the methodological framework adopted to address the dynamic bus dispatching problem in public transportation systems under demand uncertainty. The approach integrates three core components: (i) a short-term demand forecasting model; (ii) a dynamic heuristic for real-time dispatch decision-making; and (iii) an exact optimization benchmark based on MILP. Each component plays a complementary role in the design, execution, and evaluation of the proposed system, enabling a comprehensive assessment of performance in terms of operational efficiency and service quality.

The central challenge lies in balancing transportation supply and stochastic passenger demand over discrete-time intervals. Formally, we consider a finite-horizon, discrete-time planning problem in which decisions must be made at regular intervals regarding whether or not to dispatch a bus from a given terminal. Let

T = {0, 1, \dots, H}

denote the set of time intervals within the planning horizon, and let

R

represent the set of routes under consideration. Each route

r \in R

has two operating directions, indexed by

s \in S = {1, 2}

, corresponding to the movement between a pair of terminals

(C_{1}, C_{2})

.

To support dispatch decisions, a demand forecasting model is developed based on Prophet [19], a modular and interpretable time series model capable of capturing trend shifts, seasonalities, and holiday effects. The output of this model, denoted as

D_{r, s, t}

, serves as an input to both the heuristic and the MILP formulation. Accurate forecasting is critical as it allows the system to anticipate demand surges and respond proactively rather than reactively.

Based on these predictions, a dynamic dispatch heuristic is formulated to make real-time decisions regarding bus deployment. The heuristic incorporates operational constraints such as bus capacity, travel time, depot routing, and terminal availability, as well as service-level thresholds including minimum occupancy and maximum waiting time. It simulates the temporal evolution of the system by updating state variables and re-evaluating dispatch decisions at each interval

t \in T

, generating a dispatch plan that seeks to optimize a utility function defined as the net operational profit:

Z = Revenue - Operating Costs .

To provide a performance benchmark, an exact optimization model is formulated using a space-time network representation of the bus system. This model is cast as a Mixed-Integer Linear Program (MILP), capturing the spatiotemporal constraints of vehicle flows, bus capacities, demand satisfaction, and movement feasibility. It yields a globally optimal dispatching solution under perfect information, and thus serves as a reference point against which the heuristic can be compared in empirical experiments.

The combined methodology is designed to achieve three objectives:

Forecasting precision: leverage data-driven methods to generate reliable passenger demand estimates under temporal variability.
Real-time adaptability: provide a computationally efficient dispatching policy that dynamically reacts to changing system conditions.
Benchmark validation: use MILP model to validate the heuristic’s performance and quantify suboptimality gaps.

This integrative approach aligns with recent trends in the optimization of transportation systems, emphasizing the interplay between predictive analytics, real-time control, and rigorous mathematical modeling. The following subsections provide detailed formulations of the system assumptions, optimization model, dispatching heuristic, and statistical evaluation procedures.

3.1. Assumptions and System Characteristics

This study models the dynamic bus dispatching problem in Montería under a set of operational and structural conditions that are both representative of the local transit system and conducive to analytical modeling. These assumptions serve to define the system’s constraints, guide the design of the mathematical model and heuristic algorithm, and ensure coherence between the empirical environment and the proposed methodological framework.

The operational horizon spans from 5:00 a.m. to 9:00 p.m., discretized into fixed-length intervals that form the planning timeline. The system includes 21 bus routes, each configured as a bidirectional service between two terminals denoted

C_{1}

and

C_{2}

. Each direction—

C_{1} \to C_{2}

and

C_{2} \to C_{1}

—is treated independently, with its own demand function and travel time, reflecting observed asymmetries in passenger flows and route durations.

The vehicle fleet is homogeneous in terms of capacity and performance, and all buses are housed in a common depot P. At the beginning of each operational day, buses are dispatched to terminals with the aim of balancing vehicle availability across both directions. However, current dispatching practices are manual and static: dispatch volumes are fixed by hour and informed by expected round-trip durations. This static planning does not incorporate real-time data or stochastic variability in operations, leading to desynchronization as the day progresses.

Demand is time-varying and distributed along the entire length of each route. While passenger boarding data from onboard turnstiles are used to estimate demand patterns, these data are incomplete: they exclude suppressed demand and lack continuous spatial resolution. Additionally, the absence of real-time technologies such as GPS and automated passenger counters further limits the accuracy of operational monitoring. Given these limitations, the model assumes directional symmetry in demand and employs time series forecasting to estimate interval-specific demand for each route and direction.

Operational costs and passenger fares are modeled as constant over time. These parameters define the utility function optimized by both the MILP formulation and the dynamic heuristic, which aim to maximize net revenue from operations.

To validate the proposed methodology, the study focuses on a carefully selected subset of eight routes that reflect a broad range of operating conditions. These routes—Pradera 27, Panzenu, Santander, Mogambo 22, Tambo Circunvalar, KM30, Dorado, and KM15—differ in spatial coverage, average demand intensity, and fleet assignment policies. This selection provides a robust test bed for evaluating methodological performance across diverse scenarios.

Overall, these assumptions provide a necessary abstraction of the transit system in Montería, enabling a rigorous mathematical treatment while preserving the empirical fidelity required for actionable insights.

3.2. Passenger Demand Forecasting Model

Short-term passenger demand forecasting is a critical input for dynamic bus dispatching, enabling anticipatory decision-making in systems subject to spatiotemporal variability. This subsection presents the methodological formulation and comparative evaluation of three time series forecasting models: Autoregressive Integrated Moving Average (ARIMA), Long Short-Term Memory Networks (LSTM), and Prophet. Each model is designed to predict the expected passenger count per route–direction–time triplet

(r, s, t) \in R \times S \times T

, given the historical observation filtration

F_{t - 1}

.

For each route–direction pair, the objective is to forecast

D_{r, s, t} = E [y_{r, s, t} ∣ F_{t - 1}],

where

y_{r, s, t}

denotes the number of passengers observed at interval t, and

F_{t - 1}

is the sigma-algebra generated by past observations

{y_{r, s, u}}_{u < t}

.

3.2.1. Forecasting Models

ARIMA: A classical linear time series model grounded in the Box–Jenkins methodology. ARIMA assumes stationarity (after differencing), modeling temporal dynamics via autoregressive (AR) and moving average (MA) components:

Φ (B) {(1 - B)}^{d} y_{t} = Θ (B) ε_{t},

where B is the backshift operator, d is the order of differencing, and

Φ (B), Θ (B)

are polynomials with coefficients estimated via maximum likelihood. Although ARIMA is interpretable and effective for short-range dependencies, it struggles with multiple seasonalities and nonlinearity.

LSTM Networks: A class of deep learning models specifically designed to capture long-range dependencies and nonlinear temporal patterns through memory cells and gated recurrence. The LSTM cell is defined as:

\begin{matrix} i_{t} & = σ (W_{i} x_{t} + U_{i} h_{t - 1} + b_{i}), \\ f_{t} & = σ (W_{f} x_{t} + U_{f} h_{t - 1} + b_{f}), \\ o_{t} & = σ (W_{o} x_{t} + U_{o} h_{t - 1} + b_{o}), \\ c_{t} & = f_{t} ⊙ c_{t - 1} + i_{t} ⊙ tanh (W_{c} x_{t} + U_{c} h_{t - 1} + b_{c}), \\ h_{t} & = o_{t} ⊙ tanh (c_{t}), \end{matrix}

where

x_{t}

is the input sequence,

h_{t}

the hidden state, and

c_{t}

the cell state. LSTM Networks are highly expressive but demand extensive data, are computationally expensive, and lack transparency [20,21].

Prophet: A Bayesian structural time series model introduced by Taylor and Letham [19]. Prophet decomposes the signal as:

y (t) = g (t) + s (t) + h (t) + ε_{t},

where

g (t)

models the trend,

s (t)

represents multiple seasonal components (daily, weekly, yearly) using Fourier expansions, and

h (t)

accounts for holidays via regressors. The model supports automatic changepoint detection using sparsity-inducing priors and is implemented in Stan, allowing efficient posterior inference. Prophet is robust to missing data and outliers, and facilitates interpretability.

A preliminary qualitative comparison is presented in Table 2.

3.2.2. Evaluation Protocol

Each model was trained independently per route–direction series using a rolling window over 30 historical days. The prediction target is the number of passengers per 10 min interval. To ensure model comparability, hyperparameters were tuned using cross-validation on training folds, and performance was assessed on a hold-out set.

Two evaluation metrics were used:

Root Mean Squared Error (RMSE): Sensitive to large errors, highlighting predictive conservatism.
Mean Absolute Percentage Error (MAPE): Scale-free and interpretable in relative error terms.

Detailed numerical results and statistical comparisons of forecasting accuracy for each model are presented in Section 4, where their implications on dispatch planning are further discussed. As stated in the conclusion, given its robustness, interpretability, and low implementation overhead, the Prophet model was selected as the forecasting engine for the dispatch heuristic developed in this study. Nonetheless, the alternative models serve as valuable performance benchmarks in the experimental comparison.

3.3. Space-Time MILP Formulation for Optimal Bus Dispatching

The following MILP formulation models the optimal bus dispatching problem over a discretized space-time network, capturing temporal decisions, operational feasibility, and passenger demand satisfaction for a given route

r \in R

. The formulation is defined over a planning horizon discretized in time intervals indexed by

t \in T = {0, 1, \dots, H}

, and locations

L_{r} = {P, C 1_{r}, C 2_{r}}

, representing the depot and the route-specific terminals. The sets of routes and directions are

R

and

S = {1, 2}

, respectively.

max Z = α_{r} \sum_{b \in B_{r}} \sum_{s \in S} \sum_{t \in T} p_{b, s, t} - β_{r} \sum_{b \in B_{r}} \sum_{t \in T} z_{b, t} .

(1)

V_{b, P, 0} = 1, \forall b \in B_{r}

(2)

\sum_{l \in L_{r}} V_{b, l, t} = 1, \forall b \in B_{r}, \forall t \in T

(3)

V_{b, l, t + 1} = V_{b, l, t} - \sum_{\begin{matrix} l^{'} \in L_{r} \\ t^{'} = t + T_{r, l, l^{'}} \end{matrix}} x_{b, l, t, l^{'}, t^{'}} + \sum_{\begin{matrix} l^{'} \in L_{r} \\ t^{″} = t + 1 - T_{r, l^{'}, l} \end{matrix}} x_{b, l^{'}, t^{″}, l, t + 1}, \forall b \in B_{r}, \forall l \in L_{r}, \forall t \in T

(4)

x_{b, l, t, l^{'}, t^{'}} \leq V_{b, l, t}, \forall b \in B_{r}, \forall l, l^{'} \in L_{r}, \forall t \in T, \forall t^{'} = t + T_{r, l, l^{'}}

(5)

z_{b, t} = 1 - V_{b, P, t}, \forall b \in B_{r}, \forall t \in T

(6)

\sum_{s \in S} p_{b, s, t} \leq C_{r} \cdot z_{b, t}, \forall b \in B_{r}, \forall t \in T

(7)

\sum_{b \in B_{r}} p_{b, s, t} + y_{s, t} = D_{r, s, t}, \forall s \in S, \forall t \in T

(8)

m_{b, s, t} = \sum_{\begin{matrix} l, l^{'} \in L_{r} \\ t^{'} \leq t < t^{'} + T_{r, l, l^{'}} \end{matrix}} δ_{r, (l, l^{'}), s} \cdot x_{b, l, t^{'}, l^{'}, t^{'} + T_{r, l, l^{'}}}, \forall b \in B_{r}, \forall s \in S, \forall t \in T

(9)

Equations (1)–(9) jointly define the objective and the operational constraints for a specific route r. Equation (1) seeks to maximize the net operational utility by subtracting costs from fare revenue. Constraints (2) and (3) ensure that each bus starts from the depot and is present at a unique location at each interval, respectively. Equation (4) updates the location of each bus over time, in accordance with its movements. Equation (5) ensures buses only move from locations where they are present. Equation (6) encodes the operational status of a bus—indicating whether it is active in service or idle at the depot. Constraint (7) limits passenger transport to vehicle capacity. Constraint (8) enforces full demand coverage, including unmet demand accounting. Finally, Equation (9) enforces that the binary variable

m_{b, s, t}

reflects actual vehicle movement along direction s.

The proposed model supports tactical and operational decision-making by precisely defining bus dispatches and assignments, optimizing operations independently for each route in the system. For each execution, specific parameters must be adjusted, such as passenger demand in each direction, bus availability, travel times, vehicle capacities, and other relevant elements, ensuring that decisions adapt to the particular conditions of the analyzed route. Table 3 defines all sets, parameters, and decision variables used in the formulation.

This model aims to guarantee optimal solutions for each route

r \in R

; however, its computational complexity increases with the size of the instance for that route. The number of binary variables scales proportionally to

| B_{r} | \cdot (| L_{r} | + | S |) \cdot | T |

, while the number of continuous variables grows as

| B_{r} | \cdot | S | \cdot | T | + | S | \cdot | T |

. Regarding constraints, their number increases as

| B_{r} | \cdot | S | \cdot | T | + | B_{r} | \cdot | T | \cdot | L_{r} |

, reflecting the number of equations required to ensure compliance with operational conditions and model consistency for the route. These characteristics imply high computational costs when working with routes that include multiple buses or a large number of time intervals, which may hinder direct model resolution for large-scale instances.

To improve computational feasibility in such cases, advanced optimization techniques can be employed to efficiently solve the model for each route. Among these techniques are valid inequalities, Gomory cuts combined with branch-and-bound strategies, and Benders decomposition, which are useful for reducing the feasible search space or decomposing the problem into more manageable subproblems, without compromising the quality of the obtained solution.

Another strategy to reduce complexity involves decreasing temporal granularity, meaning considering longer time intervals. While this simplifies the model by reducing the number of variables and constraints, it also limits its ability to capture time-dependent dynamics more accurately and adapt to specific fluctuations in demand or operational conditions at given moments. Therefore, any reduction in granularity must be carefully evaluated to balance computational simplification and accuracy in representing the operation of the route.

3.4. Bus Dispatch Heuristic

The proposed heuristic aims to optimize bus dispatching in a public transportation system by integrating passenger demand predictions and dynamic decisions based on operational conditions. Its design is based on a simulation of daily operations, structured in an iterative approach that periodically evaluates the state of demand and bus availability at discrete-time intervals. The primary objective is to maximize the operational performance of each route, measured through utility, which considers both the revenue generated by served passengers and the costs associated with fleet usage during operation.

3.4.1. Parameter Definition

The heuristic takes as input the predicted passenger demand for each route and direction, along with operational system parameters such as bus capacity, travel times, and specific heuristic parameters like the occupancy threshold required for dispatching and the maximum time a bus can remain at the terminal before being dispatched. As a result, the heuristic generates a detailed dispatch plan that specifies each bus’s activity on each route throughout the operation. Additionally, it provides performance metrics that evaluate the effectiveness of the decisions. Table 4 shows the description of the heuristic parameters.

The possible states of a bus are as follows:

At depot: the bus is at the depot (stored), available for assignment.
At terminal s: the bus is at terminal $C 1$ or $C 2$ , waiting to be dispatched.
In route: the bus is traveling between terminals with passengers.
In transit to depot: the bus is returning empty to the depot from a terminal.
In transit to terminal: the bus is heading to a terminal from the depot.
Empty trip: the bus is traveling between terminals without passengers.

3.4.2. Heuristic Description

The Algorithm 1 details the operation of the bus dispatch heuristic. This algorithm dynamically evaluates demand and bus status at each time interval to make efficient decisions. The heuristic is based on a discrete simulation over a planning horizon divided into time intervals

T

. At each interval, the need to dispatch buses for each route and direction is evaluated based on accumulated demand and passenger wait time.

Initially, the accumulation variables

Q_{r, s, 0}

, waiting time

W_{r, s, 0}

, and accumulated hours list

L_{r, s, 0}

are set to zero and empty for all routes

r \in R

and directions

s \in S

(lines 1–5). These variables track the accumulated passenger volume, waiting time, and the times when demand accumulates, which is essential for evaluating dispatch conditions at each time interval. Additionally, the state of all buses

S_{b}

is initialized to at depot, indicating their initial location and availability

A_{b}

at the start of operations. This setup determines when and where they can be assigned to routes. Variables tracking the total number of passengers transported and total kilometers traveled are also initialized to zero (lines 6–8). Finally, the dispatch plan is initialized, which stores all operational decisions for each bus at each time interval from start to the end of operations, thus recording the fleet’s complete activity (line 9).

At each time interval

t \in T

(line 10), the current time

h_{current}

corresponding to that interval is updated (line 11). For instance, if the planning horizon starts at 5:00 and the intervals last

Δ t = 10

min,

h_{current}

for

t = 1

will be 5:00, for

t = 2

will be 5:10, and so on until

t = 96

, corresponding to 21:00. For each route

r \in R

and each direction

s \in S

(lines 12–13), accumulated demand

Q_{r, s, t}

is updated by adding the predicted demand for the current interval

D_{r, s, t}

to the total accumulated from the previous interval

Q_{r, s, t - 1}

(line 14). Simultaneously, accumulated waiting time

W_{r, s, t}

is increased by adding the interval duration

Δ t

(line 15). Additionally, the current time

h_{current}

is recorded in the list

L_{r, s, t}

, storing the times corresponding to elapsed intervals in the simulation (line 16).

Subsequently, the dispatch threshold

Θ^{'}

is calculated as the product of the bus capacity

C_{r, s}

and the required occupancy percentage

Θ_{r, s}

(line 17). This value determines the minimum accumulated demand required to consider dispatching a bus, ensuring that buses operate at a reasonable occupancy level to optimize efficiency. The dispatch decision (line 18) is based on verifying whether any of the following conditions are met:

Accumulated demand $Q_{r, s, t}$ exceeds the dispatch threshold $Θ_{r, s}$ : this condition implies that the bus has reached a sufficient occupancy level to justify dispatching. If demand continues accumulating without dispatching, the vehicle’s capacity may not be sufficient to accommodate all accumulated demand, generating a service deficit. Therefore, this rule ensures that buses are dispatched before reaching a critical saturation point.
Accumulated waiting time $W_{r, s, t}$ reaches or exceeds the allowed limit $Γ_{r, s}$ : this condition ensures that passengers do not wait beyond a reasonable time before a bus is dispatched. It also respects operational constraints related to synchronizing trips between terminals. If this time is exceeded, it could affect the overall scheduling of the system, including the arrival and departure times of other buses. Thus, this rule seeks a balance between service level and operational punctuality.
It is the last time interval, $t = | T |$ : this condition ensures that all accumulated demand is served before the daily operation ends. Upon reaching the last interval, any remaining demand in the system must be dispatched to maximize the service level at the end of the day. This rule acts as a clearing mechanism to prevent passengers from being left unattended at the end of operations.

Algorithm 1 Efficient Bus Dispatch Heuristic

Input:

D_{r, s, t}

,

T_{r, s}

,

C_{r, s}

,

| B_{r} |

,

{Distance}_{r, s}

Parameters:

R

,

S = {1, 2}

,

T = {0, 1, \dots, H}

,

Θ_{r, s}

,

Γ_{r, s}

,

Δ t

, K,

τ_{C 1, P}

,

τ_{C 2, P}

,

λ

,

μ

Output: Bus dispatch plan, performance metrics (U, NS, CU, UP, IPK)
1: for each

r \in R

do
2: for each

s \in S

do
3:

Q_{r, s, 0} \leftarrow 0, W_{r, s, 0} \leftarrow 0, L_{r, s, 0} \leftarrow [], Q_{ns, r, s, 0} \leftarrow 0

  4:     end for
  5: end for
  6: for each

b \in B

do
7.

S_{b} \leftarrow ’ at_depot \leftarrow, A_{b} \leftarrow start time, P_{b} \leftarrow 0, K_{b} \leftarrow 0

8: end for
9: Initialize Dispatch Plan ← empty
10: for each

t \in T

do
11:

h_{current} \leftarrow

time corresponding to interval t
12: for each

r \in R

do
13: for each

s \in S

do
14:

Q_{r, s, t} \leftarrow Q_{r, s, t - 1} + D_{r, s, t}

15:

W_{r, s, t} \leftarrow W_{r, s, t - 1} + Δ t

16:

L_{r, s, t} . append (h_{current})

17:

Θ^{'} \leftarrow C_{r, s} \times Θ_{r, s} / 100

18: if

Q_{r, s, t} \geq Θ^{'}

or

W_{r, s, t} \geq Γ_{r, s}

or

t = H

then
19:

b \leftarrow SELECTAVAILABLEBUS (r, s, h_{current})

20: if

b \neq null

then
21: Dispatch bus b on route r, direction s, at time

h_{dispatch} \leftarrow L_{r, s, t} [0]

22:

S_{b} \leftarrow ’ in_route_s ’

23: if

s = 1

then
24:

A_{b} \leftarrow h_{dispatch} + T_{r, 1}

25: else
26:

A_{b} \leftarrow h_{dispatch} + T_{r, 2}

27: end if
28:

P_{served} \leftarrow min (C_{r, s}, Q_{r, s, t})

29:

P_{b} \leftarrow P_{b} + P_{served}

30:

K_{b} \leftarrow K_{b} + {Distance}_{r, s}

31:

Q_{r, s, t} \leftarrow Q_{r, s, t} - P_{served}

32:

W_{r, s, t} \leftarrow 0

33: if

Q_{r, s, t} > 0

then
34:

L_{r, s, t} \leftarrow [t]

35: else
36:

L_{r, s, t} \leftarrow []

37:                    end if
38:                else
39:

Q_{ns, r, s, t} \leftarrow Q_{ns, r, s, t - 1} + Q_{r, s, t}

40:

Q_{r, s, t} \leftarrow 0, W_{r, s, t} \leftarrow 0, L_{r, s, t} \leftarrow []

41:                end if
42:            end if
43:            Schedule

(b, r, h_{current}, S_{b})

44:         end for
45:     end for
46:     EvaluateIdleBuses

(r, s, h_{current}, t, B, τ_{C 1, P}, τ_{C 2, P})

47: end for
48: SendBusesToDepot

(h_{current}, t, R, S, B, τ_{C 1, P}, τ_{C 2, P})

49: ComputeGlobalMetrics

(P_{b}, K_{b}, Q_{ns, r, s, t}, D_{r, s, t}, α, β)

50: return Bus dispatch plan, performance metrics (U, NS, CU, UP, IPK)

If any condition is met, an available bus is selected for dispatch by calling the function SelectAvailableBus (line 19). This function looks for a bus b that is available at the terminal corresponding to direction s and whose availability allows satisfying the accumulated demand up to that moment. The bus must be available from the time

h_{dispatch}

corresponding to the first recorded time in the list

L_{r, s, t}

or earlier.

If an available bus b is found (line 20), it is dispatched on route r, direction s, at the dispatch time

h_{dispatch}

(line 21). Subsequently, the bus state

S_{b}

is updated to in route, and its availability

A_{b}

is adjusted considering the travel time

T_{r, s}

between the terminals corresponding to the journey direction (lines 21–26).

Continuing with the updates, the served demand is set as the minimum between the bus capacity

C_{r, s}

and the accumulated demand

Q_{r, s, t}

up to the current interval. This ensures that the bus can only transport the number of passengers it can accommodate, respecting its maximum capacity. This approach ensures that any excess demand is not ignored but remains accumulated in

Q_{r, s, t}

to be served in future dispatches if buses are available at that time. Additionally, the bus and route performance metrics are updated, including the number of passengers transported, kilometers traveled, and occupancy percentage (lines 28–31).

After dispatching, the accumulated waiting time

W_{r, s, t}

is reset to zero, as most or all waiting passengers have been served, and any new waiting time starts counting from that moment. Additionally, the list

L_{r, s, t}

is cleared to record only the waiting intervals associated with the new accumulated demand. However, if there is remaining demand, the list must start accumulating from t as there is pending demand that needs to be reevaluated (lines 32–37).

On the other hand, if no available bus is found (line 38), the accumulated demand

Q_{r, s, t}

remains unchanged since it cannot be served in that interval. This demand is recorded as unmet demand for later analysis (line 39). This record helps evaluate operational limitations, identify excessive accumulation patterns, and determine possible adjustments in resource allocation, whether by increasing the fleet, modifying dispatch thresholds, or adjusting operational planning. Finally, the bus schedule is recorded in the dispatch plan, considering the previously made dispatch decisions (line 43).

As observed, the heuristic uses a combination of demand-based and wait-time-based criteria to decide when to dispatch a bus. These criteria ensure a balance between operational efficiency and the service level offered to passengers. The occupancy threshold

Θ_{r, s}

controls the minimum percentage of bus occupancy before dispatching. This ensures that buses are not dispatched with very low occupancy, optimizing fleet utilization and reducing operational costs, thereby achieving higher operational utility. The maximum waiting time

Γ_{r, s}

ensures that passengers do not wait beyond a reasonable time, maintaining a good service level and preventing excessive passenger accumulation.

At the end of processing all routes and directions in interval t, the function EvaluateIdleBuses is called (line 46). Algorithm 2 details the precise instructions for this function, which aims to optimize the allocation of available buses at the terminals, dynamically evaluating whether they should remain at their current location, be sent to the depot, or be moved to the opposite terminal. The decision is based on estimated future demand and available capacity at each terminal.

Algorithm 2 Function EvaluateIdleBuses

Input: r, s,

h_{current}

, t,

B

,

τ_{C 1, P}

,

τ_{C 2, P}

Parameters:

R

,

S = {1, 2}

, K
Output: Updated bus state

S_{b}

and availability time

A_{b}

for idle buses
1: for each

r \in R

do
2: for each

s \in S

do
3:

B_{available} \leftarrow {b \in B ∣ ((S_{b} = ’ C 1 ’ \land s = 1) \lor (S_{b} = ’ C 2 ’ \land s = 2)) \land A_{b} \leq h_{current}}

4:

D_{future} \leftarrow \sum_{k = 1}^{K} D_{r, s, t + k}

5:

C_{total} \leftarrow | B_{available} | \times C

6: if

C_{total} > D_{future}

then
7:

B_{idle} \leftarrow

SelectIdleBuses(B_available)
8: for each

b \in B_{idle}

do
9:

s^{'} \leftarrow 2

if

s = 1

else

s^{'} \leftarrow 1

10:

D_{opposite} \leftarrow \sum_{k = 1}^{K} D_{r, s^{'}, t + k}

11:

C_{opposite} \leftarrow | {b^{'} \in B ∣ ((S_{b^{'}} = ’ C 1 ’ \land s^{'} = 1) \lor (S_{b^{'}} = ’ C 2 ’ \land s^{'} = 2)) \land A_{b^{'}} \leq h_{current}} | \times C

12: if

C_{opposite} < D_{opposite}

then
13: SendEmptyBus

(b, h_{current}, s, r)

14: else
15: SendBusToDepot

(b, h_{current}, s, τ_{C 1, P}, τ_{C 2, P})

16:                end if
17:            end for
18:         end if
19:     end for
20: end for
21: return

S_{b}, A_{b}

For each route

r \in R

and direction

s \in S

, buses in the state at terminal s that will be available before or at the current time

h_{current}

are identified. Next, the future demand

D_{r, s, t^{'}}

is calculated by summing the demand predictions for the next K intervals

(t^{'} > t)

within a defined horizon. This future demand is compared with the total capacity of the available buses at the terminal, calculated as the product of the number of present buses and their individual capacity.

If the present bus capacity exceeds the estimated future demand, there are considered to be idle buses. For these buses, the following two options are evaluated:

Sending to the opposite terminal: The future demand at the opposite terminal is estimated, along with the available capacity there. If the capacity is insufficient to meet future demand in the opposite direction, the bus is sent empty. The bus state is updated to in transit towards the opposite terminal, and its availability $A_{b}$ is adjusted, considering the travel time in that direction.
Sending to the depot: If the capacity at the opposite terminal is sufficient to meet its future demand, the bus is sent to the depot to optimize operating costs and reduce idleness. In this case, the bus state is updated to in transit to depot, and its availability $A_{b}$ is adjusted to reflect its arrival at the depot.

Thus, the function ensures dynamic bus management, maximizing their utilization based on the real needs of the transportation system and adjusting to demand fluctuations. By strategically redistributing buses, operational efficiency is optimized, reducing costs associated with idleness.

At the end of all time intervals, the function SendBusesToDepot (line 48) is executed to return all buses to the depot at the end of daily operations. This function iterates through the list of all buses, and if a bus is not in the at depot state, it updates its state to in transit to depot and updates its availability

A_{b}

considering the travel time. This recorded time marks when the bus completes its operation if it was still active. This process ensures that all buses are in the depot at the start of the next operational day, simulating real-world operations as consistently followed.

Finally, the function CalculateGlobalMetrics is executed with the accumulated data (

P_{b}

,

K_{b}

,

Q_{ns, r, s, t}

,

D_{r, s, t}

) (line 49). This function computes the system’s global performance metrics, providing a comprehensive evaluation of the effectiveness and efficiency of the implemented heuristic.

Operational Utility (U): evaluates the system’s profitability by calculating the difference between total revenue and total operating costs.

$U_{r} = \sum_{b \in B_{r}} α \times P_{b, r} - \sum_{b \in B_{r}} β \times z_{b, r}, \forall r \in R$

(10)

Total revenue is calculated as the sum of passengers transported by each bus $P_{b, r}$ multiplied by the fare $α$ for each route r. On the other hand, total operating costs are computed as the sum of operating costs per unit of time $β$ multiplied by the total operating time $z_{b, r}$ for each bus b in each route r. This formula represents the total operational utility for each route r in the system.
Service Level (NS): measures the proportion of total demand met by the system.

${NS}_{r} = (\frac{\sum_{b \in B_{r}} P_{b, r}}{\sum_{s \in S} \sum_{t \in T} D_{r, s, t}}) \times 100, \forall r \in R$

(11)

This metric compares the number of passengers transported to the total predicted demand for each route r, reflecting the system’s efficiency in meeting demand.
Average Capacity Utilization (CU): indicates the average occupancy percentage of buses during trips.

${CU}_{r} = (\frac{\sum_{b \in B_{r}} P_{b, r}}{\sum_{s \in S} | B_{r} | \times C_{r, s}}) \times 100, \forall r \in R$

(12)

It reflects the proportion of bus capacity that is effectively utilized during trips, comparing the total passengers transported to the total available capacity for each route r.
Average Utilization (UP): shows the proportion of time buses are in active operation (on route) compared to the total available operational time.

${UP}_{r} = (\frac{\sum_{b \in B_{r}} Time on {route}_{b, r}}{\sum_{b \in B_{r}} Total operation {time}_{b, r}}) \times 100, \forall r \in R$

(13)

This metric is useful for assessing operational efficiency, showing how much time buses actually spend on the route compared to the total available operating time for each route r.
Passengers per Kilometer Index (IPK): measures efficiency in terms of bus occupancy relative to the distance traveled.

${IPK}_{r} = \frac{\sum_{b \in B_{r}} P_{b, r}}{\sum_{b \in B_{r}} K_{b, r}}, \forall r \in R$

(14)

This allows the evaluation of the system’s operational efficiency by relating the number of transported passengers to the distance traveled for each route r.

These metrics provide essential insights to assess the effectiveness of the heuristic and identify potential areas for improvement. They also facilitate comparisons between different configurations or operational strategies. Additionally, they offer a solid foundation for making adjustments to operational parameters or resource allocation, thereby optimizing the overall system’s performance, as will be discussed later.

3.4.3. Heuristic Parameter Tuning

A sensitivity analysis is performed to identify the optimal values for the two key parameters in the bus dispatch heuristic: the dispatch threshold and the maximum waiting time for each route. This analysis aims to evaluate the impact of different parameter configurations on the performance of the public transportation system and determine the combinations that maximize operational utility U, achieving a balance between operational efficiency and service quality.

The analysis begins by defining a discrete search space for both parameters. The dispatch threshold is evaluated for values ranging from 10% to 100%, in increments of 2%, while the maximum waiting time is analyzed within a range of 0 to 60 min, in increments of two minutes. This generated a total of 1426 parameter combinations, each, evaluated independently.

For each parameter combination, the heuristic is executed to determine the best-performing parameters. The combination that maximizes operational utility is selected. In cases of a tie, a composite metric (MP) was defined, synthesizing the key performance metrics, which include service level, average utilized capacity, and average utilization, acting as a global indicator of system performance by balancing demand fulfillment and resource efficiency. The weighting prioritized demand fulfillment with a weight of 0.4, assigning a weight of 0.3 to both average utilized capacity and average utilization. The composite metric is calculated as a linear combination of these values, as shown in Equation (15).

MP = 0.4 \cdot NS + 0.3 \cdot CU + 0.3 \cdot UP

(15)

The tie-breaking selection among configurations with equal utilities is based on identifying the combination of dispatch threshold and maximum waiting time that maximizes the composite metric, establishing the heuristic configuration for each route that enables the best possible operation within the system’s constraints.

3.5. Experimental Design

In order to compare the Baseline, Heuristic, and MILP dispatch strategies, we employed a fully-crossed repeated-measures design. For each route, thirty independent demand scenarios were generated by running the forecasting model thirty times, so that each run produced a complete daily demand trajectory. Applying all three methods to the same set of scenarios controls for day-to-day demand variability and enhances statistical power for within-route comparisons. Moreover, the sample size of

N = 30

is justified by the central limit theorem, which ensures that the sampling distribution of the mean approximates normality—even if the underlying distribution is not normal—provided the sample is sufficiently large.

Subsequently, a one-way analysis of variance (ANOVA) was conducted to compare the mean Utility and mean Service Level across the three methods and to determine whether these means differed significantly. Prior to the ANOVA, its assumptions were assessed: the normality of within-subject differences was tested using the Shapiro–Wilk test, and the sphericity of the covariance matrix was evaluated with Mauchly’s test. In cases of sphericity violation, the Greenhouse–Geisser correction was applied to the degrees of freedom of the F statistic to maintain appropriate control of the Type I error rate. The ANOVA was formulated with the hypotheses

H_{0} : μ_{Baseline} = μ_{Heuristic} = μ_{MILP}, H_{A} : \exists i \neq j : μ_{i} \neq μ_{j},

where

μ_{i}

denotes the mean of the metric for method i.

When the ANOVA indicated a significant effect (

p < 0.05

), Tukey’s HSD adapted for repeated measures was used for post hoc pairwise comparisons among the methods. For each comparison of i vs. j, the following hypotheses were tested:

H_{0}^{(i, j)} : μ_{i} = μ_{j} vs . H_{A}^{(i, j)} : μ_{i} \neq μ_{j} .

This protocol ensures a rigorous, high-power statistical evaluation capable of detecting genuine differences in Utility and Service Level among dispatch strategies under realistic demand variability. In doing so, it establishes whether the heuristic achieves significant improvements over the baseline and how closely it approximates the exact model’s performance.

4. Computational Experiments and Results Analysis

This section presents the computational evaluation of the proposed dynamic dispatch heuristic and its comparison against both the baseline manual scheduling currently used in public transportation system of Montería and an exact optimization model formulated as a MILP. The analysis is based on real operational data from eight representative bus routes, selected for their diversity in demand patterns and structural characteristics.

The evaluation proceeds in several stages. First, we describe the operational context of the case study, the limitations of the current dispatch strategy, and the technical specifications of the computational environment used in simulations. Next, we assess the performance of the demand forecasting models (Prophet, ARIMA, and LSTM), examining their calibration, predictive accuracy, and impact on downstream dispatch decisions.

Following the demand prediction, we present the results obtained from the baseline rule, the exact optimization model, and the proposed heuristic. Each method is evaluated on key performance indicators including operational utility, service level, vehicle occupancy, and system utilization. A comparative analysis quantifies the relative gaps in performance between the heuristic and the optimization model, and highlights the heuristic’s advantages in terms of computational efficiency and near-optimal quality.

Finally, a statistical validation is performed using repeated-measures ANOVA and Tukey’s HSD tests across multiple simulated demand scenarios, providing robust evidence on the significance and consistency of observed differences between methods.

4.1. Case Study Overview and Operational Context

The proposed models were tested on the public transportation system of Montería, Colombia, a mid-sized city operating 21 urban bus routes between 5:00 a.m. and 9:00 p.m. All routes use homogeneous buses with a capacity of 52 passengers, and vehicles are dispatched manually by operators based on average travel times and experiential heuristics, without real-time coordination or explicit demand modeling. This leads to mismatches between service supply and actual demand, particularly during peak hours or under variability in return times.

Despite recording boarding data through turnstiles, the system suffers from incomplete demand capture, directional registration errors, and a lack of historical travel-time logs. To compensate, estimated average travel times were obtained from an urban mobility application. Eight routes were selected for experimentation—Pradera 27, Panzenu, Santander, Mogambo 22, Tambo Circunvalar, KM30, Dorado, and KM15—chosen for their diversity in operational characteristics and demand patterns.

All models (demand forecasting, MILP optimization, and the adaptive heuristic) were implemented in Python 3.11.7. Simulations were conducted on a Windows 11 ®machine with an Intel Core i5-12450H processor (9th generation, 2.4 GHz) and 8 GB RAM. Full implementation details and source code are available at: https://github.com/javibarrera10/DynamicHeuristicForPublicTransportation, accessed on 1 May 2025.

4.2. Passenger Demand Forecasting Performance

In this subsection, we develop and evaluate predictive models tailored to the specific context of Montería’s public transport system, where available data is sparse, incomplete, and biased by operational limitations.

The analysis begins with a description of the preprocessing procedures needed to transform raw boarding data into usable time series. Three alternative forecasting approaches are implemented—Prophet, ARIMA, and LSTM—to explore trade-offs between interpretability, flexibility, and accuracy. Each model is trained and validated using a structured cross-validation process, with hyperparameters tuned to minimize forecast error.

We then compare the models’ predictive performance across eight representative routes using standard accuracy metrics (RMSE and MAPE), and analyze how their forecasts differ in capturing temporal patterns of demand. The results guide the selection of Prophet as the forecasting model used in the subsequent dispatch optimization, based on its robustness and ease of integration. A final subsection discusses model limitations and implications for downstream decision-making.

4.2.1. Data Preparation and Model Configuration

Passenger demand forecasting was based on historical operational records collected in Montería, with timestamps at 10 min intervals formatted as YYYY-MM-DD HH:MM:SS. The data underwent a rigorous preprocessing pipeline that included:

Elimination of missing or inconsistent entries.
Standardization of timestamp formats.
Alignment of time indices across all routes.
Aggregation of demand by route and direction within each interval.

To capture external influences on passenger flow, the model incorporated engineered regressors: binary flags indicating holidays, weekends, and peak-hour windows were added to the dataset. These features enhanced the capacity of Prophet to detect conditional seasonal effects and behavioral shifts.

Three forecasting models were evaluated: Prophet, ARIMA, and LSTM. Prophet was tuned through a grid search over the following hyperparameter space:

changepoint_prior_scale∈ 0.001, 0.01, 0.1, 0.5
seasonality_prior_scale∈ 0.01, 0.1, 1.0, 10.0

Time-series cross-validation was performed with 180-day training windows, a 30-day spacing between cutoffs, and 30-day horizons.

ARIMA models were specified as seasonal ARIMA, with order selection via Akaike Information Criterion (AIC). Candidate parameters included:

$p, q \in 0, 1, 2, d \in 0, 1, 2$
$P, Q \in 0, 1, D \in 0, 1, s = 7$

Each route’s model was independently optimized.

The LSTM network used a sliding-window input representation of 18 time steps (covering 3 h), enriched with cyclical time encodings (e.g., sin(hour), cos(hour)). The architecture consisted of two stacked LSTM layers (64 and 32 units) with BatchNormalization and Dropout(0.2). Training was performed using the Adam optimizer and MSE loss, with early stopping and learning rate decay. All inputs were normalized (zero mean, unit variance), and outputs were rescaled post-prediction.

4.2.2. Model Evaluation and Accuracy Metrics

The final models, selected per route using the best validation performance, were evaluated across eight representative routes in Montería. The metrics used were Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE). Table 5 summarizes the comparative accuracy.

On high-demand routes such as Pradera 27, Panzenu, and Santander, the LSTM model consistently outperformed others, achieving the lowest RMSE and MAE values, which indicates superior forecasting accuracy. In contrast, on low-demand routes like KM15 and KM30, Prophet demonstrated strong performance, suggesting that its ability to model seasonality is particularly effective in low-volume scenarios. For routes with pronounced seasonal patterns, such as Mogambo 22 and Dorado, Prophet delivered competitive results due to its built-in seasonality components, although LSTM outperformed it in terms of RMSE on Dorado, highlighting the potential of nonlinear models in specific cases. ARIMA provided reasonable performance in scenarios with stable demand patterns, such as Tambo Circunvalar, but its sensitivity to parameter tuning poses a risk of overfitting if not carefully managed. Despite the strengths shown by all three models in different contexts, Prophet was selected as the final forecasting model due to its robust handling of seasonality, interpretability, and consistent performance across routes.

4.2.3. Illustrative Demand Profiles

Figure 1 highlights the daily demand profile (observed vs. predicted) for two contrasting routes: one with higher volume and clear peak patterns (Pradera 27) and another with lower, sporadic demand (KM15). Each plot compares Prophet, ARIMA, and LSTM forecasts against the actual recorded passenger counts over a sample three-day period.

As shown in Figure 1, all three models reproduce the overarching morning rise and evening fall in passenger demand, but they differ in how they handle sudden shifts. On the high-volume Pradera 27 route, the LSTM’s nonlinear architecture most closely tracks the sharp morning surge, minimizing peak-underestimation that can lead to bus shortages. Prophet, by contrast, provides a smooth baseline and broad confidence intervals—advantages on low-volume routes like KM15—yet its smoothed peaks often lag real demand, risking idle capacity during critical windows. ARIMA sits between these extremes, performing well under gradual demand changes but proving sensitive to parameter settings when faced with abrupt dips or spikes. Residual patterns and peak-error metrics (difference between forecasted and actual maximum) reveal that LSTM yields the smallest peak miss, Prophet delivers the highest coverage of true values within its forecast intervals, and ARIMA falls in the middle. While small errors in 10 min bins have limited immediate impact on dispatch, consistent under- or over-prediction at peak times can degrade service regularity—highlighting the need to choose the model whose error profile best aligns with the operational cost of missed versus excess trips.

4.2.4. Passenger Demand Prediction in Montería

The demand prediction model was executed to determine the demand to be met on each route, defining a prediction interval of 10 min for 8 April 2025. This date was strategically chosen as it presents temporal characteristics grouped with the highest amount of available historical data. Upon applying the model, it was observed that it successfully generates predictions for all routes; however, for the present analysis, only 8 out of the 21 available routes will be included. The optimal calibration parameters for each model were obtained through cross-validation, and the results are presented in Table 6.

Table 6 presents the calibration results of the Prophet model. A significant variability in parameter values across different routes can be observed, suggesting that each route exhibits unique characteristics in its demand patterns. For example, the KM 15 route has the lowest RMSE (3.19) with parameters of 0.010 and 10, indicating excellent prediction accuracy due to a calibration that favors strong seasonality and a relatively stable trend.

In contrast, the Pradera 27 route shows the highest RMSE (14.89) with parameters of 0.5 and 10, which could indicate greater flexibility in detecting trend changes and a possible complexity in demand patterns that the model has struggled to capture accurately. Additionally, routes like KM 30 and Dorado also exhibit low RMSE values, reflecting a good fit of the selected parameters in capturing their specific dynamics. Overall, the customized calibration of parameters for each route has optimized prediction accuracy, evidenced by the low RMSE values, indicating high precision in forecasts.

Figure 2 illustrates the predicted passenger demand for different routes as estimated by the Prophet model. It highlights that the Pradera 27, Panzenu, and Santander routes exhibit the highest demand throughout the day, with averages of 43, 30, and 28 passengers every ten minutes, respectively. In contrast, routes such as KM30 and KM15 show significantly lower and more stable demand levels, with daily averages of six and five passengers per ten-minute interval.

Overall, the routes share a similar pattern in demand variation throughout the day. From the beginning of operations, demand progressively increases, reaching its peak between 7:00 a.m. and 8:00 a.m. Subsequently, a relatively stable trend is observed, with slight fluctuations until around 5:00 p.m., when demand begins to decline steadily across all routes, reaching its lowest levels at the end of the day.

4.2.5. Model Limitations

Despite the mentioned capabilities, the model presents certain limitations. One of the main limitations is that its performance depends on the quality and quantity of available historical data. Specifically, if there are insufficient data in specific groups, such as low-traffic hours or underutilized routes, predictions may not adequately reflect the real variability of demand. Additionally, it is assumed that observed past patterns will remain consistent in the future, which may not be valid in highly irregular scenarios or those subject to disruptive changes.

Another significant limitation lies in the granularity of the configured prediction horizon. If intervals are too small (e.g., a few minutes), the model may fail to capture the inherent demand variability adequately, generating noisier predictions with wider confidence intervals. Conversely, excessively long horizons may dilute finer patterns, compromising the practical utility of predictions for operational planning. Therefore, the prediction interval configuration must be carefully selected based on specific operational needs and the historical data structure.

Additionally, although Prophet robustly models seasonal and trend effects, it is not designed to handle highly nonlinear patterns or complex variable interactions, such as those arising when demand depends on multiple external factors like weather and traffic conditions. In such cases, more sophisticated models, such as neural networks or hybrid approaches combining different prediction techniques, may be required.

Once the demand is determined, and considering the parameters of the different routes and their respective directions, the heuristic and the optimization model are executed. In the following sections, the performance obtained from each method will be individually evaluated, followed by a comparative analysis.

4.3. Operational Results by Dispatch Strategy

This section evaluates the operational outcomes of three dispatch strategies: a baseline approach based on fixed headways, an exact MILP optimization model, and a heuristic procedure calibrated through parameter tuning.

4.3.1. Baseline Performance

Table 7 presents the results obtained under the baseline strategy. As expected, this approach delivers consistent service regardless of fluctuations in demand, leading to inefficiencies in both over- and under-provision of vehicles. While operationally simple, the strategy results in higher overall waiting times and increased total distance traveled, as it cannot adapt to temporal or spatial demand variability.

Table 7 shows both the utility and the performance metrics when implementing the dispatch rule currently used by the company with the demand forecasts generated by the prediction model.

In the basic configuration, fixed intervals result in substantial loss of demand on routes with the highest passenger flows (e.g., Pradera 27 covers only 49% of demand) and highly variable utilization. As these routes are where the greatest revenue opportunities lie, it ends up being very costly for the system to operate inefficiently there, which ends up affecting profits. However, some medium demand routes, such as Mogambo 22 and Dorado, manage to meet all demand with a continuous deployment as established by this methodology, but continue to operate inefficiently with low vehicle utilization (32% and 34%, respectively). Conversely, low-demand routes, such as KM30, suffer from both poor service (15 percent) and negative utility, i.e., they are routes that the system currently covers but operate at a loss. Overall, the basic dispatch rule produces low levels of service and operational efficiency, making it clear that other approaches need to be explored to enable the company to improve its operation.

4.3.2. MILP Optimization Model

To implement the optimization model described in Section 3.2, the Gurobi optimizer version 11.0.0 was used. As a stopping criterion for execution, a time limit of 3600 s or a GAP (relative difference between the best-found solution and the optimal) of 0.5% was set, considering the size and complexity of the problem.

Table 8 presents the results of the MILP model implementation for each of the analyzed routes in the case study. The results show that, for all routes, the model achieved a MILP gap below 0.5%, relative to the expected optimal value, thus meeting the established time limit for solving the problem.

Additionally, the highest utility values obtained by the MILP model for each route are reported. It is observed that routes with higher passenger demand yield the highest utility values. In particular, the Pradera 27 and Panzenu routes stand out with utilities of 12,481 and 9328, respectively. In contrast, routes with lower demand, such as KM30 and KM15, show the lowest operational utilities, with values of 815 and 1033, respectively. These results confirm the direct relationship between passenger demand and generated utility, reflecting the impact of demand on system efficiency and profitability.

Table 9 presents the performance metrics corresponding to the optimal solution obtained by the model for each instance. The results show that the dispatch strategy established by the model allows for servicing more than 94% of the demand in all instances, except for the KM30 route, where a service level of 49.36% is reached. This outcome is expected since, although the model aims to maximize the serviced demand, the availability of a single bus to cover both directions of the route limits its operational capacity. In each time interval, half of the demand corresponding to one direction is lost while the bus operates in the opposite direction. This result highlights the model’s robustness under severe constraints, although it is clearly limited by available resources.

To achieve the reported service levels and operational utilities, the model opted to utilize the entire available fleet in routes such as Pradera 27, KM30, and KM15. In contrast, other routes achieved their objectives without employing all available vehicles, resulting in operational cost savings. Specifically, Panzenu and Santander used six out of seven available buses, Mogambo 22 deployed three out of seven enabled buses, Tambo Circunvalar utilized three out of five assigned buses, and the Dorado route required only two out of the seven available buses.

In terms of the number of trips, high-demand routes such as Pradera 27 required a higher number of journeys, accumulating a total of 169 trips. Conversely, lower-demand routes like KM15 and KM30 completed 53 and 22 trips, respectively, with all trips on KM30 being carried out by a single bus. In general, buses on high-demand routes exhibit high average occupancy levels during their journeys. For example, in the Pradera 27, Panzenu, and Santander routes, vehicles operate with average capacity utilization levels above 90%. In contrast, routes such as Mogambo, Tambo Circunvalar, and Dorado operate with average occupancy levels close to 80%. Lower-demand routes, KM15 and KM30, show significantly lower performance in terms of capacity utilization, with averages of 32.63% and 53.76%, respectively.

Additionally, the results indicate that on routes such as Pradera 27, Panzenu, Santander, KM30, and KM15, buses exhibit utilization levels exceeding 80%. This suggests that, for most of the operational time, vehicles are actively performing trips in directions 1 and 2, with minimal waiting times at terminals or empty relocations. On the other hand, routes such as Mogambo, Tambo Circunvalar, and Dorado show lower average utilization levels of 40.87%, 56.11%, and 27.68%, respectively, indicating that buses on these routes spend a significant portion of their operational time idle.

4.3.3. Heuristic Strategy

When executing the heuristic, a sensitivity analysis is performed to determine the parameters that allow maximizing utility for each route.

Figure 3 presents the sensitivity of utility to changes in the dispatch threshold and the maximum waiting time for each evaluated route. The results show differentiated behaviors depending on the demand level of each route. For low-demand routes such as KM30, Dorado, KM15, Tambo Circunvalar, and Mogambo 22, utility increases as the dispatch threshold rises, reaching its peak at low or medium values of this metric. Beyond these points, utility stabilizes, indicating that changes in the dispatch threshold no longer significantly influence performance.

On the other hand, medium-to-high demand routes such as Santander and Panzenu exhibit a progressive increase in utility as the dispatch threshold increases, reaching their highest values at high thresholds.

The highest-demand route, Pradera 27, exhibits a distinctive pattern: its utility remains stable despite variations in the dispatch threshold until reaching a value close to 80%. Beyond this point, utility declines considerably with further threshold increases, which may be due to a loss of operational efficiency under these conditions.

Regarding the maximum waiting time, a more uniform pattern is observed across different routes. In general, utility experiences minor fluctuations when modifying this metric (lower sensitivity), with the highest utility values achieved when setting maximum waiting times between 20 and 30 min. Beyond this range, utility tends to stabilize, suggesting that longer waiting times do not generate any additional significant impact on operational utility.

Table 10 presents the best-performing parameters obtained through the sensitivity analysis to maximize utility for each route. The results show that for high-demand routes such as Pradera, Panzenu, and Santander, high dispatch thresholds are preferred. In these routes, buses are dispatched only when the accumulation of passengers approaches full capacity, with optimal values of 96%, 88%, and 94%, respectively. Conversely, for low-demand routes such as Mogambo 22, KM30, and Dorado, better performance is achieved with lower dispatch thresholds, slightly above half of the bus capacity. This reflects a preference for more frequent dispatches, even if full occupancy is not guaranteed. The optimal thresholds for these routes are 70%, 60%, and 68%, respectively.

A particular case is observed in route KM15, where the identified optimal dispatch threshold is significantly lower, at 30%. This behavior can be explained by the fact that KM15 is the route with the lowest demand and the least availability of buses, requiring a single vehicle to serve demand in both directions. In this context, the objective is to maximize dispatch frequency, ensuring that the bus remains in constant movement to cover as much demand as possible.

Regarding the second heuristic adjustment parameter, the results indicate that for high-demand routes such as Pradera, Panzenu, and Santander, the optimal maximum waiting time at the terminal is 30 min. Meanwhile, for lower-demand routes, the maximum waiting time is 20 min. This reflects a strategy in which longer waiting times are allowed on high-demand routes, as passenger accumulations tend to be higher, promoting more efficient use of bus capacity. However, this approach may compromise vehicle utilization since it prioritizes efficiency in transporting larger volumes of passengers.

Table 11 presents the performance metrics obtained for each route when executing the heuristic calibrated with the best-performing parameters. Analyzing the results, it is evident that Pradera 27, Panzenu, Santander, and Dorado stand out for their high service level and operational efficiency. Panzenu and Santander manage to meet 100% and 97.92% of the demand, respectively, maintaining excellent bus capacity utilization (96.68% and 99.6%), indicating that during their operation, vehicles are consistently picking up passengers with minimal waiting times or empty trips. Additionally, these routes exhibit passenger-per-kilometer (IPK) indices above 2.5 (Panzenu at 2.68 and Santander at 2.16), reflecting an efficient operation that is well balanced between user satisfaction and resource optimization, ensuring that buses pick up a significant number of passengers per kilometer traveled. Dorado, with an IPK of 2.78 and an average utilization of 96.97%, demonstrates a highly efficient operation, maximizing bus usage without compromising service quality.

On the other hand, Mogambo 22 also achieves a perfect service level (100%) and the highest global IPK of 2.85, demonstrating excellent efficiency in passenger transport per kilometer and successfully meeting available demand. However, its average utilization is moderate (72.08%), and its average capacity utilization is 78.72%, suggesting that while the buses are effective in each journey, they spend a significant amount of time not picking up passengers, and their trips are around 70% full. In contrast, Tambo Circunvalar maintains a high service level (97.92%) and good capacity utilization (91.42%), although its average utilization is somewhat low (62.55%), indicating that buses have a lower dispatch frequency but operate near full capacity each time they complete a trip.

KM30 and KM15 present significant challenges in terms of efficiency and demand satisfaction. KM30 only manages to meet 50% of the demand with an IPK of 1.52 and an average capacity utilization of 54.46%, indicating a highly inefficient operation and a resource allocation that fails to cover existing demand. KM15, while achieving an almost perfect service level (99.58%), does so with a very low average capacity utilization (31.5%) and an IPK of 1.42, reflecting that the fleet capacity assigned for this day may be overestimated given the demand, as only a small portion of the active buses’ capacity on this route is actually utilized.

An example of the dispatch schedule established by the heuristic is presented in Figure 4, which illustrates the specific scheduling for the Pradera 27 route. The figure shows how buses begin operations in the stored in depot state (red), from where some of them proceed to travel either from the depot to terminal 1 (orange) or terminal 2 (yellow) to arrive at the terminals in time to start their respective trips in direction 1 (dark green) and direction 2 (light green). This ensures that passenger demand is met from the start of operations, scheduled for 5:00 a.m.

Throughout the day, it is evident that buses remain in active operation for most of the time, continuously running trips to serve passengers. In most cases, buses start a new trip immediately after completing the previous one. However, there are also moments when there is no immediate demand to be met. In these situations, buses remain in waiting at terminal 1 (dark blue) or terminal 2 (light blue), ready to be dispatched when demand requires it.

Additionally, for this route, empty trips (pink or purple) are not observed, occurring when a bus is not needed at a terminal and must move to another one to meet future demand. These movements, although infrequent, reflect the necessity of balancing bus availability between terminals. At the end of the daily operation, all buses make return trips from the terminals (terminal 1 or terminal 2) to the depot, where they are stored again to be ready for the next day’s operation.

4.4. Comparative Analysis of Dispatch Approaches

4.4.1. Performance Gaps

A comparative evaluation is now performed to measure the effectiveness of the heuristic in obtaining efficient dispatch solutions. This evaluation focuses primarily on the main objective: operational utility. However, it is also examined through other key metrics that provide relevant insights into the system, such as service level, the average capacity used by buses, and the utilization of buses for passenger pickup during operation. To quantify the differences between the solutions obtained by the optimization model and the heuristic, the comparison is established using the relative gap, calculated both for utility and for each of the performance metrics.

Relative Gap = \frac{Model Value - Approximate Value}{Model Value}

(16)

Table 12 presents the relative gaps in operational utility and performance metrics for the eight analyzed instances.

Overall, it is observed that the heuristic generates lower operational utilities compared to those obtained by the optimization model. This is particularly evident in the Tambo Circunvalar, KM 15, and Mogambo 22 routes, where the gaps are 11.84%, 7.07%, and 8.05%, respectively. On the other hand, routes such as Pradera 27, KM 30, and Panzenu show smaller differences between the utilities obtained by the heuristic and the model, with gaps of 4.91%, 4.79%, and 1.56%, respectively. This superior performance of the MILP model compared to the heuristic is attributed to its ability to explore dispatch alternatives more thoroughly. The model achieves a better balance between service level, vehicle capacity utilization on each trip, and operational efficiency, allowing it to identify solutions with more favorable cost–benefit relationships for system operation. By contrast, the fixed-interval baseline dispatch rule produces utility gaps above 31% on every route—66.78% on Pradera 27, 60.47% on Panzenu, 60.34% on Santander, 31.10% on Mogambo 22, 41.42% on Tambo Circunvalar, 116.05% on KM 30, 39.70% on Dorado, and 61.23% on KM 15—demonstrating an overall far poorer performance.

For example, in the Pradera 27 route, it is evident that the heuristic prioritizes maximizing the capacity utilization of buses, ensuring that they operate with near-full occupancy. This approach results in an 8.58% increase in capacity utilization compared to the optimization model. However, by waiting for sufficient demand to accumulate before dispatching vehicles, the dispatch frequency is reduced, leading to a 17.91% decrease in bus utilization. This reduction in frequency also causes an increase in unmet demand during certain intervals, negatively impacting the service level, which drops by 3.50% compared to the model. These combined factors contribute to a 4.91% utility gap for the heuristic. The baseline rule in the same route exhibits a utility gap of 66.78%, a service-level gap of 49.61%, a capacity-utilization gap of 85.00% and a utilization gap of 95.00%, underscoring its inability to match demand with fixed headways.

Similar situations are observed in the KM 30 route, where the heuristic aims to improve bus capacity utilization. However, this strategy compromises other key metrics, such as service level and utilization, resulting in a 4.79% reduction in operational utility. In comparison, the baseline yields a utility gap of 116.05%, a service-level gap of 68.72%, a capacity-utilization gap of 60.00% and a utilization gap of 65.00%.

In the KM 15 route, the heuristic approach allows for serving a higher number of passengers, increasing the service level by 5.68%. However, capacity utilization decreases by 3.46% and bus utilization drops by 26.14% compared to the model, leading to a 7.07% utility gap. Conversely, the baseline experiences a 61.23% utility gap, a 25.47% service-level gap, a 48.00% capacity-utilization gap and a 55.00% utilization gap.

This behavior is not consistent across all routes, as improving certain metrics does not always imply an increase in operational utility. Instead, an equilibrium between metrics can be reached to maximize utility differently for each instance. For example, in the Panzenu and KM 30 routes, the heuristic outperforms the MILP model in service level and capacity utilization—with increases of 1.47% and 7.05% in Panzenu, and 1.30% in both metrics for KM 30—while bus utilization decreases by 14.53% and 9.50%, respectively. The baseline, however, under-serves by 71.02% and 15.44% in service level and records capacity–utilization gaps of 89.22% and 70.00%, leading to utility gaps of 60.47% (Panzenu) and 116.05% (KM 30).

In the case of the Tambo Circunvalar route, the heuristic achieves better performance in terms of operational costs by improving the efficiency of bus usage, though this approach negatively impacts the service level and results in an 11.84% utility gap. By comparison, the baseline loses 41.42% of potential utility and has a 6.54% service-level gap.

In contrast to the previously analyzed behaviors, the utilities obtained for the Santander and Dorado routes present negative relative gaps for the heuristic (−0.28% and −1.92%), indicating that the heuristic achieved better performance than the optimization model in these instances. The baseline gaps on those routes are 60.34% and 39.70%, respectively. This occurs because, for these routes, the heuristic improved both the level of demand coverage and the efficiency of bus utilization simultaneously.

These cases are particularly relevant because, in theory, the heuristic solution should be equal to or lower than that of the optimization model, as the latter seeks to explore all possible alternatives to find an optimal solution. However, the MILP model did not reach full optimality due to the 0.5% MILPGAP stopping criterion, so with more computation time it might match or exceed the heuristic on those routes. Overall, even though the heuristic incurs small losses relative to the model, it closes the vast majority of the gap to optimality and vastly outperforms the fixed-interval baseline in all four metrics, while retaining minimal computational costs.

Another relevant aspect to consider is the assumptions related to the transit times used by the model. To adapt to the discretization required by the optimization approach, transit times had to be adjusted, which could introduce slight discrepancies. Although these discrepancies are minimal, they may impact the final results by slightly influencing the operational performance calculated in specific instances.

4.4.2. Execution Time Comparison

Figure 5 presents the computational execution times of both the heuristic model and MILP model. As expected, the results show that the computational times for the optimization model are significantly higher than those for the heuristic approach. Specifically, routes such as Pradera 27, Panzenu, and Mogambo 22 require 173.34, 119.41, and 469.61 s, respectively, to find a solution, making them the most computationally demanding. However, for routes like KM30 and KM15, which have lower demand and fewer available buses, the optimization model finds solutions considerably faster, even recording execution times lower than those of the heuristic.

In contrast, the computational performance of the heuristic model exhibits remarkable stability across different instances. The shortest execution time is observed for the KM30 route, with 1.42 s, while the longest corresponds to the Santander route, with 3.83 s. This represents a difference of only 2.41 s between the fastest and the most complex instance.

These results highlight how, in scenarios with higher demand and greater vehicle availability, the optimization model significantly increases its computational complexity, requiring more resources to find solutions. On the other hand, the heuristic approach maintains execution times on the order of just a few seconds, making it an efficient and robust tool. Although it does not guarantee optimal solutions, the heuristic achieves remarkable performance, with minimal gaps in operational utility compared to the optimization model. This characteristic positions it as a valuable alternative for public transport systems of similar or larger scale than Montería, where computational challenges can become costly or even infeasible to solve using exact optimization techniques. The heuristic’s robustness against variations in system size is a key strength, consolidating its practical utility in real-world contexts.

4.5. Statistical Comparison and Results

In this section we first present descriptive summaries of Utility and Service Level across the three dispatch methods and all eight routes, then report global tests of differences via repeated-measures ANOVA, and finally detail pairwise contrasts using Tukey’s HSD. All analyses use

N = 30

demand scenarios.

4.5.1. Descriptive Statistics

Table 13 summarizes, for each route and dispatch method, the sample mean and standard deviation of Utility and Service Level over

N = 30

demand scenarios. Across all corridors, the MILP model achieves the highest mean Utility, with absolute improvements over the baseline ranging from $1424 (Mogambo 22) to $6833 (Pradera 27). The heuristic rule provides substantial gains relative to the baseline on high-demand routes (e.g., +$4563 on Pradera 27, +$3963 on Panzenu) and closes approximately half the gap between baseline and MILP on moderate-demand corridors (e.g., Mogambo 22: +$3309 vs. +$3424). In contrast, on low-volume corridors (KM30, KM 15) the baseline sometimes yields negative or marginal profit (baseline: −$132, +$400; heuristic: +$607, +$448), whereas the MILP method consistently secures positive returns (+$692, +$588).

Service Level follows a similar pattern: MILP achieves near-complete coverage (mean 97–98%), the heuristic raises baseline coverage by 13–24 pp on most routes (e.g., from 49.1% to 72.8% on Pradera 27), but offers no improvement where demand saturates capacity (100% on Mogambo 22 and Dorado). The baseline method fluctuates widely from 15.4% (KM30) to 100% (Mogambo 22, Dorado), indicating that fixed headways fail to adapt to varying demand patterns. The reported standard deviations are small, relative to the mean differences, confirming that these patterns hold consistently across scenarios.

4.5.2. Global Significance Tests (ANOVA)

Table 14 presents the results of one-way repeated-measures ANOVAs assessing the effect of dispatch method on Utility and Service Level for each route. The numerator degrees of freedom is

k - 1 = 2

(three methods) and the denominator is

(N - 1) (k - 1) = 29 \times 2 = 58

. All F-statistics for Utility exceed 23,000 (up to 89,713 for Dorado) and those for Service Level exceed 8000 (Mogambo 22: 813.7; Dorado: 1 214.7), with

p ≪ 10^{- 8}

. These extremely large F-values indicate that between-method variability overwhelmingly dominates scenario-to-scenario noise. The somewhat lower F for Service Level on saturated routes arises because two methods plateau at 100 %, leaving minimal residual variance. Overall, these findings confirm that dispatch strategy exerts a highly significant, systematic impact on both profit and coverage across all corridors.

4.5.3. Pairwise Comparisons

Table 15 and Table 16 report Tukey’s HSD pairwise contrasts for Utility and Service Level, respectively. Every comparison between methods is significant at

p < 0.001

, except where the mean difference is zero.

Table 15 and Table 16 present the pairwise Tukey HSD contrasts for Utility and Service Level, respectively, based on repeated-measures residuals (error

df = 58

, family-wise

α = 0.05

).

For Utility, all method pairs exhibit highly significant differences (

p < 0.001

). The heuristic captures 48–67% of the MILP’s advantage over the baseline. For instance, on Pradera 27, the heuristic achieves a mean gain of $4563 (95% CI [4480, 4647]) compared to $6834 (95% CI [6750, 6917]) for MILP. Even on low-volume routes, the heuristic outperforms the baseline (KM30: +$740; KM15: +$99), while MILP amplifies these gains. On Mogambo 22, under saturated demand, the heuristic nearly matches MILP, with a minimal difference of +$114 (95% CI [55, 174]).

For Service Level, the heuristic improves coverage by 7.9–39.1 percentage points (pp) over the baseline (e.g., +23.7 pp on Pradera 27, +39.1 pp on KM30), while MILP achieves up to 45.7 pp gains. On routes with 100% coverage (Mogambo 22, Dorado), the heuristic yields no further improvement (mean

Δ = 0

, non-significant), and MILP sacrifices 1.9–2.6 pp of coverage for higher profits.

These findings highlight that Tukey’s HSD, by controlling the family-wise error rate, precisely quantifies each method’s contribution to optimal gains. The heuristic consistently delivers most of MILP’s improvements, significantly surpassing the fixed-interval baseline, though it may underperform on extremely sparse or fully-saturated routes. This nuanced analysis underscores the trade-offs between simplicity and optimality in transit dispatch.

5. Conclusions

This study addresses the challenge of optimizing bus dispatching in public transportation systems by implementing an adaptive heuristic based on passenger demand prediction. This methodology enables transportation companies to respond swiftly and efficiently to demand fluctuations, offering a flexible, real-time solution that adapts to operational conditions and forecasted variations. The heuristic demonstrates robustness across various operational scenarios, including changes in demand, fleet sizes, and travel times, making it applicable to diverse contexts.

The proposed approach significantly advances medium-scale public transportation management by combining accurate machine learning-based predictions with dynamic operational decision-making. This balance ensures efficient resource use while meeting passenger needs. The results show that the heuristic achieves operational efficiency, service quality, and fleet utilization comparable to more complex optimization models, but with significantly lower computational costs. These characteristics make it a valuable tool for cost-effective and time-efficient bus dispatching, particularly for public transportation companies in Colombia.

An extensive statistical evaluation using repeated-measures ANOVA and Tukey’s HSD post-hoc tests confirmed that the observed differences in utility and service level between dispatch methods are statistically significant for every route. Across all routes, the heuristic outperforms the fixed-interval baseline with robust gains, and on several routes, its performance is statistically indistinguishable from the MILP model. These findings provide rigorous empirical support for adopting the heuristic as a near-optimal and reliable dispatch strategy in practice.

However, the heuristic has limitations. Its performance depends heavily on the accuracy of the demand prediction model, which relies on the quality and quantity of historical data. In scenarios with insufficient or biased data, the heuristic’s effectiveness could be compromised. Assumptions of discrete-time intervals restrict operational decisions to predefined intervals, limiting the ability to capture detailed dynamics, such as passenger boardings and alightings at intermediate stops, which affect the representation of the real state of buses along their routes.

For future research, enhancing model accuracy, incorporating intermediate decision-making along routes, and including additional stops to better capture passenger flows could further improve the heuristic. Implementing acceleration techniques in the prediction model could mitigate the high computational costs associated with cross-validation and parameter tuning. Integrating real-time information and additional variables, such as traffic conditions and local events, could also enrich the predictive model, improving its accuracy and adaptability. To ensure reproducibility and support future research, source code, processed datasets, and analysis scripts are available in our public repository: https://github.com/javibarrera10/DynamicHeuristicForPublicTransportation, accessed on 1 May 2025.

Author Contributions

Formal analysis, J.E.B.H., L.E.T.T., A.T. and D.Á.-M.; funding acquisition, A.T. and D.Á.-M.; investigation, J.E.B.H. and D.Á.-M.; project administration, D.Á.-M.; supervision, D.Á.-M. and A.T; visualization, J.E.B.H.; writing—original draft, J.E.B.H., L.E.T.T., A.T. and D.Á.-M.; writing—review and editing, J.E.B.H., L.E.T.T., A.T. and D.Á.-M. All authors have read and agreed to the published version of the manuscript.

Funding

The article processing charge (APC) was funded by Universidad de los Andes.

Data Availability Statement

The original data presented in the study are openly available in DynamicHeuristicForPublicTransportation at https://github.com/javibarrera10/DynamicHeuristicForPublicTransportation (accessed on 5 May 2025).

Conflicts of Interest

The authors declare no conflicts of interest.

Abbreviations

TNSP	Transit Network Scheduling Problem
MILP	Mixed-Integer Linear Programming
ML	Machine Learning
RMSE	Root Mean Square Error
ARIMA	AutoRegressive Integrated Moving Average
LSTM	Long Short-Term Memory
DP	Dynamic Programming
LNS	Large Neighborhood Search
SHFs	Simulated Historical Forecasts
DNP	Departamento Nacional de Planeación

References

Guihaire, V.; Hao, J.K. Transit network design and scheduling: A global review. Transp. Res. Part A Policy Pract. 2008, 42, 1251–1273. [Google Scholar] [CrossRef]
Magnanti, T.L.; Wong, R.T. Network design and transportation planning: Models and algorithms. Transp. Sci. 1984, 18, 1–55. [Google Scholar] [CrossRef]
Borndörfer, R.; Grötschel, M.; Pfetsch, M.E. A Path-Based Model for Line Planning in Public Transport A Path-Based Model for Line Planning in Public Transport *. Technical Report. 2005. Available online: https://www.google.com/url?sa=t&source=web&rct=j&opi=89978449&url=http://webdoc.sub.gwdg.de/ebook/serien/ah/reports/ZIBreport/ZR-05-18.pdf&ved=2ahUKEwiQrJ6R6pyNAxVjVmwGHYLRJroQFnoECBwQAQ&usg=AOvVaw1KSs6HedcL39KmDr41Voqj (accessed on 5 May 2025).
Zhou, X.; Wei, G.; Zhang, Y.; Wang, Q.; Guo, H. Optimizing Multi-Vehicle Demand-Responsive Bus Dispatching: A Real-Time Reservation-Based Approach. Sustainability 2023, 15, 5909. [Google Scholar] [CrossRef]
Pei, M.; Lin, P.; Du, J.; Li, X.; Chen, Z. Vehicle dispatching in modular transit networks: A mixed-integer nonlinear programming model. Transp. Res. Part E Logist. Transp. Rev. 2021, 147, 102240. [Google Scholar] [CrossRef]
Guan, D.; Wu, X.; Wang, K.; Zhao, J. Vehicle Dispatch and Route Optimization Algorithm for Demand-Responsive Transit. Processes 2022, 10, 2651. [Google Scholar] [CrossRef]
van Oudheusden, D.L.; Zhu, W. Trip frequency scheduling for bus route management in Bangkok. Eur. J. Oper. Res. 1995, 83, 439–451. [Google Scholar] [CrossRef]
Gkiotsalitis, K.; van Berkum, E.C. An exact method for the bus dispatching problem in rolling horizons. Transp. Res. Part C Emerg. Technol. 2020, 110, 143–165. [Google Scholar] [CrossRef]
Gkiotsalitis, K. A model for the periodic optimization of bus dispatching times. Appl. Math. Model. 2020, 82, 785–801. [Google Scholar] [CrossRef]
Chen, Z.; Li, X.; Zhou, X. Operational design for shuttle systems with modular vehicles under oversaturated traffic: Discrete modeling method. Transp. Res. Part B Methodol. 2019, 122, 1–19. [Google Scholar] [CrossRef]
Gkiotsalitis, K.; Cats, O. Reliable frequency determination: Incorporating information on service uncertainty when setting dispatching headways. Transp. Res. Part C Emerg. Technol. 2018, 88, 187–207. [Google Scholar] [CrossRef]
Hadas, Y.; Shnaiderman, M. Public-transit frequency setting using minimum-cost approach with stochastic demand and travel time. Transp. Res. Part B Methodol. 2012, 46, 1068–1084. [Google Scholar] [CrossRef]
Eranki, A. A Model to Create Bus Timetables to Attain Maximum Synchronization Considering Waiting Times at Transfer Stops. Master’s Thesis, Department of Industrial and Management Systems Engineering University of South Florida, Tampa, FL, USA, 2004. Available online: http://scholarcommons.usf.edu/etd/1025 (accessed on 15 February 2025).
Gorev, A.; Popova, O.; Solodkij, A. Demand-responsive transit systems in areas with low transport demand of “smart city”. Transp. Res. Procedia 2020, 50, 160–166. [Google Scholar] [CrossRef]
Berrebi, S.J.; Watkins, K.E.; Laval, J.A. A real-time bus dispatching policy to minimize passenger wait on a high frequency route. Transp. Res. Part B Methodol. 2015, 81, 377–389. [Google Scholar] [CrossRef]
Gkiotsalitis, K.; Alesiani, F. Robust timetable optimization for bus lines subject to resource and regulatory constraints. Transp. Res. Part E Logist. Transp. Rev. 2019, 128, 30–51. [Google Scholar] [CrossRef]
Yao, E.; Liu, T.; Lu, T.; Yang, Y. Optimization of electric vehicle scheduling with multiple vehicle types in public transport. Sustain. Cities Soc. 2020, 52, 101862. [Google Scholar] [CrossRef]
Van Lieshout, R.N.; Bouman, P.C.; van den Akker, M.; Huisman, D. A self-organizing policy for vehicle dispatching in public transit systems with multiple lines. Transp. Res. Part B Methodol. 2021, 152, 46–64. [Google Scholar] [CrossRef]
Taylor, S.J.; Letham, B. Forecasting at Scale. Am. Stat. 2018, 72, 37–45. [Google Scholar] [CrossRef]
Sherstinsky, A. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) network. Phys. D Nonlinear Phenom. 2020, 404, 132306. [Google Scholar] [CrossRef]
Song, X.; Liu, Y.; Xue, L.; Wang, J.; Zhang, J.; Wang, J.; Jiang, L.; Cheng, Z. Time-series well performance prediction based on Long Short-Term Memory (LSTM) neural network model. J. Pet. Sci. Eng. 2020, 186, 106682. [Google Scholar] [CrossRef]

Figure 1. Illustrative forecast profiles for Pradera 27 (left, high demand) and KM15 (right, low demand).

Figure 2. Passenger demand predicted by the Prophet model for the eight study routes.

Figure 3. Effect of the dispatch threshold and maximum waiting time on operational utility.

Figure 4. Bus dispatch schedule for the Pradera 27 route established by the heuristic.

Figure 5. Comparison of computational time spent for the execution of the MILP and heuristic models.

Table 1. Comparison of approaches in the State-of-the-Art of the TNSP.

Reference	Objective	Methodology	Variables	Constraints	Application
Borndörfer et al. [3]	Minimize operational costs and waiting times	MILP with column generation	Frequencies, optimized lines	Vehicle capacity, travel times	Potsdam, Germany
Zhou et al. [4]	Efficient allocation in on-demand systems	Bipartite graphs + Kuhn-Munkres Algorithm	Real-time allocation	Vehicle capacity, time constraints	Simulation
Pei et al. [5]	Optimization of frequencies and modular capacities	MILP	Demand, vehicle capacities	Demand fluctuation	Simulation (China)
Guan et al. [6]	Optimize dispatch in variable demand systems	LNS-genetic algorithm	Passenger flows, routes	Vehicle conservation, capacity	Simulation (urban network)
Van Oudheusden et al. [7]	Minimize operational costs - reduction of empty trips	Nonlinear programming	Frequencies, travel times	Vehicle capacity, operational costs	Bangkok
Gkiotsalitis et al. [8]	Optimize dispatch in a rolling horizon	Convex nonlinear formulation	Dispatch intervals, schedules	Capacity, service regularity	Simulation
Gkiotsalitis [9]	Minimize variations in departure intervals	Convex quadratic programming	Intervals, departure times	Regularity, capacity	Line 302, Singapore
Chen and Zhou [10]	Optimize frequencies and capacities in oversaturated conditions	MILP + Dynamic Programming	Frequencies, dynamic demand	Vehicle capacity, oversaturated traffic	Beijing and Tampa Bay
Hadas et al. [12]	Optimize frequencies and schedules in public networks	Stochastic simulations	Frequencies, schedules	Regularity, historical demand	Simulation
Eranki [13]	Maximize arrivals at transfer points	Iterative heuristic	Departure schedules, transfers	Node synchronization	Simulation
Gorev et al. [14]	Optimize allocation in low-demand systems	Prediction + heuristics	Demand, vehicle assignment	Fleet availability	European network
Berrebi et al. [15]	Mitigate bus bunching in high-frequency routes	Stochastic decision processes	Dispatch times, intervals	Regularity, real-time demand	Circular routes
Yao et al. [17]	Adjust real-time dispatches	Dynamic simulations + relaxation	Demand, dispatch times	Dynamic constraints	Asian network
Van Lieshout et al. [18]	Optimize decentralized dispatching	Self-organization strategies	Dispatch times, routes	Limited resources, regular intervals	Göttingen, Germany
Our work	Optimize operational utility	Dispatch heuristic + demand prediction model	Passenger demand, distances, number of buses, travel times	Vehicle capacity, allowed movements, accumulations	Public transport network of Montería, Colombia

Table 2. Comparison of time series prediction models.

Problem Characteristics	Prophet	ARIMA	LSTM
Predictions with small datasets	√	√	X
Seasonalities–calendar effects	√	X	√
Training speed	√	√	X
Explainability	√	√	X
Robustness against outliers	√	X	√
Configuration complexity	√	X	X
Adjustment to complex nonlinear trends	√	X	√

Table 3. Parameters and Variables Used in the Route-Specific MILP Model.

Symbol	Type	Description
$R$	Set	Set of available routes in the system.
$S$	Set	Set of directions, $S = {1, 2}$ .
$T$	Set	Set of time intervals, $T = {0, 1, \dots, H}$ , where H is the total number of intervals in the planning horizon.
$B_{r}$	Set	Set of available buses assigned to route r.
$L_{r}$	Set	Set of locations relevant for route r, $L_{r} = {P, C 1_{r}, C 2_{r}}$ .
$D_{r, s, t}$	Parameter	Passenger demand on route r for direction $s \in S$ in time interval $t \in T$ .
$C_{r}$	Parameter	Maximum capacity of each bus assigned to route r (in number of passengers).
$T_{r, l, l^{'}}$	Parameter	Travel time in intervals from location $l \in L_{r}$ to $l^{'} \in L_{r}$ on route r.
$α_{r}$	Parameter	Fixed fare per transported passenger on route r.
$β_{r}$	Parameter	Fixed operating cost per bus in service per interval on route r.
$δ_{r, (l, l^{'}), s}$	Parameter	Binary indicator for route r: 1 if movement between $l, l^{'} \in L_{r}$ corresponds to direction s, 0 otherwise.
$x_{b, l, t, l^{'}, t^{'}}$	Variable	1 if bus $b \in B_{r}$ moves from $l \in L_{r}$ at time t to $l^{'} \in L_{r}$ at time $t^{'}$ , 0 otherwise.
$V_{b, l, t}$	Variable	1 if bus $b \in B_{r}$ is at location $l \in L_{r}$ at time t, 0 otherwise.
$z_{b, t}$	Variable	1 if bus $b \in B_{r}$ is in operation (outside the depot P) at time t, 0 otherwise.
$p_{b, s, t}$	Variable	Number of passengers transported by bus $b \in B_{r}$ in direction s in interval t.
$y_{r, s, t}$	Variable	Unmet demand on route r in direction s in interval t.
$m_{b, s, t}$	Variable	1 if bus $b \in B_{r}$ performs a movement associated with direction s during interval t, 0 otherwise.

Table 4. Heuristic Parameters and Variables.

Symbol	Type	Description
$Θ_{r, s}$	Parameter	Occupancy threshold (%) for dispatching on route r, direction s.
$Γ_{r, s}$	Parameter	Maximum time a bus can remain at terminal s waiting to be dispatched, in minutes.
$Δ t$	Parameter	Duration of each time interval in minutes.
K	Parameter	Future demand evaluation horizon in number of intervals.
$τ_{C 1, P}$	Parameter	Transit time from terminal $C 1$ to depot P for route r, in minutes.
$τ_{C 2, P}$	Parameter	Transit time from terminal $C 2$ to depot P for route r, in minutes.
$τ_{P, C 1}$	Parameter	Transit time from depot P to terminal $C 1$ for route r, in minutes.
$τ_{P, C 2}$	Parameter	Transit time from depot P to terminal $C 2$ for route r, in minutes.
$α$	Parameter	Fare per transported passenger.
$β$	Parameter	Operating cost per bus in service per time interval.
$Q_{r, s, t}$	Variable	Accumulated demand on route r, direction s, up to interval t.
$W_{r, s, t}$	Variable	Accumulated waiting time on route r, direction s, up to interval t.
$L_{r, s, t}$	Variable	List of hours with unmet accumulated demand for route r, direction s, up to interval t.
$Q_{ns, r, s, t}$	Variable	Accumulated unmet demand on route r, direction s, up to interval t.
$S_{b}$	Variable	Current state of bus b.
$A_{b}$	Variable	Time at which bus b will be available for operation.
$P_{b}$	Variable	Total number of passengers transported by bus b.
$K_{b}$	Variable	Kilometers traveled by bus b.

Table 5. Comparison of forecast accuracy across three models: RMSE (passengers per 10 min interval) and MAPE (%).

Route	Prophet		ARIMA		LSTM
Route	RMSE	MAE	RMSE	MAE	RMSE	MAE
Pradera 27	13.24	11.28	24.27	16.09	7.00	4.41
Panzenu	8.92	7.27	17.31	11.31	5.58	3.55
Santander	9.21	7.67	17.33	11.39	5.58	3.42
Mogambo 22	0.77	0.33	15.55	10.06	5.31	3.45
Tambo Circunvalar	5.49	4.59	8.91	5.34	3.59	2.27
KM30	4.57	3.10	7.89	4.69	5.25	3.64
Dorado	4.73	3.95	7.40	4.39	3.16	1.93
KM15	0.72	0.18	5.84	3.24	3.46	1.93

Table 6. Adjusted parameters and root mean square error (RMSE) by route.

Route	Changepoint Scale	Seasonality Scale	RMSE
Pradera 27	0.500	10	14.89
Panzenu	0.010	0.10	8.94
Santander	0.001	0.10	9.94
Mogambo 22	0.010	0.10	9.01
Tambo Circunvalar	0.010	0.1	6.67
KM30	0.001	1	4.56
Dorado	0.010	10	5.27
KM15	0.010	10	3.19

Table 7. Performance metrics for each Montería route under the baseline dispatch rule.

Route	Total Passengers	Trips Completed	Total Kilometers	Service Level (%)	Global IPK	Capacity Utilized (%)	Utilization (%)	Utility ($)
Pradera 27	4021	82	1640.00	49.10	2.45	93.63	41.72	4145.50
Panzenu	3554	75	1405.50	71.02	2.53	89.22	40.76	3689.50
Santander	3291	71	1704.50	62.21	1.93	87.55	49.58	3158.00
Mogambo 22	3682	77	1097.00	100.00	3.36	91.87	32.01	4065.00
Tambo Circ.	2228	53	968.5	92.03	2.30	82.65	42.95	2406.00
KM30	182	5	94.00	15.44	1.94	70.00	18.99	−131.00
Dorado	2938	77	1155.00	100.00	2.54	66.32	33.68	2576.00
KM15	695	19	219.00	70.06	3.17	70.50	22.12	400.00

Table 8. Stopping GAP and utility per route obtained with the MILP optimization model.

Route	MILP Gap (%)	Utility ($)
Pradera 27	0.49	12,481
Panzenu	0.48	9328
Santander	0.24	7962
Mogambo 22	0.41	5897
Tambo Circunvalar	0.36	4109
KM30	0.18	815
Dorado	0.10	4272
KM15	0.50	1033

Table 9. Performance metrics for each Montería route obtained with the optimization model.

Route	Buses Used	Total Passengers	Trips Completed	Global IPK	Service Level (%)	Capacity Utilization (%)	Utilization (%)
Pradera 27	8	8038	169	2.31	97.55	91.41	96.92
Panzenu	6	5844	124	2.56	98.55	90.31	82.67
Santander	6	5296	112	2.08	98.29	90.83	83.07
Mogambo 22	3	3526	78	3.18	98.60	86.70	40.87
Tambo Circ.	3	2602	56	2.50	98.49	86.16	56.11
KM30	1	615	22	1.54	49.36	53.76	96.39
Dorado	2	2616	64	2.81	97.90	78.61	27.68
KM15	2	899	53	1.22	94.23	32.63	96.08

Table 10. Dispatch parameters and utility per route.

Route	Dispatch Threshold (%)	Waiting Time (min)	Utility ($)
Pradera 27	96	30	11,868.50
Panzenu	88	30	9182.50
Santander	94	30	7984.00
Mogambo 22	70	20	5422.00
Tambo Circunvalar	94	20	3622.50
KM30	60	20	776.00
Dorado	68	20	4354.00
KM15	30	20	960.00

Table 11. Performance metrics for each Montería route obtained with the heuristic.

Route	Buses Used	Total Passengers	Trips Completed	Global IPK	Service Level (%)	Capacity Utilized (%)	Utilization (%)
Pradera 27	8	7844	152	2.58	94.17	99.25	85.25
Panzenu	6	5930	118	2.68	100.00	96.68	85.68
Santander	6	5386	104	2.16	100.00	99.60	89.68
Mogambo 22	4	3576	88	2.85	100.00	78.72	79.51
Tambo	5	2587	55	2.57	97.92	91.42	68.21
KM30	1	623	22	1.52	50.00	54.46	79.16
Dorado	2	2672	64	2.78	100.00	80.29	89.76
KM15	2	950	58	1.42	99.58	31.50	82.36

Table 12. Relative performance gaps of heuristic and baseline vs. model.

Route	Utility Gap (%)		Service Level Gap (%)		Capacity Util Gap (%)		Utilization Gap (%)
Route	Heuristic	Baseline	Heuristic	Baseline	Heuristic	Baseline	Heuristic	Baseline
Pradera 27	4.91	66.78	3.50	49.61	−8.58	85.00	−14.53	95.00
Panzenu	1.56	60.47	−1.47	27.94	−7.05	82.00	−9.61	88.00
Santander	−0.28	60.34	−1.74	36.76	−9.66	80.00	−4.41	90.00
Mogambo 22	8.05	31.10	−1.42	1.42	9.20	50.00	5.17	55.00
Tambo Circ.	11.84	41.42	0.58	6.54	−6.10	70.00	−11.48	75.00
KM30	4.79	116.05	−1.30	68.72	−1.30	60.00	9.50	65.00
Dorado	−1.92	39.70	−2.15	2.10	−2.14	75.00	−0.11	80.00
KM15	7.07	61.23	−5.68	25.47	3.46	48.00	26.14	55.00

Table 13. Descriptivestatistics for Utility and Service Level over

N = 30

scenarios.

Table 13. Descriptivestatistics for Utility and Service Level over

N = 30

scenarios.

Route	Method	Utility ($)	Service Level (%)
Pradera 27	Baseline	4164 ± 54	49.1 ± 0.84
	Heuristic	8727 ± 214	72.8 ± 1.92
	MILP	10,997 ± 81	94.8 ± 1.57
Panzenu	Baseline	3700 ± 45	70.9 ± 1.01
	Heuristic	7664 ± 171	86.3 ± 0.82
	MILP	8758 ± 133	98.2 ± 0.33
Santander	Baseline	3131 ± 60	62.3 ± 0.79
	Heuristic	4118 ± 119	70.2 ± 0.62
	MILP	6794 ± 81	97.8 ± 0.38
Mogambo 22	Baseline	4053 ± 25	100.0 ± 0.00
	Heuristic	7363 ± 133	100.0 ± 0.00
	MILP	7477 ± 98	98.1 ± 0.36
Tambo Circ.	Baseline	2403 ± 49	91.9 ± 0.76
	Heuristic	2896 ± 49	95.8 ± 0.70
	MILP	3347 ± 45	98.2 ± 0.46
KM30	Baseline	−132 ± 5	15.4 ± 0.30
	Heuristic	607 ± 15	49.5 ± 0.61
	MILP	692 ± 24	49.1 ± 0.83
Dorado	Baseline	2569 ± 82	100.0 ± 0.00
	Heuristic	3035 ± 84	100.0 ± 0.00
	MILP	4249 ± 71	97.4 ± 0.41
KM15	Baseline	400 ± 16	70.0 ± 1.01
	Heuristic	448 ± 21	83.4 ± 0.75
	MILP	588 ± 32	88.1 ± 1.39

Table 14. Repeated-measures ANOVA results for Utility and Service Level.

Route	Utility			Service Level
Route	F	df	p	F	df	p
Pradera 27	21,734.4	(2.29)	$4.13 \times 10^{- 84}$	19,779.4	(2.29)	$6.32 \times 10^{- 83}$
Panzenu	25,083.6	(2.29)	$6.50 \times 10^{- 86}$	9346.5	(2.29)	$1.67 \times 10^{- 73}$
Santander	23,753.8	(2.29)	$3.15 \times 10^{- 85}$	28,450.1	(2.29)	$1.69 \times 10^{- 87}$
Mogambo 22	23,726.8	(2.29)	$3.25 \times 10^{- 85}$	813.7	(2.29)	$3.68 \times 10^{- 43}$
Tambo Circ.	58,510.9	(2.29)	$1.42 \times 10^{- 96}$	28,062.0	(2,29)	$2.52 \times 10^{- 87}$
KM30	23,415.3	(2.29)	$4.77 \times 10^{- 85}$	27,458.2	(2.29)	$4.73 \times 10^{- 87}$
Dorado	89,713.1	(2.29)	$5.92 \times 10^{- 102}$	1214.7	(2.29)	$4.60 \times 10^{- 48}$
KM15	24,330.3	(2.29)	$1.57 \times 10^{- 85}$	19,191.7	(2.29)	$1.51 \times 10^{- 82}$

Table 15. Pairwise Tukey HSD comparisons on Utility (USD).

Route	Comparison	Mean Δ	CI Lower	CI Upper	p
Pradera 27	Baseline vs. Heuristic	4563.4	4479.7	4647.1	<0.001
	Baseline vs. MILP	6833.6	6749.9	6917.4	<0.001
	Heuristic vs. MILP	2270.2	2186.5	2354.0	<0.001
Panzenu	Baseline vs. Heuristic	3963.1	3884.4	4041.8	<0.001
	Baseline vs. MILP	5057.7	4978.9	5136.4	<0.001
	Heuristic vs. MILP	1094.6	1015.8	1173.3	<0.001
Santander	Baseline vs. Heuristic	987.1	931.7	1042.5	<0.001
	Baseline vs. MILP	3663.1	3607.7	3718.5	<0.001
	Heuristic vs. MILP	2676.0	2620.5	2731.4	<0.001
Mogambo 22	Baseline vs. Heuristic	3309.9	3250.6	3369.2	<0.001
	Baseline vs. MILP	3424.1	3364.8	3483.4	<0.001
	Heuristic vs. MILP	114.2	54.9	173.5	<0.001
Tambo Circ.	Baseline vs. Heuristic	493.3	336.0	677.4	<0.001
	Baseline vs. MILP	944.2	914.9	973.5	<0.001
	Heuristic vs. MILP	450.9	382.6	520.2	<0.001
KM30	Baseline vs. Heuristic	739.8	693.5	775.4	<0.001
	Baseline vs. MILP	824.3	813.9	834.6	<0.001
	Heuristic vs. MILP	860.2	849.9	870.5	<0.001
Dorado	Baseline vs. Heuristic	464.7	415.8	513.5	<0.001
	Baseline vs. MILP	1680.1	1631.2	1728.9	<0.001
	Heuristic vs. MILP	1215.4	1166.6	1264.2	<0.001
KM15	Baseline vs. Heuristic	98.8	65.2	118.7	<0.001
	Baseline vs. MILP	187.8	173.3	202.3	<0.001
	Heuristic vs. MILP	89.0	75.6	106.2	<0.001

Table 16. Pairwise Tukey HSD comparisons on Service Level (%).

Route	Comparison	Mean $Δ$	CI Lower	CI Upper	p
Pradera 27	Baseline vs. Heuristic	23.67	22.73	24.60	<0.001
	Baseline vs. MILP	45.71	44.78	46.64	<0.001
	Heuristic vs. MILP	22.04	21.11	22.98	<0.001
Panzenu	Baseline vs. Heuristic	15.39	14.92	15.87	<0.001
	Baseline vs. MILP	27.32	26.85	27.80	<0.001
	Heuristic vs. MILP	11.93	11.45	12.41	<0.001
Santander	Baseline vs. Heuristic	7.93	7.55	8.31	<0.001
	Baseline vs. MILP	35.56	35.18	35.94	<0.001
	Heuristic vs. MILP	27.63	27.25	28.01	<0.001
Mogambo 22	Baseline vs. Heuristic	0.00	−0.13	0.13	1.000
	Baseline vs. MILP	−1.88	−2.01	−1.75	<0.001
	Heuristic vs. MILP	−1.88	−2.01	−1.75	<0.001
Tambo Circ.	Baseline vs. Heuristic	3.90	1.82	4.58	<0.001
	Baseline vs. MILP	6.25	5.85	6.65	<0.001
	Heuristic vs. MILP	2.35	1.95	3.75	<0.001
KM30	Baseline vs. Heuristic	39.10	37.28	40.52	<0.001
	Baseline vs. MILP	33.68	33.30	34.06	<0.001
	Heuristic vs. MILP	−0.41	−0.96	0.77	<0.001
Dorado	Baseline vs. Heuristic	0.00	−0.14	0.14	1.000
	Baseline vs. MILP	−2.59	−2.73	−2.44	<0.001
	Heuristic vs. MILP	−2.59	−2.73	−2.44	<0.001
KM15	Baseline vs. Heuristic	13.39	12.27	14.94	<0.001
	Baseline vs. MILP	18.12	17.46	18.79	<0.001
	Heuristic vs. MILP	4.73	4.06	5.39	<0.001

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Barrera Hernandez, J.E.; Tarazona Torres, L.E.; Tabares, A.; Álvarez-Martínez, D. Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting. Smart Cities 2025, 8, 87. https://doi.org/10.3390/smartcities8030087

AMA Style

Barrera Hernandez JE, Tarazona Torres LE, Tabares A, Álvarez-Martínez D. Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting. Smart Cities. 2025; 8(3):87. https://doi.org/10.3390/smartcities8030087

Chicago/Turabian Style

Barrera Hernandez, Javier Esteban, Luis Enrique Tarazona Torres, Alejandra Tabares, and David Álvarez-Martínez. 2025. "Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting" Smart Cities 8, no. 3: 87. https://doi.org/10.3390/smartcities8030087

APA Style

Barrera Hernandez, J. E., Tarazona Torres, L. E., Tabares, A., & Álvarez-Martínez, D. (2025). Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting. Smart Cities, 8(3), 87. https://doi.org/10.3390/smartcities8030087

Article Menu

Optimization of Bus Dispatching in Public Transportation Through a Heuristic Approach Based on Passenger Demand Forecasting

Abstract

Highlights

Abstract

1. Introduction

2. State-of-the-Art

3. Methodological Framework

3.1. Assumptions and System Characteristics

3.2. Passenger Demand Forecasting Model

3.2.1. Forecasting Models

3.2.2. Evaluation Protocol

3.3. Space-Time MILP Formulation for Optimal Bus Dispatching

3.4. Bus Dispatch Heuristic

3.4.1. Parameter Definition

3.4.2. Heuristic Description

3.4.3. Heuristic Parameter Tuning

3.5. Experimental Design

4. Computational Experiments and Results Analysis

4.1. Case Study Overview and Operational Context

4.2. Passenger Demand Forecasting Performance

4.2.1. Data Preparation and Model Configuration

4.2.2. Model Evaluation and Accuracy Metrics

4.2.3. Illustrative Demand Profiles

4.2.4. Passenger Demand Prediction in Montería

4.2.5. Model Limitations

4.3. Operational Results by Dispatch Strategy

4.3.1. Baseline Performance

4.3.2. MILP Optimization Model

4.3.3. Heuristic Strategy

4.4. Comparative Analysis of Dispatch Approaches

4.4.1. Performance Gaps

4.4.2. Execution Time Comparison

4.5. Statistical Comparison and Results

4.5.1. Descriptive Statistics

4.5.2. Global Significance Tests (ANOVA)

4.5.3. Pairwise Comparisons

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Abbreviations

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI