A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks

Bongiovi, Gustavo; Dias, Teresa Galvão; Nauri Junior, Jose; Campos Ferreira, Marta

doi:10.3390/app16031384

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks

by

Gustavo Bongiovi

^1,†

,

Teresa Galvão Dias

^2,†

,

Jose Nauri Junior

^3,†

and

Marta Campos Ferreira

^2,*,†

¹

Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal

²

Instituto de Engenharia de Sistemas e Computadores, Tecnologia e Ciência (INESC TEC), Faculdade de Engenharia, Universidade do Porto, Rua Dr. Roberto Frias, s/n, 4200-465 Porto, Portugal

³

Agência Reguladora de Serviços Públicos Delegados do Estado do Ceará (Arce), Av. Gen. Afonso Albuquerque Lima, Cambeba, Fortaleza 60822-325, Brazil

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Appl. Sci. 2026, 16(3), 1384; https://doi.org/10.3390/app16031384

Submission received: 19 December 2025 / Revised: 9 January 2026 / Accepted: 27 January 2026 / Published: 29 January 2026

(This article belongs to the Special Issue Data Science and Machine Learning in Logistics and Transport, 2nd Edition)

Download Versions Notes

Abstract

This study explores the application of multiple predictive algorithms under general versus route-specialized modeling strategies to estimate passenger boarding demand in public bus transportation systems. Accurate estimation of boarding patterns is essential for optimizing service planning, improving passenger comfort, and enhancing operational efficiency. This research evaluates a range of predictive models to identify the most effective techniques for forecasting demand across different routes and times. Two modeling strategies were implemented: a generalistic approach and a specialized one. The latter was designed to capture route-specific characteristics and variability. A real-world case study from a medium-sized metropolitan region in Brazil was used to assess model performance. Results indicate that ensemble-tree-based models, particularly XGBoost, achieved the highest accuracy and robustness in handling nonlinear relationships and complex interactions within the data. Compared to the generalistic approach, the specialized approach demonstrated superior adaptability and precision, making it especially suitable for long-term and strategic planning applications. It reduced the average RMSE by 19.46% (from 13.84 to 11.15) and the MAE by 17.36% (from 9.60 to 7.93), while increasing the average R² from 0.289 to 0.344. However, these gains came with higher computational demands and mean Forecast Bias (from 0.002 to 0.560), indicating a need for bias correction before operational deployment. The findings highlight the practical value of predictive modeling for transit authorities, enabling data-driven decision making in fleet allocation, route planning, and service frequency adjustment. Moreover, accurate demand forecasting contributes to cost reduction, improved passenger satisfaction, and environmental sustainability through optimized operations.

Keywords: passenger demand forecasting; predictive modeling; machine learning; XGBoost; public bus transportation

Share and Cite

MDPI and ACS Style

Bongiovi, G.; Dias, T.G.; Nauri Junior, J.; Campos Ferreira, M. A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks. Appl. Sci. 2026, 16, 1384. https://doi.org/10.3390/app16031384

AMA Style

Bongiovi G, Dias TG, Nauri Junior J, Campos Ferreira M. A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks. Applied Sciences. 2026; 16(3):1384. https://doi.org/10.3390/app16031384

Chicago/Turabian Style

Bongiovi, Gustavo, Teresa Galvão Dias, Jose Nauri Junior, and Marta Campos Ferreira. 2026. "A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks" Applied Sciences 16, no. 3: 1384. https://doi.org/10.3390/app16031384

APA Style

Bongiovi, G., Dias, T. G., Nauri Junior, J., & Campos Ferreira, M. (2026). A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks. Applied Sciences, 16(3), 1384. https://doi.org/10.3390/app16031384

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.

Article Menu

A Data-Driven Approach to Estimating Passenger Boarding in Bus Networks

Abstract

Share and Cite

Article Metrics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI