Application of Integrated Geospatial Analysis and Machine Learning in Identifying Factors Affecting Ride-Sharing Before/After the COVID-19 Pandemic
Abstract
1. Introduction
2. Literature Review
3. Materials and Methods
- TNC trips data 2019 (https://data.cityofchicago.org/Transportation/Transportation-Network-Providers-Trips-2019/iu3g-qa69/about_data (accessed on 2 February 2025))
- and 2023 (https://data.cityofchicago.org/Transportation/Transportation-Network-Providers-Trips-2023-2024-/n26f-ihde/about_data (accessed on 2 February 2025)):
- American Community Survey (U.S. Census Bureau, Washington, DC, USA) (socio-economic data) (https://www.census.gov/programs-surveys/acs (accessed on 12 March 2025)):
- Smart location dataset (built environmental data) (https://www.epa.gov/smartgrowth/smart-location-mapping#SLD (accessed on 15 January 2025)):
- Census tract boundaries TIGER/Line (https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.html (accessed on 12 March 2025)):
4. Methodology
4.1. XGBoost
4.2. XGBoost Model Evaluation
- -
- Accuracy, where we evaluated our predictions by comparing them to the correct dependent variable,
- -
- Log Loss (or Cross-Entropy Loss), a standard metric for probabilistic classification. Lower values indicate better calibration of predicted probabilities:
- -
- Area under the ROC curve (AUC) is a commonly used metric that evaluates a classifier’s ability to distinguish between classes at all possible thresholds. The ROC curve plots the True Positive Rate (TPR) against the False Positive Rate (FPR). An AUC of 0.5 indicates random guessing, and 1.0 indicates a perfect classifier. Mathematically, it can be expressed as the integral of TPR over FPR from 0 to 1:
4.3. Model Interpretation—SHAP
5. Results
5.1. Descriptive Analysis
5.2. Statistical Analysis
6. Discussion
7. Conclusions
Limitations and Future Studies
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Cai, H.; Wang, X.; Adriaens, P.; Xu, M. Environmental Benefits of Taxi Ride Sharing in Beijing. Energy 2019, 174, 503–508. [Google Scholar] [CrossRef]
- De Ruijter, A.; Cats, O.; Alonso-Mora, J.; Hoogendoorn, S. Ride-Pooling Adoption, Efficiency and Level of Service under Alternative Demand, Behavioural and Pricing Settings. Transp. Plan. Technol. 2023, 46, 407–436. [Google Scholar] [CrossRef]
- Yan, L.; Luo, X.; Zhu, R.; Santi, P.; Wang, H.; Wang, D.; Zhang, S.; Ratti, C. Quantifying and Analyzing Traffic Emission Reductions from Ridesharing: A Case Study of Shanghai. Transp. Res. Part D Transp. Environ. 2020, 89, 102629. [Google Scholar] [CrossRef]
- Cats, O.; Kucharski, R.; Danda, S.R.; Yap, M. Beyond the Dichotomy: How Ride-Hailing Competes with and Complements Public Transport. PLoS ONE 2022, 17, e0262496. [Google Scholar] [CrossRef]
- Shaheen, S.; Cohen, A.; Zohdy, I. Shared Mobility: Current Practices and Guiding Principles; Federal Highway Administration: Washington, DC, USA, 2016.
- Shulika, O.; Bujak, M.; Ghasemi, F.; Kucharski, R. Spatiotemporal Variability of Ride-Pooling Potential—Half a Year New York City Experiment. J. Transp. Geogr. 2024, 114, 103767. [Google Scholar] [CrossRef]
- Shulika, O.; Kucharski, R. Can We Start Sharing Our Rides Again? The Postpandemic Ride-Pooling Market. Transp. Telecommun. 2025, 26, 194–207. [Google Scholar]
- Shafiee, A.; Rastegar Moghadam, H.; Merikhipour, M.; Lin, J. Analyzing Post-Pandemic Remote Work Accessibility for Equity through Machine Learning Analysis. In Proceedings of the International Conference on Transportation and Development 2024, Atlanta, GA, USA, 15–18 June 2024; pp. 453–462. [Google Scholar]
- Shamshiripour, A.; Rahimi, E.; Shabanpour, R.; Mohammadian, A. (Kouros) How Is COVID-19 Reshaping Activity-Travel Behavior? Evidence from a Comprehensive Survey in Chicago. Transp. Res. Interdiscip. Perspect. 2020, 7, 100216. [Google Scholar] [CrossRef] [PubMed]
- Bursztynsky, J. Uber Restarting Shared Rides in U.S. Cities like New York and San Francisco. Available online: https://www.nbcnews.com/tech/tech-news/uber-restarting-shared-rides-us-cities-new-york-san-francisco-rcna34532 (accessed on 6 February 2025).
- Gastelu, G. Uber, Lyft and Other Apps Suspend Shared Rides Due to Coronavirus. Available online: https://www.foxnews.com/auto/uber-lyft-suspend-shared-rides-coronavirus (accessed on 5 February 2025).
- Davalos, J. Lyft Will Discontinue Pooled Rides. The Mercury News, 11 May 2023. Available online: https://www.mercurynews.com/2023/05/11/lyft-will-discontinue-pooled-rides (accessed on 25 July 2025).
- Fitzpatrick, A.; Beheraj, K. Where Public Transit Is Recovering—And Where It’s Not. Available online: https://www.axios.com/2023/12/14/public-trasnportation-transit-america-recovery-pandemic-covid (accessed on 6 February 2025).
- Roman, A. Public Transit Ridership Continues Post-COVID Bounce Back. Available online: https://www.metro-magazine.com/10216134/public-transit-ridership-continues-post-covid-bounce-back (accessed on 6 February 2025).
- de Ruijter, A.; Cats, O.; van Lint, H. Ridesourcing Platforms Thrive on Socio-Economic Inequality. Sci. Rep. 2024, 14, 7371. [Google Scholar] [CrossRef]
- Brown, A.E. Who and Where Rideshares? Rideshare Travel and Use in Los Angeles. Transp. Res. Part A Policy Pract. 2020, 136, 120–134. [Google Scholar] [CrossRef]
- Lavieri, P.S.; Bhat, C.R. Modeling Individuals’ Willingness to Share Trips with Strangers in an Autonomous Vehicle Future. Transp. Res. Part A Policy Pract. 2019, 124, 242–261. [Google Scholar] [CrossRef]
- Shoman, M.; Moreno, A.T. Exploring Preferences for Transportation Modes in the City of Munich after the Recent Incorporation of Ride-Hailing Companies. Transp. Res. Rec. J. Transp. Res. Board. 2021, 2675, 329–338. [Google Scholar] [CrossRef]
- Kang, S.; Mondal, A.; Bhat, A.C.; Bhat, C.R. Pooled versus Private Ride-Hailing: A Joint Revealed and Stated Preference Analysis Recognizing Psycho-Social Factors. Transp. Res. Part C Emerg. Technol. 2021, 124, 102906. [Google Scholar] [CrossRef]
- Zheng, H.; Chen, X.; Chen, X. How Does On-Demand Ridesplitting Influence Vehicle Use and Purchase Willingness? A Case Study in Hangzhou, China. IEEE Intell. Transp. Syst. Mag. 2019, 11, 143–157. [Google Scholar] [CrossRef]
- Ge, Y.; Knittel, C.R.; MacKenzie, D.; Zoepf, S. Racial Discrimination in Transportation Network Companies. J. Public Econ. 2020, 190, 104205. [Google Scholar] [CrossRef]
- Middleton, S.; Zhao, J. Discriminatory Attitudes between Ridesharing Passengers. Transportation 2020, 47, 2391–2414. [Google Scholar] [CrossRef]
- Kong, H.; Zhang, X.; Zhao, J. How Does Ridesourcing Substitute for Public Transit? A Geospatial Perspective in Chengdu, China. J. Transp. Geogr. 2020, 86, 102769. [Google Scholar] [CrossRef]
- Ghaffar, A.; Mitra, S.; Hyland, M. Modeling Determinants of Ridesourcing Usage: A Census Tract-Level Analysis of Chicago. Transp. Res. Part C Emerg. Technol. 2020, 119, 102769. [Google Scholar] [CrossRef]
- Friedman, J.H. Greedy Function Approximation: A Gradient Boosting Machine. Ann. Statist. 2001, 29, 1189–1232. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13 August 2016; pp. 785–794. [Google Scholar]
- Badr, W. Why Feature Correlation Matters … A Lot! Towards Data Science, 18 January 2019. Available online: https://towardsdatascience.com/why-feature-correlation-matters-a-lot-847e8ba439c4 (accessed on 25 July 2025).
- Lundberg, S.M.; Lee, S.-I. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17); Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4768–4777. [Google Scholar] [CrossRef]
- Shapley, L.S. A Value for n-Person Games. In Contributions to the Theory of Games (AM-28), Volume II; Kuhn, H.W., Tucker, A.W., Eds.; Princeton University Press: Princeton, NJ, USA, 1953; pp. 307–318. ISBN 978-1-4008-8197-0. [Google Scholar]
- Štrumbelj, E.; Kononenko, I. Explaining Prediction Models and Individual Predictions with Feature Contributions. Knowl. Inf. Syst. 2014, 41, 647–665. [Google Scholar] [CrossRef]
- Molnar, C. Interpretable Machine Learning: A Guide for Making Black Box Models Explainable, 3rd ed.; Self-Published, 2023; Available online: https://christophm.github.io/interpretable-ml-book/ (accessed on 25 July 2025).
- Taiebat, M.; Amini, E.; Xu, M. Sharing Behavior in Ride-Hailing Trips: A Machine Learning Inference Approach. Transp. Res. Part D Transp. Environ. 2022, 103, 103166. [Google Scholar] [CrossRef]
- Chicago Transit Agency. CTA Annual Ridership Report, 2023 Full Year Report; Chicago Transit Agency: Chicago, IL, USA, 2024.
- Igielnik, R.; Anderson, M. Ride-Hailing Services Are Seen by Minorities as a Benefit to Areas Underserved by Taxis; Pew Research Center: Washington, DC, USA, 2016. [Google Scholar]
- Asgharpour, S.; Allahyari, A.; Mohammadi, M.; Mohammadian, R.; Mohammadian, A.; Abraham, C. Investigating Equity of Public Transit Accessibility: Comparison of Accessibility among Disadvantaged Groups in Cook County, IL. In Proceedings of the International Conference on Transportation and Development, Austin, TX, USA, 14–17 June 2023; pp. 639–650. [Google Scholar] [CrossRef]
- Liu, L.; Miller, H.J.; Scheff, J. The Impacts of COVID-19 Pandemic on Public Transit Demand in the United States. PLoS ONE 2020, 15, e0242476. [Google Scholar] [CrossRef] [PubMed]
- Hipp, J.R.; Yates, D.K. Ghettos, Thresholds, and Crime: Does Concentrated Poverty Really Have an Accelerating Increasing Effect on Crime?*: Poverty and Crime. Criminology 2011, 49, 955–990. [Google Scholar] [CrossRef]
- Mitropoulos, L.; Kortsari, A.; Ayfantopoulou, G. A Systematic Literature Review of Ride-Sharing Platforms, User Factors and Barriers. Eur. Transp. Res. Rev. 2021, 13, 61. [Google Scholar] [CrossRef] [PubMed]
Variable | Description | Source |
---|---|---|
Walkability index (0–20) | A composite score (0 = least walkable, 20 = most walkable) reflects how friendly an area is for walking, based on land use mix, street connectivity, and pedestrian infrastructure. | EPA Smart Location |
Job accessibility (D5BR) | Jobs within a 45-min transit commute. | EPA Smart Location |
Average transit frequency (D4D) | The aggregate frequency of peak hour transit service per square mile in an area. | EPA Smart Location |
Road network density (mi/sq mi) | Total length of roadway facilities in miles per square mile of land area. | EPA Smart Location |
Job density (jobs/acre) | The number of jobs located per acre of an area, indicating job concentration. | EPA Smart Location |
Population density (pop/acre) | Number of people per acre in an area, representing residential concentration. | American Community Survey |
Job/Pop ratio | The ratio of jobs to population in an area. | EPA Smart Location |
Median income (USD 1000/household) | Median annual household income in thousands of dollars. | American Community Survey |
Percentage of non-white population | Proportion of the population in an area that identifies as a race or ethnicity other than non-Hispanic white. | American Community Survey |
Percentage of bachelor’s or higher degrees | Potion of population with a bachelor’s degree or higher in an area. | American Community Survey |
Percentage of households with no car | Portion of households without owned cars in an area. | American Community Survey |
Bus stop density (stops/sq mi) | Number of bus stops per square mile of an area. | Chicago Metropolitan Agency for Planning |
Rail station density (station/sq mi) | Number of urban rail stations per square mile of an area. | Chicago Metropolitan Agency for Planning |
Crime density (crimes/sq mi) | Number of crime incidents per square mile of an area | Chicago Police Department (Chicago, IL, USA) |
Variable | Mean | Std | Min | Max |
---|---|---|---|---|
Total number of trips | 79,383 | 230,770 | 39 | 3,307,656 |
Average cost (USD/trip) | 11 | 2 | 8 | 23 |
Number of shared trips requested | 12,953 | 29,504 | 16 | 463,168 |
Number of unshared trips | 66,430 | 202,071 | 23 | 2,844,488 |
Total shared (matched) trips | 10,091 | 23,792 | 11 | 389,279 |
Shared (matched) trip percentage | 22.1% | 9.0% | 3.6% | 41.2% |
Crime density (crimes/sq mi) | 1685 | 1585 | 14 | 15,014 |
Walkability index (0-20) | 14.51 | 1.73 | 8.58 | 19.67 |
Job accessibility by transit (D5BR) | 539 | 247 | 11 | 1498 |
Total transit frequency (d4d) | 1885 | 4120 | 15 | 59,709 |
Average transit frequency (d4d) | 637 | 1145 | 14 | 12,637 |
Employment | 1678 | 12,395 | 0 | 331,288 |
Population | 3413 | 1872 | 347 | 20,087 |
Road network density (mi/sq mi) | 30.41 | 7.25 | 9.04 | 68.66 |
Job density (jobs/acre) | 11.54 | 54.56 | 0 | 1216.63 |
Population density (pop/acre) | 29.49 | 24.46 | 0.75 | 403.82 |
Job/pop | 0 | 2 | 0 | 44 |
Median income (USD 1000/household) | 60.472 | 34.599 | 11.146 | 194.167 |
Percentage of bachelor’s or higher degrees | 15.5% | 11.4% | 0% | 45.5% |
Percentage of non-white population | 69.4% | 29.5% | 7.6% | 100% |
Percentage of households with no car | 26.6% | 15.1% | 0.5% | 75.3% |
Bus stop density (stops/sq mi) | 631 | 332 | 0 | 3029 |
Rail station density (station/sq mi) | 10 | 35 | 0 | 399 |
Variable | Mean | Std | Min | Max |
---|---|---|---|---|
Total number of trips | 49,504 | 142,267 | 19 | 1,827,425 |
Average cost (USD/trip) | 18.9 | 1.8 | 14.5 | 32.7 |
Number of shared trips requested | 1529 | 2931 | 0 | 38,347 |
Number of unshared trips | 47,975 | 139,572 | 19 | 1,789,078 |
Total Shared (matched) trips | 784 | 1574 | 0 | 22,411 |
Shared (matched) trip percentage | 3.2% | 2.1% | 0% | 9.9% |
Crime density (crimes/sq mi) | 1709 | 1379 | 14 | 15,332 |
Walkability index (0–20) | 14.5 | 1.7 | 8.6 | 19.7 |
Job accessibility by transit (D5BR) | 539 | 247 | 11 | 1498 |
Total transit frequency (d4d) | 1885 | 4120 | 15 | 59,709 |
Average transit frequency (d4d) | 637 | 1145 | 14 | 12,637 |
Employment | 1678 | 12,395 | 0 | 331,288 |
Population | 3425 | 1860 | 375 | 19,889 |
Road network density (mi/sq mi) | 30.41 | 7.25 | 9.04 | 68.66 |
Job density (jobs/acre) | 11.54 | 54.56 | 0 | 1216.63 |
Population density (pop/acre) | 29.55 | 24.44 | 0.72 | 411.31 |
Job/Pop | 0.42 | 1.89 | 0 | 45.06 |
Median income (USD/household) | 74.046 | 41.361 | 13.438 | 250.001 |
Percentage of bachelor’s or higher degrees | 16.5% | 11.5% | 0% | 48.5% |
Percentage of non-white population | 69.9% | 28.7% | 10.4% | 100% |
Percentage of households with no car | 4.4% | 3.9% | 0% | 28.2% |
Bus stop density (stops/sq mi) | 631 | 332 | 0 | 3029 |
Rail station density (station/sq mi) | 10 | 35 | 0 | 399 |
Model | 2019 R2 | 2023 R2 |
---|---|---|
XGBoost (Training Dataset) | 0.98 | 0.95 |
XGBoost (Test Dataset) | 0.87 | 0.85 |
OLS (Entire Dataset) | 0.722 | 0.643 |
Variable | Coefficient | Std Error | T-Statistic | p-Value | VIF |
---|---|---|---|---|---|
Constant | 41.6080 | 1.629 | 25.537 | 0 | |
Bachelor’s Degree % | −0.5371 | 0.02 | −27.324 | 0 | 4.4 |
Population Density | −0.0731 | 0.009 | −8.065 | 0 | 3.9 |
Job Density | −0.016 | 0.004 | −4.366 | 0 | 1.4 |
Job Accessibility | 0.3409 | 0.083 | 4.108 | 0 | 8.5 |
Crime Density | 0.1129 | 0.014 | 7.819 | 0 | 3.8 |
Walkability Index | −0.9426 | 0.118 | −7.985 | 0 | 8.6 |
Bus Stop Density | 0.1853 | 0.067 | 2.766 | 0.006 | 8 |
Rail Station Density | −0.0076 | 0.005 | −1.486 | 0.138 | 1.2 |
R2 | 0.722 | ||||
Adjusted R2 | 0.719 |
Variable | Coefficient | Std Error | T-Statistic | p-Value | VIF |
---|---|---|---|---|---|
Constant | 9.285 | 0.426 | 21.812 | 0 | |
Bachelor’s Degree % | −0.0971 | 0.005 | −19.93 | 0 | 4.4 |
Population Density | −0.0249 | 0.002 | −10.668 | 0 | 3.7 |
Job Density | −0.0039 | 0.001 | −4.195 | 0 | 1.3 |
Job Accessibility | 0.1055 | 0.022 | 4.813 | 0 | 8.6 |
Crime Density | 0.0472 | 0.005 | 10.435 | 0 | 4.8 |
Walkability Index | −0.3545 | 0.031 | −11.51 | 0 | 8.6 |
Bus Stop Density | 0.0219 | 0.018 | 1.231 | 0.219 | 8.2 |
Rail Station Density | −0.0028 | 0.001 | −2.047 | 0.041 | 1.25 |
R2 | 0.647 | ||||
Adjusted R2 | 0.643 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Published by MDPI on behalf of the International Society for Photogrammetry and Remote Sensing. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Allahyari, A.; Peiravian, F. Application of Integrated Geospatial Analysis and Machine Learning in Identifying Factors Affecting Ride-Sharing Before/After the COVID-19 Pandemic. ISPRS Int. J. Geo-Inf. 2025, 14, 291. https://doi.org/10.3390/ijgi14080291
Allahyari A, Peiravian F. Application of Integrated Geospatial Analysis and Machine Learning in Identifying Factors Affecting Ride-Sharing Before/After the COVID-19 Pandemic. ISPRS International Journal of Geo-Information. 2025; 14(8):291. https://doi.org/10.3390/ijgi14080291
Chicago/Turabian StyleAllahyari, Afshin, and Farideddin Peiravian. 2025. "Application of Integrated Geospatial Analysis and Machine Learning in Identifying Factors Affecting Ride-Sharing Before/After the COVID-19 Pandemic" ISPRS International Journal of Geo-Information 14, no. 8: 291. https://doi.org/10.3390/ijgi14080291
APA StyleAllahyari, A., & Peiravian, F. (2025). Application of Integrated Geospatial Analysis and Machine Learning in Identifying Factors Affecting Ride-Sharing Before/After the COVID-19 Pandemic. ISPRS International Journal of Geo-Information, 14(8), 291. https://doi.org/10.3390/ijgi14080291