Investigating the Nonlinear Effect of Built Environment Factors on Metro Station-Level Ridership under Optimal Pedestrian Catchment Areas via the Machine Learning Method
Abstract
:1. Introduction
2. Literature Review
Author | Analysis Methods | Main Independent Variables | Dependent Variables | PCA | Travel Mode | Analysis Area |
---|---|---|---|---|---|---|
Estupiñán et al. (2008) [56] | Two Stage Least Square | socioeconomic characteristics | Daily ridership | 250 m buffer zone | BRT | Bogotá, Columbia |
Sohn et al. (2010) [31] | OLS/SEM | socioeconomic characteristics, accessibility, land use and density | Average weekday ridership | 250 m buffer zone | Metro | Seoul, Republic of Korea |
Loo et al. (2010) [35] | OLS | socioeconomic characteristics, accessibility, rail transit service | Average weekday ridership | N/A | Metro | New York, USA and Hong Kong, China |
Gutiérrez et al. (2011) [33] | OLS | socioeconomic characteristics, land use and density | Monthly ridership | Threshold of change | Metro | Madrid, Spain |
Sung et al. (2011) [34] | OLS | socioeconomic characteristics, accessibility | Ridership by time of day, week, and mode of transport | 500 m buffer zone | Metro, Bus | Seoul, Republic of Korea |
Cardozo et al. (2012) [36] | GWR/OLS | socioeconomic characteristics, accessibility | Monthly ridership | 800 m and 200 m buffer zone | Metro | Madrid, Spain |
Zhao et al. (2013) [4] | OLS | socioeconomic characteristics, accessibility, land use and density | Annual average weekday ridership | 800 m buffer zone | Metro | Nanjing, China |
Zhao et al. (2013) [16] | OLS | accessibility, rail transit service | Ridership between stations | 800 m buffer zone | Metro | Nanjing, China |
Hyungun et al. (2014) [1] | SER | land use and density, socioeconomic characteristics, rail transit service | Average weekday ridership | 250, 500, 750, 1000, and 1500 m buffer zone | Metro | Seoul, Republic of Korea |
Jun et al. (2015) [22] | MGWR | land use and density, socioeconomic characteristics | 600 m buffer zone | Metro | Seoul, Republic of Korea | |
Calvo et al. (2019) [37] | GWR | land use and density, socioeconomic characteristics | Average weekday ridership | N/A | Metro | Madrid, Spain |
Ding et al. (2019) [21] | Gradient Boosting regression trees (GBRT) | socioeconomic characteristics, accessibility, land use and density, rail transit service | Average inbound ridership on weekdays | 400 m buffer zone | Metro | Washington, DC, USA |
Li et al. (2020) [23] | GWR | land use and density, socioeconomic characteristics | Weekday ridership, weekend ridership, weekday morning peak arrivals and evening peak departures average, weekday morning peak departures and evening peak arrivals average | 800 m buffer zone | Metro | Guangzhou, China |
Gan et al. (2020) [3] | Gradient Boosting regression trees (GBRT) | socioeconomic characteristics, land use and density, rail transit service | OD ridership in different time periods of a day | 800 m buffer zone | Metro | Nanjing, China |
Andersson et al. (2021) [39] | GWR | “5D” of Built Environment | Seasonal daily traffic volume | 600 m buffer zone | Metro | Tai Pei, China |
Wang et al. (2022) [17] | MGWR | “7D” of Built Environment | Alighting ridership during the morning peak hours | Overlapped by 1000 m radius circular buffer zone and Thiessen polygon | Metro | Beijing, China |
Du et al. (2022) [18] | Gradient Boosting regression trees (GBRT) | socioeconomic characteristics, land use and density, rail transit service | Weekday daily ridership, weekend ridership, weekday morning peak ridership, weekday evening peak ridership | 800 m grid distance | Metro | Xian, China |
3. Study Scope and Data
3.1. Study Scope and Data Sources
3.2. Explanatory Variables of the Built Environment
4. Methods
4.1. Pedestrian Catchment Areas (PCA) Delineation for Metro Stations
4.2. eXtreme Gradient Boosting (XGBoost)
4.3. Explanation of Machine Learning Models: SHAP (Shapley Additive exPlanations)
4.4. Mean Absolute Percentage Error
5. Results and Discussion
5.1. Optimal Metro Stations PCA for Different Zones
5.2. Global Impact on Metro Ridership
5.3. Nonlinear Effects on Metro Ridership
5.4. Spatial Heterogeneity Effecton Metro Ridership
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Sung, H.; Choi, K.; Lee, S.; Cheon, S. Exploring the impacts of land use by service coverage and station-level accessibility on rail transit ridership. J. Transp. Geogr. 2014, 36, 134–140. [Google Scholar] [CrossRef]
- Chiou, Y.-C.; Jou, R.-C.; Yang, C.-H. Factors affecting public transportation usage rate: Geographically weighted regression. Transp. Res. Part A Policy Pract. 2015, 78, 161–177. [Google Scholar] [CrossRef]
- Gan, Z.; Yang, M.; Feng, T.; Timmermans, H.J.P. Examining the relationship between built environment and metro ridership at station-to-station level. Transp. Res. Part D Transp. Environ. 2020, 82, 102332. [Google Scholar] [CrossRef]
- Zhao, J.; Deng, W.; Song, Y.; Zhu, Y. What influences Metro station ridership in China? Insights from Nanjing. Cities 2013, 35, 114–124. [Google Scholar] [CrossRef]
- Li, S.; Liu, X.; Li, Z.; Wu, Z.; Yan, Z.; Chen, Y.; Gao, F. Spatial and Temporal Dynamics of Urban Expansion along the Guangzhou–Foshan Inter-City Rail Transit Corridor, China. Sustainability 2018, 10, 593. [Google Scholar] [CrossRef]
- Shen, Q.; Chen, P.; Pan, H. Factors affecting car ownership and mode choice in rail transit-supported suburbs of a large Chinese city. Transp. Res. Part A Policy Pract. 2016, 94, 31–44. [Google Scholar] [CrossRef]
- Cullinane, S. The relationship between car ownership and public transport provision: A case study of Hong Kong. Transp. Policy 2002, 9, 29–39. [Google Scholar] [CrossRef]
- Goodwin, P.B. Car ownership and public transport use: Revisiting the interaction. Transportation 1993, 20, 21–33. [Google Scholar] [CrossRef]
- Nguyen-Phuoc, D.Q.; Currie, G.; De Gruyter, C.; Young, W. Congestion relief and public transport: An enhanced method using disaggregate mode shift evidence. Case Stud. Transp. Policy 2018, 6, 518–528. [Google Scholar] [CrossRef]
- Badland, H.M.; Rachele, J.N.; Roberts, R.; Giles-Corti, B. Creating and applying public transport indicators to test pathways of behaviours and health through an urban transport framework. J. Transp. Health 2017, 4, 208–215. [Google Scholar] [CrossRef]
- Currie, G. Quantifying spatial gaps in public transport supply based on social needs. J. Transp. Geogr. 2010, 18, 31–41. [Google Scholar] [CrossRef]
- Cervero, R.; Day, J. Suburbanization and transit-oriented development in China. Transp. Policy 2008, 15, 315–323. [Google Scholar] [CrossRef]
- Huang, X.; Cao, X.; Cao, X.; Yin, J. How does the propensity of living near rail transit moderate the influence of rail transit on transit trip frequency in Xi’an? J. Transp. Geogr. 2016, 54, 194–204. [Google Scholar] [CrossRef]
- Central People’s Government of the People’s Republic of China. Modern Comprehensive Transport System Development Plan for the Fourteenth Five-Year Plan. Available online: http://www.gov.cn/zhengce/content/2022-01/18/content_5669049.htm (accessed on 12 April 2023).
- Gazette, P. Beijing Underground’s Daily Passenger Volume Breaks 11 Million, Some Stations Will Start to Take Temporary Flow Restriction Measures in the Morning Rush Hour. Available online: https://baijiahao.baidu.com/s?id=1758039502843960181&wfr=spider&for=pc (accessed on 23 May 2023).
- Zhao, J.; Deng, W.; Song, Y.; Zhu, Y. Analysis of Metro ridership at station level and station-to-station level in Nanjing: An approach based on direct demand models. Transportation 2013, 41, 133–155. [Google Scholar] [CrossRef]
- Wang, Z.; Song, J.; Zhang, Y.; Li, S.; Jia, J.; Song, C. Spatial Heterogeneity Analysis for Influencing Factors of Outbound Ridership of Subway Stations Considering the Optimal Scale Range of “7D” Built Environments. Sustainability 2022, 14, 16314. [Google Scholar] [CrossRef]
- Du, Q.; Zhou, Y.; Huang, Y.; Wang, Y.; Bai, L. Spatiotemporal exploration of the non-linear impacts of accessibility on metro ridership. J. Transp. Geogr. 2022, 102, 103380. [Google Scholar] [CrossRef]
- Ji, S.; Wang, X.; Lyu, T.; Liu, X.; Wang, Y.; Heinen, E.; Sun, Z. Understanding cycling distance according to the prediction of the XGBoost and the interpretation of SHAP: A non-linear and interaction effect analysis. J. Transp. Geogr. 2022, 103, 103414. [Google Scholar] [CrossRef]
- Caigang, Z.; Shaoying, L.; Zhangzhi, T.; Feng, G.; Zhifeng, W. Nonlinear and threshold effects of traffic condition and built environment on dockless bike sharing at street level. J. Transp. Geogr. 2022, 102, 103375. [Google Scholar] [CrossRef]
- Ding, C.; Cao, X.; Liu, C. How does the station-area built environment influence Metrorail ridership? Using gradient boosting decision trees to identify non-linear thresholds. J. Transp. Geogr. 2019, 77, 70–78. [Google Scholar] [CrossRef]
- Jun, M.-J.; Choi, K.; Jeong, J.-E.; Kwon, K.-H.; Kim, H.-J. Land use characteristics of subway catchment areas and their influence on subway ridership in Seoul. J. Transp. Geogr. 2015, 48, 30–40. [Google Scholar] [CrossRef]
- Li, S.; Lyu, D.; Huang, G.; Zhang, X.; Gao, F.; Chen, Y.; Liu, X. Spatially varying impacts of built environment factors on rail transit ridership at station level: A case study in Guangzhou, China. J. Transp. Geogr. 2020, 82, 102631. [Google Scholar] [CrossRef]
- Li, S.; Lyu, D.; Liu, X.; Tan, Z.; Gao, F.; Huang, G.; Wu, Z. The varying patterns of rail transit ridership and their relationships with fine-scale built environment factors: Big data analytics from Guangzhou. Cities 2020, 99, 102580. [Google Scholar] [CrossRef]
- Shao, Q.; Zhang, W.; Cao, X.; Yang, J.; Yin, J. Threshold and moderating effects of land use on metro ridership in Shenzhen: Implications for TOD planning. J. Transp. Geogr. 2020, 89, 102878. [Google Scholar] [CrossRef]
- Robert, C.; Kara, K. Travel Demand and the 3Ds: Density, Diversity, and Design. Transp. Res. Part D Transp. Environ. 1997, 2, 199–219. [Google Scholar]
- Frank, L.D.; Gary, P. Impacts of Mixed Use and Density on Utilization of Three Modes of Travel: Single-Occupant Vehicle, Transit, and Walking. Transp. Res. Rec. 1994, 1994, 44–52. [Google Scholar]
- Todd, M.; Reid, E. Transit-Oriented Development in the Sun Belt. Transp. Res. Rec. 1996, 1552, 145–153. [Google Scholar]
- Chanam, L.; Moudon, A.V. Correlates of Walking for Transportation or Recreation Purposes. J. Phys. Act. Health 2006, 3, S77–S98. [Google Scholar]
- Kuby, M.; Barranda, A.; Upchurch, C. Factors influencing light-rail station boardings in the United States. Transp. Res. Part A Policy Pract. 2004, 38, 223–247. [Google Scholar] [CrossRef]
- Sohn, K.; Shim, H. Factors generating boardings at Metro stations in the Seoul metropolitan area. Cities 2010, 27, 358–368. [Google Scholar] [CrossRef]
- Zhao, J.; Deng, W. Relationship of Walk Access Distance to Rapid Rail Transit Stations with Personal Characteristics and Station Context. J. Urban Plan. Dev. 2013, 139, 311–321. [Google Scholar] [CrossRef]
- Gutiérrez, J.; Cardozo, O.D.; García-Palomares, J.C. Transit ridership forecasting at station level: An approach based on distance-decay weighted regression. J. Transp. Geogr. 2011, 19, 1081–1092. [Google Scholar] [CrossRef]
- Sung, H.; Oh, J.-T. Transit-oriented development in a high-density city: Identifying its association with transit ridership in Seoul, Korea. Cities 2011, 28, 70–82. [Google Scholar] [CrossRef]
- Loo, B.P.Y.; Chen, C.; Chan, E.T.H. Rail-based transit-oriented development: Lessons from New York City and Hong Kong. Landsc. Urban Plan. 2010, 97, 202–212. [Google Scholar] [CrossRef]
- Cardozo, O.D.; García-Palomares, J.C.; Gutiérrez, J. Application of geographically weighted regression to the direct forecasting of transit ridership at station-level. Appl. Geogr. 2012, 34, 548–558. [Google Scholar] [CrossRef]
- Calvo, F.; Eboli, L.; Forciniti, C.; Mazzulla, G. Factors influencing trip generation on metro system in Madrid (Spain). Transp. Res. Part D Transp. Environ. 2019, 67, 156–172. [Google Scholar] [CrossRef]
- Lu, B.; Yang, W.; Ge, Y.; Harris, P. Improvements to the calibration of a geographically weighted regression with parameter-specific distance metrics and bandwidths. Comput. Environ. Urban Syst. 2018, 71, 41–57. [Google Scholar] [CrossRef]
- Andersson, D.E.; Shyr, O.F.; Yang, J. Neighbourhood effects on station-level transit use: Evidence from the Taipei metro. J. Transp. Geogr. 2021, 94, 103127. [Google Scholar] [CrossRef]
- Yu, L.; Cong, Y.; Chen, K. Determination of the Peak Hour Ridership of Metro Stations in Xi’an, China Using Geographically-Weighted Regression. Sustainability 2020, 12, 2255. [Google Scholar] [CrossRef]
- Cheng, L.; Chen, X.; De Vos, J.; Lai, X.; Witlox, F. Applying a random forest method approach to model travel mode choice behavior. Travel Behav. Soc. 2019, 14, 1–10. [Google Scholar] [CrossRef]
- Hagenauer, J.; Helbich, M. A comparative study of machine learning classifiers for modeling travel mode choice. Expert Syst. Appl. 2017, 78, 273–282. [Google Scholar] [CrossRef]
- Zhao, X.; Yan, X.; Yu, A.; Van Hentenryck, P. Prediction and behavioral analysis of travel mode choice: A comparison of machine learning and logit models. Travel Behav. Soc. 2020, 20, 22–35. [Google Scholar] [CrossRef]
- Liu, M.; Liu, Y.; Ye, Y. Nonlinear effects of built environment features on metro ridership: An integrated exploration with machine learning considering spatial heterogeneity. Sustain. Cities Soc. 2023, 95, 104613. [Google Scholar] [CrossRef]
- Liang, W.; Luo, S.; Zhao, G.; Wu, H. Predicting Hard Rock Pillar Stability Using GBDT, XGBoost, and LightGBM Algorithms. Mathematics 2020, 8, 765. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar]
- Kim, S.; Lee, S. Nonlinear relationships and interaction effects of an urban environment on crime incidence: Application of urban big data and an interpretable machine learning method. Sustain. Cities Soc. 2023, 91, 104419. [Google Scholar] [CrossRef]
- Sun, B.; Sun, T.; Jiao, P.; Tang, J. Spatio-Temporal Segmented Traffic Flow Prediction with ANPRS Data Based on Improved XGBoost. J. Adv. Transp. 2021, 2021, 5559562. [Google Scholar] [CrossRef]
- Ran, D.; Jiaxin, H.; Yuzhe, H. Application of a Combined Model based on K-means++ and XGBoost in Traffic Congestion Prediction. In Proceedings of the 2020 5th International Conference on Smart Grid and Electrical Automation (ICSGEA), Zhangjiajie, China, 13–14 June 2020; pp. 413–418. [Google Scholar]
- Lv, C.X.; An, S.Y.; Qiao, B.J.; Wu, W. Time series analysis of hemorrhagic fever with renal syndrome in mainland China by using an XGBoost forecasting model. BMC Infect. Dis. 2021, 21, 839. [Google Scholar] [CrossRef]
- Tang, J.; Zheng, L.; Han, C.; Liu, F.; Cai, J. Traffic Incident Clearance Time Prediction and Influencing Factor Analysis Using Extreme Gradient Boosting Model. J. Adv. Transp. 2020, 2020, 6401082. [Google Scholar] [CrossRef]
- Liu, J.; Wang, B.; Xiao, L. Non-linear associations between built environment and active travel for working and shopping: An extreme gradient boosting approach. J. Transp. Geogr. 2021, 92, 103034. [Google Scholar] [CrossRef]
- Yang, L.; Ao, Y.; Ke, J.; Lu, Y.; Liang, Y. To walk or not to walk? Examining non-linear effects of streetscape greenery on walking propensity of older adults. J. Transp. Geogr. 2021, 94, 103099. [Google Scholar] [CrossRef]
- Yang, C.; Chen, M.; Yuan, Q. The application of XGBoost and SHAP to examining the factors in freight truck-related crashes: An exploratory analysis. Accid. Anal. Prev. 2021, 158, 106153. [Google Scholar] [CrossRef]
- Zhou, S.; Liu, Z.; Wang, M.; Gan, W.; Zhao, Z.; Wu, Z. Impacts of building configurations on urban stormwater management at a block scale using XGBoost. Sustain. Cities Soc. 2022, 87, 104235. [Google Scholar] [CrossRef]
- Estupiñán, N.; Rodríguez, D.A. The relationship between urban form and station boardings for Bogotá’s BRT. Transp. Res. Part A Policy Pract. 2008, 42, 296–306. [Google Scholar] [CrossRef]
- Sun, L.S.; Wang, S.W.; Yao, L.Y.; Rong, J.; Ma, J.M. Estimation of transit ridership based on spatial analysis and precise land use data. Transp. Lett. 2016, 8, 140–147. [Google Scholar] [CrossRef]
- Thompson, G.; Brown, J.; Bhattacharya, T. What Really Matters for Increasing Transit Ridership: Understanding the Determinants of Transit Ridership Demand in Broward County, Florida. Urban Stud. 2012, 49, 3327–3345. [Google Scholar] [CrossRef]
- Ewing, R.; Cervero, R. Travel and the Built Environment. J. Am. Plan. Assoc. 2010, 76, 265–294. [Google Scholar] [CrossRef]
- De Gruyter, C.; Saghapour, T.; Ma, L.; Dodson, J. How does the built environment affect transit use by train, tram and bus? J. Transp. Land Use 2020, 13, 625–650. [Google Scholar] [CrossRef]
- An, D.; Tong, X.; Liu, K.; Chan, E.H.W. Understanding the impact of built environment on metro ridership using open source in Shanghai. Cities 2019, 93, 177–187. [Google Scholar] [CrossRef]
- Chen, E.; Ye, Z.; Wang, C.; Zhang, W. Discovering the spatio-temporal impacts of built environment on metro ridership using smart card data. Cities 2019, 95, 102359. [Google Scholar] [CrossRef]
- Jiang, Y.; Christopher Zegras, P.; Mehndiratta, S. Walk the line: Station context, corridor type and bus rapid transit walk access in Jinan, China. J. Transp. Geogr. 2012, 20, 1–14. [Google Scholar] [CrossRef]
- Gao, W.; Wang, W.; Dimitrov, D.; Wang, Y. Nano properties analysis via fourth multiplicative ABC indicator calculating. Arab. J. Chem. 2018, 11, 793–801. [Google Scholar] [CrossRef]
- Lundberg, S.M.; Su-In, L. A Unified Approach to Interpreting Model Predictions. In Proceedings of the 31st Annual Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 4765–4774. [Google Scholar]
- de Myttenaere, A.; Golden, B.; Le Grand, B.; Rossi, F. Mean Absolute Percentage Error for regression models. Neurocomputing 2016, 192, 38–48. [Google Scholar] [CrossRef]
Built Environment Category | Interfering Factor | Unit |
---|---|---|
Density | Building density | m2/km2 |
Diversity | Mixed utilization of land | |
Design | Road density | km/km2 |
Floor area ratio | ||
Destination accessibility | Number of entrances and exits | quantity |
Number of commercial facilities | ||
Number of office facilities | ||
Number of public service facilities | ||
Distance to transit | Density of bus lines | km/km2 |
Demand management | Number of parking lots | quantity |
Number of bus stops | ||
Demographics | Population | quantity |
Parameter | Implication | Value |
---|---|---|
max_depth | Maximum tree depth, which controls the model complexity, can be used to prevent overfitting | 8 |
eta | The learning rate, which controls the weights of each step of the fitting process, can be used to improve the model accuracy | 0.20 |
subsample | Random sampling ratio, which controls the proportion of random samples per tree, can be used to prevent overfitting | 0.75 |
colsample_bytree | The column sampling rate represents the column fraction of a random sample of each tree | 0.80 |
n_estimators | Return the number of trees | 461 |
gamma | The leaf node split threshold, which specifies the minimum loss reduction that must occur for splitting | 0.20 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wang, Z.; Li, S.; Li, Y.; Liu, D.; Liu, S.; Chen, N. Investigating the Nonlinear Effect of Built Environment Factors on Metro Station-Level Ridership under Optimal Pedestrian Catchment Areas via the Machine Learning Method. Appl. Sci. 2023, 13, 12210. https://doi.org/10.3390/app132212210
Wang Z, Li S, Li Y, Liu D, Liu S, Chen N. Investigating the Nonlinear Effect of Built Environment Factors on Metro Station-Level Ridership under Optimal Pedestrian Catchment Areas via the Machine Learning Method. Applied Sciences. 2023; 13(22):12210. https://doi.org/10.3390/app132212210
Chicago/Turabian StyleWang, Zhenbao, Shihao Li, Yongjin Li, Dong Liu, Shuyue Liu, and Ning Chen. 2023. "Investigating the Nonlinear Effect of Built Environment Factors on Metro Station-Level Ridership under Optimal Pedestrian Catchment Areas via the Machine Learning Method" Applied Sciences 13, no. 22: 12210. https://doi.org/10.3390/app132212210
APA StyleWang, Z., Li, S., Li, Y., Liu, D., Liu, S., & Chen, N. (2023). Investigating the Nonlinear Effect of Built Environment Factors on Metro Station-Level Ridership under Optimal Pedestrian Catchment Areas via the Machine Learning Method. Applied Sciences, 13(22), 12210. https://doi.org/10.3390/app132212210