Sentiment Analysis and Topic Modeling in Transportation: A Literature Review
Abstract
:1. Introduction
2. Related Work
3. Review Methodology
3.1. Database
3.2. Bibliographic Analysis
3.3. Selected Studies
4. Findings from the Literature Review
4.1. Social Media Data
4.2. Sentiment Analysis
4.3. Topic Modeling
5. Discussion
6. Conclusions
Author Contributions
Funding
Conflicts of Interest
Appendix A
Ref. | Author(s) | Social Media | Data Size | Location |
---|---|---|---|---|
[11] | Candelieri and Archetti | Yes (Twitter) | Not mentioned | Milan (ITA) |
[17] | Lopez-Fuentes et al. | Yes (Twitter) | 5818 tweets | Global |
[44] | Ajik et al. | No (Google Forms) | 216 responses | Katsina (NGA) |
[8] | Sun and Yin | No (Abstracts) | 17,163 articles from 22 journals | Cambridge and Gainesville (USA) |
[72] | Egger and Yu | Yes (Twitter) | 31,800 tweets | Salzburg and Vienna (AUT) |
[93] | Ali et al. | Yes (Twitter, Facebook, TripAdvisor) | 30,000 tweets 1851 TripAdvisor reviews Facebook (not mentioned) | New York (USA) and London (GBR) |
[73] | Kuhn | No | 25,706 records | Global |
[80] | Roque et al. | No | 54 RSI reports | Dublin (IRL) |
[45] | Sari and Ruldeviyani | Yes (Twitter) | 340 tweets | Jakarta (IDN) |
[75] | Hidayatullah and Ma’arif | Yes (Twitter) | 20,932 tweets | Java (IDN) |
[34] | Baj-Rogowska | Yes (Facebook) | 18,326 posts | Gdańsk (POL) |
[6] | Zayet et al. | No | 74 studies | Kuala Lumpur and Petaling Jaya (MYS) |
[21] | Hirata and Matsuda | Yes (Twitter) | 17,597 tweets | Kobe and Tokyo (JPN) |
[46] | Cao et al. | Yes (Weibo, blogs, forums) | Not mentioned | Changsha and Beijing (CHN) |
[47] | Papapicco | Yes (Twitter) | 774 tweets | Bari (ITA) |
[48] | Trivedi and Serasiya | Yes (Twitter) | Not mentioned | Ahmedabad (IND) |
[27] | Salas et al. | Yes (Twitter) | Not mentioned | West Midlands (GBR) |
[30] | Politis et al. | Yes (Twitter) | 418,624 tweets | London (GBR) |
[32] | Serna et al. | Yes (TripAdvisor) | 2000 TripAdvisor reviews | Donostia-San Sebastián (ESP) |
[76] | Moreno and Iglesias | Yes (Twitter) | 215,387 tweets | Madrid (ESP) |
[36] | Othman et al. | Yes (Facebook and Twitter) | 500 posts/tweets | Klang Valley (MYS) |
[24] | Chaturvedi et al. | Yes (Twitter) | Not mentioned | Delhi, Mumbai, Bangalore, and Hyderabad (IND) |
[77] | Kinra et al. | Yes (Twitter, newspaper articles) | 157,000 tweets and 1338 newspaper articles | Copenhagen (DNK) |
[12] | Effendy et al. | Yes (Twitter) | 1201 tweets | Bandung and Jakarta (IDN) |
[31] | Serna et al. | Yes (Minube) | 43,251 comments | Arrasate, Valencia, and Elgoibar (ESP) |
[49] | Candelieri et al. | Yes (Twitter) | 45,000 tweets | Milan (ITA) |
[78] | Esztergár-Kiss | No (Abstracts) | 310 project abstracts | Budapest (HUN) |
[40] | Bhardwaj et al. | Yes | Not mentioned | Global |
[81] | Méndez et al. | Yes (Twitter) | 91,186 tweets | Santiago (CHL) |
[50] | Lazic et al. | Yes (Twitter) | 14,640 tweets | Belgrade (SRB) |
[18] | Jacques et al. | Yes (Twitter) | 68,916 tweets | Île-de-France (FRA) |
[51] | Windasari et al. | Yes (Twitter) | 2000 tweets | Semarang (IDN) |
[52] | Luong and Houston | Yes (Twitter) | 8515 tweets | Los Angeles (USA) |
[71] | Wu et al. | No (Hotline data) | 223,599 complaints | Hohhot (CHN) |
[26] | Mishra and Panda | Yes (Twitter) | 92,271 tweets | Rourkela (IND) |
[35] | Garzia et al. | Yes (Twitter) | 121,536 tweets | London (UK) and Rome (ITA) |
[82] | Dou et al. | Yes (Dianping.com). | 52,087 reviews | Shanghai (CHN) |
[29] | Muguro et al. | Yes (Twitter) | 1,000,000 tweets | Nairobi (KEN) |
[83] | Ye et al. | Yes (Weibo) | 13,738 messages | Shanghai (CHN) |
[53] | Pinem | Yes (Turnbackhoax.com data) | Over 100 hoax cases | Yogyakarta (IDN) |
[84] | Tamakloe et al. | No (Abstracts) | 421 abstracts | Seoul (KOR) |
[54] | Preoțiuc-Pietro et al. | Yes (Twitter) | 1971 tweets | Florence (ITA) |
[85] | Kabbani et al. | Yes (Twitter) | 4432 tweets | Calgary (CAN) |
[94] | Shokoohyar et al. | Yes (Twitter) | 216,120 tweets | USA |
[55] | Shin | Yes (Yelp) | 1075 reviews | New York, Washington, D.C., and Chicago (USA) |
[15] | Anastasia and Budi | Yes (Twitter) | 126,405 tweets | Depok (IDN) |
[42] | Ali et al. | Yes (Twitter and reviews) | 1851 reviews and tweets | Incheon (KOR) |
[56] | Fan et al. | No (Noise complaint records) | 2032 complaint records | Bukit Panjang, Singapore (SGP) |
[37] | Saragih and Girsang | Yes (Facebook and Twitter) | 1200 comments | Jakarta (IDN) |
[13] | Sari et al. | Yes (Twitter) | 2000 tweets | Yogyakarta (IDN) |
[22] | Chen et al. | Yes (Twitter) | 296,924 tweets | New York (USA) |
[28] | Lock and Pettit | Yes (Twitter) | 55,000 tweets | Sydney (AUS) |
[57] | Seliverstov et al. | No (Web reviews) | 1130 reviews | Saint Petersburg (RUS) |
[58] | Adilah et al. | Yes (Instagram) | 1000 comments | Jakarta (IDN) |
[59] | Gao et al. | Yes (Weibo) | 3266 Weibo posts | Shanghai (CHN) |
[79] | Pineda-Jaramillo et al. | No (TripAdvisor reviews) | 1947 reviews | Mount Etna, Sicily (ITA) |
[60] | Styawati et al. | Yes (Google Play Store reviews) | 14,688 Gojek reviews and 15,945 Grab reviews | Bandar Lampung (IDN) |
[25] | Beck et al. | No (Google Maps reviews) | 8371 comments | São Paulo (BRA) |
[39] | Tran et al. | Yes (Twitter) | 517,000 tweets | Vancouver (CAN) |
[43] | Myoya et al. | Yes (Twitter) | Not mentioned | Nairobi (KEN), Johannesburg (ZAF), Dar es Salaam (TZA) |
[61] | Atmadja et al. | Yes (Twitter) | 565 tweets | Bandung and Samarinda (IDN) |
[86] | Aksan and Akdağ | Yes (Twitter) | 206,205 tweets (UK) and 36,418 tweets (India) | London, Birmingham, Manchester (UK) and New Delhi, Mumbai, Bengaluru (IND) |
[62] | Gupta et al. | Yes (Facebook and Twitter) | 1103 feedback | New Delhi (IND) and Edmonton (CAN) |
[23] | Fen et al. | Yes (Twitter) | 1235 tweets | Kuala Lumpur (MYS) |
[63] | Kumalasari and Handayani | Yes (Instagram) | 1584 reviews | Surabaya (IDN) |
[19] | Jaman et al. | Yes (Twitter) | 2053 tweets | Karawang (IDN) |
[64] | Bakalos, Papadakis, and Litke | Yes (Twitter, Reddit) | 5047 tweets | Athens (GRC) |
[7] | Verma | No (bibliometric review) | 353 articles | Global |
[16] | Pratama et al. | Yes (Twitter) | 2160 tweets | Jakarta (IDN) |
[14] | Ashari, Irawan, and Setianing-sih | Yes (Instagram) | 3600 comments | Bandung (IDN) |
[33] | Vitetta | Yes (platforms unspecified) | Not mentioned | Florence, Rome, Naples, Bari, Reggio Calabria (ITA) |
[87] | Ali et al. | Yes (Twitter, TripAdvisor) | Not mentioned | Incheon (KOR) |
[65] | Rohwinasakti, Irawan, and Setianingsih | Yes (Instagram) | 278,179 | Bandung and Jakarta (IDN) |
[20] | Lin et al. | No (Online app reviews) | 9200 reviews | Tainan (TWN) |
[66] | Gitto and Mancuso | Yes (blogs) | 895 sentences from blog content | Amsterdam (NLD), Frankfurt (DEU), London (GBR), Madrid (ESP), Paris (FRA) |
[67] | Martin-Domingo et al. | Yes (Twitter) | 4392 tweets | London (GBR) |
[38] | Aksan and Akdağ | Yes (Twitter) | 206,205 tweets | London, Birmingham, Manchester (GBR) |
[68] | Aldisa, Maulana, and Al-dinugroho | Yes (Twitter) | 5 h of tweets collected per service | Greater Jakarta (IDN) |
[41] | Giancristofaro and Panangadan | Yes (Instagram) | 1010 posts | California (USA) |
[69] | Arsarini et al. | Yes (Twitter, YouTube, Google Search) | 5000 data points | Bali (IDN) |
References
- Ziedan, A.; Brakewood, C.; Watkins, K. Will Transit Recover? A Retrospective Study of Nationwide Ridership in the United States during the COVID-19 Pandemic. J. Public Transp. 2023, 25, 100046. [Google Scholar] [CrossRef] [PubMed]
- Martí, P.; Serrano-Estrada, L.; Nolasco-Cirugeda, A. Social Media Data: Challenges, Opportunities and Limitations in Urban Studies. Comput. Environ. Urban Syst. 2019, 74, 161–174. [Google Scholar] [CrossRef]
- Wang, S.; Zhao, Z.; Xie, Y.; Ma, M.; Chen, Z.; Wang, Z.; Su, B.; Xu, W.; Li, T. Recent Surge in Public Interest in Transportation: Sentiment Analysis of Baidu Apollo Go Using Weibo Data. arXiv 2024, arXiv:2408.10088v1. [Google Scholar]
- Avetisyan, L.; Zhang, C.; Bai, S.; Pari, E.M.; Feng, F.; Bao, S.; Zhou, F. Design a Sustainable Micro-Mobility Future: Trends and Challenges in the United States and European Union Using Natural Language Processing Techniques. arXiv 2022, arXiv:2210.11714. [Google Scholar]
- Wang, X.; Jiang, W.; Luo, Z. Combination of Convolutional and Recurrent Neural Network for Sentiment Analysis of Short Texts. In Proceedings of the 26th International Conference on Computational Linguistics: Technical Papers, Osaka, Japan, 11–16 December 2016; pp. 2428–2437. [Google Scholar]
- Zayet, T.M.A.; Ismail, M.A.; Varathan, K.D.; Noor, R.M.D.; Chua, H.N.; Lee, A.; Low, Y.C.; Singh, S.K.J. Investigating Transportation Research Based on Social Media Analysis: A Systematic Mapping Review. Scientometrics 2021, 126, 6383–6421. [Google Scholar] [CrossRef]
- Verma, S. Sentiment Analysis of Public Services for Smart Society: Literature Review and Future Research Directions. Gov. Inf. Q. 2022, 39, 101708. [Google Scholar] [CrossRef]
- Sun, L.; Yin, Y. Discovering Themes and Trends in Transportation Research Using Topic Modeling. Transp. Res. Part C Emerg. Technol. 2017, 77, 49–66. [Google Scholar] [CrossRef]
- Kherwa, P.; Bansal, P. Topic Modeling: A Comprehensive Review. EAI Endorsed Trans. Scalable Inf. Syst. 2020, 7, 159623. [Google Scholar] [CrossRef]
- Tonkin, E.L. A Day at Work (with Text): A Brief Introduction. In Working with Text: Tools, Techniques and Approaches for Text Mining; Elsevier: Amsterdam, The Netherlands, 2016; pp. 23–60. ISBN 9781843347491. [Google Scholar]
- Candelieri, A.; Archetti, F. Analyzing Tweets to Enable Sustainable, Multi-Modal and Personalized Urban Mobility: Approaches and Results from the Italian Project TAM-TAM. In WIT Transactions on the Built Environment; WITPress: Boston, MA, USA, 2014; Volume 138, pp. 373–379. [Google Scholar]
- Effendy, V.; Novantirani, A.; Sabariah, M.K. Sentiment Analysis on Twitter about the Use of City Public Transportation Using Support Vector Machine Method. Int. J. Inf. Commun. Technol. 2016, 2, 57–66. [Google Scholar] [CrossRef]
- Sari, E.Y.; Wierfi, A.D.; Setyanto, A. Sentiment Analysis of Customer Satisfaction on Transportation Network Company Using Naive Bayes Classifier. In Proceedings of the International Conference on Computer Engineering Network, and Intelligent Multimedia, London, UK, 3–5 July 2019; Institute of Electrical and Electronics Engineers: Los Alamitos, CA, USA, 2019. [Google Scholar]
- Ashari, D.S.; Irawan, B.; Setianingsih, C. Sentiment Analysis on Online Transportation Services Using Convolutional Neural Network Method. In Proceedings of the International Conference on Electrical Engineering, Computer Science and Informatics (EECSI), Semarang, Indonesia, 20–21 October 2021; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2021; Volume 1, pp. 335–340. [Google Scholar]
- Anastasia, S.; Budi, I. Twitter Sentiment Analysis of Online Transportation Service Providers. In Proceedings of the 2016 International Conference on Advanced Computer Science and Information Systems, Malang, Indonesia, 15–16 October 2016; IEEE: New York, NY, USA, 2016. [Google Scholar]
- Pratama, M.O.; Satyawan, W.; Jannati, R.; Pamungkas, B.; Raspiani; Syahputra, M.E.; Neforawati, I. The Sentiment Analysis of Indonesia Commuter Line Using Machine Learning Based on Twitter Data. In Proceedings of the Journal of Physics: Conference Series, Crete, Greece, 18 April 2019; Institute of Physics Publishing: London, UK, 2019; Volume 1193. [Google Scholar]
- Lopez-Fuentes, L.; Farasin, A.; Zaffaroni, M.; Skinnemoen, H.; Garza, P. Deep Learning Models for Road Passability Detection during Flood Events Using Social Media Data. Appl. Sci. 2020, 10, 8783. [Google Scholar] [CrossRef]
- Jacques, S.; Farahnak, F.; Kosseim, L. Sentiment Analysis of Tweets on Transport from Île-de-France. ACL Anthol. 2018, 2, 239–248. [Google Scholar]
- Jaman, J.H.; Abdulrohman, R.; Suharso, A.; Sulistiowati, N.; Dewi, I.P. Sentiment Analysis on Utilizing Online Transportation of Indonesian Customers Using Tweets in the Normal Era and the Pandemic COVID-19 Era with Support Vector Machine. Adv. Sci. Technol. Eng. Syst. 2020, 5, 389–394. [Google Scholar] [CrossRef]
- Lin, X.M.; Ho, C.H.; Xia, L.T.; Zhao, R.Y. Sentiment Analysis of Low-Carbon Travel APP User Comments Based on Deep Learning. Sustain. Energy Technol. Assess. 2021, 44, 101014. [Google Scholar] [CrossRef]
- Hirata, E.; Matsuda, T. Examining Logistics Developments in Post-Pandemic Japan through Sentiment Analysis of Twitter Data. Asian Transp. Stud. 2023, 9, 100110. [Google Scholar] [CrossRef]
- Chen, X.; Wang, Z.; Di, X. Sentiment Analysis on Multimodal Transportation during the COVID-19 Using Social Media Data. Information 2023, 14, 113. [Google Scholar] [CrossRef]
- Fen, C.W.; Ismail, M.A.; Zayet, T.M.A.; Varathan, K.D. Sentiment Analysis of Users’ Perception Towards Public Transportation Using TWITTER. Int. J. Technol. Manag. Inf. Syst. 2020, 2, 92–101. [Google Scholar]
- Chaturvedi, N.; Toshniwal, D.; Parida, M. Twitter to Transport: Geo-Spatial Sentiment Analysis of Traffic Tweets to Discover People’s Feelings for Urban Transportation Issues. J. East. Asia Soc. Transp. Stud. 2019, 13, 210–220. [Google Scholar]
- Beck, D.; Teixeira, M.; Maróstica, J.; Ferasso, M. Quality Perception of São Paulo Transportation Services: A Sentiment Analysis of Citizens’ Satisfaction Regarding Bus Terminuses. Rev. Gest. Ambient. Sustentabilidade 2024, 13, e23392. [Google Scholar] [CrossRef]
- Mishra, D.N.; Panda, R.K. Decoding Customer Experiences in Rail Transport Service: Application of Hybrid Sentiment Analysis. Public Transp. 2023, 15, 31–60. [Google Scholar] [CrossRef]
- Salas, A.; Georgakis, P.; Nwagboso, C.; Ammari, A.; Petalas, I. Traffic Event Detection Framework Using Social Media. In Proceedings of the IEEE International Conference on Smart Grid and Smart Cities (ICSGSC), Singapore, 23–26 July 2017. [Google Scholar] [CrossRef]
- Lock, O.; Pettit, C. Social Media as Passive Geo-Participation in Transportation Planning–How Effective Are Topic Modeling & Sentiment Analysis in Comparison with Citizen Surveys? Geo-Spat. Inf. Sci. 2020, 23, 275–292. [Google Scholar] [CrossRef]
- Muguro, J.; Njeri, W.; Matsushita, K.; Sasaki, M. Road Traffic Conditions in Kenya: Exploring the Policies and Traffic Cultures from Unstructured User-Generated Data Using NLP. IATSS Res. 2022, 46, 329–344. [Google Scholar] [CrossRef]
- Politis, I.; Georgiadis, G.; Kopsacheilis, A.; Nikolaidou, A.; Papaioannou, P. Capturing Twitter Negativity Pre-vs. Mid-COVID-19 Pandemic: An Lda Application on London Public Transport System. Sustainability 2021, 13, 13356. [Google Scholar] [CrossRef]
- Serna, A.; Gerrikagoitia, J.K.; Bernabé, U.; Ruiz, T. Sustainability Analysis on Urban Mobility Based on Social Media Content. In Transportation Research Procedia; Elsevier: Amsterdam, The Netherlands, 2017; Volume 24, pp. 1–8. [Google Scholar]
- Serna, A.; Soroa, A.; Agerri, R. Applying Deep Learning Techniques for Sentiment Analysis to Assess Sustainable Transport. Sustainability 2021, 13, 2397. [Google Scholar] [CrossRef]
- Vitetta, A. Sentiment Analysis Models with Bayesian Approach: A Bike Preference Application in Metropolitan Cities. J. Adv. Transp. 2022, 2022, 2499282. [Google Scholar] [CrossRef]
- Baj-Rogowska, A. Sentiment Analysis of Facebook Posts: The Uber Case. In Proceedings of the 8th IEEE International Conference on Intelligent Computing and Information Systems, Madurai, India, 13–15 December 2018; Volume 430. [Google Scholar]
- Garzia, F.; Borghini, F.; Moretti, A.; Lombardi, M.; Ramalingam, S. Emotional Analysis of Safeness and Risk Perception of Transports and Travels by Car and Motorcycle in London and Rome during the COVID-19 Pandemic. In Proceedings of the International Carnahan Conference on Security Technology, Hatfield, UK, 11–15 October 2021; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2021; Volume 1. [Google Scholar]
- Othman, N.; Hussin, M.; Mahmood, R.A.R. Sentiment Evaluation of Public Transport in Social Media Using Naïve Bayes Method. Int. J. Eng. Adv. Technol. 2019, 9, 2305–2308. [Google Scholar] [CrossRef]
- Saragih, M.H.; Girsang, A.S. Sentiment Analysis of Customer Engagement on Social Media in Transport Online. In Proceedings of the International Conference on Sustainable Information Engineering and Technology, Batu, Indonesia, 24–25 November 2017; IEEE: New York, NY, USA, 2017. [Google Scholar]
- Aksan, A.; Akdaǧ, H.C. Public Opinion on UK Public Transportation Through Sentiment Analysis and Topic Modeling. In Proceedings of the 31st IEEE Conference on Signal Processing and Communications Applications, SIU 2023, Istanbul, Turkey, 5–8 July 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar]
- Tran, M.; Draeger, C.; Wang, X.; Nikbakht, A. Monitoring the Well-Being of Vulnerable Transit Riders Using Machine Learning Based Sentiment Analysis and Social Media: Lessons from COVID-19. Environ. Plan B Urban Anal. City Sci. 2023, 50, 60–75. [Google Scholar] [CrossRef]
- Bhardwaj, R.; Vaidya, T.; Poria, S. Towards Solving NLP Tasks with Optimal Transport Loss. J. King Saud Univ. Comput. Inf. Sci. 2022, 34, 10434–10443. [Google Scholar] [CrossRef]
- Giancristofaro, G.T.; Panangadan, A. Predicting Sentiment toward Transportation in Social Media Using Visual and Textual Features. In Proceedings of the 2016 IEEE 19th International Conference on Intelligent Transportation Systems (ITSC), Rio de Janeiro, Brazil, 1–4 November 2016; IEEE: New York, NY, USA, 2016. [Google Scholar]
- Ali, F.; Kwak, D.; Khan, P.; Islam, S.M.R.; Kim, K.H.; Kwak, K.S. Fuzzy Ontology-Based Sentiment Analysis of Transportation and City Feature Reviews for Safe Traveling. Transp. Res. Part C Emerg. Technol. 2017, 77, 33–48. [Google Scholar] [CrossRef]
- Myoya, R.L. Analysing Public Transport User Sentiment. Master’s Thesis, University of Pretoria, Pretoria, South Africa, 2024. [Google Scholar]
- Ajik, E.D.; Suleiman, A.B.; Ibrahim, M. Enhancing user experience through sentiment analysis for katsina state transport agency: A textblob approach. Fudma J. Sci. 2023, 7, 117–122. [Google Scholar] [CrossRef]
- Sari, I.C.; Ruldeviyani, Y. Sentiment Analysis of the COVID-19 Virus Infection in Indonesian Public Transportation on Twitter Data: A Case Study of Commuter Line Passengers. In Proceedings of the 2020 International Workshop on Big Data and Information Security, IWBIS 2020, Depok, Indonesia, 17 October 2020; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2020; pp. 23–28. [Google Scholar]
- Cao, J.; Zeng, K.; Wang, H.; Cheng, J.; Qiao, F.; Wen, D.; Gao, Y. Web-Based Traffic Sentiment Analysis: Methods and Applications. IEEE Trans. Intell. Transp. Syst. 2014, 15, 844–853. [Google Scholar] [CrossRef]
- Papapicco, C. SentiSfaction: New Cultural Way to Measure Tourist COVID-19 Mobility in Italy. Mediterr. J. Soc. Behav. Res. 2023, 7, 29–41. [Google Scholar] [CrossRef] [PubMed]
- Trivedi, M.; Serasiya, S. Traffic Issues Categorization of Indian Cities Using Word2Vec by Social Media Data. J. Emerg. Technol. Innov. Res. 2020, 7, 576–580. [Google Scholar]
- Candelieri, A.; Archetti, F. Detecting Events and Sentiment on Twitter for Improving Urban Mobility. In Proceedings of the International Conference on Autonomous Agents and Multiagent Systems, Istanbul, Turkey, 4–8 May 2015; pp. 106–115. [Google Scholar]
- Lazić, J.; Krstić, A.; Vujnović, S. Sentiment Analysis Using Optimal Transport Loss Function. In Proceedings of the 10th International Conference on Electrical, Electronic and Computing Engineering, IcETRAN 2023, East Sarajevo, Bosnia and Herzegovina, 5–8 June 2023; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2023. [Google Scholar]
- Pertiwi Windasari, I.; Nurul Uzzi, F.; Iman Satoto, K. Sentiment Analysis on Twitter Posts: An Analysis of Positive or Negative Opinion on GoJek. In Proceedings of the 2017 4th International Conference on Information Technology, Computer, and Electrical Engineering, Semarang, Indonesia, 18–19 October 2017; ISBN 9781538639474. [Google Scholar]
- Luong, T.T.B.; Houston, D. Public Opinions of Light Rail Service in Los Angeles, an Analysis Using Twitter Data. In Proceedings of the iConference 2015 Proceedings, Newport Beach, CA, USA, 24–27 March 2015. [Google Scholar]
- Pinem, Y.A. Corpus-Based Analysis of Online Hoax Discourse on Transportation Subject Picturing Indonesian Issue. Ling. Cult. 2021, 15, 7067. [Google Scholar] [CrossRef]
- Preotiuc-Pietro, D.; Gaman, M.; Aletras, N. Automatically Identifying Complaints in Social Media. In Proceedings of the Association for Computational Linguistics, Minneapolis, MN, USA, 2–7 June 2019; Volume 5008, pp. 5008–5019. [Google Scholar]
- Shin, E.J. A Comparative Study of Bike-Sharing Systems from a User’s Perspective: An Analysis of Online Reviews in Three U.S. Regions between 2010 and 2018. Int. J. Sustain. Transp. 2021, 15, 908–923. [Google Scholar] [CrossRef]
- Fan, Y.; Teo, H.P.; Wan, W.X. Public Transport, Noise Complaints, and Housing: Evidence from Sentiment Analysis in Singapore. J. Reg. Sci. 2021, 61, 570–596. [Google Scholar] [CrossRef]
- Seliverstov, Y.; Seliverstov, S.; Malygin, I.; Korolev, O. Traffic Safety Evaluation in Northwestern Federal District Using Sentiment Analysis of Internet Users’ Reviews. Transp. Res. Procedia 2020, 50, 626–635. [Google Scholar] [CrossRef]
- Adilah, M.T.; Supendar, H.; Ningsih, R.; Muryani, S.; Solecha, K. Sentiment Analysis of Online Transportation Service Using the Naïve Bayes Methods. J. Phys. Conf. Ser. 2020, 1641, 012093. [Google Scholar]
- Gao, S.; Ran, Q.; Su, Z.; Wang, L.; Ma, W.; Hao, R. Evaluation System for Urban Traffic Intelligence Based on Travel Experiences: A Sentiment Analysis Approach. Transp. Res. Part A Policy Pract. 2024, 187, 104170. [Google Scholar] [CrossRef]
- Styawati; Nurkholis, A.; Aldino, A.A.; Samsugi, S.; Suryati, E.; Cahyono, R.P. Sentiment Analysis on Online Transportation Reviews Using Word2Vec Text Embedding Model Feature Extraction and Support Vector Machine (SVM) Algorithm. In Proceedings of the 2021 International Seminar on Machine Learning, Optimization, and Data Science, ISMODE 2021, Jakarta, Indonesia, 29–30 January 2022; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2022; pp. 163–167. [Google Scholar]
- Atmadja, A.R.; Uriawan, W.; Pritisen, F.; Maylawati, D.S.; Arbain, A. Comparison of Naive Bayes and K-Nearest Neighbours for Online Transportation Using Sentiment Analysis in Social Media. J. Phys. Conf. Ser. 2019, 1402, 077029. [Google Scholar] [CrossRef]
- Gupta, P.; Mehlawat, M.K.; Khaitan, A.; Pedrycz, W. Sentiment Analysis for Driver Selection in Fuzzy Capacitated Vehicle Routing Problem with Simultaneous Pick-Up and Drop in Shared Transportation. IEEE Trans. Fuzzy Syst. 2021, 29, 1198–1211. [Google Scholar] [CrossRef]
- Kumalasari, A.T.; Handayani, W. Sentiment Analysis to Improve the Quality of Public Services “Suroboyo Bus”. Indones. Interdiscip. J. Sharia Econ. 2024, 7, 6407–6426. [Google Scholar]
- Bakalos, N.; Papadakis, N.; Litke, A. Public Perception of Autonomous Mobility Using ML-Based Sentiment Analysis over Social Media Data. Logistics 2020, 4, 12. [Google Scholar] [CrossRef]
- Rohwinasakti, S.; Irawan, B.; Setianingsih, C. Sentiment Analysis on Online Transportation Service Products Using K-Nearest Neighbor Method. In Proceedings of the International Conference on Computer Information, and Telecommunication Systems CITS 2021, Istanbul, Turkey, 11–13 November 2021; Institute of Electrical and Electronics Engineers Inc.: New York, NY, USA, 2021. [Google Scholar]
- Gitto, S.; Mancuso, P. Improving Airport Services Using Sentiment Analysis of the Websites. Tour. Manag. Perspect. 2017, 22, 132–136. [Google Scholar] [CrossRef]
- Martin-Domingo, L.; Martín, J.C.; Mandsberg, G. Social Media as a Resource for Sentiment Analysis of Airport Service Quality (ASQ). J. Air Transp. Manag. 2019, 78, 106–115. [Google Scholar] [CrossRef]
- Aldisa, R.T.; Maulana, P.; Aldinugroho, M. Sentiment Analysis of Public Transportation Services on Twitter Social Media Using the Method Naïve Bayes Classifier. Int. J. Inf. Syst. Technol. Akreditasi 2021, 5, 466–475. [Google Scholar] [CrossRef]
- Ayu Putu Savita Arsarini, D.; Ketut Gede Darma Putra, I.; Kadek Dwi Rusjayanthi, N. Public Sentiment Analysis of Online Transportation in Indonesia through Social Media Using Google Machine Learning. J. Ilm. Merpati 2021, 9, 153–164. [Google Scholar]
- Wagner, S.; Fernández, D.M. Analyzing Text in Software Projects. In The Art and Science of Analyzing Software Data; Elsevier Inc.: Amsterdam, The Netherlands, 2015; pp. 39–72. ISBN 9780124115439. [Google Scholar]
- Wu, R.; Shao, C.; Zhuge, C.; Wang, X.; Yin, X. What Do People Complain about Transport Service? Text Mining of Hotline Data Using LDA Model 2022. Available online: https://ssrn.com/abstract=4305469 (accessed on 24 November 2024).
- Egger, R.; Yu, J. A Topic Modeling Comparison Between LDA, NMF, Top2Vec, and BERTopic to Demystify Twitter Posts. Front. Sociol. 2022, 7, 886498. [Google Scholar] [CrossRef]
- Kuhn, K.D. Using Structural Topic Modeling to Identify Latent Topics and Trends in Aviation Incident Reports. Transp. Res. Part C Emerg. Technol. 2018, 87, 105–122. [Google Scholar] [CrossRef]
- Tamakloe, R.; Park, D. Discovering Latent Topics and Trends in Autonomous Vehicle-Related Research: A Structural Topic Modelling Approach. Transp. Policy 2023, 139, 1–20. [Google Scholar] [CrossRef]
- Hidayatullah, A.F.; Ma’arif, M.R. Road Traffic Topic Modeling on Twitter Using Latent Dirichlet Allocation. In Proceedings of the 2017 International Conference on Sustainable Information Engineering and Technology (SIET), Malang, Indonesia, 24–25 November 2017. [Google Scholar]
- Moreno, A.; Iglesias, C.A. Understanding Customers’ Transport Services with Topic Clustering and Sentiment Analysis. Appl. Sci. 2021, 11, 10169. [Google Scholar] [CrossRef]
- Kinra, A.; Beheshti-Kashi, S.; Buch, R.; Nielsen, T.A.S.; Pereira, F. Examining the Potential of Textual Big Data Analytics for Public Policy Decision-Making: A Case Study with Driverless Cars in Denmark. Transp. Policy 2020, 98, 68–78. [Google Scholar] [CrossRef]
- Esztergár-Kiss, D. Horizon 2020 Project Analysis by Using Topic Modelling Techniques in the Field of Transport. Transp. Telecommun. 2024, 25, 266–277. [Google Scholar] [CrossRef]
- Pineda-Jaramillo, J.; Fazio, M.; Le Pira, M.; Giuffrida, N.; Inturri, G.; Viti, F.; Ignaccolo, M. A Sentiment Analysis Approach to Investigate Tourist Satisfaction towards Transport Systems: The Case of Mount Etna. Transp. Res. Procedia 2023, 69, 400–407. [Google Scholar] [CrossRef]
- Roque, C.; Lourenço Cardoso, J.; Connell, T.; Schermers, G.; Weber, R. Topic Analysis of Road Safety Inspections Using Latent Dirichlet Allocation: A Case Study of Roadside Safety in Irish Main Roads. Accid. Anal. Prev. 2019, 131, 336–349. [Google Scholar] [CrossRef]
- Mendez, J.T.; Lobel, H.; Parra, D.; Herrera, J.C. Using Twitter to Infer User Satisfaction with Public Transport: The Case of Santiago, Chile. IEEE Access 2019, 7, 60255–60263. [Google Scholar] [CrossRef]
- Dou, M.; Gu, Y.; Gong, J. How Do People Perceive the Quality of Urban Transport Service? New Insights from Online Reviews of Shanghai Metro System. J. Urban Manag. 2024, 13, 705–719. [Google Scholar] [CrossRef]
- Ye, Q.; Chen, X.; Zhang, H.; Ozbay, K.; Zuo, F. Public Concerns and Response Pattern toward Shared Mobility Using Social Media Data. In Proceedings of the IEEE Intelligent Transportation Systems Conference (ITSC), Auckland, New Zealand, 1 January 2019; pp. 619–624. [Google Scholar]
- Tamakloe, R.; Park, D.; Chang, H. Discovering Research Topics, Trends, and Perspectives in COVID-19-Related Transportation Journal Articles. Int. J. Urban Sci. 2022, 26, 710–738. [Google Scholar] [CrossRef]
- Kabbani, O.; Klumpenhouwer, W.; El-Diraby, T.; Shalaby, A. What Do Riders Say and Where? The Detection and Analysis of Eyewitness Transit Tweets. J. Intell. Transp. Syst. Technol. Plan. Oper. 2023, 27, 347–363. [Google Scholar] [CrossRef]
- Aksan, A.; Akdağ, H.C. Comparative Analysis of Public Transportation Through Sentiment Analysis and Topic Modeling. In Industrial Engineering in the Industry 4.0 Era; Springer Science and Business Media Deutschland GmbH: Cham, Switzerland, 2024; pp. 3–15. [Google Scholar]
- Ali, F.; El-Sappagh, S.; Ali, A.; Kwak, K.S.; Ei-Sappagh, S.; Kwak, K.S.; Kwak, D. Sentiment Analysis of Transportation Using Word Embedding and LDA Approaches. In Proceedings of the Korea Institute of Communications and Information Sciences 2018 Winter General Academic Conference, Incheon, Republic of Korea, 25 January 2018. [Google Scholar]
- Çaylak, P.Ç.; Kayakuş, M.; Eksili, N.; Yiğit Açikgöz, F.; Coşkun, A.E.; Ichimov, M.A.M.; Moiceanu, G. Analysing Online Reviews Consumers’ Experiences of Mobile Travel Applications with Sentiment Analysis and Topic Modelling: The Example of Booking and Expedia. Appl. Sci. 2024, 14, 11800. [Google Scholar] [CrossRef]
- Özkara, Y.; Bilişli, Y.; Yildirim, F.S.; Kayan, F.; Başdeğirmen, A.; Kayakuş, M.; Yiğit Açıkgöz, F. Analysing Social Media Discourse on Electric Vehicles with Machine Learning. Appl. Sci. 2025, 15, 4395. [Google Scholar] [CrossRef]
- Deng, J.; Liu, Y. Research on Sentiment Analysis of Online Public Opinion Based on RoBERTa–BiLSTM–Attention Model. Appl. Sci. 2025, 15, 2148. [Google Scholar] [CrossRef]
- Melhem, W.Y.; Abdi, A.; Meziane, F. Deep Learning Classification of Traffic-Related Tweets: An Advanced Framework Using Deep Learning for Contextual Understanding and Traffic-Related Short Text Classification. Appl. Sci. 2024, 14, 11009. [Google Scholar] [CrossRef]
- Laynes-Fiascunari, V.; Gutierrez-Franco, E.; Rabelo, L.; Sarmiento, A.T.; Lee, G. A Framework for Urban Last-Mile Delivery Traffic Forecasting: An In-Depth Review of Social Media Analytics and Deep Learning Techniques. Appl. Sci. 2023, 13, 5888. [Google Scholar] [CrossRef]
- Ali, F.; Kwak, D.; Khan, P.; El-Sappagh, S.; Ali, A.; Ullah, S.; Kim, K.H.; Kwak, K.S. Transportation Sentiment Analysis Using Word Embedding and Ontology-Based Topic Modeling. Knowl. Based Syst. 2019, 174, 27–42. [Google Scholar] [CrossRef]
- Shokoohyar, S.; Ghomi, V.; Jafari Gorizi, A.; Liang, W.; Sinclair, A. Impact of COVID-19 Outbreak and Vaccination on Ride-Sharing Services: A Social Media Analysis. Transp. Lett. 2024, 16, 527–541. [Google Scholar] [CrossRef]
Ref./Year | Method | Application | Findings |
---|---|---|---|
[44] 2023 | Sentiment analysis using the TextBlob library; preprocessing and polarity scoring of comments from survey data. | Improving customer satisfaction for Katsina State Transport Authority through insights from user feedback. | 71% positive sentiments; 9% negative, highlighting service areas needing improvement. Recommendations for operational and service enhancements based on feedback. |
[45] 2020 | Sentiment analysis using Naïve Bayes and Decision Tree algorithms on Twitter data; preprocessing included data cleaning, case folding, filtering, stemming, and tokenizing. | Analyzing public sentiment toward COVID-19 transmission risks among commuter line passengers in Indonesia. | Naïve Bayes outperformed Decision Tree with an average accuracy of 73.59%. Positive sentiment was the most frequent (135 tweets), followed by negative (152 tweets) and neutral (53 tweets). Sentiment predominantly reflected appeals and calls for COVID-19 prevention and control. |
[6] 2021 | Systematic mapping review of 74 transportation-related studies using social media data published between 2008 and 2018 | Identifying trends, approaches, data types, and platforms in transportation research leveraging social media analysis; providing insights for future research. | Twitter is the most frequently used platform (72% of studies). Text data is the most commonly analyzed attribute, often combined with sentiment analysis or topic modeling. Challenges include limited access to data and insufficient integration of metadata like location. |
[46] 2014 | Rule-based sentiment analysis framework for processing traffic-related web data; data preprocessed with text segmentation and rule construction. | Analyzing public sentiment on traffic-related issues (e.g., “yellow light rule” and fuel prices) to support intelligent transportation systems (ITS) and policy evaluation. | Rule-based approach achieved higher accuracy for traffic-related sentiment analysis compared to existing algorithms. Demonstrated the value of public sentiment as a supplementary “social sensor” in ITS decision making. |
[47] 2023 | Sentiment analysis using SentiStrength, combined with Chi-square and t-tests, applied to Twitter data. | Measuring customer satisfaction and online reputation of Trenitalia rail services during the pre-COVID-19 (2019) and COVID-19 (2020) periods. | Prevalence of neutral sentiment (41% in 2019; 64% in 2020). Mixed emotions highlighted in tweets reflect the impact of the pandemic on mobility and service expectations. Differences in sentiment attributed to contextual factors, emphasizing the need for qualitative analysis alongside quantitative methods. |
[48] 2020 | Word2Vec classification algorithm and semantic analysis for analyzing traffic-related tweets; preprocessing included data cleaning, hashtag and URL removal, and multilingual tweet conversion. | Categorizing traffic-related issues (e.g., accidents, congestion, potholes) in Indian cities for improved traffic management. | Tweets categorized into positive, negative, and neutral sentiments. Traffic issues such as accidents, congestion, and potholes identified; significant disparities observed in tweet activity between regions of Ahmedabad. |
[49] 2015 | Sentiment analysis using Support Vector Machines (SVM) and delta TF-IDF; event detection based on keyword matching. | Improving urban mobility in Milan by analyzing Twitter data for detecting transport events and assessing user sentiment toward public transportation services. | SVM proved effective for sentiment classification with high accuracy. Event detection and sentiment analysis enhanced the TAM-TAM platform’s capability for trip optimization and service evaluation. |
[50] 2023 | Optimal Transport (OT) loss applied to sentiment analysis for minimizing misclassifications between opposite sentiment classes; compared against traditional cross-entropy loss. | Analyzing user sentiment toward U.S. airlines using the Kaggle Twitter US Airline Sentiment dataset. | OT loss improved misclassification rates between positive and negative sentiment classes compared to cross-entropy. Overall accuracy was slightly improved, demonstrating OT’s potential for enhancing sentiment classification tasks. Results emphasize OT’s utility for handling imbalanced classes in real-world applications. |
[51] 2017 | Sentiment analysis using Support Vector Machine (SVM) with unigram and TF-IDF for feature extraction; preprocessing included text cleaning, stemming, and tokenization. | Analyzing public sentiment about the online transportation service GoJek, categorizing tweets as positive or negative to assess customer satisfaction. | Achieved 86% accuracy in classifying tweets. Positive sentiment was identified with 100% precision, while negative sentiment showed 67.44% precision. Highlighted the potential of sentiment analysis for improving service quality and understanding customer feedback. |
[52] 2015 | Sentiment analysis using an English opinion lexicon; unigram-based clustering for topic analysis; text preprocessing included emoji translation and data cleaning. | Understanding public opinions about Los Angeles’ light rail system to enhance service delivery and policy-making. | Temporal analysis revealed weekly patterns, such as more positive tweets on Mondays and negative ones during weekends. Word clustering identified frequent topics (e.g., “delay,” “disable,” “dies,” “fatal”). Highlighted Twitter’s potential as a tool for real-time feedback on transit services. |
[53] 2021 | Corpus-based analysis using Indonesian Web Corpus (IWaC) and hoax news collection from Turnbackhoax.com; socio-pragmatic approach applied to interpret discourse prosody. | Analyzing the spread of transportation-related hoaxes in Indonesia to identify underlying social issues, such as distrust in government and unawareness of transportation laws. | Most hoaxes related to land transportation, including toll roads, police ticketing, and public transport issues. Themes often reflect real social concerns, such as economic inequality, governance mistrust, and lack of infrastructure awareness. Emphasized the role of hoaxes in mirroring unresolved societal problems in transportation. |
[54] 2019 | Complaint classification using logistic regression and neural models (MLP and LSTM); dataset created via manual annotation of Twitter complaints across nine domains, including transport. | Identifying and analyzing complaints on social media to enhance customer service and support dialogue systems. | Predictive accuracy reached up to 79% F1 (using logistic regression with bag-of-words features). Transport domain exhibited lower performance due to linguistic variations in complaint expression. Proposed models significantly outperformed baseline sentiment analysis tools in complaint classification. |
[55] 2021 | Content analysis of Yelp reviews using a coding scheme developed from literature and refined via manual review; analyzed factors affecting user evaluations and complaints. | Investigating perceptions of bike-sharing systems in New York, Washington, D.C., and Chicago, focusing on regional and temporal variations in user satisfaction. | Average ratings were consistent across regions (around 2.6/5). Key factors influencing satisfaction included pricing, bike quality, and customer service. Highlighted the role of social media in detecting service issues and planning improvements. |
[56] 2021 | Sentiment analysis using the SentimentR tool for analyzing noise complaint records; Difference-in-Differences (DID) approach for causal analysis; instrumental variables used to validate proximity effects. | Investigating the impact of a new bus route on noise complaints and housing prices in Bukit Panjang, Singapore, to understand the trade-off between accessibility and environmental externalities. | The new bus route increased noise complaints by 10.9% for residents within 100 m compared to those 100–200 m away. Noise complaints were most pronounced for mid-level floors (5th–8th). Housing prices decreased by 3% with a 1-point increase in noise sentiment, offsetting 17.8% of the accessibility benefit. Demonstrated the importance of noise insulation in transit-oriented developments. |
[57] 2020 | Sentiment analysis using a Naïve Bayes classifier and a linear classifier with stochastic gradient descent optimization; data mining from web sources using the Scrapy framework; text vectorization through Bag-of-Words and TF-IDF Vectorizer. | Analyzing road conditions in the Northwestern Federal District of Russia through user reviews to identify problematic areas and provide recommendations for traffic safety improvements. | Classifier achieved 71.94% accuracy, categorizing reviews into positive and negative. Roads with positive reviews covered 75% of total length, while negatively reviewed roads accounted for 25%. Created a visual map highlighting road conditions, supporting data-driven decision making for traffic management. |
[58] 2020 | Sentiment analysis using Naïve Bayes Classifier (NBC); preprocessing included tokenization, stopword removal, and TF-IDF term weighting. | Evaluating public sentiment about Gojek online transportation services based on Instagram comments, classifying sentiments as positive or negative to inform service improvements. | Achieved an accuracy of 81% using NBC. Positive comments reflected satisfaction with promotions and service reliability, while negative comments highlighted technical issues and long waiting times. Demonstrated the utility of Instagram as a feedback channel for enhancing transportation services. |
[59] 2024 | Sentiment analysis using BERT (Bidirectional Encoder Representations from Transformers) for text classification and lexicon-based sentiment analysis; combined with Analytic Hierarchy Process (AHP) for cross-validation. | Developing a comprehensive evaluation system for urban traffic intelligence in Shanghai, focusing on public travel experiences gathered from surveys and social media (Weibo). | key indicators identified: safety, efficiency, comfort, environmental friendliness, reliability, convenience, flexibility, information accessibility, and affordability. Areas needing improvement: affordability, safety, and comfort. High positive sentiment towards flexibility, environmental friendliness, and information accessibility. Demonstrated scalability for application to other cities with minimal input data. |
[60] 2021 | Sentiment analysis using Word2Vec (Skip-gram model) for feature extraction and Support Vector Machine (SVM) with various kernels (RBF, Linear, Polynomial) for classification. | Evaluating public sentiment on Gojek and Grab reviews from the Google Play Store to identify service strengths and weaknesses. | Gojek achieved a higher performance score: 89% accuracy, 94% precision, 86% recall, and 90% F1-score. Grab: 87% accuracy, 94% precision, 85% recall, and 89% F1-score. Demonstrated Word2Vec and SVM as effective tools for sentiment classification in online transportation reviews. |
[43] 2024 | Multilingual opinion mining using AfriBERTa, AfroXLMR, and AfroLM models; preprocessing involved data cleaning, trend analysis, and handling code-mixed languages. | Investigating public sentiment toward rail, bus, and mini-bus taxi systems in Kenya, South Africa, and Tanzania through Twitter data, focusing on user satisfaction and multilingual insights. | Positive sentiments were observed for reliability and accessibility, while negative sentiments focused on delays, overcrowding, and safety issues. Code-mixed datasets highlighted the complexity of user expressions, emphasizing the need for multilingual NLP tools. Demonstrated the alignment and gaps between user sentiment and service provider ratings. Results offered actionable insights for improving public transport policies and service quality in multilingual contexts. |
[61] 2019 | Sentiment analysis using Naïve Bayes Classifier (NBC) and K-Nearest Neighbors (KNN); preprocessing included data cleaning, tokenization, stemming, and filtering. | Comparing the performance of NBC and KNN for sentiment classification of tweets related to online transportation services (e.g., Gojek and Grab). | NBC achieved 66.15% accuracy, while KNN slightly outperformed with 67.69% accuracy. Demonstrated that KNN provides better classification accuracy than NBC for small datasets. Identified positive and negative sentiments related to service quality, driver behavior, and application functionality. Recommendations included expanding datasets and integrating spam detection for improved accuracy. |
[62] 2021 | Sentiment analysis using Natural Language Processing (NLP) to extract driver ratings from customer feedback; hybrid Genetic Algorithm (GA) and fuzzy simulation to optimize vehicle routing in uncertain conditions. | Improving shared transportation systems by integrating customer sentiment into driver selection for the Vehicle Routing Problem (VRP) with simultaneous pick-up and drop services. | NLP-based sentiment analysis provided actionable insights into driver performance, linking driver selection to improved customer satisfaction. Fuzzy simulations demonstrated robust handling of uncertainties in travel durations and customer demands. Genetic Algorithm optimization reduced travel times while incorporating customer sentiment into routing decisions. |
[63] 2024 | Sentiment analysis using Random Forest for classification; data preprocessing involved tokenization, stemming, and case folding. Word clouds were used for visualization. | Evaluating customer feedback about the Suroboyo Bus on Instagram to identify key areas for service improvement and enhance public transportation quality in Surabaya. | Positive sentiments focused on the affordability and convenience of services. Negative sentiments highlighted issues with bus stops, routes, schedules, and the mobile app. Random Forest achieved an accuracy of 71.27% in classifying sentiments. Recommendations include optimizing bus schedules, improving the app, and addressing service gaps identified through sentiment analysis. |
[64] 2020 | Sentiment analysis using BERT for natural language processing; data captured via APIs from Twitter and Reddit. Preprocessing included tokenization, lemmatization, and removal of irrelevant data. | Assessing public acceptance of autonomous mobility by identifying fears and concerns regarding self-driving technology through social media data. | 61.66% of Twitter posts expressed positive opinions, compared to 71.72% on Reddit. Negative sentiments were associated with safety concerns, cybersecurity fears, and employment issues. Identified specific concerns like combining autonomous and conventional vehicles, liability in accidents, and the potential decline of driving as a hobby. Recommendations include addressing technophobia and enhancing public understanding of autonomous technology. |
[7] 2022 | Comprehensive bibliometric review of 353 research articles using VOSviewer for co-citation and network analysis. Data preprocessing included keyword identification, clustering, and co-occurrence mapping. | Mapping the thematic and intellectual structure of sentiment analysis in public services for smart society, including applications in traffic congestion, governance, and urban planning. | Identified motor themes like information retrieval and supervised learning as drivers for smart societies. Emerging themes included social media data for urban planning and location-based services. Sentiment analysis facilitates innovation, transparency, citizen participation, and improved efficiency in public service management. Highlighted gaps in cross-domain application and the need for advanced algorithms to process unstructured social media data. |
[65] 2020 | Sentiment analysis using the K-Nearest Neighbor (KNN) algorithm; preprocessing included tokenization, case folding, stopword removal, and stemming. Data extracted from Instagram comments related to two Indonesian online transportation providers (BRG and KJG). | Analyzing user satisfaction with online transportation services to identify positive and negative customer sentiments, aiding service improvement. | BRG: 35.6% positive comments, 64.4% negative comments. KJG: 35.9% positive comments, 64.1% negative comments. Achieved 94.4% accuracy, precision, recall, and F1-score with a training-test split of 95:5. Demonstrated KNN’s effectiveness for analyzing social media data to gauge public opinion. |
[66] 2017 | Sentiment analysis using KNIME and Semantria API; document-level analysis of blog content using a dictionary-based and machine learning approach. | Assessing customer satisfaction levels with aviation and non-aviation services at five major European airports based on blog data. | Non-aviation services (food, beverage, shopping) had higher positive sentiment (55%) than aviation services (check-in, baggage claim, security control) with only 33% positive feedback. Identified critical areas for improvement, including food and beverage quality, and check-in and security procedures. Provided actionable insights for airport management to prioritize resource allocation and enhance passenger experience. |
[67] 2019 | Sentiment analysis using Theysay and Twinword tools for analyzing 4392 tweets related to Heathrow Airport’s services. Preprocessing included keyword extraction and mapping to Airport Service Quality (ASQ) attributes. | Evaluating passenger sentiment on various airport service aspects to complement traditional methods like surveys, providing insights into areas needing improvement. | Positive sentiments were higher for attributes like WiFi, Food and Beverage, and Lounge services. Negative sentiments focused on Waiting, Parking, Passport Control, and Staff interactions. The analysis highlighted gaps in service quality, guiding actionable improvements. |
[68] 2021 | Sentiment analysis using the Naïve Bayes Classifier (NBC). Data preprocessing included tokenization, stemming, and filtering. Tweets related to Gojek, Grab, Commuter Line, and Transjakarta services were analyzed. | Evaluating public sentiment about major transportation services in Greater Jakarta based on Twitter data to identify service strengths and weaknesses. | Commuter Line: Positive sentiment dominated with a 0.333 probability and no negative tweets. Gojek: 8.4% positive and 50.4% negative probabilities. Grab: 6.7% positive and 79% negative probabilities. Transjakarta: 2.4% positive and 11.9% negative probabilities. Overall, negative sentiment outweighed positive sentiment for most services except for the Commuter Line. Recommendations included leveraging social media feedback to improve service quality and customer interaction. |
[69] 2021 | Sentiment analysis using Google Machine Learning API (Natural Language API); data preprocessing included data crawling, cleansing, and filtering. Sentiment labeling was based on a threshold system (positive, neutral, negative). | Evaluating public sentiment toward online transportation providers in Indonesia (Gojek, Grab, and Bluebird) to inform service improvements. | Sentiment distribution for Gojek: 495 positive, 853 neutral, 1054 negative tweets. Sentiment distribution for Grab: 385 positive, 406 neutral, 429 negative tweets. Sentiment distribution for Bluebird: 21 positive, 49 neutral, 17 negative tweets. Gojek received the highest proportion of negative sentiments. Performance of the sentiment analysis system achieved 82.6% accuracy, 82.2% precision, and 83.3% recall. Demonstrated Twitter as the most valuable platform for sentiment data collection compared to other platforms like YouTube and Google Search. |
Ref./Year | Method | Application | Findings |
---|---|---|---|
[73] 2018 | Structural Topic Modeling (STM) applied to aviation safety incident reports (ASRS database), using metadata (flight mission, phase of flight, etc.) to uncover latent topics and trends. | Identifying themes and trends in aviation safety to inform safety priorities and future research. | Demonstrated STM’s utility for integrating metadata and narrative text for aviation safety insights. |
[80] 2019 | Latent Dirichlet Allocation (LDA) applied to Road Safety Inspection (RSI) reports to identify patterns in roadside safety issues and interventions. | Assessing road safety on Irish roads, focusing on run-off-road crashes and identifying common problems and corresponding interventions. | Common roadside hazards include poles, roadside barriers, and walls. Issues are more frequently identified than solutions, indicating better intervention strategies are needed. Lack of application of “clear zone” and “forgiving roadside” concepts in Irish RSI reports. |
[81] 2019 | Sentiment analysis using manual classification and topic modeling with MALLET; validation of results via comparative analysis with traditional surveys. | Measuring user satisfaction with the Transantiago public transport system in Santiago, Chile, using Twitter data to complement traditional surveys. | Twitter primarily captures negative sentiment (75%) compared to surveys, which report more balanced feedback. Thematic topics from tweets aligned well with operational issues yet revealed biases towards rush hours and higher socioeconomic areas. Twitter data provides broader spatial coverage than surveys yet is more limited in terms of in depth per stop or service. |
[82] 2024 | Structural Topic Model (STM) applied to 52,087 online reviews from Dianping.com; preprocessing included text segmentation, stopword removal, and part-of-speech tagging. | Assessing the quality of service in the Shanghai Metro system by identifying key themes, analyzing sentiment polarity, and exploring temporal and spatial patterns. | Temporal analysis showed a rise in experience-related topics and a decline in physical operations-related topics. Spatial clustering revealed differing priorities across station types: residential areas emphasized commuting needs, business districts highlighted design and operational aspects, and other areas focused on cleanliness and security. |
[83] 2019 | Latent Dirichlet Allocation (LDA) and dictionary-based sentiment analysis applied to 13,738 Weibo messages; preprocessing included keyword filtering and term weighting using TF-IDF. | Understanding public concerns and sentiment patterns related to ride-hailing security following a Didi driver murder case to improve service platforms and government regulations. | Four key concerns: ride-hailing service quality (35%), platform accountability (25%), government regulation on market entry, and case-related discussions. Negative sentiment peaked after the incident, highlighting dissatisfaction with driver background checks and crisis response. Positive sentiment was associated with competition driving service improvements. |
[84] 2022 | Structural Topic Model (STM) applied to abstracts from 421 COVID-19-related transportation research articles to identify latent themes and analyze trends and perspectives. | Investigating how the pandemic influenced transportation research priorities and comparing research focuses between high-income countries (HICs) and middle- and low-income countries (MLICs). | Key topics included travel behavior changes, airport financial performance, air transport recovery, and logistics optimization. Emerging areas included shipping emissions, active transportation, and traffic safety. HIC authors emphasized shared mobility and safety, while MLIC researchers focused on logistics efficiency and pandemic mitigation strategies. |
[85] 2023 | Sentiment analysis using Valence Aware Dictionary and Sentiment Reasoner (VADER) combined with topic modeling and geospatial analysis; preprocessing involved tweet cleaning, geocoding, and location matching. | Detecting and analyzing eyewitness transit-related tweets for incident management in Calgary Transit, identifying urgent issues, and supporting service improvement. | Safety and security incidents were the most frequently reported topics, followed by ride quality and travel time. Tweets were primarily negative or neutral, reflecting complaints about crowded buses, safety concerns, and service delays. Spatial analysis highlighted high tweet activity in downtown Calgary, enabling targeted service responses. Demonstrated potential for integrating social media insights into real-time transit management. |
[86] 2024 | Sentiment analysis using RoBERTa for polarity classification (positive, neutral, negative) combined with Latent Dirichlet Allocation (LDA) for topic modeling. Data preprocessing included tokenization, lemmatization, and removal of stopwords. | Understanding public sentiment and identifying key issues and strengths in public transportation in the UK and India based on Twitter data. | Recommendations include addressing affordability and safety in the UK and enhancing public transport infrastructure in India. |
[87] 2018 | Sentiment analysis using Latent Dirichlet Allocation (LDA) for topic modeling and Word2Vec (skip-gram model) for feature representation. Data preprocessing included filtering, stopword removal, lemmatization, and cleaning. Classification applied machine learning models, including Decision Tree, SVM, Logistic Regression, Naïve Bayes, Random Forest, and Neural Network. | Analyzing transportation-related data from Twitter, New York Times, and TripAdvisor to predict polarity and enhance Intelligent Transportation Systems (ITS). | The proposed system improved the accuracy of sentiment classification by integrating topic modeling and word embedding. Neural network models outperformed traditional classifiers in sentiment prediction. Highlighted the importance of domain-specific ontologies for refining feature extraction. |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Torres, E.C.M.; de Picado-Santos, L.G. Sentiment Analysis and Topic Modeling in Transportation: A Literature Review. Appl. Sci. 2025, 15, 6576. https://doi.org/10.3390/app15126576
Torres ECM, de Picado-Santos LG. Sentiment Analysis and Topic Modeling in Transportation: A Literature Review. Applied Sciences. 2025; 15(12):6576. https://doi.org/10.3390/app15126576
Chicago/Turabian StyleTorres, Ewerton Chaves Moreira, and Luís Guilherme de Picado-Santos. 2025. "Sentiment Analysis and Topic Modeling in Transportation: A Literature Review" Applied Sciences 15, no. 12: 6576. https://doi.org/10.3390/app15126576
APA StyleTorres, E. C. M., & de Picado-Santos, L. G. (2025). Sentiment Analysis and Topic Modeling in Transportation: A Literature Review. Applied Sciences, 15(12), 6576. https://doi.org/10.3390/app15126576