Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data
Abstract
:Highlights
- Demonstrates the effectiveness of a novel multitask learning (MTL) framework utilizing large language models (LLMs) for real-time analysis of road traffic crashes (RTCs) through the integration of social media data.
- Fine-tuning GPT-2 for language modeling demonstrated that it outperformed baseline models, including GPT-4o mini in zero-shot mode and XGBoost, across various classification and information retrieval tasks. This study benchmarks the performance of the fine-tuned GPT-2 model against these baselines, highlighting its superior performance in these tasks.
- The study collected and curated a dataset of 26,226 RTC-related tweets from Australia over a year. This dataset extracted fifteen unique features, with six used in classification tasks and nine in information retrieval tasks.
- Developed an advanced automated labeling system using GPT-3.5, followed by rigorous expert verification to ensure the accuracy and reliability of feature extraction from tweets. The resulting meticulously curated dataset serves as a foundational resource for training and validating subsequent models, establishing a new standard for RTC analysis.
- Offers a transformative approach to traffic safety analytics, providing detailed, timely insights crucial for emergency responders, urban planners, and policymakers.
- By leveraging cutting-edge AI techniques within an MTL framework, this study demonstrates a transformative approach to real-time RTC analysis, setting the stage for future advancements in the field.
- The curated dataset generated in this research not only advances traffic safety measures but also serves as a valuable resource for extracting insights, developing models, and conducting further research. This resource provides a solid foundation for future studies aimed at enhancing road safety.
Abstract
1. Introduction
- Develop a multitask learning framework: design a comprehensive MTL framework that integrates classification and information retrieval tasks, surpassing traditional multi-class classifiers.
- Label tweets and verify via domain experts: implement an automated labeling system using GPT-3.5, followed by expert verification to ensure the accuracy and reliability of the extracted features. The resulting dataset is used for fine-tuning the GPT-2 model.
- Fine-tune GPT models for multitask objectives: fine-tune GPT-2 for simultaneous classification and information extraction tasks, ensuring the model can handle both types of tasks efficiently.
- Incorporate GPT-4 zero-shot as a baseline: utilize GPT-4 in a zero-shot setting to establish a performance baseline, enabling a robust comparison with the fine-tuned GPT-2 model and highlighting the effectiveness of task-specific fine-tuning.
- Evaluate and test model efficiency and applicability: rigorously test the model’s performance using real-world Twitter data to assess its effectiveness and applicability in real-time RTC monitoring and analysis.
- Provide a dataset of tweets related to traffic crashes: make this dataset available for further model development or to gain insights about traffic crashes using social media.
2. Background
2.1. Multitask Learning
2.2. Large Language Models (LLMs)
2.3. Prompt Engineering
2.4. Fine-Tuning LLMs
2.5. Traffic Crash Detection and Analysis
3. Literature Review
3.1. Social Media as a Data Source for Traffic Analysis
3.2. Advances in NLP and Large Language Models (LLMs)
3.3. Addressing the Research Gap: Multitask Learning with LLMs
4. Methodology
4.1. Data Collection
4.2. Data Preprocessing
4.3. Data Labeling
Algorithm 1: Dataset Curation Process pseudocode |
1: procedure DATASET_CURATION() 2: D ← LOAD_RAW_DATA() 3: D ← REMOVE(#, duplicates, emojis, etc., from D) 4: auth ← AUTHENTICATE(API_KEY) 5: P ← CREATE_PROMPTS(D) 6: F ← DEFINE_FEATURES() 7: M_config ← CONFIGURE_MODEL(GPT_3_5, F) 8: M ← Configure_MODEL(M_config) 9: PD ← BATCH_PROCESS(M, D) 10: do 11: SLEEP(120) 12: PD ← CONTINUE_PROCESS(M) 13: loop until ALL_PROCESSED(P.D.) 14: if NEED_ITERATION() then 15: go to step 3 16: endif 17: J ← TO_JSON(PD) 18: C ← TO_CSV(J) 19: FD ← FINAL_CLEAN(C) 20: SD ← STANDARDIZE(FD) 21: if ¬VALIDATE(SD) then 22: raise ERROR(“Validation Failed”) 23: endif 24: return SD 25: end DATASET_CURATION |
4.4. Manual Verification and Post-Processing
4.5. Final Curated and Annotated Dataset Description
4.6. Model Training and Fine-Tuning
4.7. Model Evaluation
5. Experimental Setup
5.1. Dataset
5.2. Computing Environment
5.3. Fine Tuning of GPT-2
5.3.1. Data Preparation
5.3.2. Tokenization and Input Preparation
- Truncation and Padding:The input sequences were truncated to a maximum length of 256 tokens. This truncation was necessary to ensure that all sequences fit within the model’s input capacity, allowing for efficient processing and reducing computational overhead.Padding was applied to shorter sequences to create uniform input lengths across all sequences, ensuring that the model received inputs of consistent size, which is crucial for batch processing during training.
- Label Preparation for Classification Tasks:For classification tasks, labels were assigned to the prompts. The input sequences were tokenized and paired with their corresponding labels. The model was then trained to predict the correct labels based on the input tokens, enabling it to perform tasks such as classifying whether a tweet is related to a road traffic accident (RTC).
- Handling Long Inputs:If an input prompt exceeded the maximum token length of 256 tokens, it was truncated to fit within the model’s input capacity. The truncation was performed carefully to preserve the essential parts of the input, ensuring that the context required for accurate predictions was maintained.
- Input Preparation for Information Retrieval (IR) Tasks:In IR tasks, the model was fine-tuned to generate the correct answer based on a provided prompt. The prompt and expected response were tokenized together, with the model learning to predict the sequence of tokens that correspond to the correct answer. This approach allowed the model to handle a wide range of queries related to RTCs, such as extracting the number of injuries or identifying the location of an accident.
- Tokenization Process:The entire process of tokenization and input preparation can be represented by Equation (1). This formula illustrates how the prompt and response are combined and tokenized into a format suitable for GPT-2, ensuring that the input is ready for fine-tuning. Figure 3 shows a flowchart showing the step-by-step process from input tweets and questions to the tokenized format ready for GPT-2 fine-tuning.
5.3.3. Model Training Process
5.3.4. Multitask Learning Objective
- xt is the token at position t;
- x<t represents the sequence of tokens before position t;
- θ denotes the model parameters;
- T is the total number of tokens in the sequence;
- P(xt|x<t;θ) is the probability of token xt given the preceding tokens x<t and model parameters θ.
- N is the number of examples in the dataset;
- C is the number of classes in the classification task;
- yic is the true label, for example, i and class c;
- ŷic is the predicted probability, for example, i and class c.
- α is the weighting factor for the language modeling loss; and
- β is the weighting factor for the classification loss.
5.3.5. Training Setup and Configuration
5.3.6. Optimization and Learning Rate Scheduling
- θ: the parameters or weights of the model;
- η: the learning rate, a hyperparameter that controls how much to change the model parameters in response to the gradient of the loss function;
- ∇θL: the gradient of the loss function L with respect to the parameters θ;
- L: the loss function that measures how well the model’s predictions match the actual data.
- ηt: the learning rate at time t;
- ηmin: the minimum learning rate;
- ηmax: the maximum learning rate;
- t: the current time step or iteration;
- T: the total time or the period over which the learning rate schedule is applied;
- π: the mathematical constant pi, approximately equal to 3.14159.
5.4. Baseline Benchmark Models
5.4.1. Prompt Engineering Using Zero-Shot GPT-4o Mini
5.4.2. XGBoost Model Training
6. Analysis and Results
6.1. Fine-Tuned GPT-2 Performance
6.2. Baseline Model Performance: GPT-4o Mini and XGBoost
6.2.1. Classification Task Performance
6.2.2. Information Retrieval Task Performance
6.2.3. Comparative Analysis of Baseline Models
Classification Task Performance
Information Retrieval Task Performance
7. Discussion
7.1. Multitask Learning Framework for Enhanced RTC Analysis
7.2. Comparison with GPT-4 Zero-Shot Baseline
8. Conclusions and Future Work
8.1. Key Contributions
- Development of multitask learning framework (MMF) for classification and information retrieval: this study introduced a sophisticated MTF that utilizes LLMs to manage multiple classification and information retrieval tasks simultaneously, enabling a more comprehensive and efficient analysis of RTCs.
- Curated dataset: We developed a curated dataset specifically designed to fine-tune models for more accurate RTC analysis. This dataset includes a variety of labels and classes relevant to road traffic accidents, contributing significantly to the research community and providing a valuable resource for further RTC-related studies.
- Automated labeling using prompt engineering: The study employed prompt engineering techniques to automate the labeling process, enhancing both the efficiency and accuracy of data annotation. This automation is essential for scaling the analysis to larger datasets while maintaining consistency in labeling.
- Benchmarking fine-tuning vs. prompt engineering: By comparing the performance of fine-tuned models against those utilizing prompt engineering in a zero-shot setting, this study sets a benchmark for future research. This comparison provides valuable insights into the strengths and limitations of different methodologies, guiding the development of more effective approaches in RTC analysis.
8.2. Limitations and Future Directions
8.3. Implications for Research and Practice
Author Contributions
Funding
Data Availability Statement
Acknowledgments
Conflicts of Interest
Appendix A. Example Labeled Data from Traffic Incident Tweets
Tweet | Road Accident | Severity | Culprit | Culprit Apprehended | No of Injured | No of Deaths |
---|---|---|---|---|---|---|
“23 yr old man charged over Caboolture hit and run that killed Collin Young” | Yes | Fatal | 23 yr old man | Yes | None | 1 |
“Teenage driver arrested, woman in critical condition after alleged hit-and-run in Darwin” | Yes | Critical | Teenage driver | Yes | 1 | None |
“A professional Perth golfer accused of a hit-and-run that killed an elderly man on the freeway in October 2019 claims he was suffering a medical episode at the time and did not have control of his body.” | Yes | Fatal | Professional Perth golfer | No | None | 1 |
“#Duingal—Three stable patients have been transported to Bundaberg Hospital after a traffic incident on the Bruce Highway at 12.46pm.” | Yes | Mild | None | None | 3 | 0 |
“UPDATE: Peel St remains closed northbound at Swan St due to a 2-car crash with a person trapped. Diversions are via Swan St, Johnson St, Tribe St, Moore Creek Rd, Browns Ln to rejoin Manilla Rd near Hallsville. Continue to allow extra travel time.” | Yes | Moderate | None | None | 1 | None |
“#Beenleigh—Two vehicle traffic incident with three patients, all with minor injuries. Paramedics transporting two patients to Logan Hospital in a stable condition.” | Yes | Mild | None | None | 3 | 0 |
“Monday—Mixed 4’s: Slap That Ace beat Hit n Run (23–15)” | No | None | None | None | None | None |
Brisbane City: several lanes are closed on Countess Street due to a traffic incident where a truck has crashed into a rail bridge. Motorists are advised to use Hale Street as an alternate route, and avoid the area or expect delays. | Yes | Moderate | Truck | None | None | None |
7NEWS understands a 13-year-old boy has died at the Children’s Hospital after yesterday’s stolen car crash in Oakey. A 14-year-old boy remains on life support—while another teenager has been released from the Base Hospital and charged. | Yes | Fatal | Stolen car | Yes | 2 | 1 |
Location of Accident | Contributing Factor | Type of Car Involved | Crash Event Type | Driver Error | Collision Type | Case Scenario | Sentiment | Emotions |
---|---|---|---|---|---|---|---|---|
Caboolture | Hit-and-run | None | Hit-and-run | None | None | Hit-and-run resulting in fatality | Negative | Sadness |
Darwin | Alleged hit-and-run | Unknown | Hit-and-Run | Reckless driving | Unknown | Alleged hit-and-run | Negative | Sadness |
Freeway in Perth | Medical episode | Unknown | Hit-and-run | None | None | Medical episode leading to fatal hit-and-run | Negative | Sadness |
Bruce Highway | None | None | Traffic incident | None | None | Stable patients transported to hospital | Neutral | Neutral |
Peel St at Swan St | Driver error | None | 2-car collision | Yes | None | Person trapped | Neutral | Fear |
Beenleigh | None | Two vehicles | Traffic incident | None | None | None | Neutral | Neutral |
None | None | None | None | None | None | None | Neutral | None |
Countess Street, Brisbane City | Crash into a rail bridge | Truck | Collision with a rail bridge | None | Vehicle-structure collision | Traffic incident | Negative | Fear |
Oakey | Stolen car crash | Stolen car | Collision | None | Car crash | Stolen car crash resulting in death and injuries | Negative | Sadness |
Appendix B. Prompt Templates for Zero-Shot GPT-4o Mini
prompt = “““ Given this tweet below: <tweet> {text} </tweet> {instruction} “““ road_accident_prompt = “““ I want you to classify for this field: - Road Accident: (Yes or No) RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Pick the best answer (only one value from the list provided) as the value for this field. Use the values’ spellings as they have been provided you in your response. - Do not be unnecessarily verbose or make additional statements. Your response:”““ severity_prompt =“““ I want you to classify for this field: - Severity: (none, unknown, mild, moderate, severe, critical, or fatal) RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Pick the best answer (only one value from the list provided) as the value for this field. Use the values’ spellings as they have been provided you in your response. - Do not be unnecessarily verbose or make additional statements. Your response: “““ culprit_prompt = “““ I want you to extract information for this field: - Culprit: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response: “““ culprit_appehended_prompt=“““ I want you to classify for this field: - Culprit apprehended: (unknown, Yes or No) RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Pick the best answer (only one value from the list provided) as the value for this field. Use the values’ spellings as they have been provided you in your response. - Do not be unnecessarily verbose or make additional statements. Your response: “““ num_injured_prompt = “““I want you to extract information for this field: - No of injured: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ num_deaths_prompt = “““ I want you to extract information for this field: - No of deaths: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response: “““ location_prompt=“““I want you to extract information for this field: - Location of accident: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ contributing_factor_prompt=“““I want you to extract information for this field: - Contributing factor: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ type_of_car_involved_prompt=“““I want you to extract information for this field: - type of car involved: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ crash_event_type_prompt=“““I want you to extract information for this field: - crash event type: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ driver_error_prompt=“““I want you to extract information for this field: - driver error: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ collision_type_prompt=“““I want you to classify for this field: - collision type: (single-vehicle crashes, ‘types of car’ accidents, broadside collision, chain reaction car accidents, hit and run accidents, stationary object collision, pedestrian accidents or not applicable) RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Pick the best answer (only one value from the list provided) as the value for this field. Use the values’ spellings as they have been provided you in your response. - Do not be unnecessarily verbose or make additional statements. Your response:”““ case_scenario_prompt=“““I want you to extract information for this field: - case scenario: RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Do not be unnecessarily verbose or make additional statements. Your response:”““ sentiment_prompt=“““I want you to classify for this field: - sentiment: (positive, negative or neutral) RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Pick the best answer (only one value from the list provided) as the value for this field. Use the values’ spellings as they have been provided you in your response. - Do not be unnecessarily verbose or make additional statements. Your response:”““ emotions_prompt=“““I want you to classify for this field: - emotions: (fear, anger, sadness, happy, neutral, disgust, love, confusion, curiosity, gratitude, sympathy or empathy) RULES - You must provide a value for this field above. - if the tweet does not contain enough information to answer, you must return ‘None’. - Pick the best answer (only one value from the list provided) as the value for this field. Use the values’ spellings as they have been provided you in your response. - Do not be unnecessarily verbose or make additional statements. Your response:”““ model_prompt = PromptTemplate.from_template(prompt) model = ChatOpenAI(model=“gpt-4o-mini”, temperature=0, streaming=False)#.bind(response_format= {“type”:”json_object”}) chain = model_prompt | model | StrOutputParser() |
References
- Sahana, S.; Palaniappan, D.; Bobade, S.D.; Rafi, S.M.; Kannadasan, B.; Jayapandian, N. Deep learning ensemble model for the prediction of traffic accidents using social media data. J. Pharm. Negat. Results 2022, 13, 485–495. [Google Scholar] [CrossRef]
- Jaradat, S.; Alhadidi, T.I.; Ashqar, H.I.; Hossain, A.; Elhenawy, M. Exploring traffic crash narratives in Jordan using text mining analytics. arXiv 2024, arXiv:2406.09438. [Google Scholar]
- Gutierrez-Osorio, C.; González, F.A.; Pedraza, C.A. Deep learning ensemble model for the prediction of traffic accidents using social media data. Computers 2022, 11, 126. [Google Scholar] [CrossRef]
- Kumar, K.P.K.; Geethakumari, G. Detecting misinformation in online social networks using cognitive psychology. Hum. -Centric Comput. Inf. Sci. 2014, 4, 14. [Google Scholar] [CrossRef]
- Stieglitz, S.; Mirbabaie, M.; Ross, B.; Neuberger, C. Social media analytics—Challenges in topic discovery, data collection, and data preparation. Int. J. Inf. Manag. 2018, 39, 156–168. [Google Scholar] [CrossRef]
- Atefeh, F.; Khreich, W. A survey of techniques for event detection in Twitter. Comput. Intell. 2015, 31, 132–164. [Google Scholar] [CrossRef]
- Batrinca, B.; Treleaven, P.C. Social media analytics: A survey of techniques, tools and platforms. AI Soc. 2015, 30, 89–116. [Google Scholar] [CrossRef]
- Brown, T.B.; Mann, B.; Ryder, N.; Subbiah, M.; Kaplan, J.; Dhariwal, P.; Neelakantan, A.; Shyam, P.; Sastry, G.; Askell, A.; et al. Language models are few-shot learners. arXiv 2020, arXiv:2005.14165. [Google Scholar]
- Pei, X.; Li, Y.; Xu, C. GPT self-supervision for a better data annotator. arXiv 2023, arXiv:2306.04349. [Google Scholar]
- Caruana, R. Multitask learning. Mach. Learn. 1997, 28, 41–75. [Google Scholar] [CrossRef]
- Radford, A.; Wu, J.; Child, R.; Luan, D.; Amodei, D.; Sutskever, I. Language Models Are Unsupervised Multitask Learners. 2019. Available online: https://openai.com/index/better-language-models/ (accessed on 15 July 2024).
- Chalapathy, R.; Chawla, S. Deep learning for anomaly detection: A survey. arXiv 2019, arXiv:1901.03407. [Google Scholar]
- Kutela, B.; Mwekh’iga, R.J.; Kilaini, A.M.; Magehema, R.T.; Mbatta, G. Leveraging social media data to understand spatial and severity of roadway crashes in Tanzania. J. Saf. Stud. 2022, 7, 27–51. [Google Scholar] [CrossRef]
- Ruder, S. An overview of multitask learning in deep neural networks. arXiv 2017, arXiv:1706.05098. [Google Scholar]
- Zhang, Y.; Yang, Q. A survey on multitask learning. IEEE Trans. Knowl. Data Eng. 2022, 34, 5586–5609. [Google Scholar] [CrossRef]
- Liu, S.; Wang, Z.; Liu, X. Jointly learning multi-task sequences and language models with shared hiddenlLayers. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing, Hong Kong, China, 7 November 2019; pp. 5939–5948. [Google Scholar]
- Bingel, T.; Søgaard, S. Identifying beneficial task relations for multitask learning in deep neural networks. In Proceedings of the 15th Conference of the European Chapter of the Association for Computational Linguistics, Valencia, Spain, 3–7 April 2017; pp. 164–169. [Google Scholar]
- Thrun, S.; Pratt, L. Learning to Learn; Kluwer Academic Publishers: Dordrecht, The Netherlands, 1998. [Google Scholar]
- Liu, T.; Ma, X.; Liu, L.; Liu, X.; Zhao, Y.; Hu, N.; Ghafoor, K.Z. LAMBERT: Leveraging Attention Mechanisms to Improve the BERT Fine-Tuning Model for Encrypted Traffic Classification. Mathematics 2024, 12, 1624. [Google Scholar] [CrossRef]
- Zhou, Y.; Li, Z.; Tian, S.; Ni, Y.; Liu, S.; Ye, G.; Chai, H. SilverSight: A multi-task Chinese financial large language model based on adaptive semantic space learning. arXiv 2024, arXiv:2404.04949. [Google Scholar]
- Zhao, W.X.; Zhou, K.; Li, J.; Tang, T.; Wang, X.; Hou, Y.; Min, Y.; Zhang, B.; Zhang, J.; Dong, Z.; et al. A survey of large language models. arXiv 2023, arXiv:2303.18223. [Google Scholar]
- Vaswani, V.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Advances in Neural Information Processing Systems; The MIT Press: Cambridge, MA, USA, 2017. [Google Scholar]
- Pennington, J.; Socher, R.; Manning, C.D. GloVe: Global vectors for word representation. In Proceedings of the EMNLP, Doha, Qatar, 25–29 October 2014; pp. 1532–1543. [Google Scholar] [CrossRef]
- Yang, J.; Jin, H.; Tang, R.; Han, X.; Feng, Q.; Jiang, H.; Yin, B.; Hu, X. Harnessing the power of LLMs in practice: A survey on ChatGPT and beyond. arXiv 2023, arXiv:2304.13712. [Google Scholar] [CrossRef]
- Kojima, T.; Gu, S.S.; Reid, M.; Matsuo, Y.; Iwasawa, Y. Large language models are zero-shot reasoners. In Proceedings of the NeurIPS, New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
- Ye, X.; Durrett, G. The unreliability of explanations in few-shot prompting. In Proceedings of the NeurIPS, New Orleans, LA, USA, 28 November–9 December 2022. [Google Scholar]
- Hasan, S.; Ukkusuri, S.V. Location contexts of user check-ins to model urban geo life-style patterns. PLoS ONE 2015, 10, e0124819. [Google Scholar] [CrossRef]
- Radford, A.; Narasimhan, K.; Saliman, T.; Sutskever, I. Improving Language Understanding by Generative Pre-Training. 2018. Available online: https://openai.com/index/language-unsupervised/ (accessed on 15 July 2024).
- Ni, M.; He, Q.; Gao, J. Forecasting the subway passenger flow under event occurrences with social media. IEEE Trans. Intell. Transp. Syst. 2017, 18, 1623–1632. [Google Scholar] [CrossRef]
- Shirky, C. The political power of social media: Technology, the public sphere, and political change. Foreign Aff. 2011, 90, 28–41. [Google Scholar]
- Ye, Q.; Chen, X.; Ozbay, K.; Li, T. Mining social media data for transport policy: Approaches, challenges, and recommendations. In Proceedings of the 2022 IEEE 25th International Conference on Intelligent Transportation Systems (ITSC), Macau, China, 8–12 October 2022; IEEE: Piscataway, NJ, USA; pp. 785–794. [Google Scholar] [CrossRef]
- Demertzis, K.; Iliadis, L.; Anezakis, V.-D. MOLESTRA: A multitask learning approach for real-time big data analytics. In Proceedings of the 2018 Innovations in Intelligent Systems and Applications (INISTA), Thessaloniki, Greece, 3–5 July 2018; IEEE: Piscataway, NJ, USA. [Google Scholar] [CrossRef]
- Wang, G.; Kim, J. The prediction of traffic congestion and incident on urban road networks using Naive Bayes classifier. In Proceedings of the ATRF, Melbourne, Australia, 16–18 November 2016. [Google Scholar]
- Liu, X.; He, P.; Chen, W.; Gao, J. Multi-task deep neural networks for natural language understanding. In Proceedings of the NAACL-HLT, Minneapolis, MN, USA, 2–7 June 2019; pp. 4487–4496. [Google Scholar]
- Zhang, Z.; He, Q.; Zhu, S. Potentials of using social media to infer the longitudinal travel behavior: A sequential model-based clustering method. Transp. Res. Part C Emerg. Technol. 2017, 85, 396–414. [Google Scholar] [CrossRef]
- Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P.J. Exploring the limits of transfer learning with a unified text-to-text transformer. J. Mach. Learn. Res. 2020, 21, 1–67. [Google Scholar]
- D’Andrea, E.; Ducange, P.; Bechini, A.; Renda, A.; Marcelloni, F. Real-time detection of traffic from Twitter stream analysis. IEEE Trans. Intell. Transp. Syst. 2015, 16, 2269–2283. [Google Scholar] [CrossRef]
- Mehri, S.; Eskenazi, M. USR: An unsupervised and reference free evaluation metric for dialog generation. arXiv 2020, arXiv:2005.00456. [Google Scholar]
- Vishwakarma, M.; Kesswani, N. A new two-phase intrusion detection system with Naïve Bayes machine learning for data classification and elliptic envelop method for anomaly detection. Decis. Anal. J. 2023, 7, 100233. [Google Scholar] [CrossRef]
- Liu, Z.; He, S.; Ding, F.; Tan, H.; Liu, Y. Exploring the potential of social media data in interpreting traffic congestion: A case study of Jiangsu Freeways. In Proceedings of the CICTP 2023, Beijing, China, 14–17 July 2023. [Google Scholar] [CrossRef]
- Ding, Y.; Tao, H.; Zhang, R.; Cheng, Y.; Wang, H. Social media-based traffic situational awareness under extreme weather. In Proceedings of the CICTP 2023, Beijing, China, 14–17 July 2023. [Google Scholar] [CrossRef]
- Yang, X.; Bekoulis, G.; Deligiannis, N. Traffic event detection as a slot filling problem. Eng. Appl. Artif. Intell. 2023, 123, 106202. [Google Scholar] [CrossRef]
- Zheng, O.; Abdel-Aty, M.; Wang, Z.; Ding, S.; Wang, D.; Huang, Y. Avoid: Autonomous vehicle operation incident dataset across the globe. arXiv 2023, arXiv:2303.12889. [Google Scholar]
- Jaradat, S.; Nayak, R.; Paz, A.; Elhenawy, M. Ensemble Learning with Pre-Trained Transformers for Crash Severity Classification: A Deep NLP Approach. Algorithms 2024, 17, 284. [Google Scholar] [CrossRef]
- Luceri, L.; Boniardi, E.; Ferrara, E. Leveraging large language models to detect influence campaigns on social media. arXiv 2023, arXiv:2311.07816. [Google Scholar]
- Yang, K.; Zhang, T.; Kuang, Z.; Xie, Q.; Huang, J.; Ananiadou, S. MentaLLaMA: Interpretable mental health analysis on social media with large language models. In Proceedings of the ACM Web Conference, Singapore, 13–17 May 2024; pp. 4489–4500. [Google Scholar] [CrossRef]
- Kim, S.; Kim, K.; Jo, C.W. Accuracy of a large language model in distinguishing anti- and pro-vaccination messages on social media: The case of human papillomavirus vaccination. Prev. Med. Rep. 2024, 42, 102723. [Google Scholar] [CrossRef]
- Li, M.; Conrad, F. Advancing annotation of stance in social media posts: A comparative analysis of large language models and crowd sourcing. arXiv 2024, arXiv:2406.07483. [Google Scholar]
- Xue, H.; Zhang, C.; Liu, C.; Wu, F.; Jin, X. Multi-task prompt words learning for social media content generation. arXiv 2024, arXiv:2407.07771. [Google Scholar]
- Liu, J.; Siu, M. Enhancing mental health condition detection on social media through multi-task learning. medRxiv 2024. [Google Scholar] [CrossRef]
- Ilias, L.; Askounis, D. Multitask learning for recognizing stress and depression in social media. arXiv 2023, arXiv:2305.18907. [Google Scholar] [CrossRef]
- Aduragba, O.T.; Yu, J.; Cristea, A.I. Multi-task learning for personal health mention detection on social media. arXiv 2022, arXiv:2212.05147. [Google Scholar]
- Bruns, A.; Burgess, J.; Highfield, T. A ‘big data’ approach to mapping the Australian Twittersphere. In Advancing Digital Humanities; Palgrave Macmillan: London, UK, 2014. [Google Scholar] [CrossRef]
- Peters, M.E.; Neumann, M.; Iyyer, M.; Gardner, M.; Clark, C.; Lee, K.; Zettlemoyer, L. Deep contextualized word representations. In Proceedings of the NAACL-HLT 2018, New Orleans, LA, USA, 1–6 June 2018; pp. 2227–2237. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the NAACL-HLT 2019, Minneapolis, MN, USA, 2–7 June 2019; pp. 4171–4186. [Google Scholar] [CrossRef]
- Gal-Tzur, A.; Grant-Muller, S.; Kuflik, T.; Minkov, E.; Nocera, S.; Shoor, I. The potential of social media in delivering transport policy objectives. Transp. Policy 2014, 32, 115–123. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A scalable tree boosting system. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; Association for Computing Machinery: New York, NY, USA, 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Rathje, S.; Mirea, D.-M.; Sucholutsky, I.; Marjieh, R.; Robertson, C.E.; Van Bavel, J.J. GPT is an Effective Tool for Multilingual Psychological Text Analysis. Proc. Natl. Acad. Sci. USA 2024, 121, e2308950121. [Google Scholar] [CrossRef]
- Manning, C.D.; Raghavan, P.; Schütze, H. Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008. [Google Scholar]
- Lin, C.-Y. ROUGE: A package for automatic evaluation of summaries. In Proceedings of the Text Summarization Branches Out, Barcelona, Spain, 25–26 July 2004; Association for Computational Linguistics: Stroudsburg, PA, USA; pp. 74–81. [Google Scholar]
- Papineni, K.; Roukos, S.; Ward, T.; Zhu, W.-J. BLEU: A method for automatic evaluation of machine translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Stroudsburg, PA, USA, 7–12 July 2002; Association for Computational Linguistics: Stroudsburg, PA, USA, 2002; pp. 311–318. [Google Scholar] [CrossRef]
- Morris, A.C.; Maier, V.; Green, P. From WER and RIL to MER and WIL: Improved evaluation measures for connected speech recognition. In Proceedings of the Interspeech, Jeju Island, Republic of Korea, 4–8 October 2004; pp. 2765–2768. [Google Scholar]
- Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; et al. Transformers: State-of-the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Online, 16–20 November 2020; Liu, Q., Schlangen, D., Eds.; Association for Computational Linguistics: Stroudsburg, PA, USA, 2020; pp. 38–45. [Google Scholar] [CrossRef]
Study | Authors (Year) | Domain | Model Used | Data Type | Key Findings |
---|---|---|---|---|---|
Exploring the Potential of Social Media Data in Interpreting Traffic Congestion: A Case Study of Jiangsu Freeways [40] | [40] | Traffic Congestion Analysis | Document Frequency-Based Method | Sina Weibo Microblogs | Identified congestion-prone areas using Sina Weibo data, demonstrating the potential for traffic analysis through social media. |
Social Media-Based Traffic Situational Awareness under Extreme Weather [41] | [41] | Traffic Situational Awareness | LSTM Classifier | Weibo Data | Enhanced traffic situational awareness under extreme weather with 93.8%–95.8% accuracy using an LSTM classifier. |
Traffic Event Detection as a Slot Filling Problem [42] | [42] | Traffic Event Detection | CNN, LSTM, BERT | Twitter Data | Addressed traffic event detection from Twitter data as a text classification and slot filling problem, achieving high performance scores. |
Identification and Classification of Road Traffic Incidents in Panama City through Social Media Stream Analysis [43] | [43] | Traffic Incident Identification | SVM, Naïve Bayes, Random Forest, XGBoost | Twitter Data | Achieved high precision rates in traffic incident identification and classification using machine learning models. |
Deep Learning Ensemble Model for the Prediction of Traffic Accidents Using Social Media Data [3] | [3] | Traffic Accident Prediction | GRU, CNN | Social Media Data, Bogota Climate Information | Proposed a deep learning ensemble model for traffic accident prediction, outperforming baseline algorithms. |
Twitter-informed Prediction for Urban Traffic Flow Using Machine Learning [44] | [44] | Urban Traffic Flow Prediction | Random Forest, Gradient Boosting | Twitter Data, PeMS Data | Combined Twitter data with traffic and weather information to enhance traffic flow prediction accuracy. |
Leveraging Large Language Models to Detect Influence Campaigns on Social Media [45] | [45] | Influence Campaign Detection | Large Language Models (LLMs) | Multilingual Social Media Datasets | Showcased superior performance in detecting and adapting to influence campaigns using LLMs incorporating user metadata and network structures. |
MentaLLaMA: Interpretable Mental Health Analysis on Social Media with Large Language Models [46] | [46] | Mental Health Analysis | MentaLLaMA, LLMs | IMHI dataset, Social Media Data | Introduced MentalLLaMA, achieving state-of-the-art correctness in mental health analysis on social media, generating high-quality explanations. |
Accuracy of a Large Language Model in Distinguishing Anti- And Pro-vaccination Messages on Social Media: The Case of Human Papillomavirus Vaccination [47] | [47] | Sentiment Analysis | ChatGPT, LLMs | Facebook, Twitter Data | Assessed ChatGPT’s accuracy in sentiment analysis of pro- and anti-vaccination messages on social media. |
Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing [48] | [48] | Stance Annotation | LLMs, Crowdsourcing | Twitter Data | The performance of LLMs with human annotators in stance annotation was compared, and it was found that LLMs perform well when human annotators do. |
Multitask Prompt Words Learning for Social Media Content Generation [49] | [49] | Content Generation | Multi-modal Information Fusion | Social Media Data | Introduced a multitask prompt word learning framework, improving the quality and relevance of social media content generation. |
Enhancing Mental Health Condition Detection on Social Media through Multitask Learning [50] | [50] | Mental Health Detection | BERT, Multitask Learning | Reddit, SWMH, PsySym | Multitask learning was used to enhance mental health condition detection, outperforming single-task and large language models. |
Multitask Learning for Recognizing Stress and Depression in Social Media [51] | [51] | Stress and Depression Detection | BERT, Attention Fusion | Reddit, Stress, and Depression Datasets | Introduced multitask learning frameworks for recognizing stress and depression, outperforming existing methods. |
Multitask Learning for Personal Health Mention Detection on Social Media [52] | [52] | Health Mention Detection | Multitask Learning | Annotated Social Media Data | Enhanced personal health mention detection by incorporating emotional information as an auxiliary task in a multitask learning framework. |
Attribute | Definition |
---|---|
tweet_id | Unique identifier of a tweet, as provided by Twitter/X. |
username | Name of the Twitter account that posted the tweet. |
text | Full body text of the tweet, including all hashtags. |
created_at | Time the tweet was posted. |
in_reply_to_tweet_id | If the tweet is a reply, this shows the ID of the tweet this replied to; otherwise, the value is blank. |
retweeted_tweet_id | If the tweet is a retweet, this shows the retweeted Tweet’s ID; otherwise, the value is blank. |
quoted_tweet_id | If the tweet is a quote, this shows the quoted tweet’s ID; otherwise, the value is blank. |
favorite_count | The number of favorites this tweet received at the time of collection. |
location | Location that the tweet posted; can be blank. |
“““ Given this tweet below: <tweet> {text} </tweet> I want you to extract information for the following fields: - Road Accident: Yes or No - Severity: mild, moderate, fatal - Driver: - Driver apprehended: - No of injured: - No of deaths: - Location of accident: - Contributing factor: - type of car involved: - crash event type: - driver error: - collision type: - case scenario: - sentiment: (positive, negative, or neutral) - emotions: (pick only one of these: fear, anger, sadness, happy, neutral, disgust, love, confusion, gratitude, sympathy or empathy) Provide your answer in a JSON dictionary. RULES - You must provide a value to each field above. - if the tweet does not contain enough information to answer any of the above field, you must set that field to the value “None.” Your response:”““ |
Feature | GPT Label | Standardized Value |
---|---|---|
No of Deaths | “1 (pregnant mate)” | “1” |
“at least 12” | “≥12” | |
“dependent on scenario” | “unknown” | |
“one million Americans” | “unknown” | |
Driver Apprehended | “one male detained, second tracked by pd falco” | “yes” |
“varies” | “unknown” | |
“no information” | “unknown” | |
“not applicable” | “no” | |
Crash Event Type | “hit my parked car” | “single-vehicle crashes” |
“reckless joyride and crashing into a house” | “vehicle crashing into a building” | |
“road accidents” | “traffic accidents” | |
“roll-over” | “vehicle rollover” | |
Collision Type | “animal-vehicle” | “single-vehicle crashes” |
“back collision” | “backing collision” | |
“broadside collision” | “broadside collision” | |
“hit and run accidents” | “hit and run accidents” | |
Severity | “catastrophic” | “fatal” |
“mild to moderate” | “moderate” | |
“serious non-life threatening” | “moderate” | |
“tragic” | “fatal” | |
Driver Error | “abandoning critically injured passenger” | “failure to render aid” |
“attempted hit-and-run” | “hit-and-run attempt” | |
“confusion” | “not applicable” | |
“dangerous driving, impaired, impaired 80+” | “impaired driving” | |
Sentiment | “confusion” | “negative” |
“shocked” | “negative” | |
“unknown” | “neutral” | |
Emotions | “amusement” | “happy” |
“annoyed” | “angry” | |
“anxious” | “fear” | |
Contributing Factor | “alleged shooting” | “not applicable” |
“allegedly caused a sickening three-car crash” | “possible contributing factor” | |
“bad interview” | “not applicable” | |
“banana peel thrown onto road” | “possible contributing factor” | |
Type of Car Involved | “conservation police officer’s vehicle and another vehicle” | “police officer’s vehicle, another vehicle” |
“grey, 2 door car with dark tinted windows, loud exhaust, and black hardtop” | “grey, 2 door car with dark tinted window and black hardtop” | |
“unidentified” | “unidentified” | |
No of Injured | “1 (deputy)” | “1” |
“at least 5” | “≥5” | |
“dozens” | “12+” | |
“none” | “0” |
Serial Number | Tweet | API Filter Classification | GPT-2 Model Classification | Context |
---|---|---|---|---|
1 | R.T. @9NewsGoldCoast: The Maroons will travel from the Gold Coast to Western Australia today aiming to execute a hit-and-run mission in the west to regain the Origin Shield. #9News https://t.co/qfEJem4RJO (accessed on 7 November 2023) | Crash-related | Non-crash-related | The tweet uses “hit-and-run” in the context of a sports mission, not a traffic crash. |
2 | If she’s still hung over some guy whether for good or for bad my guy flee the scene. No try fix wetin you no spoil (accessed on 7 November 2023) | Crash-related | Non-crash-related | The phrase “flee the scene” is used in a relationship context, not related to a traffic incident. |
3 | Monday—Mixed 4’s: Slap That Ace beat Hit n Run (23–15) https://t.co/n0PAcPwi5f (accessed on 7 November 2023) | Crash-related | Non-crash-related | The term “Hit n Run” is part of a sports score update, unrelated to any traffic crash. |
4 | The best interview I have seen Laura do, and I’ll admit I’m not always a fan. A car crash of an interview for the PM. Gave me absolutely no faith in sorting out the country’s issues. @bbclaurak (accessed on 7 November 2023) | Crash-related | Non-crash-related | “Car crash” is used metaphorically to describe a disastrous interview, not an actual traffic incident. |
Statistic | Value |
---|---|
Number of samples | 26,226 |
Number of samples after preprocessing | 19,834 |
No. of features generated for each tweet | 15 |
Total number of samples | 19,384 × 15 = 297,510 |
Training samples | 287,010 |
Testing samples | 10,500 |
Was There a Road Traffic Accident? | ||||||||
Dataset | Yes | No | Unknown | Total | ||||
Training Set | 16,702 | 2431 | 1 | 19,134 | ||||
Testing Set | 605 | 95 | 0 | 700 | ||||
Total | 17,307 | 2526 | 1 | 19,834 | ||||
Severity: How bad was the road traffic accident? | ||||||||
Dataset | Fatal | Mild | Moderate | None | Severe | Critical | Unknown | Total |
Training Set | 6824 | 6736 | 2302 | 2269 | 619 | 287 | 97 | 19,134 |
Testing Set | 277 | 228 | 83 | 86 | 13 | 9 | 4 | 700 |
Total | 7101 | 6964 | 2385 | 2355 | 632 | 296 | 101 | 19,834 |
Was the culprit driver identified? | ||||||||
Dataset | Yes | No | Unknown | Reportedly | Some | Total | ||
Training Set | 2339 | 15,682 | 1106 | 6 | 1 | 19,134 | ||
Testing Set | 102 | 559 | 39 | 0 | 0 | 700 | ||
Total | 2441 | 16,241 | 1145 | 6 | 1 | 19,834 | ||
What is the collision type? | ||||||||
Dataset | Training Set | Testing Set | Total | |||||
Hit-And-Run | 3049 | 110 | 3159 | |||||
Pedestrian | 1486 | 46 | 1532 | |||||
Broadside | 390 | 16 | 406 | |||||
Single-Vehicle Crashes | 179 | 6 | 185 | |||||
Stationary Object | 133 | 9 | 142 | |||||
Chain Reaction | 127 | 6 | 133 | |||||
Rear-End | 101 | 0 | 101 | |||||
Type of Car | 103 | 4 | 107 | |||||
Rollover | 32 | 0 | 32 | |||||
Intersection | 10 | 0 | 10 | |||||
Sideswipe | 3 | 0 | 3 | |||||
Side-Impact | 2 | 0 | 2 | |||||
Backing | 2 | 0 | 2 | |||||
Not Applicable | 13,517 | 503 | 14,020 | |||||
Total | 19,134 | 700 | 19,834 | |||||
What is the sentiment? | ||||||||
Dataset | Negative | Positive | Neutral | Total | ||||
Training Set | 13,348 | 2388 | 3398 | 19,134 | ||||
Testing Set | 490 | 83 | 127 | 700 | ||||
Total | 17,307 | 2526 | 1 | 19,834 | ||||
What is the emotion? | ||||||||
Dataset | Training Set | Testing Set | Total | |||||
Sadness | 6788 | 249 | 7037 | |||||
Anger | 4980 | 183 | 5163 | |||||
Fear | 3481 | 112 | 3593 | |||||
Neutral | 2187 | 87 | 2274 | |||||
Happy | 922 | 36 | 958 | |||||
Disgust | 309 | 12 | 321 | |||||
Gratitude | 207 | 11 | 218 | |||||
Confusion | 202 | 5 | 207 | |||||
Sympathy | 30 | 2 | 32 | |||||
Love | 24 | 1 | 25 | |||||
Curiosity | 3 | 1 | 4 | |||||
Empathy | 1 | 1 | 2 | |||||
Total | 19,134 | 700 | 19,834 |
No. | Feature | Task |
---|---|---|
1 | Was there a road traffic accident? | Classification |
2 | How bad was the road traffic accident? | Classification |
3 | Was the driver identified? | Classification |
4 | What is the collision type? | Classification |
5 | What is the sentiment? | Classification |
6 | What is the emotion? | Classification |
7 | Who is the driver? | Information Retrieval (IR) |
8 | How many people were injured? | Information Retrieval (IR) |
9 | How many people died? | Information Retrieval (IR) |
10 | What is the location of the accident? | Information Retrieval (IR) |
11 | What is the contributing factor to the accident? | Information Retrieval (IR) |
12 | What is the car type involved? | Information Retrieval (IR) |
13 | What is the crash event type? | Information Retrieval (IR) |
14 | What was the driver error? | Information Retrieval (IR) |
15 | What was the scenario in this narrative? | Information Retrieval (IR) |
Hyperparameter | Value |
---|---|
Model Name | GPT-2 Medium |
Max Length | 256 tokens |
Batch Size | 6 |
Learning Rate | 5 × 10−5 |
Model/Tokenizer Name | GPT2-medium |
Weight Decay | 0.0 |
Number of Epochs | 12 |
Optimizer | AdamW |
Gradient Accumulation Steps | 128 |
Warmup Steps | 100 |
Evaluation Strategy | Steps |
Save Steps | 32 |
Eval Steps | 32 |
Logging Steps | 4 |
Output Directory | “/path/to/output_dir” |
Feature | Class Value | Precision | Recall | F1-Score |
---|---|---|---|---|
Was there a road traffic accident? | Yes | 0.825 | 0.842 | 0.833 |
No | 0.975 | 0.972 | 0.974 | |
What is the severity of the road traffic accident? | Fatal | 0.933 | 0.953 | 0.943 |
Moderate | 0.638 | 0.530 | 0.579 | |
Critical | 1.000 | 0.778 | 0.875 | |
None | 0.877 | 0.744 | 0.805 | |
Mild | 0.754 | 0.833 | 0.792 | |
Unknown | 0.000 | 0.000 | 0.000 | |
Severe | 0.812 | 1.000 | 0.897 | |
What is the collision type? | Single vehicle | 0.400 | 0.333 | 0.364 |
Multiple Vehicle | 0.000 | 0.000 | 0.000 | |
Broadside | 0.647 | 0.688 | 0.667 | |
Chain reaction | 0.000 | 0.000 | 0.000 | |
Hit-and-run | 0.589 | 0.509 | 0.546 | |
Stationary object | 0.500 | 0.333 | 0.400 | |
Pedestrian | 0.342 | 0.283 | 0.310 | |
Not applicable | 0.852 | 0.913 | 0.881 | |
What is the sentiment? | Negative | 0.901 | 0.967 | 0.933 |
Positive | 0.887 | 0.759 | 0.818 | |
What is the emotion? | Sympathy | 0.000 | 0.000 | 0.000 |
Gratitude | 0.750 | 0.545 | 0.632 | |
Love | 1.000 | 0.000 | 1.000 | |
Disgust | 0.500 | 0.167 | 0.250 | |
Anger | 0.723 | 0.814 | 0.766 | |
Neutral | 0.643 | 0.517 | 0.573 | |
Confusion | 0.000 | 0.000 | 0.000 | |
Curiosity | 0.000 | 0.000 | 0.000 | |
Sadness | 0.784 | 0.892 | 0.835 | |
Empathy | 0.000 | 0.000 | 0.000 | |
Happy | 0.698 | 0.833 | 0.759 | |
Fear | 0.765 | 0.580 | 0.660 | |
Was the driver identified? | Yes | 0.958 | 0.902 | 0.929 |
No | 0.936 | 0.971 | 0.953 | |
Unknown | 0.500 | 0.308 | 0.381 |
Feature | BLEU-4 | ROUGE-I | WER |
---|---|---|---|
How many people were injured? | 0.15 | 0.85 | 0.15 |
How many people died? | 0.15 | 0.87 | 0.13 |
What was the location? | 0.26 | 0.80 | 0.27 |
What are the contributing factors? | 0.16 | 0.78 | 0.26 |
What car was involved? | 0.15 | 0.80 | 0.22 |
What is the crash event type? | 0.47 | 0.81 | 0.28 |
What was the case scenario? | 0.15 | 0.58 | 0.75 |
What was the driver error? | 0.23 | 0.75 | 0.37 |
Was the culprit driver identified? | 0.21 | 0.80 | 0.28 |
Average metric performance across all features | 0.22 | 0.78 | 0.30 |
Tweets for Each Feature | GPT-3.5 Labels | Model Prediction |
---|---|---|
What is the collision type? | ||
Six people were hurt, including four pedestrians, in a serious hit-and-run crash in River North, Chicago police said. | hit-and-run | pedestrian |
R.T. @DerrickRBickley: Like a “tough, all-action thriller” and would love a “book you can’t put down” THE HIT-AND-RUN MAN at or your favourite digital store and PAPERBACK HARDBACK AUDIOBOOK plus other formats | hit-and-run | not applicable |
Woman dies from injuries two weeks after hit and run in Darwin’s CBD | pedestrian | hit-and-run |
@yourallon Hit and run mission, we were there for about 5 min. | hit-and-run | not applicable |
How many people were injured? | ||
Harrison Grey in an induced coma after Bees Creek Road, Darwin hit and run | 1 | 1 |
Three people and a dog have avoided serious injury after a multi-car crash in Oxenford. It happened at the intersection of Hope Island Road and the Pacific Highway exit. | 3 | 3 |
R.T. @DerrickRBickley: Fancy a reader-acclaimed FIVE/FOUR star but no E-reader? THE HIT AND RUN MAN is available in three print formats: PAPERBACK HARDBACK (Amzn) (B&N) LARGE PRINT | unknown | 0 |
R.T. @9NewsSyd: Three teenage boys on their way home from school have been hit by a car at a Sydney pedestrian crossing. The driver leaving the scene without stopping to help. @LizzyBryan1 | 3 | 3 |
What were the contributing factors? | ||
There are reports a motorcycle was seen leaving the scene in Gnangara after the shooting. | shooting | shooting |
R.T. @mlmartens: @MissionRCMP It was a hit and run. Why are excusing the behaviour of the driver? | driver error | hit-and-run |
FEED: BUS @TransLinkSEQ: Buses travelling along Waterworks Rd, The Gap are delayed up to 15 min due to a traffic accident. | not applicable | not applicable |
2 Alberta students dead, 3 in serious condition after single-vehicle car crash <prompt> what is the contributing factor to the accident?: | unknown | unknown |
What is the location? | ||
A witness to a fatal hit-and-run crash in Melbourne in which a woman was dragged alongside a moving car, has told how neighbours desperately tried to save the woman. | melbourne | melbourne |
R.T. @MaggieRaworth: A man has been killed after a hit and run incident in Hampton Park this morning. Police have created a crime scene in front of Kilberry Valley Primary School. @9NewsMelb | hampton park in front of kilberry valley primary school | hampton park |
R.T. @manny745: A Walk to Revenge A drizzly Manchester night, a hit and run. A chance meeting decades later leads ultimately to the exposure of a transatlantic crime family. | manchester | manchester |
@sultan_sydney hit and run sultsssssss | sydney | unknown |
What is the case scenario? | ||
A car tied to two separate hit-and-run incidents in Lexington, including one involving a child, has been found and the driver questioned, according to Lexington Police. | child involved in one incident | hit-and-run incident involving a child |
An elderly man who suffered critical injuries in an alleged hit-and-run in Sydney has undergone a number of surgeries, as police share footage of the vehicle they believe is involved. | elderly man suffered critical injuries | elderly man undergoing surgery |
Bicyclist killed in Bronx hit and run; driver fled in another car with baby. | hit-and-run resulting in fatality | hit-and-run resulting in fatality |
The family of a woman allegedly run down in a hit and run on a suburban street while walking her dog has made a heartbreaking plea for her recovery from critical injuries. | woman walking her dog hit by a car | hit-and-run while walking her dog |
What is the driver error? | ||
@etsysofttotouch @BandB_CBS And covered up her drunken hit and run of Darla and slept with her daughter’s love Rick! | driving under the influence | driving under the influence |
Child in serious condition after being struck by hit-and-run driver evading Queensland police | fleeing from law enforcement | evading police |
@laurarichards99 @jessradio I hope you are sitting—you may start to feel the rage after reading this. It is actually based around a hit and run case in Oz but somehow it has become this, a petition about changing the law against women? | reckless driving | not applicable |
Judge Raoul Neave strikes again. A decade after going easy on hit-and-run investment banker Guy Hallwright, he’s done it again to a recidivist drink driver who ran a red light, drunk, killed an innocent driver then fled the scene. | running red light | driving under the influence |
What is the crash event type? | ||
Police have seized a car used in a suspected hit and run that killed a father of four in Fairfield yesterday morning. | hit-and-run | hit-and-run |
@BLUEfingers2021 Only if you’re in a new Jag. Also, it’s obligatory to drive into the side of a house and repeatedly attempt to leave the scene. | single-vehicle | not applicable |
1 driver taken to hospital following 3-vehicle collision in downtown Cedar City | chain reaction car accidents | chain reaction car accidents |
TRAFFIC LIGHTS WENT DOWN, IMPAIRED CHARGES LAID after single vehicle collision 3:20 am Sept 18 at Main & Dundas St, Cambridge. Vehicle struck pole, traffic lights fell. Cambridge man 23 charged w dangerous driving, impaired, impaired 80+. Intersection reopened after repairs. | single-vehicle crashes | single-vehicle crashes |
Classification Task | Model | Accuracy | Precision | Recall | F1-Score |
---|---|---|---|---|---|
Is there a road traffic accident? | XGBoost | 0.950 | 0.920 | 0.850 | 0.880 |
GPT-2 | 0.954 | 0.900 | 0.907 | 0.903 | |
GPT-4 | 0.856 | 0.741 | 0.912 | 0.780 | |
Was the driver identified? | XGBoost | 0.930 | 0.870 | 0.700 | 0.750 |
GPT-2 | 0.924 | 0.798 | 0.727 | 0.755 | |
GPT-4 | 0.610 | 0.315 | 0.303 | 0.307 | |
What is the severity of the accident? | XGBoost | 0.820 | 0.780 | 0.730 | 0.740 |
GPT-2 | 0.831 | 0.716 | 0.691 | 0.699 | |
GPT-4 | 0.476 | 0.469 | 0.551 | 0.380 | |
What is the emotion in this tweet? | XGBoost | 0.680 | 0.420 | 0.340 | 0.360 |
GPT-2 | 0.743 | 0.489 | 0.446 | 0.456 | |
GPT-4 | 0.496 | 0.372 | 0.389 | 0.308 | |
What is the collision type? | XGBoost | 0.770 | 0.530 | 0.380 | 0.430 |
GPT-2 | 0.777 | 0.370 | 0.340 | 0.352 | |
GPT-4 | 0.533 | 0.243 | 0.263 | 0.203 | |
What is the sentiment in the tweet? | XGBoost | 0.860 | 0.830 | 0.720 | 0.770 |
GPT-2 | 0.879 | 0.849 | 0.780 | 0.810 | |
GPT-4 | 0.861 | 0.786 | 0.811 | 0.797 |
Feature | GPT-2 BLEU-4 | GPT-4 BLEU | GPT-2 ROUGE-I | GPT-4 ROUGE-L | GPT-2 WER | GPT-4 WER |
---|---|---|---|---|---|---|
How many people were injured? | 0.15 | 0.0203 | 0.85 | 0.1143 | 0.15 | 0.8857 |
How many people died? | 0.15 | 0.1331 | 0.87 | 0.7486 | 0.13 | 0.2514 |
What was the location? | 0.26 | 0.2105 | 0.80 | 0.6644 | 0.27 | 0.3506 |
What are the contributing factors? | 0.16 | 0.0041 | 0.78 | 0.0162 | 0.26 | 1.7179 |
What car was involved? | 0.15 | 0.1164 | 0.80 | 0.6324 | 0.22 | 0.3689 |
What is the crash event type? | 0.47 | 0.0530 | 0.81 | 0.1146 | 0.28 | 0.9752 |
What was the case scenario? | 0.15 | 0.0144 | 0.58 | 0.0976 | 0.75 | 11.9861 |
What was the driver error? | 0.23 | 0.0017 | 0.75 | 0.0081 | 0.37 | 1.2303 |
Who Was the culprit? | 0.21 | 0.0529 | 0.80 | 0.2964 | 0.28 | 0.8771 |
Average Metric Performance | 0.22 | 0.0674 | 0.78 | 0.2992 | 0.30 | 2.0715 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jaradat, S.; Nayak, R.; Paz, A.; Ashqar, H.I.; Elhenawy, M. Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data. Smart Cities 2024, 7, 2422-2465. https://doi.org/10.3390/smartcities7050095
Jaradat S, Nayak R, Paz A, Ashqar HI, Elhenawy M. Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data. Smart Cities. 2024; 7(5):2422-2465. https://doi.org/10.3390/smartcities7050095
Chicago/Turabian StyleJaradat, Shadi, Richi Nayak, Alexander Paz, Huthaifa I. Ashqar, and Mohammad Elhenawy. 2024. "Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data" Smart Cities 7, no. 5: 2422-2465. https://doi.org/10.3390/smartcities7050095
APA StyleJaradat, S., Nayak, R., Paz, A., Ashqar, H. I., & Elhenawy, M. (2024). Multitask Learning for Crash Analysis: A Fine-Tuned LLM Framework Using Twitter Data. Smart Cities, 7(5), 2422-2465. https://doi.org/10.3390/smartcities7050095