Social Media Analytics for Disaster Response: Classification and Geospatial Visualization Framework
Abstract
:1. Introduction
- We developed and trained a high-performance deep learning model for the accurate classification of diverse disaster-related content from social media data.
- We designed and implemented an interactive web-based platform that integrates user geolocation with classified social media posts, supporting emergency responders in monitoring and assessing crisis situations more effectively.
2. Related Works
2.1. Existing Dataset
- SWDM2013 [5] includes two distinct collections: the Joplin set, which contains 4400 labeled tweets from the May 2011 tornado that struck Joplin, Missouri, and the Sandy set, featuring 2000 tweets from Hurricane Sandy in October 2012. These datasets provide valuable insights into the social media discourse during two significant disaster events and serve as foundational data for early research in disaster-related sentiment analysis and classification.
- Disaster Response Data resource [4] contains 30,000 tweets categorized into 36 different classes, collected from various disasters, including the 2010 earthquakes in Haiti and Chile, the 2010 floods in Pakistan, and the 2012 occurrence of Hurricane Sandy in the United States. This dataset is particularly valuable for training models across diverse disaster contexts, enabling the creation of more generalized crisis classification systems.
- CrisisNLP [6] is another comprehensive dataset featuring 50,000 human-annotated tweets collected from 19 disaster events between 2013 and 2015. These tweets are categorized using multiple labeling schemes, with classes designed for humanitarian disaster response and health-related emergencies. The diversity of annotations allows researchers to explore various aspects of disaster response, including needs identification, resource allocation, and medical emergencies.
- CrisisMMD [7] adopts a multimodal and multitask approach, consisting of 18,000 labeled tweets accompanied by associated images. This dataset supports both text- and image-based classification, enabling researchers to develop models that analyze not only textual information but also visual content. This capability is particularly useful for tasks like assessing infrastructure damage, visualizing rescue operations, and understanding the broader impact of disasters.
- TREC Incident Streams [8], developed for the TREC-IS 2018 evaluation challenge, consists of 19,784 tweets labeled for identifying actionable information and assessing the criticality of information. This dataset is specifically designed for enhancing decision-making during disasters by helping emergency responders distinguish between high-priority and low-priority information.
- Eyewitness Messages [9] focuses on analyzing eyewitness accounts on Twitter. It classifies 14,000 tweets into four distinct categories across events such as floods, earthquakes, fires, and hurricanes. This dataset is particularly valuable for understanding firsthand experiences during disasters, which can provide crucial, real-time information for emergency responders.
- Disaster Tweet Corpus 2020 [10] consolidates several existing datasets focused on classifying various disaster event types. This resource offers a unified platform for researchers to train models capable of recognizing different types of disaster-related tweets. Additionally, Alam et al. [11] integrate eight data sources, yielding a comprehensive collection of 166,100 tweets for informativeness classification and 141,500 tweets for humanitarian classification tasks. These consolidated resources enhance the ability to train robust models that generalize effectively across multiple disaster scenarios.
2.2. Text Classification Models
2.3. Disaster Management
2.4. Contributions of This Work
3. Methods
3.1. Dataset and Data Cleaning
3.1.1. Dataset
- Caution and advice: Notifications of issued or lifted warnings, along with guidance and safety tips related to the disaster.
- Sympathy and support: Posts expressing prayers, thoughts, and emotional support.
- Requests or urgent needs: Reports of urgent needs or requests for supplies, including food, water, clothing, money, medical supplies, or blood.
- Displaced people and evacuations: Reports of individuals who have relocated due to the crisis, including temporary evacuations.
- Injured or dead people: Reports of individuals who have been injured or killed because of the disaster.
- Missing or found people: Reports of individuals who are missing or have been found following the disaster event.
- Infrastructure and utility damage: Reports of damage to infrastructure, including buildings, houses, roads, bridges, power lines, communication poles or vehicles.
- Rescue, volunteering, or donation efforts: Reports of rescue, volunteering, or donation efforts, including transportation to safe locations, evacuations, the distribution of medical aid or food, shelter provisions, and donations of money, goods, or services.
- Other relevant information: Tweets that do not fit into any of the above categories but still contain important information relevant to humanitarian aid.
- Not humanitarian: Tweets that do not contain information related to humanitarian aid.
- Don’t know or can’t judge: Tweets that are irrelevant or cannot be assessed, including those written in non-English languages.
3.1.2. Data Cleaning
3.2. Models
- BERT: BERT is a transformer-based model designed to understand the context of words by simultaneously analyzing both their left and right surroundings. It is pre-trained on large corpora using two objectives: masked language modeling and next sentence prediction. This dual-training strategy allows BERT to capture complex relationships between words, making it highly effective for a wide range of natural language processing tasks, including sentiment analysis, question-answering, and text classification.
- RoBERTa: RoBERTa builds upon BERT’s foundation by refining the pretraining process. It eliminates the next sentence prediction task and instead focuses on training with larger batches and more data for longer durations. These adjustments enhance the model’s ability to capture subtle linguistic patterns, leading to an improved performance across various benchmarks and making it more robust for complex text classification tasks.
- DistilBERT: DistilBERT is a condensed version of BERT that prioritizes efficiency while retaining much of the original model’s performance. It uses a technique known as knowledge distillation, where a smaller model learns from a larger, pre-trained model. This results in a faster, lighter model that is particularly suitable for applications with limited computational resources, such as mobile devices or real-time processing systems.
- BigBird: BigBird extends the capabilities of traditional transformer models by introducing a sparse attention mechanism, allowing it to process much longer sequences without dramatically increasing computational costs. This innovation makes BigBird especially suitable for tasks that involve lengthy documents or sequences, such as document summarization, genome analysis, or processing large-scale social media streams during disasters.
- ALBERT: ALBERT enhances BERT’s efficiency by reducing the number of parameters through cross-layer parameter-sharing and factorized embeddings. Despite being a smaller model, ALBERT achieves competitive results by emphasizing sentence-order prediction during pretraining. Its memory-efficient design makes it well-suited for environments where computational resources are constrained while still delivering strong performance on various NLP tasks.
- ELECTRA: ELECTRA introduces a novel pretraining method that diverges from the traditional masked language modeling used in BERT. Instead, it uses a generator-discriminator framework, where a discriminator learns to differentiate between real tokens and those generated by a smaller model. This approach leads to more sample-efficient training, allowing ELECTRA to achieve high performance with reduced computational costs, making it an effective solution for large-scale text classification tasks.
- ModernBERT: ModernBERT represents the latest evolution in BERT-based architecture, incorporating advanced optimization strategies and training techniques. It leverages innovations such as dynamic masking and improved pretraining objectives to enhance both efficiency and accuracy. ModernBERT is designed to achieve superior results on contemporary NLP benchmarks, making it highly effective for complex text classification tasks in dynamic environments such as disaster response systems.
3.3. Visualization
3.3.1. Web Server and User Interface (UI)
3.3.2. Model Deployment and Visualization
4. Results
4.1. Social Media Text Classification
4.1.1. Experiment Details
4.1.2. Metrics
- F1i is the F1 score for class i.
- wi is the weight for class i, defined as the proportion of samples belonging to that class.
- n is the total number of classes.
4.1.3. Experimental Results
4.2. Case Study
5. Discussion
5.1. Classification and Data Source
5.2. System, Model and Response Improvement
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Li, J.; Stephens, K.K.; Zhu, Y.; Murthy, D. Using Social Media to Call for Help in Hurricane Harvey: Bonding Emotion, Culture, and Community Relationships. Int. J. Disaster Risk Reduct. 2019, 38, 101212. [Google Scholar] [CrossRef]
- Toraman, C.; Kucukkaya, I.E.; Ozcelik, O.; Sahin, U. Tweets Under the Rubble: Detection of Messages Calling for Help in Earthquake Disaster. arXiv 2023, arXiv:2302.13403. [Google Scholar]
- Cantini, R.; Cosentino, C.; Marozzo, F.; Talia, D.; Trunfio, P. Harnessing Prompt-Based Large Language Models for Disaster Monitoring and Automated Reporting from Social Media Feedback. Online Soc. Netw. Media 2025, 45, 100295. [Google Scholar] [CrossRef]
- Alam, F.; Qazi, U.; Imran, M. HumAID: Human-Annotated Disaster Incidents Data from Twitter with Deep Learning Benchmarks. ICWSM 2021, 15, 933–942. [Google Scholar] [CrossRef]
- Imran, M.; Elbassuoni, S.; Castillo, C.; Diaz, F.; Meier, P. Practical Extraction of Disaster-Relevant Information from Social Media. In Proceedings of the 22nd International Conference on World Wide Web, New York, NY, USA, 13 May 2013. [Google Scholar]
- Imran, M.; Mitra, P.; Castillo, C. Twitter as a Lifeline: Human-Annotated Twitter Corpora for NLP of Crisis-Related Messages. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), Portorož, Slovenia, 1 May 2016. [Google Scholar]
- Alam, F.; Ofli, F.; Imran, M. CrisisMMD: Multimodal Twitter Datasets from Natural Disasters. ICWSM 2018, 12, 100198. [Google Scholar] [CrossRef]
- Mccreadie, R.; Buntain, C.; Soboroff, I. TREC Incident Streams: Finding Actionable Information on Social Media. In Proceedings of the 16th International Conference on Information Systems for Crisis Response and Management, Valencia, Spain, 19 May 2019. [Google Scholar]
- Zahra, K.; Imran, M.; Ostermann, F.O. Automatic Identification of Eyewitness Messages on Twitter during Disasters. Inf. Process. Manag. 2020, 57, 102107. [Google Scholar] [CrossRef]
- Wiegmann, M.; Kersten, J.; Klan, F. Analysis of Detection Models for Disaster-Related Tweets. In Proceedings of the 17th Annual International Conference on Information Systems for Crisis Response and Management, Blacksburg, VA, USA, 19 May 2020. [Google Scholar]
- Alam, F.; Sajjad, H.; Imran, M.; Ofli, F. CrisisBench: Benchmarking Crisis-Related Social Media Datasets for Humanitarian Information Processing. Proc. Int. AAAI Conf. Web Soc. Media 2021, 15, 923–932. [Google Scholar] [CrossRef]
- Dong, Z.S.; Meng, L.; Christenson, L.; Fulton, L. Social Media Information Sharing for Natural Disaster Response. Nat. Hazards 2021, 107, 2077–2104. [Google Scholar] [CrossRef]
- Burel, G.; Alani, H. Crisis Event Extraction Service (CREES)—Automatic Detection and Classification of Crisis-Related Content on Social Media. In Proceedings of the 15th International Conference on Information Systems for Crisis Response and Management, Rochester, NY, USA, 18 May 2018. [Google Scholar]
- Nguyen, D.; Mannai, K.A.A.; Joty, S.; Sajjad, H.; Imran, M.; Mitra, P. Robust Classification of Crisis-Related Data on Social Networks Using Convolutional Neural Networks. Proc. Int. AAAI Conf. Web Soc. Media 2017, 11, 632–635. [Google Scholar] [CrossRef]
- Alam, F.; Ofli, F.; Imran, M.; Alam, T.; Qazi, U. Deep Learning Benchmarks and Datasets for Social Media Image Classification for Disaster Response. In Proceedings of the 2020 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM), The Hague, The Netherlands, 7–10 December 2020; pp. 151–158. [Google Scholar]
- Li, Y.; Osei, F.B.; Hu, T.; Stein, A. Urban Flood Susceptibility Mapping Based on Social Media Data in Chengdu City, China. Sustain. Cities Soc. 2023, 88, 104307. [Google Scholar] [CrossRef]
- Otal, H.T.; Stern, E.; Canbaz, M.A. LLM-Assisted Crisis Management: Building Advanced LLM Platforms for Effective Emergency Response and Public Collaboration. In Proceedings of the 2024 IEEE Conference on Artificial Intelligence (CAI), Singapore, 11–14 June 2024; pp. 851–859. [Google Scholar]
- Shah, S.A.; Seker, D.Z.; Hameed, S.; Draheim, D. The Rising Role of Big Data Analytics and IoT in Disaster Management: Recent Advances, Taxonomy and Prospects. IEEE Access 2019, 7, 54595–54614. [Google Scholar] [CrossRef]
- Bukar, U.A.; Sayeed, M.S.; Amodu, O.A.; Razak, S.F.A.; Yogarayan, S.; Othman, M. Leveraging VOSviewer Approach for Mapping, Visualisation, and Interpretation of Crisis Data for Disaster Management and Decision-Making. Int. J. Inf. Manag. Data Insights 2025, 5, 100314. [Google Scholar] [CrossRef]
- Devlin, J.; Chang, M.-W.; Lee, K.; Toutanova, K. BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, 2–7 June 2019. [Google Scholar]
- Liu, Y.; Ott, M.; Goyal, N. RoBERTa: A Robustly Optimized BERT Pretraining Approach. arXiv 2019, arXiv:1907.11692. [Google Scholar]
- Clark, K.; Luong, M.-T.; Le, Q.V.; Manning, C.D. ELECTRA: Pre-Training Text Encoders as Discriminators Rather Than Generators. arXiv 2020, arXiv:2003.10555. [Google Scholar]
- Lan, Z.; Chen, M.; Goodman, S. ALBERT: A Lite BERT for Self-Supervised Learning of Language Representations. arXiv 2020, arXiv:1909.11942. [Google Scholar]
- Sanh, V.; Debut, L.; Chaumond, J.; Wolf, T. DistilBERT, a Distilled Version of BERT: Smaller, Faster, Cheaper and Lighter. arXiv 2020, arXiv:1910.01108. [Google Scholar]
- Zaheer, M.; Guruganesh, G.; Dubey, K.A. Big Bird: Transformers for Longer Sequences. Adv. Neural Inf. Process. Syst. 2020, 33, 17283–17297. [Google Scholar]
- Warner, B.; Chaffin, A.; Clavié, B. Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. arXiv 2024, arXiv:2412.13663. [Google Scholar]
- Cajun Navy. Available online: https://en.wikipedia.org/wiki/Cajun_Navy (accessed on 8 March 2025).
- Sloan, L.; Morgan, J. Who Tweets with Their Location? Understanding the Relationship between Demographic Characteristics and the Use of Geoservices and Geotagging on Twitter. PLoS ONE 2015, 10, e0142209. [Google Scholar] [CrossRef] [PubMed]
Event Name | Caution and Advice | Displaced People and Evacuations | Infrastructure and Utility Damage | Injured or Dead People | Missing or Found People | Not Humanitarian | Don’t Know or Can’t Judge | Other Relevant Information | Requests or Urgent Needs | Rescue Volunteering or Donation Effort | Sympathy and Support | Total |
---|---|---|---|---|---|---|---|---|---|---|---|---|
Ecuador Earthquake | 30 | 3 | 70 | 555 | 10 | 23 | 18 | 81 | 91 | 394 | 319 | 1594 |
Canada Wildfires | 106 | 380 | 251 | 4 | - | 79 | 13 | 311 | 20 | 934 | 161 | 2259 |
Italy Earthquake | 10 | 3 | 54 | 174 | 7 | 9 | 10 | 52 | 30 | 312 | 579 | 1240 |
Kaikoura Earthquake | 493 | 87 | 312 | 105 | 3 | 224 | 19 | 311 | 24 | 207 | 432 | 2217 |
Hurricane Matthew | 36 | 38 | 178 | 224 | - | 76 | 5 | 328 | 53 | 326 | 395 | 1659 |
Sri Lanka Floods | 28 | 9 | 17 | 46 | 4 | 20 | 2 | 56 | 34 | 319 | 40 | 575 |
Hurricane Harvey | 541 | 688 | 1271 | 698 | 10 | 410 | 42 | 1767 | 333 | 2823 | 635 | 9164 |
Hurricane Irma | 613 | 755 | 1881 | 894 | 8 | 615 | 60 | 2358 | 126 | 1590 | 567 | 9467 |
Hurricane Maria | 220 | 131 | 1427 | 302 | 11 | 270 | 39 | 1568 | 711 | 1977 | 672 | 7328 |
Mexico Earthquake | 35 | 4 | 167 | 254 | 14 | 38 | 3 | 109 | 61 | 984 | 367 | 2036 |
Maryland Floods | 70 | 3 | 79 | 56 | 140 | 77 | 1 | 137 | 1 | 73 | 110 | 747 |
Greece Wildfires | 26 | 7 | 38 | 495 | 20 | 74 | 4 | 159 | 25 | 356 | 322 | 1526 |
Kerala Floods | 139 | 56 | 296 | 363 | 7 | 456 | 65 | 955 | 590 | 4294 | 835 | 8056 |
Hurricane Florence | 1310 | 637 | 320 | 297 | - | 1060 | 95 | 636 | 54 | 1478 | 472 | 6359 |
California Wildfires | 139 | 368 | 422 | 1946 | 179 | 1318 | 68 | 1038 | 79 | 1415 | 472 | 7444 |
Cyclone Idai | 89 | 57 | 354 | 433 | 19 | 80 | 11 | 407 | 143 | 1869 | 482 | 3944 |
Midwestern U.S. Floods | 79 | 8 | 140 | 14 | 1 | 389 | 27 | 273 | 46 | 788 | 165 | 1930 |
Hurricane Dorian | 1369 | 802 | 815 | 60 | 1 | 874 | 46 | 1444 | 179 | 987 | 1083 | 7660 |
Pakistan Earthquake | 71 | - | 125 | 401 | 1 | 213 | 32 | 154 | 19 | 152 | 823 | 1991 |
Total | 5404 | 4036 | 8163 | 7321 | 435 | 6305 | 560 | 12,144 | 2619 | 21,278 | 8931 | 77,196 |
Disaster | Class | BERT | RoBERTa | DistilBERT | BigBird | ALBERT | ELECTRA | ModernBERT |
---|---|---|---|---|---|---|---|---|
2016 Ecuador Earthquake | 8 | 0.823 | 0.868 | 0.853 | 0.859 | 0.832 | 0.870 | 0.900 |
2016 Canada Wildfires | 8 | 0.754 | 0.786 | 0.780 | 0.751 | 0.759 | 0.780 | 0.842 |
2016 Italy Earthquake | 6 | 0.823 | 0.872 | 0.851 | 0.843 | 0.818 | 0.867 | 0.900 |
2016 Kaikoura Earthquake | 9 | 0.731 | 0.741 | 0.732 | 0.736 | 0.730 | 0.752 | 0.746 |
2016 Hurricane Matthew | 9 | 0.751 | 0.775 | 0.768 | 0.759 | 0.750 | 0.767 | 0.796 |
2017 Sri Lanka Floods | 8 | 0.686 | 0.729 | 0.718 | 0.708 | 0.690 | 0.721 | 0.845 |
2017 Hurricane Harvey | 9 | 0.730 | 0.759 | 0.747 | 0.739 | 0.739 | 0.748 | 0.791 |
2017 Hurricane Irma | 9 | 0.705 | 0.717 | 0.719 | 0.704 | 0.712 | 0.719 | 0.733 |
2017 Hurricane Maria | 9 | 0.706 | 0.725 | 0.722 | 0.710 | 0.714 | 0.720 | 0.767 |
2017 Mexico Earthquake | 8 | 0.834 | 0.852 | 0.847 | 0.842 | 0.836 | 0.837 | 0.915 |
2018 Maryland Floods | 8 | 0.684 | 0.764 | 0.708 | 0.690 | 0.699 | 0.756 | 0.751 |
2018 Greece Wildfires | 9 | 0.702 | 0.742 | 0.735 | 0.711 | 0.715 | 0.732 | 0.836 |
2018 Kerala Floods | 9 | 0.723 | 0.741 | 0.728 | 0.736 | 0.733 | 0.739 | 0.821 |
2018 Hurricane Florence | 9 | 0.755 | 0.776 | 0.763 | 0.760 | 0.759 | 0.770 | 0.810 |
2018 California Wildfires | 10 | 0.741 | 0.768 | 0.757 | 0.758 | 0.749 | 0.756 | 0.804 |
2019 Cyclone Idai | 10 | 0.733 | 0.789 | 0.759 | 0.749 | 0.740 | 0.791 | 0.844 |
2019 Midwestern U.S. Floods | 7 | 0.719 | 0.766 | 0.750 | 0.744 | 0.720 | 0.755 | 0.802 |
2019 Hurricane Dorian | 9 | 0.674 | 0.689 | 0.680 | 0.678 | 0.674 | 0.677 | 0.698 |
2019 Pakistan Earthquake | 8 | 0.797 | 0.823 | 0.833 | 0.799 | 0.798 | 0.811 | 0.873 |
Average | 0.740 | 0.772 | 0.760 | 0.751 | 0.745 | 0.766 | 0.814 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
He, C.; Hu, D. Social Media Analytics for Disaster Response: Classification and Geospatial Visualization Framework. Appl. Sci. 2025, 15, 4330. https://doi.org/10.3390/app15084330
He C, Hu D. Social Media Analytics for Disaster Response: Classification and Geospatial Visualization Framework. Applied Sciences. 2025; 15(8):4330. https://doi.org/10.3390/app15084330
Chicago/Turabian StyleHe, Chao, and Da Hu. 2025. "Social Media Analytics for Disaster Response: Classification and Geospatial Visualization Framework" Applied Sciences 15, no. 8: 4330. https://doi.org/10.3390/app15084330
APA StyleHe, C., & Hu, D. (2025). Social Media Analytics for Disaster Response: Classification and Geospatial Visualization Framework. Applied Sciences, 15(8), 4330. https://doi.org/10.3390/app15084330