Utilizing Volunteered Geographic Information for Real-Time Analysis of Fire Hazards: Investigating the Potential of Twitter Data in Assessing the Impacted Areas
Abstract
:1. Introduction
Related Work
- Location extraction from the posted text [12,13,14]: These studies extract names of places from the text of the messages. This task is well-known as named entity recognition (NER) as a sub-category of natural language processing (NLP). Few studies [15] also use pattern recognition with regular expressions (RegEx) for specific location name extraction. In the final step, the extracted names of places are geocoded by geoparsing.
- Direct location extraction: Locations can be directly extracted from metadata obtained with the text data when accessing VGI, e.g., via an application programming interface (API). Twitter, for example, delivers a JSON file that provides the coordinates or a place field where the tweets are created [16]. However, the user can voluntarily fill these fields, so the information is rarely available. Studies suggest that coordinates are given in about 0.2% to 1.5% of posted tweets, whereas place is given in about 2% of tweets [17,18].
2. Materials and Methods
2.1. Case Studies
2.2. Input Data
- Population dataset: The WorldPop population density dataset 2020 with a resolution of 30 arc-seconds (approximately 1 km at the equator) was used [40].
- Land cover dataset: The worldwide available product Copernicus Global Land Cover Layers (CGLS-LC100) Collection 3 of 2019 with a 100 m resolution was used [41].
2.3. Methods
2.3.1. Approximation of the Barycenter
- Population density: More people in an area means a higher probability of someone witnessing and reporting a wildfire.
- Land cover: Land cover diversity impacts the visibility and detectability of wildfires and fire propagation.
2.3.2. Approximation of the Areal Location
- We approximate the hazard location by considering names of places that are mentioned in the tweets’ texts talking about the respective hazard (see Figure 4, (1)). This method is therefore referred to as the Location by viewing angle (LVA) approach. We extract these mentioned places with two approaches: NER or pattern recognition by Regular Expression (RegEx) (e.g., [42,43,44]), which are NLP methods. For NER, we employ the Spacy Python library [45], an open-source NLP library. We use it to detect the entities GPE (Geopolitical Entities: Countries, cities, states) and LOC (Non-GPE locations, mountain ranges, bodies of water). We apply RegEx in addition to NER, as NER recognizes general location places, while we can extract more specific areas with RegEx. For RegEx, we employ our own developed algorithm, which consists of the two steps of name detection and geoparsing. It searches for spatial places like Mt. Wilson or Monrovia Peak, employing word search (e.g., Peak) and Regex patterns that search for associated nouns (e.g., Monrovia). After extracting place names, we geo-parse, which converts text descriptions of places into geographic identifiers like coordinates. Next, we apply a methodology to check for the viewing angle. The people might not see mentioned places in their tweet locations, as obstructions like mountains could prevent them. Therefore, we check the plausibility of the viewing angle by considering the occurrence of viewing obstructions in the line of sight from the location to the mentioned place. As a result, we obtain locations to which the speaker can view and which are probable that he is seeing the hazard there. With the obtained points, we conduct two separate methods again to get more independent results, in line with the principle that where more people think the hazard is there, the hazard is more likely to be there. These are:
- Kernel density estimation on viewing angle points: Based on the resulting points of 1, we conduct kernel density estimation as implemented in ArcGIS [46], which places a kernel (smooth, continuous function) on each datapoint and sums these kernels to create a smooth representation of the underlying probability distribution. We extract areas with a specific density and a higher probability of the hazard’s presence within those areas.
- Non-outlier estimation on viewing angle points: We use an Isolation Forest, implemented in ArcGIS [46] to detect non-outlier points. It works by isolating instances using binary splits and constructing an ensemble of decision trees. Outliers are identified as instances that require fewer splits to be isolated. We then calculate a convex hull spanned by non-outlier points in the following.
- In this step, we consider blocked road information (see Figure 4, (2)). Road authorities often post information about such an emergency or hazard cases. We search tweets mentioning such information or posted by responsible agencies and extract their locations. This method extracts the exact road locations via RegEx implemented for roads. We geo-parse and obtain points of blocked road information, mainly two road intersection information per tweet (e.g., Angeles Crest Hwy & Upper Big Tujunga Rd). We can then extract the closed road segments between those two mentioned points.
- Finally, we consider distance information in the texts (see Figure 4, (3)). We search tweets mentioning distance information and buffer their location with this distance. We obtain a circle on which the hazard seen by the speaker might lie. To account for coarse estimates by speakers, we apply a buffer around this circle with a distance of 30% of the initial space. This assumption is based on the idea that people gauge distances more accurately when hazards are closer to them. Furthermore, we limit the buffer areas by land cover plausibility, e.g., a buffered area is not considered if overlapping a land cover area that is not plausible to contain fire, e.g., water or bare rock.
2.3.3. General Information about the Evaluation of the Applied Methods
3. Results
4. Discussion
4.1. General Analysis
4.2. Advantages of Using VGI Data in a Hazardous Event
4.3. Limitations
4.4. Considerations for Practical Application
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Songwathana, K. The relationship between natural disaster and economic development: A panel data analysis. Procedia Eng. 2018, 212, 1068–1074. [Google Scholar] [CrossRef]
- Wisner, B.; Blaikie, P.; Cannon, T.; Davis, I. At Risk: Natural Hazards, People’s Vulnerability and Disasters; Routledge: Oxfordshire, UK, 2014. [Google Scholar]
- Hao, H.; Wang, Y. Leveraging multimodal social media data for rapid disaster damage assessment. Int. J. Disaster Risk Reduct. 2020, 51, 101760. [Google Scholar] [CrossRef]
- Florath, J.; Keller, S. Supervised Machine Learning Approaches on Multispectral Remote Sensing Data for a Combined Detection of Fire and Burned Area. Remote Sens. 2022, 14, 657. [Google Scholar] [CrossRef]
- Dittrich, A.; Lucas, C. Is this Twitter event a disaster? In Connecting a Digital Europe through Location and Place, Proceedings of the AGILE’2014 International Conference on Geographic Information Science, Castellon, Spain, 3–6 June 2014; AGILE Digital Editions: Castellon, Spain, 2014. [Google Scholar]
- Wang, Y.; Wang, T.; Ye, X.; Zhu, J.; Lee, J. Using social media for emergency response and urban sustainability: A case study of the 2012 Beijing rainstorm. Sustainability 2016, 8, 25. [Google Scholar] [CrossRef]
- Guan, X.; Chen, C. Using social media data to understand and assess disasters. Nat. Hazards 2014, 74, 837–850. [Google Scholar] [CrossRef]
- Wang, Z.; Ye, X.; Tsou, M.H. Spatial, temporal, and content analysis of Twitter for wildfire hazards. Nat. Hazards 2016, 83, 523–540. [Google Scholar] [CrossRef]
- Panteras, G.; Wise, S.; Lu, X.; Croitoru, A.; Crooks, A.; Stefanidis, A. Triangulating social multimedia content for event localization using Flickr and Twitter. Trans. GIS 2015, 19, 694–715. [Google Scholar] [CrossRef]
- Jurgens, D.; Finethy, T.; McCorriston, J.; Xu, Y.T.; Ruths, D. Geolocation prediction in twitter using social networks: A critical analysis and review of current practice. In Proceedings of the Ninth International AAAI Conference on Web and Social Media, Oxford, UK, 26–29 May 2015. [Google Scholar]
- Davis, C.A.; Pappa, G.L.; De Oliveira, D.R.R.; Arcanjo, F.L. Inferring the location of twitter messages based on user relationships. Trans. GIS 2011, 15, 735–751. [Google Scholar] [CrossRef]
- MacEachren, A.M.; Robinson, A.C.; Jaiswal, A.; Pezanowski, S.; Savelyev, A.; Blanford, J.; Mitra, P. Geo-twitter analytics: Applications in crisis management. In Proceedings of the 25th International Cartographic Conference, Paris, France, 3–8 July 2011; pp. 3–8. [Google Scholar]
- Laylavi, F.; Rajabifard, A.; Kalantari, M. A multi-element approach to location inference of twitter: A case for emergency response. ISPRS Int. J.-Geo-Inf. 2016, 5, 56. [Google Scholar] [CrossRef]
- Huang, C.Y.; Tong, H.; He, J.; Maciejewski, R. Location Prediction for Tweets. Front. Big Data 2019, 2, 5. [Google Scholar] [CrossRef]
- Gelernter, J.; Balaji, S. An algorithm for local geoparsing of microtext. GeoInformatica 2013, 17, 635–667. [Google Scholar] [CrossRef]
- Dittrich, A. Real-Time Event Analysis and Spatial Information Extraction from Text Using Social Media Data. Ph.D. Thesis, Karlsruhe Institute of Technology, Karlsruhe, Germany, 2016. [Google Scholar]
- Burton, S.H.; Tanner, K.W.; Giraud-Carrier, C.G.; West, J.H.; Barnes, M.D. “Right time, right place” health communication on Twitter: Value and accuracy of location information. J. Med. Internet Res. 2012, 14, e2121. [Google Scholar] [CrossRef] [PubMed]
- Huang, B.; Carley, K.M. A large-scale empirical study of geotagging behavior on twitter. In Proceedings of the 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, Vancouver, BC, Canada, 27–30 August 2019; pp. 365–373. [Google Scholar]
- Ajao, O.; Hong, J.; Liu, W. A survey of location inference techniques on Twitter. J. Inf. Sci. 2015, 41, 855–864. [Google Scholar] [CrossRef]
- Kim, M.G.; Koh, J.H. Recent research trends for geospatial information explored by Twitter data. Spat. Inf. Res. 2016, 24, 65–73. [Google Scholar] [CrossRef]
- Benson, E.; Haghighi, A.; Barzilay, R. Event discovery in social media feeds. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, Portland, OR, USA, 19–24 June 2011; pp. 389–398. [Google Scholar]
- Han, S.; Ciravegna, F. Rumour Detection on Social Media for Crisis Management. In Proceedings of the ISCRAM, Valencia, Spain, 19–22 May 2019. [Google Scholar]
- Imran, M.; Mitra, P.; Castillo, C. Twitter as a lifeline: Human-annotated twitter corpora for NLP of crisis-related messages. arXiv 2016, arXiv:1605.05894. [Google Scholar]
- De Albuquerque, J.P.; Herfort, B.; Brenning, A.; Zipf, A. A geographic approach for combining social media and authoritative data towards identifying useful information for disaster management. Int. J. Geogr. Inf. Sci. 2015, 29, 667–689. [Google Scholar] [CrossRef]
- Cervone, G.; Sava, E.; Huang, Q.; Schnebele, E.; Harrison, J.; Waters, N. Using Twitter for tasking remote-sensing data collection and damage assessment: 2013 Boulder flood case study. Int. J. Remote Sens. 2016, 37, 100–124. [Google Scholar] [CrossRef]
- Goffi, A.; Bordogna, G.; Stroppiana, D.; Boschetti, M.; Brivio, P.A. Knowledge and data-driven mapping of environmental status indicators from remote sensing and VGI. Remote Sens. 2020, 12, 495. [Google Scholar] [CrossRef]
- Poser, K.; Dransch, D. Volunteered geographic information for disaster management with application to rapid flood damage estimation. Geomatica 2010, 64, 89–98. [Google Scholar]
- Yang, W.; Mu, L. GIS analysis of depression among Twitter users. Appl. Geogr. 2015, 60, 217–223. [Google Scholar] [CrossRef]
- Ghosh, D.; Guha, R. What are we ‘tweeting’about obesity? Mapping tweets with topic modeling and Geographic Information System. Cartogr. Geogr. Inf. Sci. 2013, 40, 90–102. [Google Scholar] [CrossRef] [PubMed]
- Gerber, M.S. Predicting crime using Twitter and kernel density estimation. Decis. Support Syst. 2014, 61, 115–125. [Google Scholar] [CrossRef]
- Hultquist, C.; Simpson, M.; Cervone, G.; Huang, Q. Using nightlight remote sensing imagery and twitter data to study power outages. In Proceedings of the 1st ACM SIGSPATIAL International Workshop on the Use of GIS in Emergency Management, Bellevue, WA, USA, 3–6 November 2015; pp. 1–6. [Google Scholar]
- Bao, J.; Liu, P.; Yu, H.; Xu, C. Incorporating twitter-based human activity information in spatial analysis of crashes in urban areas. Accid. Anal. Prev. 2017, 106, 358–369. [Google Scholar] [CrossRef] [PubMed]
- Forati, A.M.; Ghose, R. Examining Community Vulnerabilities through multi-scale geospatial analysis of social media activity during Hurricane Irma. Int. J. Disaster Risk Reduct. 2022, 68, 102701. [Google Scholar] [CrossRef]
- Benevenuto, F.; Rodrigues, T.; Almeida, V.; Almeida, J.; Gonçalves, M. Detecting spammers and content promoters in online video social networks. In Proceedings of the 32nd International ACM SIGIR Conference on Research and Development in Information Retrieval, Boston, MA, USA, 19–23 July 2009; pp. 620–627. [Google Scholar]
- Ratkiewicz, J.; Conover, M.; Meiss, M.; Gonçalves, B.; Patil, S.; Flammini, A.; Menczer, F. Truthy: Mapping the spread of astroturf in microblog streams. In Proceedings of the 20th International Conference Companion on World Wide Web, Hyderabad, India, 28 March–1 April 2011; pp. 249–252. [Google Scholar]
- Castillo, C.; Mendoza, M.; Poblete, B. Information credibility on Twitter. In Proceedings of the 20th International Conference on World Wide Web, Hyderabad, India, 28 March–1 April 2011; pp. 675–684. [Google Scholar]
- Adnan, M.; Longley, P.A.; Khan, S.M. Social dynamics of twitter usage in London, Paris, and New York City. First Monday 2014, 19, 5. [Google Scholar] [CrossRef]
- Sloan, L. Who tweets in the United Kingdom? Profiling the Twitter population using the British social attitudes survey 2015. Soc. Media+ Soc. 2017, 3, 2056305117698981. [Google Scholar] [CrossRef]
- Ponukumati, P.; Regonda, S.K. Twitter—A New Citizen Science Solution for Urban Flood Database# Urban Floods# Flood Database. 2023. Available online: https://assets.researchsquare.com/files/rs-3045515/v1/efdd999e-3494-4ee6-b920-c657b07e36c8.pdf?c=1689058812 (accessed on 18 December 2023).
- Worldpop. 2022. Available online: https://www.worldpop.org/ (accessed on 4 September 2023).
- Buchhorn, M.; Lesiv, M.; Tsendbazar, N.E.; Herold, M.; Bertels, L.; Smets, B. Copernicus global land cover layers—collection 2. Remote Sens. 2020, 12, 1044. [Google Scholar] [CrossRef]
- Agarwal, A.; Toshniwal, D. Face off: Travel habits, road conditions and traffic city characteristics bared using twitter. IEEE Access 2019, 7, 66536–66552. [Google Scholar] [CrossRef]
- Utomo, M.N.Y.; Adji, T.B.; Ardiyanto, I. Geolocation prediction in social media data using text analysis: A review. In Proceedings of the 2018 International Conference on Information and Communications Technology (ICOIACT), Yogyakarta, Indonesia, 6–7 March 2018; pp. 84–89. [Google Scholar]
- Martínez, N.J.F.; Periñán-Pascual, C. Knowledge-based rules for the extraction of complex, fine-grained locative references from tweets. RAEL Rev. Electrón. Lingüíst. Apl. 2020, 19, 136–163. [Google Scholar]
- Explosion. spaCy—Industrial-Strength Natural Language Processing. 2023. Available online: https://spacy.io/ (accessed on 15 November 2023).
- ESRI. ArcGIS Pro. 2023. Available online: https://www.esri.com/en-us/arcgis/products/arcgis-pro/overview (accessed on 18 December 2023).
- Gallardo, R. Digital Divide Index. 2020. Available online: https://storymaps.arcgis.com/stories/8ad45c48ba5c43d8ad36240ff0ea0dc7 (accessed on 18 December 2023).
- Takahashi, T.; Igata, N. Rumor detection on twitter. In Proceedings of the 6th International Conference on Soft Computing and Intelligent Systems, and the 13th International Symposium on Advanced Intelligence Systems, Kobe, Japan, 20–24 November 2012; pp. 452–457. [Google Scholar]
Fire | Bobcat Fire | Camp Fire | Var, France | Landiras, France |
---|---|---|---|---|
Fire Starting Date | 06/09/2020 | 08/11/2018 | 16/08/2021 | 12/07/2022 |
Fire Duration | –19/10/2020 (43 days) | –25/11/2018 (17 days) | –26/08/2021 (10 days) | –25/07/2022 (13 days) |
Total Fire Area | ∼ 469 km2 | ∼620 km2 | ∼57 km2 | ∼138 km2 |
Fire Behavior | 98 km2 within 4 days, +/− constant spreading | 620 km2 within 3 days, spreading extremely quickly, but then remaining within almost these same boundaries until it is extinct | 57 km2 within 22 h, spreading very quickly but then maining within these same boundaries until it is extinct | +/− constant spreading |
land cover | Forest/shrub land cover, mountainous, no population | Shrub land cover, mountainous, few populations overall, but one town | Forest/shrub land cover, hilly, few population | Woodland/forest land cover, flat-hilly, few population |
Population Density/ Distribution | High density, irregular distribution around possible fire area | Low density, irregular distribution around and INSIDE possible fire area | Low density, regular distribution around possible fire area | Low density, irregular distribution around possible fire area |
Tweet Behavior | More tweets, less place information in text | Fewer tweets, less place information in text | Fewer tweets, more place information in text | Fewer tweets, more place information in text |
Agencies (Fire/Road) | Using Twitter | Using Twitter | Not using Twitter | Not using Twitter |
First Available Remote Sensing Data | 4 days after fire start (10/09/2020) | 3 days after fire start (11/11/2018) | 1 day after fire start (17/08/2021) | 5 days after fire start (17/07/2022) |
Fire | Bobcat Fire | Camp Fire | Var, France | Landiras, France |
---|---|---|---|---|
Period before RS data available | 06/09/2020 –10/09/2020 | 08/11/2018–10/11/2018 | 16/08/2021–17/08/2021 | 12/07/2022–16/07/2022 |
# of tweets with coordinates | 236 | 52 | 3 | 0 |
# of tweets with place | 578 | 157 | 82 | 30 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Florath, J.; Chanussot, J.; Keller, S. Utilizing Volunteered Geographic Information for Real-Time Analysis of Fire Hazards: Investigating the Potential of Twitter Data in Assessing the Impacted Areas. Fire 2024, 7, 6. https://doi.org/10.3390/fire7010006
Florath J, Chanussot J, Keller S. Utilizing Volunteered Geographic Information for Real-Time Analysis of Fire Hazards: Investigating the Potential of Twitter Data in Assessing the Impacted Areas. Fire. 2024; 7(1):6. https://doi.org/10.3390/fire7010006
Chicago/Turabian StyleFlorath, Janine, Jocelyn Chanussot, and Sina Keller. 2024. "Utilizing Volunteered Geographic Information for Real-Time Analysis of Fire Hazards: Investigating the Potential of Twitter Data in Assessing the Impacted Areas" Fire 7, no. 1: 6. https://doi.org/10.3390/fire7010006
APA StyleFlorath, J., Chanussot, J., & Keller, S. (2024). Utilizing Volunteered Geographic Information for Real-Time Analysis of Fire Hazards: Investigating the Potential of Twitter Data in Assessing the Impacted Areas. Fire, 7(1), 6. https://doi.org/10.3390/fire7010006