1. Introduction
The catastrophic impacts of climate change are global. Nevertheless, some regions of the world are, now, more sensitive to the anthropogenic climate change impacts [
1,
2]. For instance, Australia is one of these sensitive regions [
3]. Australia’s range of unique climates has contributed to the country’s history of diverse natural hazard-related disasters, including, bushfires, flooding, hurricanes, cyclones, earthquakes, and tsunamis [
4], where a natural hazard-related disaster is defined as “an extreme event that occurs naturally and causes harm to humans or to other things that we care about, though usually the focus is on humans—which, we might note, is anthropocentric” [
5]. Natural hazard-related disasters have presented increasingly significant challenges to the country’s urban and rural environments, as a result of higher density development and population growth within these areas [
4]. Particularly, anthropogenic climate change has increased the severity and frequency of these disastrous events drastically [
6,
7,
8].
The construction of urban infrastructure within modern cities aims to benefit the local population by providing easier access to essential goods and services, as well as supporting environmental and social factors for the community [
9,
10]. Nonetheless, with Australian cities expanding their footprint, natural hazard-related disasters impact larger areas and populations and causes more vulnerability to communities [
11,
12,
13]. In this sense, the adaptability of cities to climate change requires smart infrastructure and urban design that is resilient to natural hazard-related disasters impacts [
14,
15,
16].
In order to build climate resilient urban infrastructure, the existing and forthcoming impacts of climate change on cities and societies should be adequately estimated [
17,
18]. This also includes identifying the existing and future natural hazard-related disaster damages on urban infrastructures and properties [
19,
20]. The traditional methods of such estimations have their limitations, particularly not being able to offer accurately and timely estimates due to data collection challenges [
21,
22]. Hence, this emphasizes the need for an improved method of damage estimation for these disastrous events [
23,
24,
25]. Coming up with a novel damage estimation method will assist Australian communities with their recovery process and enhance resilience within urban environments.
The Royal Commission into National Natural Disaster Arrangements, which was established in response to the 2019–2020 bushfires, determined that natural hazard-related disasters will impact Australian communities more frequently, with an increased likelihood of areas experiencing simultaneous disasters before they are fully recovered in 2020. The concurrent nature and extreme intensity of these threats can be attributed to human influences on climate change, as well as increased hazard exposure through coastal and regional developments [
26,
27,
28].
In addition, the higher frequency of natural hazard-related disasters, increased economic and materialistic losses are also resulting from damage caused in urban areas [
29]. Redevelopment and recovery from these impacts consume natural resources, further contributing to climate change as a result [
26]. Consequently, improvements in disaster risk reduction and community restoration should be prioritized by future developments, to help manage multiple, nation-wide natural hazard-related disaster events and reduce their negative effects on urban environments [
30].
Besides, while the ingenuity of humankind has driven unimaginable advancements in science and technology particularly over the last two centuries [
31], our unsatiable nature, provoked by the ruthless capitalism and irresponsible innovation, is causing irreversible damages to the planet’s fragile ecosystems [
32,
33,
34]. This in turn has initiated exacerbating changes in the climate and led to an increasingly high frequency and severity of natural hazard-related disasters [
35,
36]. The situation and risks are worse for cities where the population densities and human activities are high. There is a need to monitor, prevent and manage the continuing damages to our environment and climates, particularly around the large population clusters, such as cities and urban regions. This includes improving the climate resilience and adaptability of cities and urban regions for climate change.
Responsible innovation and use of technology for social good, however, can help in generating some positive outcomes in addressing the challenges our cities and urban regions face [
37,
38,
39]. For instance, artificial intelligence-driven big data analytics has a major role to play in urban resilience and climate change adaptability of cities [
40,
41]. Social media analytics enables dynamic interactions with public and hence can be used by governments as sensors, information dissemination, and social intervention tools [
42].
Moreover, scholars investigating the effects of natural hazard-related disasters frequently lack demographic and growth pattern data for urban populations [
29]. Social media platforms such as Twitter can provide diverse perspectives on climate change impacts and natural hazard-related disaster events affecting the individual’s area, their communities and cities (e.g., identifying emotional changes over space and time in the context of a natural hazard-related disaster) allowing for more accessible data collection from different communities and additional sources for quantitative research [
43,
44].
In recent years, big data social media analytics has become a popular approach to investigate various urban issues [
45]. For instance, some studies focused on multiple human languages and a range of diverse topics including COVID-19-related governance matters [
46,
47], while others explored traffic-related event detection [
48], detecting critical diseases and symptoms in and across cities [
49]. Additionally, some scholars used big data social media analytics in examining urban logistics [
50], public sentiment analyses of government services [
51], and disaster severity prediction [
23]. Some other urban scholars conducted studies by using social media analytics techniques to capture public perception of smart city concepts and technologies [
52], and public perception of artificial intelligence and urban planning technologies [
53]. Furthermore, some studies also benefited from social media analytics techniques in developing technology-led urban solutions for disaster management.
Considering the significance of managing climate change-induced natural hazard-related disasters across cities and urban areas, this paper contributes a methodological approach involving the use of social media analytics for improving government awareness of, and government-public interaction on disaster-related issues. Specifically, this study aims to form an understanding of how social media analytics can be utilized to assist government authorities in estimating the damages linked to natural hazard-related disaster impacts on urban centers in the age of climate change. To this end, this study analyzes crowdsourced natural hazard-related disaster big data from Twitter users in the testbed case study of Australian states and territories. The methodological approach of this study employs the social media analytics method and conducts sentiment and content analyses of location-based Twitter messages, or tweets, (n = 103,291) on natural hazard-related disaster events from Australia’s states and territories between 1 January 2019 and 31 December 2020. The study informs authorities on an innovative way to analyze the geographic distribution, occurrence frequency of various disasters and their damages based on the geo-tweets analysis.
Following this introduction,
Section 2 provides a concise review of the literature and highlights the knowledge gap. Subsequently,
Section 3 introduces methodology of the research, while
Section 4 presents the findings of the data analysis. An evaluation and discussion of these results are performed in
Section 5 of the paper. Lastly,
Section 6 summarizes the study and its findings, as well as highlights future recommendations.
3. Materials and Methods
The case study employed in this research paper examines Twitter data localized to the states and territories of Australia. As the research question presents social media analytics as a metric for estimating disaster damage in property and land, it was pertinent that the case study centers upon a geographical area in which it is not uncommon for natural hazard-related disasters to occur, as well as one whose population included a significant portion of social media users. To illustrate the impact natural hazard-related disasters have had upon Australia, CANSTAR [
72] has produced the following hierarchy of natural hazard-related disasters faced by Australians between 1900 and 2015 (
Table 1), ordered by number of fatalities.
Heatwaves, floods, cyclones, fires, storms, wind, landslides, tornadoes, and hailstorms each render severe impacts not only to human life, but to property and land accordingly in Australia. While the damages of such events have previously been difficult to accurately and timely capture using common metrics [
73], real-time big data of widely adopted social media platforms may present a solution. As of 2021, 79.9% of the Australian population are active users of some form of social media, spending an average of one hour and 46 min on various social media platforms [
74].
The platform of twitter is the sixth most used social media outlet nation-wide [
74]. While other platforms such as Facebook, Snapchat or Instagram may represent a larger portion of the Australian market, Twitter’s comparatively concise 280-character limit on users’ posts and messaging allows for extraction of big data that is more readily available for processing, analysis, and evaluation for the purposes of this research. A case can be made that the research framework of this study could be tailored to apply to other, more robust social media inputs from various platforms to estimate natural hazard-related disaster damage.
Instead of using a traditional data collection method, the methodological approach that was applied in this study employs a contemporary method. Social media analysis is an ever-changing platform, where people can share their opinions and has recently become a new source of qualitative data. This data collection method started to be used as the main data source in many studies. The major benefit of analysis social media data is that it offers an opportunity to engage with a larger group of people, in an unbiased setting. Furthermore, it allows researchers to engage with people from broader geographic areas with the help of location of social media users, which is tagged in their post. Accordingly, geo-Twitter data have been a successful data type and have been adopted in this study. A geo-twitter analysis increased efficiency in analysis large datasets of shared opinions and real-time information on ongoing social issues.
Initially, sentiment and contents analysis were completed for the total number of location-based Twitter message as seen in
Figure 1. To do this, the original dataset was obtained from the QUT Digital Observatory (
https://www.qut.edu.au/institute-for-future-environments/facilities/digital-observatory/digital-observatory-databank, access on 5 April 2021). The original dataset included 203,291 tweets. Later, these tweets were filtered down to 131,673 tweets. This was done using four data filtering processes, which included frequency analysis, location, date, and relevance filter. While a bot filter is used to remove mass-produced Twitter messages, VPN users, proxies and fake accounts are not investigated as we believe they will be in minimal quantity in a large dataset of over 100,000 messages. The data extraction of twitter messages of the Australian public concerning natural hazard-related disaster damages was carried out for the purpose of a qualitative social media analysis. In the extraction and analysis of these data, several different software tools and processes were utilized.
Firstly, we selected a time period for the analysis. Therefore, any tweets outside of Australia and not within 1 July 2019 to 31 December 2020 were removed from the dataset. The reason for selecting an 18-month period was to capture the latest natural hazard-related disasters in Australia. In addition, this was done to ease the analysis task, as there have been over 400,000 tweets on disasters and damages shared annually in Australia during the past five years. During this screening process, despite the overall ‘disaster’ term is used as keyword, a manual check is conducted to make sure these disasters are actually ‘natural hazard-related disasters’. Similarly, for the keyword of ‘damage’, a manual screening is undertaken to make sure the meaning of ‘damage to infrastructure or property or livelihood’ is maintained.
Secondly, a program called Nvivo, which is a content analysis automatic software system, filtered all the tweet repetition. A word frequency was also conducted using Nvivo, with the aim of identifying important themes, concepts, and clusters.
Thirdly, a word co-occurrence identified tweets that discussed both disasters and damages in a single twitter message. For this analysis, Nvivo was also employed.
Fourthly, a spatial analysis was conducted to complement the content analysis. This includes tweets being separated by location and connected to help categorize themes, concepts, disasters, and damages based on these locations. This analysis created an overview of disaster and damage clusters for each state/territory in Australia.
Then, a sentiment analysis was conducted using WEKA [
75,
76] to further analyze the word content. These words were classified on a scale of one to three to measure the sensitivity. The following is the scale used to conduct the sentiment analysis: 1 = positive sentiment, 2 = negative sentiment and 3 = neutral sentiment. The sensitivity of these specific words was showcased in
Table 2 with exemplary tweets.
Finally, a network analysis was conducted to present the relationship between disaster and damages themes, concept, and clusters. Based on the time-stamp of each tweet provided, the process involved narrowing the scope of the research upon ‘clusters’ of tweets. The rationale behind this was that a cluster of disaster-related tweets around the same time would mostly indicate the occurrence of a significant disaster currently taking place.
5. Discussion
5.1. How Social Media Analytics Can Be Used in Estimating Natural Hazard-Related Disaster Damage
Climate change-induced natural hazard-related disasters are the reality of our time [
78,
79]. In adopting our communities and cities to these disastrous events, innovative approaches have become a savior; social media data analytics is one of these innovative approaches [
80,
81]. Social media have become an important alternative information channel to traditional media during emergencies and natural hazard-related disasters. Given that in the age of climate change, the severity and frequency of natural hazard-related disasters are on the rise, it is critical to benefit from innovative technology solutions, such as social media data analytics [
82,
83]. Data obtained from these social media platforms can be used to warn others on unsafe areas and fundraising for disaster relief [
84].
At the onset of disasters and emergencies, local and national governments are tasked to respond and rescue. In the event of an emergency, local government areas need to know the disaster and location of the situation, severity and geographical measures of the impact and which sectors (infrastructure, economic, environment or social) are affected. While the dataset of this study was derived from past data, current emergencies require live information and actionable reports for decision-making. The data captured in the findings reveal themes and overall sentiment of people that may be impacted from these disasters can be found in
Table 12.
The collected dataset from Twitter was divided in five categories: bushfire, damage, disaster, floods, and winds. These disasters were very common and usually affected a big portion of the human population. The statistics of the tweet dataset from all five categories of natural hazard-related disasters used in this study are provided below.
Total number of tweets for Bushfires: 40,148;
Total number of tweets for Damages: 14,275;
Total number of tweets for Disasters: 10,971;
Total number of tweets for Floods: 7352;
Total number of tweets for Winds: 6003.
Situational awareness and information sharing: Cluster 1 had the lowest negative (n = 10,764) and positive (n = 2408) sentiment from the four clusters. Bushfire was the most mentioned negative (n = 3748) and positive (n = 10,746) sentiment keyword. The cluster dates were similar between QLD, NSW and ACT which represented the major bushfire that impacted the eastern states. VIC and NT were also impacted by bushfires during the May 2019 period. However, WA and TAS Cluster 1 appeared in the October 2019 period which also represented bushfires.
Cluster 2 has the highest negative (n = 38,555) and positive (n = 5520) sentiment from the four clusters. Bushfire was also the most mentioned negative (n = 22,347) and positive (n = 5520) sentiment keyword. The cluster dates were similar between all the states and territories. This shows a major disaster has directly or indirectly impacted all the communities in Australia. This is evident as Australia experienced one of the worst bushfire seasons in its recorded history. This caused massive damage throughout the country, with fires in each state and territory. The east coast (QLD, VIC, NSW) experienced widespread destruction from mega-blazes, such as the Currowan bushfire, which was just one of many catastrophic bushfires during the September 2019 to March 2020 period. In terms of the impact, Australia saw 34 fatalities, 3500 homes lost, and 18.7 million hectares of area were burnt. It is noted that TAS was not directly impacted by the fires, however, people were sharing information and reacting highly negatively.
Cluster 3 was the second highest negative (n = 16,314) and positive (n = 3228) sentiment from the four clusters. The most mentioned negative (n = 8097) and positive (n = 3228) keyword was also bushfire. The cluster dates were also very similar between all the states and territory. Nonetheless, the difference between positive and negative tweets diminished from Cluster 2. This is due to bushfires becoming controlled and there was more positive sentiment within the community.
Cluster 4 was the third highest negative (n = 15,266) and positive (n = 2832) sentiment from the four clusters. Bushfire was still the most mentioned negative (n = 6252) and positive (n = 1196) keyword. There was also a strong correlation between the states and territories. There were a range of small bushfires that began from severe weather. Nevertheless, the impact was low and is translated into minimal negative sentiment from the community.
The clusters show that individuals use social media to gather and disperse useful information regarding disasters in Australia. Individuals that use Twitter to spread awareness can be categorized into two groups, situational awareness, and information sharing [
85]. Clusters 1 and 2 were examples of situational awareness as the tweets provided a useful insight into time and safety of a critical situation [
86]. These tweets will be able to assist first responder’s in assessing the amount of damage, victims’ location and needs. Information sharing was evident in Clusters 3 and 4 as the disaster has already occurred and could be used for directing needed resources into local communities that may have felt an additional burden from the bushfires. Both situational awareness and information sharing help accelerate disaster response and alleviate both property and human losses in crisis management [
87].
Sentiment analysis is a technique that could detect post for situation awareness. It is useful to better understand the dynamics of the network, including user feelings, panics and concerns, to identify polarity sentiment during disaster events [
88]. A sentiment analysis has revealed that from these clusters, it can be noted that the most frequently used keyword was ‘bushfire’ as a natural hazard-related disaster in Australia. It was also the most common negative keyword throughout all the states and territories. This is evident in VIC (
n = 4642), Qld (
n = 4764), NSW (
n = 7700), ACT (
n = 1633), NT (
n = 189), SA (
n = 264), TAS (
n = 346) and SA (
n = 2452). The most frequent positive keyword was also ‘bushfire’ in VIC (
n = 1139), QLD (
n = 353), NSW (
n = 1124), NT (
n = 21), TAS (
n = 62) and WA (
n = 62). Whereas in the ACT, ‘damage’ (
n = 112) and, in WA, ‘wind’ (
n = 321) were the most positive keywords.
Acknowledging the negative sentiment during the bushfire crisis as seen in Clusters 1 and 2 allows for improved decision-making and helps authorities find answers to their questions and make better decisions regarding disaster event assistance. As the 2019–2020 bushfire was the worst bushfire event that Australia has ever experienced, the high negative sentiment that was found in the findings could have been used to project the information regarding the devastation and recovery situation and donation requests to the public in more efficient ways [
84].
Further, it is important for government agencies to capture the community perceptions and demands immediately after a disaster event. Presently, in general, the governments or related emergency agencies come to know about the community demands after several months of an incident [
22]. However, social media offers live or near real-time updates about the community vulnerabilities of a disaster zone. Therefore, the government should take necessary actions to benefit from the knowledge/situation awareness opportunities social media channels provide. For that, responsible government agencies should present and maintain a good role within the social networks. By being within these networks, the authorities can quickly identify immediate community vulnerabilities and demands. At the same time, these agencies also should take measures to filter misinformation/false information that exists in social media networks [
64].
5.2. Myths and Facts about Social Media Data for Disaster Damage Assessments
The emergence of the presence of the social network and crowdsourcing in disaster damage assessment-related studies have enabled the application of inclusive disaster management approaches more than ever before. However, since the first application of social media data managing the Haiti earthquake and Tsunami, lots of myths and facts emerged discouraging and encouraging the use of social media data in disaster management. Among these, the most critical and valid arguments are: (a) social inequality in the usage of social media—digital divide; (b) limited information from severely damaged areas—spatial heterogeneity [
89]; (c) low information accuracy, and (d) inability to do a detailed disaster damage assessment.
The most common criticism is that not all people have equal access to social media data, which is mostly referred to as ‘digital divide’ in the literature. Still, the authenticity of this argument is considerably decreasing over time [
90]. Social media and related platforms became popular around 2006. There were 3.4 billion social media users by 2019 January, and it is growing at an increasing rate. Especially with the COVID-19 pandemic, people used more online platforms to work and study and the continuous lockdowns increased the use of social media data. Therefore, digital devices and technology overtook many of the priorities even low-income people had before the pandemic [
91]. Even with the issues of low bandwidths, coverage issues and related other issues, people have formulated their own ways to overcome them [
92].
Spatial heterogeneity in generating information from highly damaged areas during a disaster is the second critical argument against the use of social media data in disaster damage assessment. Besides, there is a high possibility of receiving more tweets from the areas with more people [
93]. Consequently, it is hard to determine the disaster damage extent by simply using disaster damage-related data. For instance, this study tried to identify ‘emerging unusual Twitter peaks’ in each state.
Low information accuracy is the third critical argument. Nevertheless, not particularly for disaster damage assessment, this argument is plural towards most of the research studies that follow social media data analytics. Sharing rumors, false information and generating imaginary information could reduce the accuracy of the social media data. During a disaster event, the social media usage becomes high and, eventually, the possibility of sharing false information also becomes high. Therefore, the studies need to undertake specific methods that have been adopted and discovered in the literature to limit the spreading of misinformation. The respective authority presence in the social media networks can be adopted to provide guaranteed information to the people.
Mostly, geolocated social media data or social media data with location information are used in disaster management-related studies. Still, the accuracy of the locations tagged in tweets may not be perfect [
94]. Among many social media, Twitter is the prominent social media that provides geotagged information for research purposes. Sometimes tagged locations may not represent the exact location information of the sender. This is a possible drawback in using geotagged social media data for disaster management. Albeit this is also becoming an outdated fact, as the social media platforms allow people to tag distant locations. Consequently, any person can tag any location which relates to the content discussed in the message. Moreover, circulation of geotagged images and videos in social media is becoming a trendy topic that provides more location-specific, trust-worthy information than the text messages [
95].
Inability to do a detailed disaster damage assessment is the fourth argument that exists specially for disaster damage-related studies. Disaster damage assessments can be done from national level damage assessment to the local level disaster damage assessments. This study factually proved the possibility of obtaining a bigger picture about the disaster damages at state level. Therefore, based on the state level assessments, the national level damage assessments can be obtained. In addition, with the prevalence of geotagged social media messages, images and videos of certain locations, damaged buildings and so on, disaster damage assessment-related studies on the local level will become popular in future research studies since it has already become more a fashion than a practice to monitor the environment through cameras of mobile handheld devices [
96].
6. Conclusions
Climate change-induced natural hazard-related disasters have become frequent events of our time sparing no corners of the world [
97]. They are so common and disruptive that even disaster terminology has been subject to change over the last decade [
98,
99,
100]. These more severe and frequent disasters are causing catastrophic results for many urban centers around the globe [
101]. Due to the magnitude of the problem, almost every time, local, regional, and national emergency service authorities fail to manage the externalities of these climate change-induced natural hazard-related disasters [
102]. Novel and innovative solutions are therefore needed to strengthen these authorities’ capabilities to combat with the disruptive impacts of these disasters [
103]. These solutions will also contribute to the knowledge-based development of cities, and in return will help in the adaptability of cities to climate change impacts [
104,
105].
The study reported in this paper focused on capturing the emotional state of local populations during the event of natural hazard-related disasters occurrence through the means of analysis of disaster damage-related geo-tweets. This analysis, in the context of Australian states and territories, sheds light on how to analyze the geographic distribution and occurrence frequency of various disasters and their damages based on the geo-tweets analyzed. By doing so, the paper showcases the advantages of the use of social media data, i.e., Twitter data, as an effective and (to a degree) unbiased source for natural hazard-related disaster analytics. This provides insights into the impact of a particular disastrous event may impose over a local community or a city/region. In sum, the study informs authorities on an innovative way to analyze the geographic distribution, occurrence frequency of various disasters and their damages based on the geo-tweets analysis.
In terms of prospective studies, we will expand the methodological approach to incorporate social media data from other networks such as Facebook and Instagram to increase the dataset and capture broader audiences. In addition to this, future studies will also collect data from the disaster and emergency services related agencies’ social media accounts. Investigating how information could be pre-processed to be immediately usable by corresponding authorities is another future research direction. Additionally, we will explore the use of real-time social media feeds and algorithmic analysis to provide timely, critical and a deeper insight into real-time public perception of a natural hazard-related disaster event. Lastly, in this study, our geographic unit of analysis was states and territories. Thus, the paper presents an aggregated view on the use of social media analytics in detecting natural hazard-related disaster impacts. Our prospective studies will focus on more disaggregate level of analysis that includes cities, local government areas, and suburbs.