Big Data in Natural Disaster Management: A Review

: Undoubtedly, the age of big data has opened new options for natural disaster management, primarily because of the varied possibilities it provides in visualizing, analyzing, and predicting natural disasters. From this perspective, big data has radically changed the ways through which human societies adopt natural disaster management strategies to reduce human suffering and economic losses. In a world that is now heavily dependent on information technology, the prime objective of computer experts and policy makers is to make the best of big data by sourcing information from varied formats and storing it in ways that it can be effectively used during different stages of natural disaster management. This paper aimed at making a systematic review of the literature in analyzing the role of big data in natural disaster management and highlighting the present status of the technology in providing meaningful and effective solutions in natural disaster management. The paper has presented the ﬁndings of several researchers on varied scientiﬁc and technological perspectives that have a bearing on the efﬁcacy of big data in facilitating natural disaster management. In this context, this paper reviews the major big data sources, the associated achievements in different disaster management phases, and emerging technological topics associated with leveraging this new ecosystem of Big Data to monitor and detect natural hazards, mitigate their effects, assist in relief efforts, and contribute to the recovery and reconstruction processes.


Introduction
Natural disasters can be defined as a combination of natural hazards and vulnerabilities that endanger vulnerable communities that are incapable of withstanding the adversities arising from them [1].Human beings invariably face threats of natural as well as human-made disasters, which often lead to massive damages, human suffering, and negative economic impacts.The main characteristics of natural disasters are unpredictability, availability of limited resources in impacted areas, and dynamic changes in the environment [2].Unpredictability implies that severe impacts on people and property during natural disasters cannot be predicted with acceptable accuracy [3].The issue of limited resources emerges because unpredictability makes it difficult to allocate adequate resources in advance.Dynamic changes in the environment result because it is difficult to make predictions about the movement of people and the damages that may occur because of the natural disaster.It is difficult to predict such changes based on data that pertains to normal periods [4].Introducing disaster management policies and applying appropriate levels of information technology and equipment offer immense potential in enhancing the capabilities of disaster management policies.In addition, the evolving trends have opened massive technological resources for reducing disaster risks [5].
Big data is defined as the technological paradigm that allows researchers to conduct an efficient analysis of vast quantities of data that is made available through the current practices [6,7].It is the collection of scientific and engineering methods and tools that help in making the best of massive amounts of available data.Big data addresses not only storage issues, but also issues related to accessibility, distribution, analysis, and effective visual presentation of data and analysis.Big data has now become a crucial element of communication, which complements the conventional exchange of intentional and explicit messages; such as first responders talking over a voice connection; or an announcement of a text message through which warning is given to citizens faced with the threat of an approaching natural disaster [8,9].More precisely, communication also entails understanding and monitoring the entire body of public and openly available communication such as messages and content that is publicly exchanged on social media.In such situations, people may be exchanging messages in reporting their condition to their loved ones or making appeals for help.However, big data allows researchers to conduct a detailed analysis of all communications which provides valuable information that has a general validity for the population at large; such as information about a disease outbreak.
In general, the disaster management cycle comprises four distinct phases, which are "mitigation", "preparedness", "responses", and "recovery".The goal of the mitigation phase is to minimize the effects of a disaster (building warning codes and risk zones, risk analysis, public education).The main focus of the preparedness phase is on planning how to respond to a disaster.It includes preparedness plans, emergency exercises, and training, but also the Early Warning System development and implementation.Response activities pertain to providing the required disaster management services to save lives and safeguard property and protect the environment during disaster management situations."Recovery" is the process of returning systems to normal levels after a disaster.
In recent years, the literature on disaster management mostly focused on the potential that lies in using specific kinds of data for natural disaster management [10][11][12].It is in this context that this paper makes a review of major big data sources, the associated achievements in different disaster management phases, and emerging technological topics associated with leveraging this new ecosystem of Big Data to monitor and detect natural hazards, mitigate their effects, assist in relief efforts, and contribute to the recovery and reconstruction processes.

Review Methodology
In the literature review process, we followed a systematic approach on selecting papers related to "big data" and "disaster management".Firstly, we searched in Google Scholar with the two key words and obtained 4432 results.Secondly, we manually selected papers that were most relevant to our topic.This step led to a total number of 223 articles.Thirdly, we filtered the list of articles to only include journal articles, leading to 149 articles from 101 journals.It is very likely, though, that we missed several articles that are on the same topic.These articles were reviewed individually for analysis.Starting from two articles in 2011, the number of articles in the review field started to grow gradually, with nine articles in 2012, 15 articles in 2013, 18 articles in 2014, 46 articles in 2015, 25 articles in 2016, 21 articles in 2017, and 12 articles in 2018 (Figure 1).The peak of the topic "big data" in combination with "disaster management" occurred in the year 2015, and researchers might be directed to other related topics or emerging technologies benefiting disaster management.The rest of this paper will mainly discuss three perspectives based on the 149 articles: (1) major data sources; (2) big data contributions in different disaster phases; and (3) emerging technologies benefiting from big data and disaster management.

Major Data Sources
To the best of this authors' knowledge, there is no standard definition of "big data" in disaster management.A Federal Geographic Data Committee (FGDC) report [13] reviewed the emerging data collections, including real-time spatiotemporal data (e.g., GPS data), broadening of participation (e.g., Volunteer Geographic Information and social media), small satellites, and unmanned aircraft vehicles (UAVs).A United Nation (UN) report [14] illustrated examples of big data sources in disaster resilience, including exhaust data (mobile-based, financial transactions, transportation, online traces), digital content (social media and crowd-sourcing), and sensing data (physical sensing devices and remote sensing).In this paper, we define big data as the integration of diverse data sources and the capability to analyze and use the data (usually in real time) to benefit the population that participates in the disaster situation.The concept of big data is beyond the datasets themselves, regardless of their size.
In this section, we endeavor to review the major big data sources-especially the emerging ones -for disaster management, including satellite imagery, aerial imagery and videos from unmanned aerial vehicles (UAVs), sensor web and Internet of Things (IoT), airborne and terrestrial Light Detection and Ranging (LiDAR), simulation, spatial data, crowdsourcing, social media, and mobile GPS and Call Data Records (CDR).Figure 2 illustrates the distribution of reviewed articles by different data sources and their year of publication.It can be observed that an increase in article numbers is shown in almost all types of major data sources during 2014-2016, when the topic "big data" is popular in "disaster management", and a decrease in 2017.Satellite imagery, crowdsourcing, and social media data serve as the most popular data for disaster management.

Major Data Sources
To the best of this authors' knowledge, there is no standard definition of "big data" in disaster management.A Federal Geographic Data Committee (FGDC) report [13] reviewed the emerging data collections, including real-time spatiotemporal data (e.g., GPS data), broadening of participation (e.g., Volunteer Geographic Information and social media), small satellites, and unmanned aircraft vehicles (UAVs).A United Nation (UN) report [14] illustrated examples of big data sources in disaster resilience, including exhaust data (mobile-based, financial transactions, transportation, online traces), digital content (social media and crowd-sourcing), and sensing data (physical sensing devices and remote sensing).In this paper, we define big data as the integration of diverse data sources and the capability to analyze and use the data (usually in real time) to benefit the population that participates in the disaster situation.The concept of big data is beyond the datasets themselves, regardless of their size.
In this section, we endeavor to review the major big data sources-especially the emerging ones -for disaster management, including satellite imagery, aerial imagery and videos from unmanned aerial vehicles (UAVs), sensor web and Internet of Things (IoT), airborne and terrestrial Light Detection and Ranging (LiDAR), simulation, spatial data, crowdsourcing, social media, and mobile GPS and Call Data Records (CDR).Figure 2 illustrates the distribution of reviewed articles by different data sources and their year of publication.It can be observed that an increase in article numbers is shown in almost all types of major data sources during 2014-2016, when the topic "big data" is popular in "disaster management", and a decrease in 2017.Satellite imagery, crowdsourcing, and social media data serve as the most popular data for disaster management.

Major Data Sources
To the best of this authors' knowledge, there is no standard definition of "big data" in disaster management.A Federal Geographic Data Committee (FGDC) report [13] reviewed the emerging data collections, including real-time spatiotemporal data (e.g., GPS data), broadening of participation (e.g., Volunteer Geographic Information and social media), small satellites, and unmanned aircraft vehicles (UAVs).A United Nation (UN) report [14] illustrated examples of big data sources in disaster resilience, including exhaust data (mobile-based, financial transactions, transportation, online traces), digital content (social media and crowd-sourcing), and sensing data (physical sensing devices and remote sensing).In this paper, we define big data as the integration of diverse data sources and the capability to analyze and use the data (usually in real time) to benefit the population that participates in the disaster situation.The concept of big data is beyond the datasets themselves, regardless of their size.
In this section, we endeavor to review the major big data sources-especially the emerging ones -for disaster management, including satellite imagery, aerial imagery and videos from unmanned aerial vehicles (UAVs), sensor web and Internet of Things (IoT), airborne and terrestrial Light Detection and Ranging (LiDAR), simulation, spatial data, crowdsourcing, social media, and mobile GPS and Call Data Records (CDR).Figure 2 illustrates the distribution of reviewed articles by different data sources and their year of publication.It can be observed that an increase in article numbers is shown in almost all types of major data sources during 2014-2016, when the topic "big data" is popular in "disaster management", and a decrease in 2017.Satellite imagery, crowdsourcing, and social media data serve as the most popular data for disaster management.

Satellite Imagery
Satellite remote sensing technology provides qualitative and quantitative opportunities in the context of varied functions such as assessing post-disaster damage, responding through operational assistance [15][16][17], and risk reduction.The most remarkable contribution of remote sensing imagery is post-disaster damage assessment through change detection [18].The application of remote sensing in disaster management is conducted through methods such as higher resolution, multidimensional, and multi-technique.
The recently evolved high resolution satellite imagery facilitates the collection of detailed texture information before and after a natural disaster for change detection.Such information is crucial to attain after the occurrence of disasters.Information collected through high resolution satellite imagery may pertain to the structural deformation of land areas, directional changes and creation of water bodies, and details about damaged building stock in the disaster affected area [19,20].Consequently, rescue methods can be suggested and initiated for the immediate location and recovery of bereaved persons and for locating the corresponding area accurately by using high-resolution remote sensing techniques [19].In addition to the two-dimensional information, it is also possible to produce three-dimensional images along with height information in the form of stereo images from satellites, which helps in identifying and measuring the intensity of damage [21,22].Three-dimensional analysis allows detecting collapsed buildings and affected regions by comparing the differences in the height of buildings and estimating the heights and volumes of piled up debris with the use of pre-and post-disaster stereo image pairs [23][24][25].To deal with the limitations of optical satellite imagery in cloud cover, rain conditions, and at nighttime, active sensors such as the synthetic aperture radar (SAR) can be effectively used to enlarge its observational capability during a natural disaster [18,26].
Despite identifying impacts or change brought about by disasters, satellite imagery has been intensively used for disaster risk reduction, including identifying human settlements [27], assessing flood risk [28], and landslide risk reduction [29].

UAV-Based Aerial Imagery and Videos.
Aerial imagery captured via unmanned aerial vehicles (UAVs) is playing an increasingly important role in disaster response, due to its efficiency in situational awareness.Aerial imagery can be captured with a high spatial resolution and processed much faster in comparison to satellite imagery [30].Besides the improvements offered by oblique imagery acquired from piloted platforms, UAVs provide additional advantages [31].These include fully controlled flight, VHR imagery of up to a 2 cm resolution that allows the detection of fine cracks, and the large degree of image overlap, which allows the generation of very detailed point clouds.With the use of UAVs, first responders can better understand which structures were affected by the given event and can determine the extent of the damage caused to these structures.
UAVs can carry various types of sensors, including cameras, video, infrared and ultra-violet sensors, radiation sensors, weather sensors, spectrum analyzers, and LIDAR reflectors.UAV imagery and videos can provide real-time, decision-relevant situational information for transportation planners who survey damaged roads, evacuation routes, and support for transport logistics [32].
UAVs are now recognized as a reliable data source for disaster information and for improving the estimation scale of damage [33].With the combined use of UAV imagery, satellite, and aircraft data, it becomes possible to identify façade and roof damage to buildings by integrating geometrical transformation and environmental information [34].Combining UAV imagery and crowd-sourcing enables annotation in the context of features of interest, such as damaged shelters and roads blocked by debris [30].
However, UAVs are still in the development stage and several issues need to be resolved; such as short battery life, which leads to a limited area of coverage; unforeseen behavior in different atmospheric conditions; limited scope of pilot training for users; and legislation that severely limits the use of UAVs in most countries.Due to the UAV technology, privacy issues arise.The community concern about exposing victims' locations needs to be addressed by officials and UAV developers.Public safety needs to be protected for privacy and individual rights during UAV operations in a complex disaster environment [35,36].

Wireless Sensor Web and Internet of Things
Chen et al. [37] utilized existing and available Wireless Sensor Network (WSN) technologies to develop an early warning system for natural disasters.The WSN technologies provide reliable data transmission and incorporate data fusion from heterogeneous sensors and minimizing energy consumption.Erdeji et al. [38] reviewed the overall structure of the WSN and UAV systems for natural disasters, and proposed the Inundation Monitoring and Alarm Technology in a System of Systems to incorporate smartphone data with WSN for enhanced situational awareness.Erman et al. [39] integrated low-cost embedded devices based on WSN and UAVs for improving the response time in critical situations, minimizing the latency, and maximizing the success ratio of delivery.WSN has also been extensively used in facilitating the intercommunication between a disaster impacted population and rescue teams when traditional infrastructure communication systems fail [40,41].Tuna et al. [42] utilized a group of mobile robots to explore an unknown region after a disaster and used WSN to extend the range of communication for human existence detection.
While being widely deployed in disaster management, WSN is lacking in terms of the overall coordination of heterogeneous data sources and protocols from "socio-techno-economic perspectives" [43].WSN contributes as an essential part of Internet of Things (IoT) technologies.Ray et al. [43] reviewed several IoT-enabled disaster management systems, including BRINCO (a notification system for earthquake and tsunami warning), BRCK (communication system under low connectivity areas), and GRILLO (earthquake alarming sensor network).The advantages of IoT include its capability to compensate for the poor infrastructure of a vulnerable population especially in developing countries, and being an alternative means of communication where IoT-enabled devices (battery powered wireless devices) can be used benefits data network resilience during disaster situations [44].

LiDAR
Airborne and terrestrial Light Detection and Ranging (LiDAR) is a method that provides the ability to extract high-quality elevation models and other features, providing reliable information about on-the-ground conditions during a disaster situation.LiDAR equipment is relatively expensive and collecting and processing of data with it can often prove to be time consuming.However, DEMs generated from LiDAR data can have a very high resolution and are very accurate.Unlike aerial photography, LiDAR scanners collect data at a very-fine (centimeter) scale resolution and can gather information about the ground surface below the vegetation.This ability is very useful for geological mapping and for measuring geological features, including monitoring volcano growth and predicting eruption patterns [45,46].In addition, LiDAR data is highly sensitive to water and thus proves to be a suitable data source for flood prediction and assessment [47,48].LiDAR has been useful in identifying and assessing elevation changes or structural damages after natural disasters [49].For example, Kwan and Ransberger [50] demonstrated the use and analysis of LiDAR data before and after Hurricane Katrina in detecting transport network obstructions during a disaster response.They also demonstrated how the data helped in reducing the time taken by first responders to reach disaster sites.Similarly, Moya et al. [51] detected collapsed buildings after the earthquake that struck Kumamoto, Japan on 16 April 2016.They did so by using digital surface models (DSMs), taken before and after the earthquake during LiDAR flights.

Simulation Data
Numerical simulation or forecasting is one of the most significant contributions to the prediction of natural disasters occurring on account of meteorological phenomena, land surface phenomena, and various types of pollutions [52][53][54].In addition, 3D modeling has been helpful in predicting the potential damage and in assessing the changes occurring after the disaster [55,56].
Large amounts and different types of observation data are generated during natural disasters, and these data can be used to produce, verify, validate, and improve models to represent the complexity for all facets of disaster management.Agent-based models are suitable for exploring human behavior and rapid low-level environmental changes using available high resolution data.Dou et al. [57] developed an agent-based framework for human rescue operations in a landslide disaster to evaluate a contingency plan.The framework extracts information from high resolution remote sensing images, simulates a landslide environment based on a three-dimensional landslide geological model, and uses a multi-agent simulation approach to provide individuals' behavior simulation under dynamic disaster scenarios.The simulations provide positive effects on the evacuation process, e.g., speeding up the process and reducing the number of casualties.Mas et al. [58] reviewed the agent-based models for tsunami mitigation and evacuation planning through case studies in Indonesia, Thailand, Japan, and Peru.Kureshi et al. [59] utilized advanced by combining agent-based models with physical sensors, and adaptively managed the heterogeneous collection of data resources and agent-based models to create what-if scenarios in order to deter-mine the best course of action.

Vector-Based Spatial Data
Vector-based spatial data provides fundamental support for disaster management, including disaster forecasts about the extent of a particular hazard or disaster, vulnerability analysis for critical facilities (hospital, school, shelter, fire station, etc.) and human beings (age, gender, socioeconomic status, etc.), damage assessment on the actual impact of a hazard, resource inventory (supplies, equipment, vehicles, etc.), and infrastructure (transportation networks and utility grids).Tomaszewski et al. [60] reviewed the importance of geographic information systems (GIS) and major GIS data sources for disaster response, including Federal Emergency Management Agency (FEMA) GIS Data Feeds, Geonames, US Census Bureau TIGER, the National Map, WorldBank Data, and Open Street Map.Herold and Sawada [61] reviewed the GIS applications of disaster management, emphasizing developing countries.

Crowdsourcing
It is known that both crowdsourcing and social media data are contributed by the public.While crowdsourcing data are actively contributed, social media data are mostly passively contributed, because the contributors are not aware that their activities on social media are leading to data collection [62].
Active crowdsourcing platforms have been developed for users to enthusiastically contribute the required information [63][64][65].These platforms are usually developed and implemented by members of the affected public, or by non-governmental organizations (NGO's) such as Ushahidi (https:// www.ushahidi.com/).The purpose of such platforms is to improve the disaster response and resource allocation based on real-time reports from disaster victims.Despite several success stories, challenges of using active crowdsourcing platforms are still worthy of consideration, particularly regarding the credibility and the value of integrating crowdsourced data into the decision-making process [66].
The analysis and processing of crowdsourced disaster data requires varied tools and automation processes in view of its noisy nature, large volume, and fast streaming speed.Significance of the crowdsourced data can be prioritized in keeping with the location of the disaster area and the analyzed data [67].Online platforms and mobile applications have been established to collect and distribute crowdsourced data during and after disasters [68].dos Santos Rocha et al. [12] have discussed the strategies that can be adopted by disaster managers in improving the performance of digital volunteers.They also assert that crowd sensing and collaborative mapping can be used in functions involving distributed intelligence, participatory engagement, and self-mobilization.They suggested the adoption of a feedback mechanism amongst the stakeholders to enhance the effectiveness of communication.
Government agencies have been slow in launching their own crowdsourcing platforms for disaster management.This might be due to concerns about the technological and human resources costs and risks of adopting and managing crowdsourcing for government operations.However, government agencies were criticized for the poor management of the 2005 Hurricane Katrina for inadequate information collection compared to the citizen efforts on response and recovery [69,70].In recent years, government agencies have started to interact with citizens during disaster situations for real-time urgent information and better coordinate search and rescue operations [71,72].FEMA is coordinating different types of crowdsourcing on image analysis, text messages, and mapathons (https://www.data.gov/event/fema-disaster-crowdsourcing-exchange/)for Hurricanes Harvey, Irma, and Maria, and the California Wildfires that happened in 2017.

Social Media
Social media platforms, including Twitter, Youtube, Foursquare, and Flickr have been contributing significantly to disaster management.Geotagged social media data can be collected by streaming harvest from the APIs provided by the social media companies.
Social media services have contributed significantly to disaster management as a tool to communicate information during disaster management [73].Social media is increasingly being used by both NGO's and government disaster management agencies to determine public sentiment and reaction to an event [74].It is evident that the multidirectional flows of communication and information that crisis crowdsourcing online platforms facilitate can make response and recovery efforts more efficient [70].
Even though social media provides implicit varieties of crowdsourced data, it is being effectively used in disaster management.Granell and Ostermann [75] discussed that Twitter is more effective for detection and prediction and less significant in recovery and response functions.Other data sources such as spatial video, UAV, and phone call data can provide strong evidence (e.g., in-situ images) of the current situation in the affected area during post-disaster stages.
There are varied ways of using social media in disaster management, including data collection, analytic workflow, narrative construction, disaster relevant information extraction, geolocation pattern/text/image analytics, and the broadcasting of information through social media platforms [76].
A number of authors have shown that real time analytics based on social media data provide good opportunities to detect and monitor events automatically [77,78].Text messages are the basic source of analysis [79].Visual analytics through social media data facilitate spatiotemporal analysis and create a spatial decision support environment that assists in evacuation planning and disaster management [80].Given that social media does not rely only on text messages and provides more useful information through images and videos posted by users, visual analytics and image/video-based analysis are becoming more important in extracting the key information from social media posts [81].

Mobile GPS and Call Data Record
Mobile GPS (Global Positioning System) has emerged as an effective means of gathering mobile sensing data because it can be used to detect human mobility and behavior during large-scale natural disasters.Horanont et al. [82] made use of data collected after the 2011 Great Japan Earthquake and provided useful information on how humans react in disaster scenarios and how the evacuation process can be monitored on a real-time basis.With GPS, it becomes possible to determine the location, magnitude, and other details about an earthquake fault.This is done by using one or more of the three basic components of GPS, which are absolute location, relative movement, and time transfer.GPS data allows automatic calculation of the location, magnitude, and other details about the earthquake fault [83].Similarly, Song et al. [84] developed a human mobility model to assess the disaster management behavior and mobility patterns of people after the occurrence of the Great East Japan Earthquake and Fukushima nuclear accident.They found that the behavior of human beings and their mobility following large-scale disasters sometimes correlate with their mobility patterns during normal times.In addition, they found that mobility patterns of human beings are also strongly impacted by factors such as their social relationships, intensity of disaster, extent of damage, availability of government shelters, news reporting, population flows, etc. GPS can also be used to determine the resilience of transportation systems during a natural disaster.Donovan and Work [85] conducted quantitative research in measuring transportation resilience by observing about 700 million taxi trips in New York City, which they used to analyze and assess the resilience of the transportation infrastructure to Hurricane Sandy.
It is known that call monitoring and recording applications used by telecommunication companies generate an extremely large amount of call detail records (CDRs) in real-time, and that companies constantly need to analyze this data and make required changes to boost productivity.It is noteworthy in this regard that the volume of the calls and data captured by the call monitoring applications is so large that it is impossible to manually analyze and reach conclusions on the behavior of the network.
CDR datasets are collections of spatiotemporal traces that can characterize individual mobility and social network behaviors at very fine scales.CDR datasets contain information about the location and time at which a communication (call/SMS) is made, along with unique identifiers for the sender and receiver.These data may be valuable for disaster management when they are used to estimate population size and density in a region or city.The information is based on the number of phone subscribers that are present in the coverage area of each cellular tower [86,87].Information on population distribution before and after a disaster can be useful when assessing exposure risk and response needs.Ghurye et al. [88] utilized call records in using the granular behavior models to evaluate the similarities and differences between the normal and disaster patterns by using data and information relative to the floods that occurred in Rwanda in 2012.Results showed that disasters tend to disrupt both mobility patterns and communication behaviors, while recovery efforts can take several weeks.

Usage of Big Data in Disaster Management Phases
The disaster management cycle is a continuous process involving different phases of planning and preparing for expected disasters, including long-term mitigation (i.e., "mitigation/prevention"); short-term preparation and prevention (i.e., "preparedness"); reducing impacts of a disaster through response and rescue efforts (i.e., "response"); and restoration through clean-up and reconstruction (i.e., "recovery") [89].In this section, we classify the 149 articles into the four phases and review the "mitigation/prevention" phase in two aspects: (1) long-term risk assessment and reduction; and (2) forecasting and prediction; the "preparedness" phase in two aspects: (1) monitoring and detection, and (2) early warning; two aspects for the "response" phase: (1) damage assessment; and (2) post-disaster coordination and response; and the post-disaster "recovery" phase (Figure 3).Table 1 shows the disaster management phases, major data sources, and the application fields that were mapped based on the literature reviewed in this paper.Table 1 shows the disaster management phases, major data sources, and the application fields that were mapped based on the literature reviewed in this paper.Earthquake [121][122][123] Hurricane [124] Typhoon [125] 4.1.Mitigation/Prevention

Long-term Risk Assessment and Reduction
Satellite images are known to allow experts to identify geographical and infrastructure risks.Kwak [93] developed a multiple satellite-based flood mapping and monitoring system to assist in risk assessment during disaster risk management.In this system, the Moderate Resolution Imaging Spectroradiometer (MODIS)-derived synchronized floodwater index was used to detect the maximum extent of flood based on annual time-series data for the year 2015.Similarly, Skakun et al. [15] utilized Landsat-5/7 satellite images from 1989 to 2012 and produced a flood risk assessment study that included direct damage categories such as dwelling units, roads, health facilities, and schools.Among these facilities, buildings are one of the most important factors to assess for risk reduction, since they represent human settlements in villages and cities.Building collapse is one of the major sources of casualties in disaster scenarios such as hurricanes and earthquakes, primarily because population distributions can be estimated from building stocks [91].
User-generated data helps in filling past research gaps and provides new opportunities to conduct research on risk assessments.For example, McCormick [94] introduced a crowdsourcing system for oil spill risk assessment, which detected factors that were not found in official expert and government risk assessment reports.Residents observed exposures, such as smells, smoke, and other potential risk factors, that were not detected at harmful levels by expert risk assessors.After the 2015 Nepal Earthquake, Wilson et al. [88] used call detail records pertaining to phone metadata tracking numbers and times of calls to estimate population distribution and socioeconomic status for risk assessment.It became evident that the analysis of mobile data and online content can help policymakers understand the behavior of communities, which allows researchers to test their response to disaster management plans and training [10].According to Horita et al. [94], combining sensor data with user-generated data proves to be an effective risk management strategy.They developed a spatial decision support system that integrated the information provided by WSN and VGI for flood risk management.This integration can support the decision-making of disaster management agencies by improving the maintainability and assessment of WSN, as well as providing data from poorly gauged or un-gauged areas through VGI.

Forecasting and Predicting
Predictive analytics can be powerful tools for natural disaster management.Natural disasters are difficult to forecast because of the complexity associated with the physical phenomena and variability of the parameters involved [96][97][98].Existing natural disaster forecasting approaches mostly rely on the underlying physical models and parameters.Nevertheless, prediction capabilities of these approaches have been enhanced in recent years, enabling them to provide higher resolution results, create better physical schemes, and develop the ability to access new data more broadly.
Higher resolution prediction models have been developed for meteorological events, including tropical cyclones, hurricanes, winter storms, or general weather prediction.Murakami et al. [52] introduced the High-Resolution Forecast-Oriented Low Ocean Resolution (FLOR) model (HiFLOR), which has high resolution (~25-km mesh) atmosphere and land components and a more moderate-resolution (~100-km mesh) sea ice and ocean component.The decrease in horizontal resolution (from originally 50 km to 25 km) was made by directly reducing the horizontal grid spacing of the cubed sphere in the underlying model scheme.This resulted in a more realistic simulation of the structure, global distribution, and seasonal and inter-annual variations of tropical cyclones.In addition, it became possible for the higher resolution model to simulate and predict extremely intense tropical cyclones (Saffir-Simpson hurricane categories 4 and 5) and their inter-annual variations.This was the first time that a global coupled model was able to simulate such extremely intense tropical cyclones in situations such as multi-century simulation, sea surface temperature restoring simulations, and retrospective seasonal predictions.Another way to increase prediction resolution is through nested modeling.Goldenberg et al. [54] presented the 2012 version of the Hurricane Weather Research and Forecasting Model (H212).In being an enhanced version of the 2007 version, H212 has an enhanced horizontal resolution of up to 3 km, which makes it the first operational model that meets the condition of running on convection-permitting resolution.The foundational improvement was that the horizontal resolution for HWRF was increased to a 27-km outer domain with 9-km intermediate and 3-km innermost moving nests.
Better underlying physical schemes have been developed with the help of prediction models by integrating big earth observation data [95,96].As argued by Zhang et al. [95], the most recent upgrades in boundary layer physics have benefited from analyses of in situ aircraft observations in the low-level eye-wall region of major hurricanes.The boundary layer height is an important parameter because it regulates the vertical distribution of turbulent fluxes and helps determine where turbulent fluxes are negligible.Zhang et al. [97] identified different height scales in the hurricane boundary layer in terms of the height of the maximum tangential wind speed, the inflow layer depth, and the mixed layer depth.By analyzing hundreds of GPS dropsondes released from aircraft, they found that there is a clear separation of the thermo-dynamical and dynamical boundary layer depths.They also found that both dynamical and thermo-dynamical boundary layer height scales tend to decrease as distance is reduced from the storm center.These findings were integrated into the operational Hurricane Weather Research and Forecasting (HWRF) Model [98]; and prediction results showed considerable improvements in the simulated storm size, surface inflow angle, near-surface wind profile, and kinematic boundary layer heights in simulations carried out with the improved use of physics [95].
The enhancement of prediction capability becomes partially better due to greater access to new data.A pertinent example in this regard is the Cyclone Global Navigation Satellite System (CYGNSS), which is a NASA earth science mission launched in 2016 that focuses on tropical cyclones and tropical convection [99].The goal of CYGNSS is to support significant improvements in our ability to forecast tropical cyclone track, intensity, and storm surge through better observations and, ultimately, better understanding of inner-core processes.Another method of integrating observation data into prediction models is through data assimilation, which is defined as (https://www.ecmwf.int/en/research/dataassimilation)"typically a sequential time-stepping procedure, in which a previous model forecast is compared with newly received observations, the model state is then updated to reflect the observations, a new forecast is initiated, and so on."Zhang and Weng [100] assimilated high-resolution airborne radar observations into hurricane prediction.With this assimilation, four to five days before these storms made landfall, the system produced good deterministic and probabilistic forecasts of not only track and intensity, but also of the spatial distributions of surface wind and rainfall.

Monitoring and Detection
Effective monitoring is very helpful in improving the management of disasters.One of the major sources of big data for disaster monitoring and detection is remote sensing data, which are available in different spatial and temporal resolutions with an adequate level of accuracy.Satellite remote sensing capabilities have long been used to monitor for timely and near-real-time disaster detection [105].Effective management of disasters such as fires or floods can be done through multitemporal remotely sensed imagery, which is imagery captured over the same location at several points in time, usually days apart.It can be used to monitor the ways in which the event is spreading and helps decision makers in developing and implementing mitigation strategies [104,106,107].The major goal of the early monitoring phase is to define a boundary that delimits the affected area, so that preliminary information on the event can be generated.Large coverage remote sensing data with a low spatial resolution (>30 m) is appropriate for this phase, whereas high-and very-high-resolution data could be a waste of resources.Automatic extraction of data relative to the disaster impacted area facilitates quick availability of the required information.
The second major source of big data for disaster monitoring and detection is user-generated data available on various platforms, including social media websites and VGI.According to Arribas-Bel [110], each user is considered as a sensor that contributes data for filling the gaps in the availability of authentic data about disasters [126,127].Earle et al. [108] investigated the capability of using Twitter for rapid earthquake detection.Their investigation was based on the observation that people start conveying messages on Twitter the moment they start feeling the tremors.Such messages or tweets on Twitter were found to be in correlation with the peaks of earthquake-related tweet-frequency time series and the times when the event started.Cervone et al. [127] conducted comprehensive research in analyzing the flood management efforts during the Boulder flood disaster in 2013.Real-time data from Twitter was monitored to identify hotspots and keywords for the disaster in prioritizing the collection of remote-sensing images for the different disasters.
The integration of traditional remote sensing data and newly emerged big social-sensing data has facilitated better understanding in the context of the location, timing, causes, and impacts of natural disasters.This helps a great deal in enhancing the speed and effectiveness of the disaster response.Musaev et al. [111] developed a landslide detection system that integrates multiple physical sensors (USGS seismic network, NASA TRMM rainfall network) and social media (Twitter, YouTube, and Instagram) to determine the varied origins and compositions of multi-hazards.The authors collected social media data that was systematically preprocessed along with keyword filtering, stop-words and phrases filtering, geotag filtering, classification filtering, and blacklist filtering.Earth surface was represented as fixed-size grids and the geotagged social media posts were grouped based on their coordinates to determine areas where landslides might have occurred.Data from the physical and social sensors were integrated by the fixed grid using a Bayesian-based relevance ranking strategy, while favoring landslide events detected from both social and physical sensors.Jongman et al. [116] integrated near-real-time satellite data and social media information for better flood monitoring.They found that satellite imagery is better suited for monitoring large floods, while Twitter can be used to monitor floods of any size, as long as the observations and discussions are shared by people on social media.Crowdsourced hazard detection techniques can enlist citizens to provide information by posting pictures of the disaster's activity.

Early Warning
Sensor data are usually the major data sources for early warning system development.Mandl et al. [109] demonstrated the "Namibia Flood SensorWeb" early warning system based on integrated satellite and ground sensor data for flood situational awareness and early warning.Different types of data are integrated in the system, including river gauge, TRMM rainfall, MODIS flood extent, Radarsat SAR derived flood extent, river flow estimated from the Terra satellite, and flood model data.The system is capable of a sensor planning service (query sensor and initiate data acquisition), web notification service (provide alerts for events such as data product availability), sensor observation service (provide access to raw data), and web processing service (provide data processing function to create high level data products).Poslad et al. [112] proposed a semantic early warning system based on IoT, in which metadata is used to enhance rich sensor data acquisition and ontology models describing multilevel knowledge-bases are used to support decision support and workflow orchestration.
Social media also benefits early warning in a certain way.Carley et al. [76] examined the use of Twitter in Indonesia for early warning and planning for disasters.The coverage, spatiotemporal patterns, and identification of opinion leaders were assessed for the suitability of Twitter for early warning.The paper concluded that with careful collection, assessment, and coordination with official disaster Twitter sites, Twitter is capable of supporting early warning; and that a local Twitter opinion leader will have a critical role in the early warning process.

Damage Assessment
The most well-known data source for damage assessment is remote sensing imagery.Different types and resolutions of remote sensing data are in demand depending on the extent of details required.Low resolution but large scale remote sensing data provides a quick initial assessment of the impacted area when in situ observations are not yet available.The initial assessment also guides disaster responders to prioritize the areas to be inspected with higher resolution data.Detailed damage assessment, including buildings and roads, requires higher resolution and three-dimensional (3D) data to provide accurate information about the intensity of the damage.Using Formosat-2 and Satellite pour Observation de la Terre (SPOT)-5 satellite images, Liou et al. [19] were able to identify the structural deformation of land areas, changes in directions of rivers, creation of new lakes, and the water levels of rivers and lakes in the 2008 Wenchuan earthquake affected area.The method used by Liou et al. [19] for analyzing such satellite imagery was mainly visual interpretation, even though methods of automatic extraction were also developed [20].Very-High-Resolution satellite imagery (with a spatial resolution finer than1 m) are mostly suited for detecting disruption to transportation networks and for identifying open spaces that can be used for locating shelters [113,128].
UAV-based aerial imagery is an emerging method that allows data collection for disaster management and provides a better resolution than that available through manned aircrafts.They provide more flexible data acquisition that improves the quality of the point clouds that can be derived for this purpose.UAV imagery and videos analyze the disaster-impacted area in a high spatiotemporal resolution.Using UAV data that has a resolution better than 1cm, Fernandez Galarreta et al. [114] identified a number of damage features for each building focusing on aspects such as total collapse, collapsed roof, rubble piles, and inclined façades.The analysis was first conducted through visual inspection by experts using a 3D point cloud (generated from the raw UAV imagery) for damaged features.Thereafter, a more detailed object-based damage feature extraction from imagery was done in the context of façade and roof analysis of the buildings that did not show any damage features in the previous step.Damage information acquired from the 3D point and the imagery was aggregated to identify the extent of major damages.
Crowdsourced platforms, where images and descriptions of the disaster impacted places can be uploaded from disaster impacted residents, have been found to be helpful for damage assessment.During the 2010 Haiti Earthquake, the Global Earth Observation Catastrophe Assessment Network (GEN-CAN) was established by ImageCat Inc.The firm trained over 600 image professionals to identify collapsed and heavily damaged structures over an affected area of 1000 square kilometers [115].Similarly, in regard to the Super Typhoon Haiyan that swept the Philippines in 2013, volunteers conducted a remote damage assessment based on satellite imagery on an OpenStreetMap-based application, which was led by the Humanitarian OpenStreetMap Team [129].Social media data was also collected for Tyhoon Haiyan from Chinese microblog "Weibo" to estimate the extent of damage in the typhoon impacted area in China.Twitter was used as a big data tool during and after Hurricane Sandy struck the East Coast of the United States in 2012 [130].FEMA organized groups of public and private agencies to analyze the tweets to plot out locations where aid and resources were most needed.

Post-Disaster Coordination and Response
It is imperative for a post-disaster response such as search and rescue operations to be conducted quickly and efficiently.A major problem in this regard is the lack of communication and situational awareness during disasters that forces first responder teams to improvise and thus lessen the efficiency of the rescue mission.A major issue with disasters is that they are mostly characterized by limited resources and dynamic change in the environment, and under such circumstances, it is always problematic to use the limited resources effectively in providing the best communication services.Big data analytics provides possible solutions to understand the situations in disaster areas, because it has the potential of using the limited resources optimally.
Aerial ad-hoc networks are associated with the advantage of being deployed in critical situations where terrestrial mobile devices might not operate.However, their implementation is challenging from the point of view of mobility management and coverage lifetime.Di Felice et al. [117] investigated the utilization of low-altitude aerial mesh networks created by Small Unmanned Aerial Vehicles (SUAVs).The objective was to re-establish connectivity among isolated end-user devices located on the ground.They proposed a distributed mobility algorithm based on the virtual spring model, through which the SUAV-based mesh node can self-organize into a mesh structure by guaranteeing Quality of Service over the aerial link and connecting the maximum number of devices.They also proposed a distributed charging scheduling scheme through which a constant coverage of sensing devices can be guaranteed over the disaster scenario.Mosterman et al. [118] proposed an automated disaster response system that integrates varied kinds of autonomous vehicles, including ground vehicles, for setting up local stations, fixed wing aircraft for assessing infrastructure damage, and rotorcraft for delivering disaster supplies.These vehicles are coordinated and controlled by the proposed cyberspace operation system that integrates information and assists disaster response automation.Lu et al. [119] designed and implemented a system called TeamPhone, which provides smart phones with the capabilities of communications during disaster recovery, including a messaging system and a self-rescue system.The messaging system integrates cellular networking, ad-hoc networking, and opportunistic seamless networking and enables effective communications between rescue workers.The self-rescue system efficiently groups the smart phones of trapped survivors and sends out disaster messages in facilitating rescue operations.
The different sources through which big data is used by entities involved in the assessment of disaster damage need to be effectively coordinated in order to confirm the damage, know of the destruction that may have been left undetected, and prevent the duplication of efforts by disaster responders.Cervone et al. [127] found that the integration of user generated data (crowdsourcing and social media) and physical sensing data (satellite and aerial imagery) can enhance the accuracy of the former and increase the temporal resolution of the latter, thus providing a more comprehensive compact dataset.For example, after the Haiti earthquake, coordinating systems such as the UN inter-agency OneResponse website, the Sahana Free and Open Source Disaster Management System, and the crowdsourcing platform Ushaidi were established to coordinate a massive amount of information [120].These systems allow the public to provide information about missing persons or to track missing people's location.The comprehensive data proves to be valuable in designing and executing task response operations, which means it should be made available as early as possible so as to limit the death toll and additional loss of property [19].

Recovery
The post-disaster recovery phase relates to the period after which initial relief has been provided and is characterized by efforts directed at bringing back normalcy in people's lives and improving the overall circumstances.The major data source for post-disaster recovery monitoring is remote sensing data, including satellite and aerial imagery.Based on change detection from multi-temporal remote sensing data, needs for reconstruction around damaged areas can be detected and monitored.The methodology of change detection is similar in comparison to damage assessment.For example, de Alwis Pitts and So [121] utilized an object-based change detection mechanism to identify the changes before and after two earthquake events; the Van earthquake in eastern Turkey in 2011 and the Kashmir earthquake in northwest Pakistan in 2005.They used high resolution satellite imagery, including the WorldView-2 (0.46 m for panchromatic band and 1.85 m for multispectural band), the Geoeye-1 (0.41 m for panchromatic band and 1.65 m for multispectual band), and the Quickbird-2 (2.44 m for spectral and 0.61 for panchromatic band).Pre-and post-disaster imagery was acquired and road information was obtained from openStreetMap.The changes in edges, texture, and gradient of primary roads were calculated, and changes of open green spaces were also detected.It became apparent that the quantified information contributed to the observation of disaster recovery over time.Contreras et al. [122] reported the progress of recovery efforts after the earthquake at L'Aquila, Italy in 2009.The recovery evaluation was based on remote sensing and ground observations.They used QuickBird imagery to detect the progress during the recovery process from damaged buildings.GPS and other ground observations were collected progressively (every two years).Results of the recovery revealed the socio-economic progress after the earthquake.
The literature has vastly cited the outcomes emanating from user generated data obtained from social media and VGI platforms during post-disaster recovery.A major challenge for disaster management in utilizing such data is their limited capability of handling big data.Afzalan et al. [124] examined how active and influential members of Facebook groups aided in disaster recovery after the occurrence of Hurricane Sandy.Network analysis was carried out to find influential members and a web-survey was conducted to learn about their background and volunteer activity inside and outside Facebook.In effect, disaster recovery organizations can approach these influential people and collaborate with them in conducting recovery procedures.Yan et al. [125] utilized geotagged Flickr photos to monitor and assess post-disaster tourism recovery after the Philippines earthquake and Typhoon Haiyan in 2013.Geotagged Flickr photos were analyzed through quality enhancement (both locational accuracy and thematic accuracy), and quantitative and qualitative investigation of the available visual contents.Results showed spatiotemporal patterns of the recovery status and trends.

Two Emerging Topics-Evolutionary Technologies
Figure 4 illustrates the increasing popularity of "machine learning" and a significant popularity of "cyberinfrastructure" applied in the research of "big data" in "disaster management".Here, we review the two major topics, machine learning and big data cyberinfrastructure, as evolutionary technologies which facilitate disaster management in various ways.The literature has vastly cited the outcomes emanating from user generated data obtained from social media and VGI platforms during post-disaster recovery.A major challenge for disaster management in utilizing such data is their limited capability of handling big data.Afzalan et al. [124] examined how active and influential members of Facebook groups aided in disaster recovery after the occurrence of Hurricane Sandy.Network analysis was carried out to find influential members and a web-survey was conducted to learn about their background and volunteer activity inside and outside Facebook.In effect, disaster recovery organizations can approach these influential people and collaborate with them in conducting recovery procedures.Yan et al. [125] utilized geotagged Flickr photos to monitor and assess post-disaster tourism recovery after the Philippines earthquake and Typhoon Haiyan in 2013.Geotagged Flickr photos were analyzed through quality enhancement (both locational accuracy and thematic accuracy), and quantitative and qualitative investigation of the available visual contents.Results showed spatiotemporal patterns of the recovery status and trends.

Two Emerging Topics-Evolutionary Technologies
Figure 4 illustrates the increasing popularity of "machine learning" and a significant popularity of "cyberinfrastructure" applied in the research of "big data" in "disaster management".Here, we review the two major topics, machine learning and big data cyberinfrastructure, as evolutionary technologies which facilitate disaster management in various ways.

Machine Learning
A great deal of uncertainty is created when different sources of big data are integrated for the purposes of disaster awareness and response.Manual interpretation and analysis of the integrated data is no longer adequate, which is why sophisticated automatic analysis methods are required to make the process efficient and effective.Machine learning was introduced in the field of disaster management two decades back and has since evolved to become one of the most effective methods for eliminating unrelated data and speeding up the analysis in disaster situations, which helps in quick prediction analysis and identifying optimal response strategies.
Text classification of social media data classifies disaster response-related tweets and facilitates the rapid identification of disaster relief contents.Imran et al. [80] developed a system of Artificial Intelligence for Disaster Response (AIDR) to classify messages that people post during disasters into a set of user-defined categories of information (e.g., "needs", "damage", etc.).In meeting this purpose, the system continuously sources data from Twitter, processes it (i.e., using machine learning classification techniques), and leverages human-participation (through crowdsourcing) in real-time.AIDR has been successfully tested to classify informative vs. non-informative tweets posted during the 2013 Pakistan Earthquake.Based on AIDR, Bejiga et al. [131] proposed a hybrid crowdsourcing and real-time machine learning solution to rapidly process large volumes of aerial data for disaster response in a time-sensitive manner.Crowdsourcing was used to annotate features of interest in

Machine Learning
A great deal of uncertainty is created when different sources of big data are integrated for the purposes of disaster awareness and response.Manual interpretation and analysis of the integrated data is no longer adequate, which is why sophisticated automatic analysis methods are required to make the process efficient and effective.Machine learning was introduced in the field of disaster management two decades back and has since evolved to become one of the most effective methods for eliminating unrelated data and speeding up the analysis in disaster situations, which helps in quick prediction analysis and identifying optimal response strategies.
Text classification of social media data classifies disaster response-related tweets and facilitates the rapid identification of disaster relief contents.Imran et al. [80] developed a system of Artificial Intelligence for Disaster Response (AIDR) to classify messages that people post during disasters into a set of user-defined categories of information (e.g., "needs", "damage", etc.).In meeting this purpose, the system continuously sources data from Twitter, processes it (i.e., using machine learning classification techniques), and leverages human-participation (through crowdsourcing) in real-time.AIDR has been successfully tested to classify informative vs. non-informative tweets posted during the 2013 Pakistan Earthquake.Based on AIDR, Bejiga et al. [131] proposed a hybrid crowdsourcing and real-time machine learning solution to rapidly process large volumes of aerial data for disaster response in a time-sensitive manner.Crowdsourcing was used to annotate features of interest in aerial images (such as damaged shelters and roads blocked by debris).These human-annotated features were then used to train a supervised machine learning system to learn to recognize such features in new unseen images.
Utilizing techniques through which machines are trained to detect damages helps in reducing human interaction and improves the performance of dynamic decision making.During the damage assessment stage, visual interpretation of satellite/aerial imagery and videos often proves to be time consuming, inaccurate, and costly.In addition, ancillary data such as UAV products, LiDAR, or GIS databases are mostly unavailable in regions where the death toll is highest [123].Machine learning algorithms actively adapt with and learn the problem without relying on statistical assumptions about data distribution [132].Machine learning algorithms have an overall improved accuracy compared to traditional classification and change detection methods.They work with nonlinear datasets [133], allow learning with limited training data [134], and successfully solve classification problems that are difficult to distinguish.For example, Cooner et al. [135] used high spatial resolution imagery acquired through WorldView-1 and Quickbird 2 to investigate the effectiveness of Artificial Neural Networks and Random Forests in detecting damages to buildings caused by the 2010 Haiti earthquake.
More recently, new learning emerged as one of the breakthroughs in machine learning that could be applied to disaster management.For example, Cha et al. [136] proposed a convolutional neural network approach to detect concrete cracks without the need to calculate a defect feature that is normally impacted by the noise included with the data.The proposed approach classified the detected cracks into different characteristics, including strong light spots, shadows, and very thin cracks and demonstrated robust performance in comparison to traditional edge detection methods (canny and sobel), which fail to provide meaningful crack information.Pouyanfar and Chen [137] proposed an ensemble deep learning framework to extract information from YouTube disaster videos in the context of flood, damage, fire, mud-rock, tornado, and lightning.Videos are firstly preprocessed into shots by boundary detection and key-frame selection, and then features are extracted through deep learning reference models on each key-frame.Well-known classifiers, such as Decision Tree and Support Vector Machine, were applied to the features for such classification.
In addition to real-time damage assessment and disaster monitoring, predictive damage assessment tools can now leverage Machine Learning and historical and Big Data analytics to improve damage forecasting modeling.Asim et al. [138] evaluated four machine learning techniques including the pattern recognition neural network, recurrent neural network, random forest, and linear programming boost ensemble classifier to model relationships between calculated seismic parameters and future earthquake occurrences.In addition, by automating such processes, it becomes possible to respond to the evolving disaster scenarios.

Big Data Cyberinfrastructure
In the context of data recovery and security, Chang [139] proposed a private cloud approach that allows data to be restored to multiple sites with multiple methods in facilitating the organization to recover close to 100 percent of the data.In doing so, data centers must adopt multi-purpose approaches in ensuring that all the data in the Big Data system can be recovered and retrieved without experiencing a prolonged downtime and complex recovery process.
Disaster management becomes more efficient if data is acquired from different sources in a higher spatial and temporal resolution.However, challenges emerge because of the constantly increasing quantum of image and video data.Emerging technological innovations including social media, location-based systems, radio frequency identification, and big data analytics are considered as powerful tools that may help during the disaster management cycle.Processing and analyzing the heterogeneous big disaster data requires efficient data collection, aggregation, information extraction, visualization, and efficient distribution.The growth of data and the need for an efficient distribution makes the development and operation of cyberinfrastructure very demanding.Wan et al. [140] developed a cloud-based flood cyber-infrastructure that collects, organizes, visualizes, and manages several global flood databases in facilitating location-based visualization and statistical analysis for authorities and the public.This system allows the collection of crowdsourcing data from smart phones or the internet to report new flood events.
Belaud et al. [141] proposed a cloud computing platform for scientific simulation that facilitates efficient natural hazards management with an analysis of big data and its exchange between distant locations.This platform provides pre-and post-processing visualization services, 3D large scientific data-set scalable compression and transmission methods, collaborative virtual environments, and 3D visualization.Puthal et al. [142] presented a framework that supports disaster event detection and the generation of alerts by analyzing the data stream, which includes efficient data collection, data aggregation, and efficient dissemination.One of the goals of such a framework is to support an end-to-end security architecture that protects the data stream from unauthorized manipulation, as well as the leakage of sensitive information.Hua et al. [143] proposed a near-real-time and cost-efficient scheme, called SmartEye, in the cloud-assisted disaster environment, which provides efficient image sharing for disaster detection and scene recognition by de-duplicating and aggregating similar features into one single flow.Wilson et al. [88] deployed a computational architecture and analytical capacity within nine days of the Nepal earthquake of 25 April 2015, to provide spatiotemporally detailed estimates of population displacements from call detail records based on the movements of 12 million de-identified mobile phones users.The analysis reveals the detailed mobility patterns of people after the earthquake and the patterns of their return to affected areas.
Bartoli et al. [144] proposed an efficient architecture of a smart public safety platform that performs a smart and functional integration of heterogeneous components as a smart data gathering and analysis system, a novel professional communication system, wireless sensor networks, and social networks.It integrates advanced infrastructures and analysis methods to coordinate the information flow between first responders, public authorities, and citizens.It allows responders and authorities to access the information available on the platform by using an advanced broadband communication system that can receive multimedia data for more effective and efficient operations in the field.Trono et al. [145] developed a Disruption Tolerant Network (DTN) distributed computing system for disaster map data generation and sharing.The system distributes computing tasks to multiple computing resources in minimizing individual computation loads.First responders and volunteers act as mobile sensing nodes as their GPS traces are collected for real-time disaster relief efforts.

Big Data Collection
In disaster management scenarios, a rapidly generated, big volume of data (e.g., 452,000 tweets/min, extreme weather simulation TB/day) needs to be analyzed in a real-time fashion for immediate action.Archives of historical data needs to be shared with researchers online and services of evaluation and validation of analysis methods or models need to be standardized and made publically available.
Disaster management requires a large variety of heterogeneous data from different data sources and might fill in the gaps between them, providing valuable information for all phases in the disaster management cycle.Sensors might generate different types of data, including time series, semi-structured data, and textural data; and these data might include noise and misinformation.Integrating multiple sources of data might contribute to the improvement of data quality and data completeness, but individual data validation needs to be conducted before data integration.Data integration in the context of disaster management can benefit from the semantics or properties associated with the data itself.
Noises and misinformation from big data are almost inevitable as a lot of these are unintentional, especially from social media and crowdsourcing.In addition, data privacy and accuracy issues are still one of the main challenges associated with big data collection, even though protocols and analytical methods for dealing with these issues are crucially required during disaster management.Integrating multiple sources of data can help eliminate such noise and misinformation.Streaming data, such as sensor outputs and crowdsourcing data, needs to be enhanced with anomaly detection to identify incorrect data due to system failure or misguiding data collection methods.Machine learning can contribute to automating such data integrating and the filtering process, and ultimately increase the data quality.

Big Data Analytics
With the synthesis of multi-platform, multi-scale, and multi-discipline data, the capability of predictive modeling of natural disasters should become more efficient.Research and activities related to utilizing the synthesized information and predictive analysis results are expected to better enhance our ability to adaptively respond and plan.It has been found that crowdsourced data, especially the ones provided by the disaster affected people, have significant value during the disaster awareness phase.However, analytical methods are still needed to reliably and accurately integrate these crowdsourced data into the physical sensing data (e.g., satellite, UAV) and authoritative data (e.g., terrain data, census data).Only then can the disaster be effectively characterized in terms of situational awareness, spatiotemporal damage patterns, and community resilience.Consequently, decision-making processes can benefit from the analytical results and design rescue and response efforts in terms of both space and time, for the affected populations and communities.

Cyberinfrastructures
There is crucial need for the design and development of cyberinfrastructures in ways that big data is effectively integrated into disaster sensing, analysis, and response phases to support first responders and disaster management agencies for real-time decision making.Such capabilities of cyberinfrastructure provide decision makers and responders from different organizations and disciplines with shared knowledge and a communicating platform that allows for conducting the disaster response process in an effective manner.Research efforts and related activities are still required to look into the challenges emanating from big sensing data, particularly in the context of the emerging data volume of streaming videos, including efficient data management, fast data transfer, and intuitive data visualization.

Conclusions
In analyzing the recent achievements associated with leveraging Big Data to disaster management, this paper has presented the findings of several researchers on varied scientific and technological perspectives that have a bearing on the efficacy of big data in facilitating disaster management.It has become apparent that in the present age of information technology, a major objective of scientists is to analyze the varied aspects of big data and find ways of making the best of the available technologies in storing the available information in well-integrated structures and using it for the welfare of human societies, particularly in the context of using bid data to effectively deal with natural disasters.The paper has analyzed major big data sources that are valuable in disaster management.A detailed analysis has been conducted to highlight the significance of different big data sources in various disaster management phases.The main challenges pertain to effectively dealing with data collection and management technologies and developing efficient systems of mitigating the adversities associated with natural disasters and managing disasters in ways that result in minimum losses to human lives and property.Other challenges pertain to developing algorithms by way of systems that can be used in resolving operational issues and attaining greater accuracy in predicting disasters.The paper has highlighted the need for further research on big data applications in enhancing the efficiency of the public sector in further developing technology to mitigate the adverse effects of natural disasters.Overall, further research efforts need to be made to look into the challenges emerging from Big Sensing Data, particularly in the context of the emerging data volume of streaming videos, including efficient data management, fast data transfer, and intuitive data visualization.

Figure 1 .
Figure 1.Distribution of reviewed articles by year of publication.

Figure 2 .Figure 1 .
Figure 2. Distribution of reviewed articles by major data sources and year of publication.

Figure 2 .Figure 2 .
Figure 2. Distribution of reviewed articles by major data sources and year of publication.

Figure 4 .
Figure 4. Distribution of the reviewed article with subtopic "machine learning" and "cyberinfrastructure" respectively over the recent years.

Figure 4 .
Figure 4. Distribution of the reviewed article with subtopic "machine learning" and "cyberinfrastructure" respectively over the recent years.

Table 1 .
Mapping disaster management phases with major data sources and application fields.

Table 1 .
Mapping disaster management phases with major data sources and application fields.