Post-Disaster Building Database Updating Using Automated Deep Learning: An Integration of Pre-Disaster OpenStreetMap and Multi-Temporal Satellite Data

: First responders and recovery planners need accurate and quickly derived information about the status of buildings as well as newly built ones to both help victims and to make decisions for reconstruction processes after a disaster. Deep learning and, in particular, convolutional neural network (CNN)-based approaches have recently become state-of-the-art methods to extract information from remote sensing images, in particular for image-based structural damage assessment. However, they are predominantly based on manually extracted training samples. In the present study, we use pre-disaster OpenStreetMap building data to automatically generate training samples to train the proposed deep learning approach after the co-registration of the map and the satellite images. The proposed deep learning framework is based on the U-net design with residual connections, which has been shown to be an e ﬀ ective method to increase the e ﬃ ciency of CNN-based models. The ResUnet is followed by a Conditional Random Field (CRF) implementation to further reﬁne the results. Experimental analysis was carried out on selected very high resolution (VHR) satellite images representing various scenarios after the 2013 Super Typhoon Haiyan in both the damage and the recovery phases in Tacloban, the Philippines. The results show the robustness of the proposed ResUnet-CRF framework in updating the building map after a disaster for both damage and recovery situations by producing an overall F 1 -score of 84.2%.


Introduction
Post-disaster map updating is one of the essential tasks to support officials/governments to make decisions, policies, and plans for both the response phase to conduct emergency actions and the recovery phase to return to normalcy after the event-even to build back better as per the Sendai Framework [1].Buildings constitute an essential land cover class in the affected area.Consequently, updating the building database is vital to provide accurate information related to demolition, reconstruction, and building modification that is taking place during the response and recovery phases [2].Building map updating requires new building data for detecting changes in the status of the buildings and the identification of newly built ones.Satellite remote sensing (RS) has become an essential and quick tool for acquiring suitable geospatial data given its synoptic coverage and the fact that it is readily available.In addition, the availability of free high-resolution images that are provided by platforms such as Google Earth has also been attracting researchers in the remote sensing domain to focus on image-based building detection and mapping [3,4].
Building database updating can be done based on two general frameworks: building detection/extraction from mono temporal RS data and building change detection using multi-temporal RS data [5][6][7].The second framework is the focus of this study and usually comprises two main steps: 1-extracting buildings and 2-detecting changes.Change detection approaches can be grouped based on the type of data they use: 1.
Multi-temporal RS data: The multi-temporal data are directly considered to detect changes, for example, from multi-temporal satellite images with a pixel-by-pixel or an object-/feature-based change analysis by comparing two images [8,9].2.
Multi-temporal RS and map data: In this approach, the multi-temporal RS data are classified using additional support from existing maps by providing guidance in training area selection [10] or excluding non-building pixels based on a probability analysis [11,12].Then, the maps or the classified building images are compared to detect changes in buildings in an object-oriented manner [13,14].

3.
Monocular RS and old map data: In many cases, pre-disaster high-resolution RS data of the affected region do not exist, precluding method 1 from being used.However, the old geo-databases containing building information can be used to guide the method to find changes in the building stock [15][16][17].This method is more complicated than the previous one because it contains a level of generalization and abstraction [18,19], and existing databases may not accurately reflect the immediate pre-disaster situation.However, the method can provide valuable information about relevant feature classes [20].4.
Height-related data: Approaches that use height data such as Digital Surface Models (DSMs), including height information obtained through Light Detection And Ranging (LiDAR) and Unmanned Aerial Vehicle (UAV) data.Height-related data from DSMs and LiDAR data are generally utilized as changed or non-changed features to detect building changes [21][22][23][24][25].
In this paper, we propose a framework to provide automatic updating of the building database from very high resolution (VHR) satellite images and outdated map data.OpenStreetMap (OSM) data were selected to be used as the reference building layer due to their free availability.OSM provides global coverage of crowdsourced geoinformation and has become the premier Volunteered Geographic Information (VGI)-derived cartographic database, though with spatially varying data quality.OSM has been proven to be even more accurate than proprietary data in some areas (e.g., generally in Europe) [26], while not offering the same accuracy or completeness in many more remote parts of the world [27,28].Since a wide range of volunteers, including highly-experienced and amateurs, contributes to the OSM data collection, certain limitations apply when dealing with OSM data [29].OSM building datasets contain a number of errors: a) omission errors, which indicate actual buildings that are not mapped in OSM, b) spatial mismatch/shift of the building footprints compared to the satellite images, c) geometric errors, which indicate that the size and/or shape of the actual buildings do not match with the mapped ones, and d) thematic information/label errors, which indicate that the information regarding the types/use of the buildings does not match the actual building use/information.In addition, e) mismatch of the building rooftop with the footprints can occur in satellite images due to the incident angle of the sensor during their acquisition.
Since the present study is dedicated to automatic building database updating from multi-temporal VHR satellite images and (potentially outdated) OSM map data, we primarily discuss the previous studies that used RS datasets guided by old map data to generate new building maps.
Fiset et al. [18] introduced the use of maps to guide a change detection algorithm to update road network maps.Afterwards, Bentabet et al. [19] proposed a new approach to refine and update road vector data using Synthetic-Aperture Radar (SAR) images.Regarding building change detection using existing map data, Knudsen and Olsen [10] developed a method that combines supervised and unsupervised comparisons and tested their method to update Denmark's map database.They used a conventional classification method to extract buildings, which produced a high amount of false positives in detecting buildings.Bouziani et al. [15] proposed a heuristic technique based on contextual information obtained from old map data.Their method was based on object-based image analysis (OBIA) classification and fuzzy logic-based rules to extract the changed buildings from VHR images.However, the inclusion of rules based on color, size, and spectral information results in the accuracy being strongly correlated with the probability of properly tuning the OBIA parameters.Building change detection was done in [30] by first segmenting DSM information that was produced from laser data.Then, aerial images and laser data were used to obtain a classification map, which was further refined through information from an old map.Results were promising and exhibited accuracy values in terms of completeness and correctness of about 85% for buildings that were larger than 60 m 2 .However, their method is highly dependent on the quality of the DSM data, which are critical for the removal of non-building features.Le Bris and Chehata [16] conducted a comparison analysis including methods that rely on images and old maps and strategies based on multi-temporal images for building map updating.They concluded that such methods were not appropriate for operational uses.Similar approaches were investigated by Malpica et al. [31] and Gharibi et al. [17], who used old map data and a LiDAR-based nDSM to guide a Support Vector Machine (SVM) classification and Level Set method for building change detection.Although their methods provided good results, nDSM data were again critical for building detection.This departs from what we propose in the present paper, in which only multispectral VHR satellite images were used, therefore without considering any products that include height information.Furthermore, most of the previous studies did not consider the building change in different scenarios, particularly in a disaster situation.For example, a pre-event building may either be damaged and rebuilt or demolished, change into a different shape or type during recovery, or be rebuilt in a new place.
Concurrent with the development of map-guided building change detection methods, computer vision and RS data processing methods have evolved and based on recent advances in computer hardware systems, researchers can readily run deep neural network-based models.Deep learning and convolutional neural networks (CNN) have been investigated and have become state-of-the-art networks for many computer vision tasks.These methods have also been used for RS data processing problems such as scene classification [32,33], hyperspectral image classification [34][35][36], object detection [37,38], image retrieval [39], multi-modal data fusion [40,41], and change detection [42][43][44].However, the developed deep learning-based change detection methods aim at detecting scene-based changes rather than a specific object and need further processes to be used in RS applications.Deep learning, in particular CNN, has also been used for disaster-based applications, such as structural damage assessment [45][46][47][48][49], as well as landslide [50,51] and fire detection [52].Most of the developed methods for building damage assessments require VHR UAV images and/or 3D point clouds and aim at only assessing structural damages.However, changes in the status of the buildings months and years after a disaster provide crucial information for post-disaster recovery assessment, which is addressed in the present paper.In a recent study, Ji et al. [53] proposed a method to detect collapsed buildings after an earthquake using pre-and post-disaster satellite imagery.Their method made use of CNN-based features that were used in a random forest classifier to do the detection work.Although the method was able to extract collapsed buildings with high accuracy, the building map was manually generated, which was a subjective and time-consuming task.This problem was overcome by our fully-automated methodology based on the OSM building map.Furthermore, their method was aimed at detecting collapsed building after an earthquake, which is inherently unsuitable for detecting new buildings or changes in building size or shapes during the recovery/reconstruction phase as exploited in our current work.
In the present paper, we adapted the deep residual U-net (ResUnet) developed by [54] as the classifier, and the training area was selected automatically using the pre-disaster building OSM data after a preprocessing step that co-registered the OSM and satellite data.The network was first trained on the pre-disaster OSM data and pre-disaster satellite images and was then fine-tuned using the same building map after conducting a building-based change detection process from a post-disaster satellite image.The change detection step was done based on two textural measurements to select appropriate training areas.Two measures, i.e., Variation-Histogram of the Oriented Gradients (V-HOG) and Edge Density Index (EDI), were considered to perform the change detection.This step was essential to exclude those buildings that may have been destroyed or damaged during the disaster from the training set when retraining the network for the period just after the disaster.Furthermore, by fine-tuning the pre-trained network on the post-disaster images after conducting the change detection step, the proposed method was able to detect buildings in the post-disaster (recovery) time that were newly constructed and to extract changes in the size or shapes of the existing buildings.As the final step, Conditional Random Field (CRF) was performed to refine the boundaries and improve the classification results, which is similar to the methods that were investigated in [55,56].The proposed framework was evaluated using WorldView2 satellite images of Tacloban, the Philippines, which was hit by Typhoon Haiyan (Yolanda) in 2013.Images were acquired one month before the disaster, plus three days and then four years after the disaster.

Materials and Methods
In this paper, we propose a framework for updating the building database after a disaster through an automated ResUnet-CRF, using outdated OSM building data and multi-temporal satellite images (Figure 1).The proposed approach consists of four main steps.
same building map after conducting a building-based change detection process from a post-disaster satellite image.The change detection step was done based on two textural measurements to select appropriate training areas.Two measures, i.e., Variation-Histogram of the Oriented Gradients (V-HOG) and Edge Density Index (EDI), were considered to perform the change detection.This step was essential to exclude those buildings that may have been destroyed or damaged during the disaster from the training set when retraining the network for the period just after the disaster.Furthermore, by fine-tuning the pre-trained network on the post-disaster images after conducting the change detection step, the proposed method was able to detect buildings in the post-disaster (recovery) time that were newly constructed and to extract changes in the size or shapes of the existing buildings.As the final step, Conditional Random Field (CRF) was performed to refine the boundaries and improve the classification results, which is similar to the methods that were investigated in [55,56].The proposed framework was evaluated using WorldView2 satellite images of Tacloban, the Philippines, which was hit by Typhoon Haiyan (Yolanda) in 2013.Images were acquired one month before the disaster, plus three days and then four years after the disaster.

Materials and Methods
In this paper, we propose a framework for updating the building database after a disaster through an automated ResUnet-CRF, using outdated OSM building data and multi-temporal satellite images (Figure 1).The proposed approach consists of four main steps.

Step 1: Co-Registration of OSM Data and Satellite Images
Since the aim of this study was to use OSM building data as a mask to extract building training samples from the pre-disaster image for a CNN-based approach, we implemented simple preprocesses to create accurate training samples.Since it was observed that the shift on the building footprints was not systematic and the shift direction and amount differs substantially across the area, initially the downloaded OSM data for the pre-disaster time were separated into five sections/regions.Then, rubber sheeting was implemented in ArcGIS, which can also handle small geometric correction as well as shifting the vector maps to adjust and align the building map to the building rooftops in the pre-disaster image (Figure 2).In order to achieve good results from the application of the rubber sheeting method, five geographically well-distributed points within each region of interest were used so that the points cover at least the center and the four main directions.Furthermore, the post-disaster satellite images were co-registered/rectified according to the pre-disaster image using ArcGIS by selecting geo-rectification points.

Step 1: Co-registration of OSM Data and Satellite Images
Since the aim of this study was to use OSM building data as a mask to extract building training samples from the pre-disaster image for a CNN-based approach, we implemented simple preprocesses to create accurate training samples.Since it was observed that the shift on the building footprints was not systematic and the shift direction and amount differs substantially across the area, initially the downloaded OSM data for the pre-disaster time were separated into five sections/regions.Then, rubber sheeting was implemented in ArcGIS, which can also handle small geometric correction as well as shifting the vector maps to adjust and align the building map to the building rooftops in the pre-disaster image (Figure 2).In order to achieve good results from the application of the rubber sheeting method, five geographically well-distributed points within each region of interest were used so that the points cover at least the center and the four main directions.Furthermore, the post-disaster satellite images were co-registered/rectified according to the predisaster image using ArcGIS by selecting geo-rectification points.

Step 2: Training Patch Generation from the Pre-Disaster Image
Pre-processed data from step 1 were used to automatically generate training samples from the pre-disaster image.Although the mismatch between OSM building footprints and the actual buildings in the pre-disaster images was mostly corrected for in step 1, some matching errors remained.For example, in the case where a building near a vegetated area has a mismatch, the building mask might contain vegetation pixels.In addition, even for a correct match of building and OSM map, some non-building pixels might end up inside the training samples, e.g., a tree may partially cover a rooftop of a building.This might also occur where buildings are next to the sea/water bodies, which may lead to the inclusion of water pixels in the training samples.Hence, to overcome these issues, the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) indices were computed for the pre-disaster image to exclude vegetated areas, trees, and water bodies from the building mask.To do so, NDVI and NDWI masks were computed based on pre-defined thresholds and those pixels falling into the masks were removed from the building

Step 2: Training Patch Generation from the Pre-Disaster Image
Pre-processed data from step 1 were used to automatically generate training samples from the pre-disaster image.Although the mismatch between OSM building footprints and the actual buildings in the pre-disaster images was mostly corrected for in step 1, some matching errors remained.For example, in the case where a building near a vegetated area has a mismatch, the building mask might contain vegetation pixels.In addition, even for a correct match of building and OSM map, some non-building pixels might end up inside the training samples, e.g., a tree may partially cover a rooftop of a building.This might also occur where buildings are next to the sea/water bodies, which may lead to the inclusion of water pixels in the training samples.Hence, to overcome these issues, the Normalized Difference Vegetation Index (NDVI) and Normalized Difference Water Index (NDWI) indices were computed for the pre-disaster image to exclude vegetated areas, trees, and water bodies from the building mask.To do so, NDVI and NDWI masks were computed based on pre-defined thresholds and those pixels falling into the masks were removed from the building training class.Training patches with a height and width of 512 pixels were extracted from the entire image.Moreover, in order to increase the training samples, more patches were generated from the same area by shifting the starting point of the generation of the patches by 100 pixels in both x and y directions.This procedure was conducted three times to obtain different image patches from the same area and then the suitable training samples were selected from those to train the network.In total, 780 image patches were selected to be used as the initial training sample set.In addition, 542 image patches were selected for each of the disaster and post-disaster images to be used for fine-tuning of the model.

Step 3: Detecting Damaged and Demolished Buildings
The OSM building map represents the pre-disaster time; however, since some buildings get damaged during the disaster and are later demolished, direct use of those maps for the training area selection from post-disaster images will lead to inaccurate results as rubble and damaged buildings are included.Hence, the damaged and destroyed buildings should be excluded from the building footprint map before using them for training area selection for post-disaster time images.Since the extraction of the buildings will be based on the advanced proposed deep learning approach, a simple yet accurate method was developed only for the identification of the intact and damaged buildings from the provided OSM building map.
Two measurements based on the Histogram of the Oriented Gradients (HOG) and the edge detection results of the satellite images, namely Variation-HOG (V-HOG) and Edge Density Index (EDI), were used to conduct the change detection between the pre-disaster and post-disaster satellite images.The change detection was performed only on the building masks of the images to distinguish the damaged/demolished and intact buildings in the post-disaster image.

Variation of HOG (V-HOG)
HOGs provide powerful features [57] for image representation, which are particularly robust for image-based object classification.They were initially developed for pedestrian identification [58], however they were then found to be robust features in different applications [58], including for remote sensing data [38,[59][60][61] and for image-based damage detection [62].
The standard approach was used to extract the HOG features, which starts by computing the gradient angles of the image and their magnitude distributions.Then, the images were split into cells of size (a × b).Gradient images were split into overlapping blocks in a manner that each block contained 50% overlap with the cells.Then, the orientation of the gradients was computed based on the defined bin size.The histogram of the oriented gradients was computed as a vector and concatenated for each block after adding the normalized magnitude of the gradients.Since damaged areas have a larger HOG distribution compared to intact buildings, we considered the V-HOG to compute the variation of the normalized magnitude of the gradients of the bins to detect damaged buildings (Figure 3e).Hence, a higher variation of the HOG descriptor (higher V-HOG value) represented damaged areas, while small V-HOG values indicated intact buildings.The V-HOG can be computed for each block or each pixel, similarly to the HOG.However, HOG features may show high variation in some cases due to color differences between pixel values in building roofs.For example, a building may contain more than one color in its rooftop, which was overcome by conducting a building-based change analysis rather than considering only the mono temporal image.This rasterized value can be used simply by defining a threshold in the change in mean of V-HOG from the pre-to the post-disaster image to distinguish damaged/demolished buildings from the intact ones.

Edge Density Index (EDI)
Edge detection results have previously been employed to refine image classification [63] and object boundary detection [64].In our study, we used the edge detection results to detect changes in building status, i.e. to differentiate among damaged, demolished, and intact buildings.Since damaged buildings are expected to contain greater variations in their pixel values inside the building footprints due to the presence of debris/rubble, they were expected to contain more edge pixels when compared to intact buildings, which had more stable color variation.Accordingly, the number of edge pixels along a building that was damaged during a disaster was higher than that of buildings in the pre-disaster situation that were not damaged (Figure 3c,f).Since the size of a building could vary from large factories to very small slum dwellings, the number of edge pixels should be considered based on the corresponding building size.Hence, Edge Density Index (EDI) was proposed, which measures the percentage of edge pixels within a building area, and if the change was higher than the defined threshold, the building was considered to be damaged/demolished.Edges were detected using the Canny edge detector [65], and its two parameters were set to extract even weak edge pixels from the images.
After a disaster and during the reconstruction process, the rooftop color of a building may change and thus, conventional techniques that perform direct change detection such as pixel valuebased subtraction methods [66,67] were not suitable for this aim.However, the two proposed indices were not sensitive to the changes in the rooftop colors of buildings from the pre-to the post-disaster scenario.In addition, since the change detection is at a building level and is followed by an advanced deep learning approach, a simple yet accurate method is required rather than more complicated methods that include contextual information [68].The buildings in each image patch were considered individually and each building was taken into account at each time.Subsequently, the intact buildings were extracted.Furthermore, NDVI and NDWI were used to remove vegetated areas and water bodies and to refine the results from the building mask in the post-disaster image.Only predisaster OSM building data were used for the damage analysis (Figure 3a) and thus, the damaged

Edge Density Index (EDI)
Edge detection results have previously been employed to refine image classification [63] and object boundary detection [64].In our study, we used the edge detection results to detect changes in building status, i.e., to differentiate among damaged, demolished, and intact buildings.Since damaged buildings are expected to contain greater variations in their pixel values inside the building footprints due to the presence of debris/rubble, they were expected to contain more edge pixels when compared to intact buildings, which had more stable color variation.Accordingly, the number of edge pixels along a building that was damaged during a disaster was higher than that of buildings in the pre-disaster situation that were not damaged (Figure 3c,f).Since the size of a building could vary from large factories to very small slum dwellings, the number of edge pixels should be considered based on the corresponding building size.Hence, Edge Density Index (EDI) was proposed, which measures the percentage of edge pixels within a building area, and if the change was higher than the defined threshold, the building was considered to be damaged/demolished.Edges were detected using the Canny edge detector [65], and its two parameters were set to extract even weak edge pixels from the images.
After a disaster and during the reconstruction process, the rooftop color of a building may change and thus, conventional techniques that perform direct change detection such as pixel value-based subtraction methods [66,67] were not suitable for this aim.However, the two proposed indices were not sensitive to the changes in the rooftop colors of buildings from the pre-to the post-disaster scenario.In addition, since the change detection is at a building level and is followed by an advanced deep learning approach, a simple yet accurate method is required rather than more complicated methods that include contextual information [68].The buildings in each image patch were considered individually and each building was taken into account at each time.Subsequently, the intact buildings were extracted.Furthermore, NDVI and NDWI were used to remove vegetated areas and water bodies and to refine the results from the building mask in the post-disaster image.Only pre-disaster OSM building data were used for the damage analysis (Figure 3a) and thus, the damaged buildings (in the event time case) and demolished buildings (in the recovery case) were detected based on changes in the mean V-HOG and EDI.

Step 4: Updating the Building Database
The output of step 3 was affected by three main problems: 1) buildings that were present in the images but were missing in the OSM building map data could not be detected by the procedure that was implemented in step 3; 2) inaccuracies may occur especially due to mismatches and inaccuracies of OSM building map data.This, for example, will classify an intact building as damaged due to connected and adjacent buildings; and 3) it only gives changes of buildings existing before the disaster, therefore missing the capability to extract newly constructed buildings.To overcome these relevant issues, step 3 was followed by step 4, which is based on an automated deep learning-based network, as we will detail later.Furthermore, since the method is a pixel-level classification, it can extract the shape and size of the buildings and, thus, their changes.The method is primarily based on the adapted deep residual U-net [54] to automatically train and detect buildings for post-disaster situations.U-net has been shown to be reliable for image segmentation tasks [69][70][71] and residual connection has also been demonstrated as one of the effective network designs to detect building damages [41].Figure 4 shows the DeepResUnet-CRF design that was used in our study.We only used historical OSM data for the initial training of the network from the pre-disaster image, in which there were inaccuracies even after refinements of the OSM building maps.Therefore, only suitable ones were selected to train the network.In addition, transfer learning has been shown to be an effective method to accelerate the training process and increase the accuracy performance of the network in several computer vision [72][73][74] and remote sensing applications [71,75,76].Hence, the Resnet34 trained network from ImageNet was considered as a pre-trained network.Given that every satellite image may differ from the other ones in terms of image characteristics (e.g., radiometric range values) and changes in building properties (e.g., colors) after a disaster, the network that was trained on the pre-disaster situation/images may not provide accurate results for the post-disaster situations/images.Hence, the results of step 3 were used to generate new samples from the associated post-disaster satellite images to fine-tune the pre-trained network.
Remote Sens. 2019, 10, x FOR PEER REVIEW 8 of 20 buildings (in the event time case) and demolished buildings (in the recovery case) were detected based on changes in the mean V-HOG and EDI.

Step 4: Updating the Building Database
The output of step 3 was affected by three main problems: 1) buildings that were present in the images but were missing in the OSM building map data could not be detected by the procedure that was implemented in step 3; 2) inaccuracies may occur especially due to mismatches and inaccuracies of OSM building map data.This, for example, will classify an intact building as damaged due to connected and adjacent buildings; and 3) it only gives changes of buildings existing before the disaster, therefore missing the capability to extract newly constructed buildings.To overcome these relevant issues, step 3 was followed by step 4, which is based on an automated deep learning-based network, as we will detail later.Furthermore, since the method is a pixel-level classification, it can extract the shape and size of the buildings and, thus, their changes.The method is primarily based on the adapted deep residual U-net [54] to automatically train and detect buildings for post-disaster situations.U-net has been shown to be reliable for image segmentation tasks [69][70][71] and residual connection has also been demonstrated as one of the effective network designs to detect building damages [41].Figure 4 shows the DeepResUnet-CRF design that was used in our study.We only used historical OSM data for the initial training of the network from the pre-disaster image, in which there were inaccuracies even after refinements of the OSM building maps.Therefore, only suitable ones were selected to train the network.In addition, transfer learning has been shown to be an effective method to accelerate the training process and increase the accuracy performance of the network in several computer vision [72][73][74] and remote sensing applications [71,75,76].Hence, the Resnet34 trained network from ImageNet was considered as a pre-trained network.Given that every satellite image may differ from the other ones in terms of image characteristics (e.g., radiometric range values) and changes in building properties (e.g., colors) after a disaster, the network that was trained on the pre-disaster situation/images may not provide accurate results for the post-disaster situations/images.Hence, the results of step 3 were used to generate new samples from the associated post-disaster satellite images to fine-tune the pre-trained network.The fully connected networks and U-net have a common limitation in image segmentation tasks, which is the smoothing of edges.In addition, since the OSM building map did not provide a precise building mask, particularly for the building boundaries, we verified inaccuracies in some parts of the images.This problem was alleviated by implementing a Conditional Random Field method (CRF), which has been primarily investigated in the literature as a refinement over the U-net or Fully Connected Networks (FCNs) results [56,77,78].Accordingly, a fully/dense CRF model developed by Krähenbühl and Koltun [79] was employed to optimize the ResUnet results.
Labels for each pixel can be considered as random variables and their relations in the image can be considered as edges in a graph-based theory.These two factors constitute a conditional random field.In dense CRF, two main factors in its energy function are the unary and pairwise potentials.
Let x be the pixel-level labels for the input image, then the unary potential ϕ i (x i ) represents the probability of each i pixel and the pairwise potential τ i,j x i , x j that represents the cost between labels at i, j pixels is computed as follows: where l i and C i are the position and color vector for pixel i. µ x i , x j is defined based on the Potts model [80] and is equal to one if x i x j , otherwise it is equal to zero.The first Gaussian expression considers both color and location of the pixels, which is an appearance kernel to consider the similarity of the adjacent pixels using the θ α and θ β parameters and the second expression only considers pixel positions and is for smoothness using θ γ as the control parameter.
Then, the energy function can be written as follows: The CRF is an iterative method that evolves and computes the labels and predictions.

Datasets
We tested the proposed post-disaster building database updating framework on satellite images of Tacloban city, the Philippines, which was hit by super Typhoon Haiyan in November 2013, resulting in massive damages and losses (Figure 5).Tacloban is a highly urbanized city that is extensively vegetated due its tropical location.There are several types of built-up regions in the city, including dense urban areas, which are mostly located in the central business district of the city with adjacent buildings, slum areas, a mix of slum and formal buildings, isolated buildings surrounded by dense vegetation/trees, various building shapes and sizes from very small slum dwellings to large factories, and diverse rooftop colors.All these characteristics made the building detection procedure challenging and a suitable test area for our method.The WorldView2 (WV2) pan-sharpened images with 0.5 m spatial resolution and four multispectral bands (Blue, Green, Red, NIR) acquired 8 months before, 3 days after, and 4 years after the Typhoon were used in the work.The selection of four bands (Red, Green, Blue, and NIR) instead of using the entire eight bands that are available in the satellite images is to reduce the computational complexity/time of the processes while using the most informative bands of the satellite images for our goal.In addition, OSM (historical) building data for 2013 that were obtained from the OSM platform were used as the pre-disaster building map data.

Experimental Settings
The proposed method was applied to 10 selected image patches to evaluate its capabilities on urban areas characterized by various building and environmental characteristics.The test images were not included in the training process of the network and were specifically selected to test the performance of the proposed approach in various environmental/data sets-based conditions, as well as different damage and recovery (reconstruction) scenarios (Table 1).
Table 1.The targeted post-disaster building detection scenarios for each selected test images.

Image Targeted Post-Disaster Building Detection Scenarios #1
Buildings that survived the disaster #2 Partially destroyed slums and formal buildings #3 Buildings surrounded by flood water #4 Completely destroyed slums that produced an extensive amount of debris #5 Partially damaged factory buildings #6 Reconstructed and not-reconstructed (completely cleared/removed) buildings after 4 years #7 Reconstruction of the buildings almost to the same amount, shape, and sizes #8 Construction of new buildings and changes in rooftop colors in the recovery phase #9 Clear expansion of the built-up area and construction of new buildings #10 Change in the size of the reconstructed factory building Table 2 presents the parameters and thresholds that were employed in the implementation of the developed method.The results that were obtained by the proposed automatic procedure were compared with reference data that were produced manually by a qualified human operator.The WorldView2 (WV2) pan-sharpened images with 0.5 m spatial resolution and four multispectral bands (Blue, Green, Red, NIR) acquired 8 months before, 3 days after, and 4 years after the Typhoon were used in the work.The selection of four bands (Red, Green, Blue, and NIR) instead of using the entire eight bands that are available in the satellite images is to reduce the computational complexity/time of the processes while using the most informative bands of the satellite images for our goal.In addition, OSM (historical) building data for 2013 that were obtained from the OSM platform were used as the pre-disaster building map data.

Experimental Settings
The proposed method was applied to 10 selected image patches to evaluate its capabilities on urban areas characterized by various building and environmental characteristics.The test images were not included in the training process of the network and were specifically selected to test the performance of the proposed approach in various environmental/data sets-based conditions, as well as different damage and recovery (reconstruction) scenarios (Table 1).
Table 1.The targeted post-disaster building detection scenarios for each selected test images.

Image
Targeted Post-Disaster Building Detection Scenarios #1 Buildings that survived the disaster #2 Partially destroyed slums and formal buildings #3 Buildings surrounded by flood water #4 Completely destroyed slums that produced an extensive amount of debris #5 Partially damaged factory buildings #6 Reconstructed and not-reconstructed (completely cleared/removed) buildings after 4 years #7 Reconstruction of the buildings almost to the same amount, shape, and sizes #8 Construction of new buildings and changes in rooftop colors in the recovery phase #9 Clear expansion of the built-up area and construction of new buildings #10 Change in the size of the reconstructed factory building Table 2 presents the parameters and thresholds that were employed in the implementation of the developed method.The results that were obtained by the proposed automatic procedure were compared with reference data that were produced manually by a qualified human operator.Accuracies were assessed using common precision, recall, F 1 -score, and Intersection over Union (IoU) [3] measurements, all of which were computed at the pixel-level.Therefore, initially all the pixels in the image were sorted into four classes: True Positive (TP), True Negative (TN), False Positive (FP), and False Negative (FN).TP and TN show the correct detections, while FP and FN show incorrect detection results.Accordingly, the abovementioned quality measurements can be computed as follows: where |.| denotes the number of pixels assigned to each distinct class and F 1 -score is the combination of precision and recall into a single score.The accuracy values of the proposed approach in extracting buildings from the selected image patches, which are representative of different scenarios, shows the performance of the method in such challenges and conditions.

Experimental Results and Discussion
The implementation of the Deep ResUnet was carried out on the LISA platform of the SURFSara Dutch supercomputer.This platform is widely available for academic organizations.The pre-processing of the OSM data and the image rectifications were conducted in ArcGIS.
Figure 6 shows the automated post-disaster building detection results for the 10 selected images.From those, five images were selected from the satellite image, which were acquired 3 days after Typhoon Haiyan, and the other five images were acquired 4 years after to test the performance of the proposed method in both the response (damage) and recovery (reconstruction) phases.The TP, FP, and FN pixels are illustrated and overlaid on the original images by assigning green, red, and blue colors, respectively.In addition, the pre-disaster OSM building map overlaid (yellow color) on the pre-disaster satellite images is shown in the first column of Figure 6 to illustrate the changes after the disaster.Qualitative analysis of the results based on the visual interpretation showed the robustness of the proposed method in the extraction of the post-disaster buildings in such a challenging case study by producing more TPs (i.e., green areas), while limiting the FPs (i.e., red areas).Furthermore, the quantitative assessment of the results also supported this statement.The overall F1-score for the event-time and recovery time images was 84.2%, the overall precision was 84.1%, and the overall recall was 84.4% (Table 3).The balance between these accuracy measurements also showed the efficiency of the proposed method.In addition, the overall IoU of 73.1% for such a challenging test area demonstrates the performance of the proposed method in extracting building boundaries and their overlap with actual building footprints.Qualitative analysis of the results based on the visual interpretation showed the robustness of the proposed method in the extraction of the post-disaster buildings in such a challenging case study by producing more TPs (i.e., green areas), while limiting the FPs (i.e., red areas).Furthermore, the quantitative assessment of the results also supported this statement.The overall F 1 -score for the event-time and recovery time images was 84.2%, the overall precision was 84.1%, and the overall recall was 84.4% (Table 3).The balance between these accuracy measurements also showed the efficiency of the proposed method.In addition, the overall IoU of 73.1% for such a challenging test area demonstrates the performance of the proposed method in extracting building boundaries and their overlap with actual building footprints.The main challenges that were experienced in this case study were: (i) different textures of the building rooftops and in some cases, their similarity with other land covers (e.g., bare soil), which made the change detection step challenging; (ii) inaccuracies in the OSM map data, which influenced the change detection and extraction procedures (e.g., mismatches of the OSM building map with actual buildings in the satellite images and missing boundaries for some buildings); (iii) the complexity of the scene to perform the building extraction task (e.g., mixture of slums and formal buildings, even in the business district of the city, and buildings with various colors, shapes, and other building characteristics).The proposed method produced 81.0% and 68.4% mean F 1 -score and IoU accuracies, respectively, for the images belonging to the event time and 87.4% and 77.7% mean F 1 -score and IoU, respectively, for the recovery images.The lower accuracy for the event time images was due to the presence of a large amount of debris in the areas just after the disaster.From the event time images, test image #5 produced the best accuracy values with 88.3% F 1 -score (Table 3), while the lowest accuracy belonged to image #3, with a 73.7% F 1 -score.The most important reason for the low performance was the similarity of the texture and color of the buildings to the damaged ones, which resulted in more false positives (i.e., red areas) in the results.In addition, image #4 produced a 76.4% F 1 -score due to the presence of large amounts of debris around the intact buildings, which also led to more false positives.Also, the similarity of some intact building rooftop colors and texture to debris resulted in more false negatives (i.e., blue colored areas).Moreover, images #1 and #2 demonstrated the efficiency of the proposed method in extracting buildings in dense urban areas, as well as a mixture of formal building and slum areas, by producing a 82.1% and 84.5% F 1 -score, respectively.However, one large dark-green colored building was not detected in this image when using the proposed method, which could be down to two reasons: the lack of dark-green colored buildings in the training samples and/or the building was only detected partially, which was later removed during post-processing, particularly during the implementation of the CRF method (Figure 7).Although CRF removed some pixels that were correctly classified as building in the previous step, it led to an overall increase in the F 1 -score from 73.7% to 82.1% (Figure 7).Furthermore, image #5 showed the robustness of the method in extracting even partially damaged building.In addition, the low F 1 -score (54.09%) and IoU accuracy (37.1%) values that were produced by the network that was initially trained without fine-tuning for image #1 shows the significance of this step in improving the performance of the final building database updating results (Figure 7).The areas within the yellow boundaries denote the dark green buildings to stress the effect of the CRF in the final result.The areas within the purple boundaries denote the inaccuracies in the results, which were extracted using the network that was initially trained without fine-tuning.Green, red, and blue pixels represent TP, FP, and FN, respectively.

Conclusions
In this paper, we proposed a novel framework to update the post-disaster building database from VHR satellite imagery coupled with a leading online collaborative and open access map database, OSM.The approach was based on the automated training area generation for the proposed ResUnet-CRF network for building extraction.In addition, the proposed EDI and V-HOG indices confirmed a reliable performance in detecting changes of built-up areas from multi-temporal satellite imagery to distinguish between damaged and intact buildings.This was then used as a preprocessing step in the automatic training area selection for the post-disaster building extraction (both for immediately after the disaster and the recovery phase).
Experiments that were performed on 10 test images that were selected from the study area (VHR images) demonstrated that the proposed approach produced robust results in updating the building database in different post-disaster scenarios, such as damaged, collapsed, reconstructed, newly built, and demolished buildings, using diverse building characteristics such as color, shape, and size of the buildings under challenging environmental conditions.Indeed, the efficacy of the proposed method was independent from building characteristics.Although the proposed method performed efficiently in this case study, it would likely produce even higher accuracies when OSM data are more accurate, such as in large cities.Assessing the impact of registering and modifying the OSM map and satellite images before training the network is also of interest for future studies.The inaccuracies in the OSM data (i.e.mismatching of the building footprint with the actual ones in the satellite images) led to smoothness in the edge of the buildings, which was mostly overcome through the CRF method.The CRF method showed a fairly good performance in refining the ResUnet results of the building boundaries; however, it did not perform well in the pixel brightness value-based refinements.This was also expected due to the complexity of the study area, in which the color variation was high and there was a strong similarity between the building and non-building classes in some parts of the images in terms of color and texture.In this case, in future research, spatial context-based approaches The areas within the yellow boundaries denote the dark green buildings to stress the effect of the CRF in the final result.The areas within the purple boundaries denote the inaccuracies in the results, which were extracted using the network that was initially trained without fine-tuning.Green, red, and blue pixels represent TP, FP, and FN, respectively.

Conclusions
In this paper, we proposed a novel framework to update the post-disaster building database from VHR satellite imagery coupled with a leading online collaborative and open access map database, OSM.The approach was based on the automated training area generation for the proposed ResUnet-CRF network for building extraction.In addition, the proposed EDI and V-HOG indices confirmed a reliable performance in detecting changes of built-up areas from multi-temporal satellite imagery to distinguish between damaged and intact buildings.This was then used as a preprocessing step in the automatic training area selection for the post-disaster building extraction (both for immediately after the disaster and the recovery phase).
Experiments that were performed on 10 test images that were selected from the study area (VHR images) demonstrated that the proposed approach produced robust results in updating the building database in different post-disaster scenarios, such as damaged, collapsed, reconstructed, newly built, and demolished buildings, using diverse building characteristics such as color, shape, and size of the buildings under challenging environmental conditions.Indeed, the efficacy of the proposed method was independent from building characteristics.Although the proposed method performed efficiently in this case study, it would likely produce even higher accuracies when OSM data are more accurate, such as in large cities.Assessing the impact of registering and modifying the OSM map and satellite images before training the network is also of interest for future studies.The inaccuracies in the OSM data (i.e., mismatching of the building footprint with the actual ones in the satellite images) led to smoothness in the edge of the buildings, which was mostly overcome through the CRF method.The CRF method showed a fairly good performance in refining the ResUnet results of the building boundaries; however, it did not perform well in the pixel brightness value-based refinements.This was also expected due to the complexity of the study area, in which the color variation was high and there was a strong similarity between the building and non-building classes in some parts of the images in terms of color and texture.In this case, in future research, spatial context-based approaches can be used to overcome these drawbacks.The limitation of the proposed method was mainly associated with difficulties in detecting buildings that rarely occurred in the training set.For example, in a post-disaster scenario, construction materials of the buildings may change to increase the resilience of the buildings, which may result in changes in the rooftop color and texture of the buildings which were not present in the training set that was used to build the network.However, this issue was limited by the proposed method by retraining the network using updated satellite images.Besides, the performance of the network can be improved by adding more image patches for training.In addition, the framework could be used to update building maps in a normal situation by implementing the proposed approach, but excluding the change detection phase.

Figure 1 .
Figure 1.The framework proposed in this paper for post-disaster building database updating.Notes: V-HOG = Variation of Histogram of Oriented Gradients; EDI = Edge Density Index; CRF = Conditional Random Field.

Figure 1 .
Figure 1.The framework proposed in this paper for post-disaster building database updating.Notes: V-HOG = Variation of Histogram of Oriented Gradients; EDI = Edge Density Index; CRF = Conditional Random Field.

Figure 2 .
Figure 2. Example of the co-registration of the OpenStreetMap (OSM) building map and satellite images for Tacloban city, the Philippines.(a) Pre-disaster satellite image, (b) original OSM building map, and (c) modified OSM building map.The areas denoted by red boundaries show the effect of the refinements on the OSM map data.

Figure 2 .
Figure 2. Example of the co-registration of the OpenStreetMap (OSM) building map and satellite images for Tacloban city, the Philippines.(a) Pre-disaster satellite image, (b) original OSM building map, and (c) modified OSM building map.The areas denoted by red boundaries show the effect of the refinements on the OSM map data.

Figure 3 .
Figure 3. (a-c) Pre-disaster data in terms of (a) multispectral image, (b) V-HOG, and (c) edge detection image; OSM building map denoted with red lines.(d-f) Event time data in terms of (d) multispectral image, (e) V-HOG, and (f) edge detection image; the result of Step 3 is denoted with red lines.

Figure 3 .
Figure 3. (a-c) Pre-disaster data in terms of (a) multispectral image, (b) V-HOG, and (c) edge detection image; OSM building map denoted with red lines.(d-f) Event time data in terms of (d) multispectral image, (e) V-HOG, and (f) edge detection image; the result of Step 3 is denoted with red lines.

Figure 5 .
Figure 5.An overview of the Philippines showing the path of Typhoon Haiyan (a) and the location of Tacloban city (b).Pre-disaster image (c) and an image acquired three days after the disaster (d).

Figure 5 .
Figure 5.An overview of the Philippines showing the path of Typhoon Haiyan (a) and the location of Tacloban city (b).Pre-disaster image (c) and an image acquired three days after the disaster (d).

Figure 6 .
Figure 6.The results of the proposed method, test images, and pre-disaster images with OSM building boundaries (yellow).Column A: Pre-disaster (8 months before Haiyan) images with OSM building boundaries in yellow.Column B, Images #1-5 were taken 3 days after Haiyan and Images #6-10 were taken 4 years after Haiyan.Column C: the reference image for buildings, in which white and black colors represent the building and background pixels, respectively.Column D: detected buildings for test images.Green, red, and blue represent TP, FP, and FN, respectively.

Figure 6 .
Figure 6.The results of the proposed method, test images, and pre-disaster images with OSM building boundaries (yellow).Column A: Pre-disaster (8 months before Haiyan) images with OSM building boundaries in yellow.Column B, Images #1-5 were taken 3 days after Haiyan and Images #6-10 were taken 4 years after Haiyan.Column C: the reference image for buildings, in which white and black colors represent the building and background pixels, respectively.Column D: detected buildings for test images.Green, red, and blue represent TP, FP, and FN, respectively.

Figure 7 .
Figure 7. (a) Original test image #1, (b) ResUnet-CRF, (c) ResUnet, and (d) ResUnet without finetuning results.The areas within the yellow boundaries denote the dark green buildings to stress the effect of the CRF in the final result.The areas within the purple boundaries denote the inaccuracies in the results, which were extracted using the network that was initially trained without fine-tuning.Green, red, and blue pixels represent TP, FP, and FN, respectively.

Figure 7 .
Figure 7. (a) Original test image #1, (b) ResUnet-CRF, (c) ResUnet, and (d) ResUnet without fine-tuning results.The areas within the yellow boundaries denote the dark green buildings to stress the effect of the CRF in the final result.The areas within the purple boundaries denote the inaccuracies in the results, which were extracted using the network that was initially trained without fine-tuning.Green, red, and blue pixels represent TP, FP, and FN, respectively.

Table 2 .
The parameters and threshold values used to do the experiments.

Table 3 .
Numerical results of the proposed post-disaster building database update for event and recovery times.