Object-Based Building Change Detection by Fusing Pixel-Level Change Detection Results Generated from Morphological Building Index

Change detection (CD) is an important tool in remote sensing. CD can be categorized into pixel-based change detection (PBCD) and object-based change detection (OBCD). PBCD is traditionally used because of its simple and straightforward algorithms. However, with increasing interest in very-high-resolution (VHR) imagery and determining changes in small and complex objects such as buildings or roads, traditional methods showed limitations, for example, the large number of false alarms or noise in the results. Thus, researchers have focused on extending PBCD to OBCD. In this study, we proposed a method for detecting the newly built-up areas by extending PBCD results into an OBCD result through the Dempster–Shafer (D–S) theory. To this end, the morphological building index (MBI) was used to extract built-up areas in multitemporal VHR imagery. Then, three PBCD algorithms, change vector analysis, principal component analysis, and iteratively reweighted multivariate alteration detection, were applied to the MBI images. For the final CD result, the three binary change images were fused with the segmented image using the D–S theory. The results obtained from the proposed method were compared with those of PBCD, OBCD, and OBCD results generated by fusing the three binary change images using the major voting technique. Based on the accuracy assessment, the proposed method produced the highest F1-score and kappa values compared with other CD results. The proposed method can be used for detecting new buildings in built-up areas as well as changes related to demolished buildings with a low rate of false alarms and missed detections compared with other existing CD methods.


Introduction
The most important social transformation in human history is regarded as urbanization, in which cities play an important role [1]. Although only 3% of the Earth's land is occupied by cities, 3.5 billion people live in cities, and it will rise to over 5 billion by 2030, making it 95% of the urban expansion. This rapid urbanization results in 60-80% of energy consumption and carbon emissions [2,3]. The development of sustainable cities is becoming crucial because of the challenges associated with urban growth and urbanization. The United Nations Sustainable Development Goals include sustainable cities and communities (i.e., Goal 11) that require a focus on safe and sustainable However, fusing CD results often causes uncertainty in results. Researchers addressed this problem using the Dempster-Shafer (D-S) theory with a segmented image to generate OBCD results [30]. It is necessary to assign the certainty weight manually while calculating change, nochange, and uncertainty objects when implementing the D-S theory. Keeping in mind the uncertainty problem while applying the D-S theory and the problem with assigning the certainty weight, we derived a method that automatically calculates and assigns the certainty weight for each binary CD result while calculating the change, nochange, and uncertainty [31].
With the availability and development of VHR imagery, the applications of the CD are increasing, and detecting the changes in complex objects are becoming the main focus, including buildings, which is one of the main applications of CD. For detecting building changes, Liu et al. [32] proposed a method in which shape and spatial features were utilized to amplify the ability of features. Im et al. [33] proposed an OBCD method, which is based on the fact that the brightness value of an object is correlated when there is no change and uncorrelated when there is a change in multitemporal images. This method is based on image segmentation and correlation analysis. Dalla Mura et al. [8] proposed an unsupervised method for CD by utilizing morphological operations and CVA and proved that the proposed method outperforms the pixel-based CVA. In another study, a building CD method based on morphological building index (MBI), spectral, and shape conditions was proposed in [34]. Moreover, an automatic building CD method based on a morphological attribute profile was proposed by Li et al. [35]. Leichtle et al. presented a CD method by using VHR images and focusing on individual buildings [36]. A multi-level 3D building CD method for megacities was proposed in [37]. A 3D building roof reconstruction and building CD method is proposed in [38]. An unsupervised OBCD approach focusing on individual buildings in an urban environment using VHR imagery is proposed in [36]. Zhang et al. [39] proposed an OBCD approach with separate segmentation of multitemporal high-resolution imagery focusing on building CD in an urban area. They individually conducted segmentation for each multitemporal imagery and extracted change features for generating changed objects. In [40], building a CD from VHR imagery by combining MBI and slow feature analysis was proposed. A multi-index-based automatic CD method was proposed for high-resolution imagery [41]. A multi-level approach for building CD was proposed in [42] that utilizes the MBI and mutual information together.
Although many studies have focused on CD for urban development, it is still a challenging issue in the field of remote sensing because of its diverse and complex cases. Moreover, detecting small and complex objects, such as buildings in altered areas, has received insignificant attention from analysts because most studies that extend PBCD to OBCD and use only OBCD or PBCD techniques have focused on detecting all changes. Studies regarding building changes using MBI have generated a binary change image using one PBCD classifier (i.e., CVA or image differencing method) and converted PBCD to OBCD using morphological operations such as morphological closing and opening operations. The problem with morphological closing and opening operations is the selection of the best structural element for the dataset; for instance, a large structural element will extract unnecessary areas such as buildings. However, very small structural elements will miss parts of buildings and result in incomplete building shapes. Consequently, the CD result can be influenced by severe false alarms or missed detections.
In this study, we proposed an OBCD method for VHR imagery, focusing only on newly built buildings in changed areas by fusing the results of different PBCD classifiers and segmented image using the D-S theory. First, we conducted the object segmentation of multitemporal VHR imagery. Next, the building feature maps were generated from VHR multitemporal imagery using MBI [43]. The advantage of using MBI is that the effect of shadows of buildings on CD results can be effectively removed. Then three change intensity images were implemented by using principal component analysis (PCA) [44], CVA [11], and iteratively reweighted multivariate alteration detection (IRMAD) [45]. A threshold is generated and applied to the three change intensity images, and binary change images are conducted. The three binary change images were fused with the segmented image by using the Remote Sens. 2020, 12, 2952 4 of 20 D-S theory to generate the OBCD map. Finally, the accuracy assessment of the proposed CD technique was conducted by using a manually digitized map. To verify the superiority of the proposed method, we applied the proposed method to the multitemporal VHR images from the KOMPSAT-3 satellite with a spatial resolution of 2.8 m [46]. The results from the proposed method were compared with the results from extended PBCD to OBCD of the three classifiers. Moreover, the OBCD result was generated by fusing PBCD results from the three classifiers with the segmented image by using the major voting technique.
The study aims to focus on the United Nations Sustainable Development Goal 11. The proposed method can be used for the planning and development of cities and for updating maps. To detect rapid changes in the urban development areas using remote sensing technology, we concentrated on detecting the changes in building areas using VHR images while minimizing the false alarms that may occur during the application of general CD methods. In particular, for reducing the detection changes regardless of urban development (e.g., seasonal changes, shadows, and vegetation), we used various difference images generated by applying PBCD techniques (i.e., CVA, PCA, and IRMAD) to MBI feature images, instead of using difference images generated using CD techniques applied to original images. To minimize the false alarms of CD results, we used the D-S theory based on the fusion method to extend PBCD to OBCD. Here, we automatically derived threshold for the certainty weight to be applied in any case without manual interruption. A detailed result analysis of the proposed method, including the segmentation effect on the performance, visual inspection, and numerical assessment by comparing with other PBCD and OBCD results, is conducted to verify the effectiveness of the proposed method.

Methodology
The procedure for the proposed method is shown in Figure 1. The primary steps are as follows: image segmentation using an image acquired at time two (image T2) 2.
creation of MBI feature maps using images from both times 3.
implementation of the three PBCD methods (i.e., CVA, PCA, and IRMAD) and application of an appropriate threshold for building detection to obtain binary change results, and 4.
fusion of the three binary CD results with a segmented object map using the D-S theory to obtain a final OBCD result.
Remote Sens. 2020, 12, x FOR PEER REVIEW  4 of 20 proposed method, we applied the proposed method to the multitemporal VHR images from the KOMPSAT-3 satellite with a spatial resolution of 2.8 m [46]. The results from the proposed method were compared with the results from extended PBCD to OBCD of the three classifiers. Moreover, the OBCD result was generated by fusing PBCD results from the three classifiers with the segmented image by using the major voting technique.
The study aims to focus on the United Nations Sustainable Development Goal 11. The proposed method can be used for the planning and development of cities and for updating maps. To detect rapid changes in the urban development areas using remote sensing technology, we concentrated on detecting the changes in building areas using VHR images while minimizing the false alarms that may occur during the application of general CD methods. In particular, for reducing the detection changes regardless of urban development (e.g., seasonal changes, shadows, and vegetation), we used various difference images generated by applying PBCD techniques (i.e., CVA, PCA, and IRMAD) to MBI feature images, instead of using difference images generated using CD techniques applied to original images. To minimize the false alarms of CD results, we used the D-S theory based on the fusion method to extend PBCD to OBCD. Here, we automatically derived threshold for the certainty weight to be applied in any case without manual interruption. A detailed result analysis of the proposed method, including the segmentation effect on the performance, visual inspection, and numerical assessment by comparing with other PBCD and OBCD results, is conducted to verify the effectiveness of the proposed method.

Methodology
The procedure for the proposed method is shown in Figure 1. The primary steps are as follows: 1. image segmentation using an image acquired at time two (image T2) 2. creation of MBI feature maps using images from both times 3. implementation of the three PBCD methods (i.e., CVA, PCA, and IRMAD) and application of an appropriate threshold for building detection to obtain binary change results, and 4. fusion of the three binary CD results with a segmented object map using the D-S theory to obtain a final OBCD result.

Multiresolution Segmentation
The multiresolution segmentation is an iterative process that minimizes the average heterogeneity of objects. The heterogeneity used in this algorithm has spatial and spectral

Multiresolution Segmentation
The multiresolution segmentation is an iterative process that minimizes the average heterogeneity of objects. The heterogeneity used in this algorithm has spatial and spectral components. The spectral component is defined by using the spectral response of the pixels contained in a segment. The spatial component is based on two shape features, which are smoothness and compactness. The compactness is defined as the ratio between the perimeter of the segment and the square root of the number of pixels it contains. The smoothness is defined as the ratio between the perimeter of the object and the perimeter of the minimum boundary rectangle [47].
The eCognition software is used to conduct the multiresolution segmentation of VHR multitemporal imagery [48]. The multiresolution segmentation integrated into the eCognition software uses the bottom-up strategy [49], starting by creating single-pixel objects and grouping those objects until a given threshold is reached [50]. The threshold is given by setting a scale parameter that is weighted with shape and compactness parameters. The scale parameter controls the spectral variations within the object and affects the segmentation size of an image. The shape parameter is a weight between object shape and its spectral color. The smaller value of the shape parameter means that the spectral characteristics are considered more than shape during the segmentation process. The compactness parameter is the ratio of boundary and area of an object. Among the three parameters in eCognition software, scale parameters have a great impact on the CD performance [51]. In this study, for conducting multiresolution segmentation, only the image T2 is used because we focused on buildings only; therefore, we generated the segmentation by using the image, which had newly constructed buildings. For generating segmentation, the scale parameter was set to 60, whereas shape and compactness parameters were left as default values, 0.1 and 0.5, respectively.

Morphological Building Index (MBI)
The primary concept of MBI is to form a relation between the spectral and spatial characteristics of buildings (i.e., brightness, size, and contrast) and morphological transformations (i.e., top-hat transformation (THT), granulometry, and directionality) [34]. In this study, the MBI introduced in [44] was calculated step-by-step as follows.
The calculation of the brightness value is the first step, which reduces the multispectral bands used in the calculation of MBI. In this study, for the calculation of brightness, only visible bands are used since they have the most significant contribution to the spectral information of buildings [52]. The maximum of each pixel in each visible band is recorded as a brightness value by using Equation (1): where B(l) represents the brightness value of pixel l, band m (l) is the spectral value of pixel l at band m, and M is the total number of bands used. The second step is to determine the information regarding the contrast of buildings. THT is a difference in brightness image, and its morphological opening profile which is mostly used because it is capable of highlighting bright structures with a size less than or equal to a predefined value [34].
The third and fourth steps are to determine the THT and its differential morphological profiles (DMPs). Buildings in VHR images indicate complicated spatial arrangements such as different sizes. Multiscale THT based on DMPs [53] is used to generate the building index. Moreover, a linear structural element is used while generating MBI because it is effective in extracting the information regarding the directionality of local structures [54]. Equation (2) is used to calculate the multiscale and multidirectional DMPs of the THT: While THT B(d,s) is calculated by using Equations (3) and (4): and where THT B(d,s) indicates the THT of the brightness image B, γ (d,s) (B) is the opening by reconstruction of the brightness image B and is defined based on two basic morphological operations (erosion ε and dilation δ), d represents the directionality of the structural element, and s (s min ≤ s ≤ s max ) indicates the scale parameter of a structural element and is used to extract objects with different sizes [48], and ∆s is the interval of the profile. The final step is to calculate MBI. The MBI feature image (MBI T ) is determined as the average of the multiscale and multidirectional THT B(d,s) by using Equation (5).
where D and S denote the value of directionality and that of the scale of profiles, respectively. Four directions (D = 4) were considered from 0 to 135 with a delay of 45 because an increase in directionality does not lead to an increase in accuracy [54]. The value of the scale of profile is calculated using the expression: S = ((s max − s min )/∆s) + 1, where s max and s min should be determined according to the spatial characteristics of the buildings and spatial resolution (in this study, s min = 2, s max 52 with ∆s = 5) [55].

Pixel-Based Change Detection (PBCD)
Due to the complexity of the multitemporal VHR images, it is difficult to obtain an accurate CD result from one PBCD method. Therefore, we utilized three independent PBCD methods and fused their CD results with a segmented image by using the D-S theory. Three commonly used and effective unsupervised PBCD methods, including CVA, IRMAD, and PCA, were considered.
CVA is the most commonly used method by researchers for CD purposes. Change vectors are obtained by subtracting the MBI feature images, which are generated from bi-temporal images using MBI. For determining the change intensity using CVA, the Euclidean distance between pixels was used [11]. The change intensity using CVA is generated using Equation (6): where PBCD CVA is the change intensity of CVA and MBI T1 and MBI T2 are the MBI feature images.
In the PCA method, the difference image is generated by the absolute-valued difference image. For extracting the eigenvectors, the difference image is divided into h × h (h = 4 used in this study) non-overlapping blocks. Then the difference image vector set is projected onto the eigenvector space to produce a feature vector space [44]. The change intensity using the PCA method is calculated using Equation (7). with where e T indicates the eigenvector of the covariance matrix, ψ is the average pixel value, and MBI d is the difference image. The IRMAD is based on the principle of canonical correlation analysis. The Chi-squared distance is used to calculate the change intensity of IRMAD [45]. During the iteration process in IRMAD for reducing the negative outcome of the changed pixels, the high weights are assigned to the unchanged pixels. The change intensity using IRMAD is generated by Equation (9). and where σ is the standard deviation of the used band, and a and b are the transformation vectors calculated from the canonical correlation analysis. The three change intensity images (i.e., PBCD CVA , PBCD PCA , and PBCD IRMAD ) were normalized to 0 and 1, and a threshold (T MBI ) is applied to the three change intensity images to generate the three binary change images: where PBCD is the change intensity images generated by k classifiers.

Dempster-Shafer (D-S) Theory
Fusing multiple CD results causes uncertainty in the CD result. In order to solve the problem of uncertainty and for improving accuracy of the CD result and decreasing the false alarms, the D-S theory is employed. The D-S evidence theory measures the probability of an event by fusing the probability of each evidence [56]. In [30], the D-S theory was used to fuse different PBCD results with segmented images and generated an OBCD result. However, in their study, certainty weights while calculating the change, nochange, and uncertainty were manually assigned. In [31], we automatically calculated and assigned the certainty weight for each segmented object. In this study, for assigning the certainty weight while calculating the change, nochange, and uncertainty, the same concept as in [31] is utilized. The PBCD results were combined with the segmented image and the certainty weight is calculated.
A simple technique to generate the certainty weight is to calculate CD accuracies by comparing them with a manually digitized reference map while changing the values of the certainty weight. However, the construction of the reference map in practical CD applications is difficult. The certainty weight (p i j ) of the evidence i (i.e., CVA, PCA, and IRMAD) for each object j is calculated using Equation (12): where σ i j is the standard deviation of the change intensity in object j acquired by the i th binary PBCD result, i.e., the i th evidence. If the change intensity is homogeneous in the object, the certainty weight value will be large, and vice versa. Accordingly, the certainty weight can be automatically calculated while considering the stability of the change intensity of each object.
Then the probability of change (mC i j ), nochange (mNC i j ), and uncertainty (mp i j ) are calculated using Equation (13): where NC i j shows the number of changed pixels in object j of the PBCD result i, NnC i j shows the number of unchanged pixels in object j of the PBCD result i, and N j is the total number of pixels in an object j.
If the changed pixels in an object j are greater than or equal to unchanged and uncertainty, then the object OBCD j will be declared a changed object. This condition can be expressed as follows:

Experimental Results
For the detection of only buildings in the changed area, the experiments were carried out on a pair of KOMPSAT-3 multispectral images of Sejong City, South Korea, with a spatial resolution of 2.8 m acquired in 2013 (image T1) and 2019 (image T2). Both the images had four bands, three visible bands and one near-infrared band. Large-scale developments have taken place in this area since 2013, including high-rise newly built-up buildings. To detect the buildings in the changed area and check the effectiveness of the proposed method, we selected two subsets from the KOMPSAT-3 imagery.
Before conducting the proposed approach, we carried out co-registration between VHR imagery to minimize the geometric misalignment as mentioned in [31]. A phase correlation method was used to detect conjugate points for the transformation model construction. The local templates were constructed over the image T1 for detecting the conjugate points. On the basis of the coordinate information from the metadata, the location of the corresponding templates of image T2 was derived. For determining the similarity peak, the phase correlation was calculated, which can detect the translation difference between the images in x and y directions. The peak location of the phase correlation between the template images was used to extract well-distributed conjugate points. Then, the corresponding conjugate point's position for the template image T2 was determined as a shifted location from the centroid of the template to the amount showing the highest similarity value of the phase correlation. After extracting the conjugate points, and improved piecewise linear transformation [57] was used to warp the image T2 to the coordinates of the image T1.
After carrying out co-registration, the MBI feature images were generated from each image in each subset using only visible bands because they correlate with the spectral property of buildings [54]. Then from the MBI feature images, the three PBCD intensity images were generated using the classifiers CVA, PCA, and IRMAD. Then a threshold (T MBI ) was applied to the three intensity images for generating the binary change images. The three binary change images were then fused with the segmented image using the D-S theory, and the OBCD result was generated. The proposed CD result was compared with the OBCD results generated by fusing the same three binary change maps with the segmented image using the major voting technique. In addition, the proposed CD result was compared with the PBCD results from the three classifiers and their extended OBCD results generated by using major voting techniques and segmented image as used for the proposed method. For the quantitative evaluation of CD performance, the F1-score, kappa, miss rate (MR), and false alarm rate (FAR) were calculated using each CD result and the manually digitized reference map.
F1-score is the harmonic mean of precision and recall. Precision is the correct positive results divided by all positive results, whereas recall is the number of correct positive results divided by all possible positive results. The higher F1-score means the best performance of the classifier. The kappa coefficient can be determined by using the observed accuracy and random accuracy. FAR is calculated using the total number of false alarms divided by the total number of unchanged samples in the reference map, and MR is calculated using the total number of missed detections divided by the total number of changed samples in the reference map.
The segmented image was generated in eCognition software using a scale parameter, shape, and compactness of 60, 0.1, and 0.5, respectively. The same scale parameter for both subsets was selected after generating and evaluating results for both subsets. The detailed analysis of the scale parameters affecting the CD performance will be given in Section 4. Moreover, the running time of the process was measured. It highly depends on the number of objects in the segmented image. It took 132 s and 40 s for experiments 1 and 2 that have 9253 and 4977 objects, respectively.

Experiment 1
The first experiment was conducted on the subset of multitemporal VHR images of the KOMPSAT-3 satellite with a size of 925 × 637 pixels. Figure 2 shows both the multispectral images. Several changes have taken place in the area during the period from 2013 to 2019, but we aimed to detect only the newly built-up buildings. From Figure 2, it can be seen that the study area has roads, trees, soil, and buildings. Therefore, MBI feature images were generated using the above images to highlight the buildings and to filter out other areas such as trees. For generating the MBI feature images, the parameters mentioned before (e.g., d = 0, 45, 90, 135 and s from 2 to 52 with an interval of 5) were used on the basis of the sizes, shapes, and spectral characteristics of buildings. Figure 3 shows the MBI feature images of image T1 and image T2.  Three intensity images were generated by using CVA, PCA, and IRMAD. Then a threshold ( = 0.3) was applied for generating the pixel-based binary change images. Binary change images were fused with the segmented image to generate the OBCD result by applying the D-S theory. The CD results from three PBCD classifiers (a-c), their OBCD results (d-f), OBCD result from fusing these three classifiers using major voting (g), the result generated using the proposed method (h), and CD reference map (i) are shown in Figure 4. It can be seen from Figure 4a-c that PBCD results generated many false alarms and salt-and-pepper noise. However, these aspects were minimized when PBCD results were extended to OBCD (Figure 4d-f). From Figure 4g-h, the noises and false alarms were minimized further when PBCD results were fused by using major voting techniques and the proposed method. However, the shapes of the buildings in the result generated by major voting were not proper because it failed to detect most parts of the buildings. From Figure 2, it can be seen that the study area has roads, trees, soil, and buildings. Therefore, MBI feature images were generated using the above images to highlight the buildings and to filter out other areas such as trees. For generating the MBI feature images, the parameters mentioned before (e.g., d = 0, 45, 90, 135 and s from 2 to 52 with an interval of 5) were used on the basis of the sizes, shapes, and spectral characteristics of buildings. Figure 3 shows the MBI feature images of image T1 and image T2. From Figure 2, it can be seen that the study area has roads, trees, soil, and buildings. Therefore, MBI feature images were generated using the above images to highlight the buildings and to filter out other areas such as trees. For generating the MBI feature images, the parameters mentioned before (e.g., d = 0, 45, 90, 135 and s from 2 to 52 with an interval of 5) were used on the basis of the sizes, shapes, and spectral characteristics of buildings. Figure 3 shows the MBI feature images of image T1 and image T2.  Three intensity images were generated by using CVA, PCA, and IRMAD. Then a threshold ( = 0.3) was applied for generating the pixel-based binary change images. Binary change images were fused with the segmented image to generate the OBCD result by applying the D-S theory. The CD results from three PBCD classifiers (a-c), their OBCD results (d-f), OBCD result from fusing these three classifiers using major voting (g), the result generated using the proposed method (h), and CD reference map (i) are shown in Figure 4. It can be seen from Figure 4a-c that PBCD results generated many false alarms and salt-and-pepper noise. However, these aspects were minimized when PBCD results were extended to OBCD (Figure 4d-f). From Figure 4g-h, the noises and false alarms were minimized further when PBCD results were fused by using major voting techniques and the proposed method. However, the shapes of the buildings in the result generated by major voting were not proper because it failed to detect most parts of the buildings. Three intensity images were generated by using CVA, PCA, and IRMAD. Then a threshold (T MBI = 0.3) was applied for generating the pixel-based binary change images. Binary change images were fused with the segmented image to generate the OBCD result by applying the D-S theory. The CD results from three PBCD classifiers (a-c), their OBCD results (d-f), OBCD result from fusing these three classifiers using major voting (g), the result generated using the proposed method (h), and CD reference map (i) are shown in Figure 4. It can be seen from Figure 4a-c that PBCD results generated many false alarms and salt-and-pepper noise. However, these aspects were minimized when PBCD results were extended to OBCD (Figure 4d-f). From Figure 4g-h, the noises and false alarms were minimized further when PBCD results were fused by using major voting techniques and the proposed method. However, the shapes of the buildings in the result generated by major voting were not proper because it failed to detect most parts of the buildings.
Remote Sens. 2020, 12, x FOR PEER REVIEW 10 of 20 The numerical results of the first experiment are listed in Table 1. From the table, extending the PBCD results to OBCD improved the accuracy of CD. The F1-score of CVA, PCA, and IRMAD increased from 0.1622, 0.6386, and 0.5130 to 0.6247, 0.6526, and 0.5694, respectively. Moreover, kappa improved steadily while extending from PBCD to OBCD, and FAR was smaller for OBCD compared with that for PBCD. However, because of the over-detection in PBCD, MR was smaller for PBCD than that for OBCD. The OBCD result using the major voting technique to fuse the results from the three classifiers showed a lower F1-score than the object-based results of CVA, PCA, and IRMAD. Furthermore, the FAR of the major voting technique is less than all the other results because of the under-extraction of the changed regions. Moreover, the PCA-based PBCD result over-extracted the changed regions, The numerical results of the first experiment are listed in Table 1. From the table, extending the PBCD results to OBCD improved the accuracy of CD. The F1-score of CVA, PCA, and IRMAD increased from 0.1622, 0.6386, and 0.5130 to 0.6247, 0.6526, and 0.5694, respectively. Moreover, kappa improved steadily while extending from PBCD to OBCD, and FAR was smaller for OBCD compared with that for PBCD. However, because of the over-detection in PBCD, MR was smaller for PBCD than that for OBCD. The OBCD result using the major voting technique to fuse the results from the three classifiers showed a lower F1-score than the object-based results of CVA, PCA, and IRMAD. Furthermore, the FAR of the major voting technique is less than all the other results because of the under-extraction of the changed regions. Moreover, the PCA-based PBCD result over-extracted the changed regions, which is confirmed by the fact that it has larger FAR and smaller MR compared with other PBCD results. On the other hand, the FAR and MR of the proposed method are quite reasonable, with the values of 0.0586 and 0.315. Moreover, the proposed method showed the highest F1-score and kappa with 0.6759 and 0.6194, respectively.

Experiment 2
The second experiment was conducted on the subset from KOMPSAT-3 satellite imagery with a size of 477 × 710 pixels, as shown in Figure 5. In this area, several changes have also taken place such as from trees to bare soil and from bare soil to buildings or roads. As we focus on buildings only, the same strategy in experiment 1 was used in this experiment.

Experiment 2
The second experiment was conducted on the subset from KOMPSAT-3 satellite imagery with a size of 477 × 710 pixels, as shown in Figure 5. In this area, several changes have also taken place such as from trees to bare soil and from bare soil to buildings or roads. As we focus on buildings only, the same strategy in experiment 1 was used in this experiment. The MBI feature images were generated to highlight the buildings only and to filter out other areas. The same parameters used in experiment 1 for generating MBI feature images were used. Figure 6 shows the MBI feature images for the second subset. Three change intensity images were generated using the three classifiers (e.g., CVA, PCA, and IRMAD) as in experiment 1. However, for converting the change intensity images to binary change images, the threshold ( = 0.4) was applied. The three binary change images were fused with the The MBI feature images were generated to highlight the buildings only and to filter out other areas. The same parameters used in experiment 1 for generating MBI feature images were used. Figure 6 shows the MBI feature images for the second subset.

Experiment 2
The second experiment was conducted on the subset from KOMPSAT-3 satellite imagery with a size of 477 × 710 pixels, as shown in Figure 5. In this area, several changes have also taken place such as from trees to bare soil and from bare soil to buildings or roads. As we focus on buildings only, the same strategy in experiment 1 was used in this experiment. The MBI feature images were generated to highlight the buildings only and to filter out other areas. The same parameters used in experiment 1 for generating MBI feature images were used. Figure 6 shows the MBI feature images for the second subset. Three change intensity images were generated using the three classifiers (e.g., CVA, PCA, and IRMAD) as in experiment 1. However, for converting the change intensity images to binary change images, the threshold ( = 0.4) was applied. The three binary change images were fused with the Three change intensity images were generated using the three classifiers (e.g., CVA, PCA, and IRMAD) as in experiment 1. However, for converting the change intensity images to binary change images, the threshold (T MBI = 0.4) was applied. The three binary change images were fused with the segmented image (conducted with the same parameters as experiment 1) using the D-S theory, and the OBCD result was generated. Furthermore, the OBCD was generated by fusing three binary change images with the segmented image using the major voting technique. The three binary change images were also extended to OBCD for the comparison with the proposed method. Figure 7 shows the three PBCD results, their extended OBCD results, the result of major voting technique, the result generated by the proposed method, and a change reference map. As similarly shown in experiment 1, the PBCD results on the second subset extracted more false alarms and salt-and-pepper noises compared with OBCD results. The noises are further minimized by the proposed method by reducing the missed detections and false alarms.
Remote Sens. 2020, 12, x FOR PEER REVIEW 12 of 20 segmented image (conducted with the same parameters as experiment 1) using the D-S theory, and the OBCD result was generated. Furthermore, the OBCD was generated by fusing three binary change images with the segmented image using the major voting technique. The three binary change images were also extended to OBCD for the comparison with the proposed method. Figure 7 shows the three PBCD results, their extended OBCD results, the result of major voting technique, the result generated by the proposed method, and a change reference map. As similarly shown in experiment 1, the PBCD results on the second subset extracted more false alarms and salt-and-pepper noises compared with OBCD results. The noises are further minimized by the proposed method by reducing the missed detections and false alarms.   Table 2 shows the quantitative evaluation of the second subset. Overall, similarly to experiment 1, the performance of OBCD was better than the PBCD. The MR of pixel-based PCA was smaller compared with the MR of other results, including both the PBCD and OBCD, whereas the FAR was higher in pixel-based PCA as well as in other PBCD results because of the over-detection of changed areas. The FAR of the major voting technique was lower than those obtained by other results, whereas its MR was the highest, leading to the under-detection of changed areas. However, the proposed method can significantly detect the newly built-up buildings in changed areas, achieving the highest F1-score and kappa values as 0.6905 and 0.6613, respectively.

Discussion
Among the three parameters in eCognition software, scale parameters have a great impact on the CD performance [51]. Therefore, for analyzing the effect of the scale parameter in the segmentation process, the F1-score was calculated for the proposed OBCD method and major voting technique while changing the values of the scale parameter from 20 to 100 with an interval of 10. Figure 8 shows the effect of the scale parameter on the CD performance. From Figure 8a, it can be seen that the proposed method showed a higher F1-score than the major voting technique and remained almost constant during the whole range of the scale parameters. Furthermore, the scale parameter had less effect on the result of the proposed method. A similar pattern was shown in experiment 2 (Figure 8b). The F1-score was highest at scale 60. Based on the sensitivity analysis of the scale parameter, we selected the scale parameter as 60 for both experiments.
Remote Sens. 2020, 12, x FOR PEER REVIEW 13 of 20 Table 2 shows the quantitative evaluation of the second subset. Overall, similarly to experiment 1, the performance of OBCD was better than the PBCD. The MR of pixel-based PCA was smaller compared with the MR of other results, including both the PBCD and OBCD, whereas the FAR was higher in pixel-based PCA as well as in other PBCD results because of the over-detection of changed areas. The FAR of the major voting technique was lower than those obtained by other results, whereas its MR was the highest, leading to the under-detection of changed areas. However, the proposed method can significantly detect the newly built-up buildings in changed areas, achieving the highest F1-score and kappa values as 0.6905 and 0.6613, respectively.

Discussion
Among the three parameters in eCognition software, scale parameters have a great impact on the CD performance [51]. Therefore, for analyzing the effect of the scale parameter in the segmentation process, the F1-score was calculated for the proposed OBCD method and major voting technique while changing the values of the scale parameter from 20 to 100 with an interval of 10. Figure 8 shows the effect of the scale parameter on the CD performance. From Figure 8a, it can be seen that the proposed method showed a higher F1-score than the major voting technique and remained almost constant during the whole range of the scale parameters. Furthermore, the scale parameter had less effect on the result of the proposed method. A similar pattern was shown in experiment 2 (Figure 8b). The F1-score was highest at scale 60. Based on the sensitivity analysis of the scale parameter, we selected the scale parameter as 60 for both experiments.  Figure 9 shows the F1-score of the proposed method and major voting by changing T from 0.1 to 0.9 with an interval of 0.1. Overall, the graph illustrates that the proposed method showed a better result than the major voting technique. In experiment 1, the proposed method yielded the highest F1-score at T = 0.3; in experiment 2 the proposed method yielded the highest F1-score at T = 0.4. According to these results, we selected a threshold with the highest F1-score for experiments 1 (T = 0.3) and 2 (T = 0.4). Moreover, we generated the results for both subsets using the automatic threshold selection method, such as the Otsu method. The F1-sore generated by  Figure 9 shows the F1-score of the proposed method and major voting by changing T MBI from 0.1 to 0.9 with an interval of 0.1. Overall, the graph illustrates that the proposed method showed a better result than the major voting technique. In experiment 1, the proposed method yielded the highest F1-score at T MBI = 0.3; in experiment 2 the proposed method yielded the highest F1-score at T MBI = 0.4. According to these results, we selected a threshold with the highest F1-score for experiments 1 (T MBI = 0.3) and 2 (T MBI = 0.4). Moreover, we generated the results for both subsets using the automatic threshold selection method, such as the Otsu method. The F1-sore generated by applying the Otsu method instead of T MBI in experiments 1 and 2 was 0.6742 and 0.6884, respectively. The F1-scores of experiments 1 and 2 generated using T MBI are 0.6759 and 0.6905, respectively. The thresholds generated by the Ostu method for the three PBCD techniques using both subsets are between 0.3 and 0.45. Hence, we propose a range from 0.3 to 0.45 because a very small threshold will detect unnecessary areas such as changes in trees or soil and result in many false alarms, while a high threshold will result in a large number of missed detections. Furthermore, the proposed method can be applied to different datasets using automatic threshold selection methods as well.
Remote Sens. 2020, 12, x FOR PEER REVIEW 14 of 20 applying the Otsu method instead of T in experiments 1 and 2 was 0.6742 and 0.6884, respectively. The F1-scores of experiments 1 and 2 generated using T are 0.6759 and 0.6905, respectively. The thresholds generated by the Ostu method for the three PBCD techniques using both subsets are between 0.3 and 0.45. Hence, we propose a range from 0.3 to 0.45 because a very small threshold will detect unnecessary areas such as changes in trees or soil and result in many false alarms, while a high threshold will result in a large number of missed detections. Furthermore, the proposed method can be applied to different datasets using automatic threshold selection methods as well. Moreover, the effectiveness of the proposed method can be analyzed by comparing the F1-score and kappa of the proposed method with other CD results. The proposed method can effectively detect the recently constructed high-rise buildings with complete shapes and can minimize the detection of roads or soil having the same spectral characteristics with buildings. Moreover, MBI effectively filtered out building shadows and improved the accuracy of the CD results.
To prove the effectiveness of the proposed method, we conducted experiments using the four bands (red, green, blue, and near infrared) of original images (i.e., images T1 and T2) without considering MBI. The OBCD results for both subsets were generated by CVA, IRMAD, and PCA applied to the original images using the D-S theory method [31] and major voting techniques. The OBCD results are presented in Figure 10. The F1-scores of the first subset using the major voting and D-S theory were 0.2971 and 0.3299, respectively, whereas the F1-scores of the second subset using the major voting and D-S theory were 0.3248 and 0.3647, respectively. The reason behind the low F1score is that both methods detect shadows as the changed objects as well as changes in trees and soil that are unrelated to buildings. Tables 3 and 4 show the accuracy evaluation performance of the first and second subsets, respectively. Moreover, the effectiveness of the proposed method can be analyzed by comparing the F1-score and kappa of the proposed method with other CD results. The proposed method can effectively detect the recently constructed high-rise buildings with complete shapes and can minimize the detection of roads or soil having the same spectral characteristics with buildings. Moreover, MBI effectively filtered out building shadows and improved the accuracy of the CD results.
To prove the effectiveness of the proposed method, we conducted experiments using the four bands (red, green, blue, and near infrared) of original images (i.e., images T1 and T2) without considering MBI. The OBCD results for both subsets were generated by CVA, IRMAD, and PCA applied to the original images using the D-S theory method [31] and major voting techniques. The OBCD results are presented in Figure 10. The F1-scores of the first subset using the major voting and D-S theory were 0.2971 and 0.3299, respectively, whereas the F1-scores of the second subset using the major voting and D-S theory were 0.3248 and 0.3647, respectively. The reason behind the low F1-score is that both methods detect shadows as the changed objects as well as changes in trees and soil that are unrelated to buildings. Tables 3 and 4 show the accuracy evaluation performance of the first and second subsets, respectively.    Furthermore, the FAR is minimized in the CD result along with the acceptable MR. However, in other CD results, if the FAR is lower than the MR is higher and vice versa. The reason for the missed detection in the proposed method is the small buildings having the same spectral characteristics as Furthermore, the FAR is minimized in the CD result along with the acceptable MR. However, in other CD results, if the FAR is lower than the MR is higher and vice versa. The reason for the missed detection in the proposed method is the small buildings having the same spectral characteristics as the background of the high-rise buildings in both subsets and being filtered out by MBI as background but being included in the reference map ( Figure 11). Moreover, the buildings in both images (i.e., image T1 and image T2) had different spectral characteristics as well as different shooting angles. The buildings in the first subset have different shooting angles. Moreover, the buildings in the second subset (image T1) had almost the same spectral characteristics with the background in that area and were recognized as background by the MBI process. Therefore, they were detected as newly built-up buildings by the CD process but were not included in the CD reference map. As a result, FAR is improved in the CD results (Figure 12), and the accuracy of the CD result is reduced.
background but being included in the reference map ( Figure 11). Moreover, the buildings in both images (i.e., image T1 and image T2) had different spectral characteristics as well as different shooting angles. The buildings in the first subset have different shooting angles. Moreover, the buildings in the second subset (image T1) had almost the same spectral characteristics with the background in that area and were recognized as background by the MBI process. Therefore, they were detected as newly built-up buildings by the CD process but were not included in the CD reference map. As a result, FAR is improved in the CD results (Figure 12), and the accuracy of the CD result is reduced.  images (i.e., image T1 and image T2) had different spectral characteristics as well as different shooting angles. The buildings in the first subset have different shooting angles. Moreover, the buildings in the second subset (image T1) had almost the same spectral characteristics with the background in that area and were recognized as background by the MBI process. Therefore, they were detected as newly built-up buildings by the CD process but were not included in the CD reference map. As a result, FAR is improved in the CD results (Figure 12), and the accuracy of the CD result is reduced.

Conclusions
We proposed a method for object-based building CD by using D-S theory to fuse multiple PBCD results with the segmented image. MBI feature images were generated from the multitemporal VHR imagery. Then, the three PBCD results were generated from MBI feature images. In the D-S theory, certainty weight is automatically calculated and assigned while calculating change, nochange, and uncertainty. The proposed method can detect new buildings and, with the use of MBI, the CD results were improved by eliminating shadows effects or other similar objects such as roads. The impacts of false and missed detections of changes irrespective to buildings can be effectively minimized using MBI. Moreover, the proposed method can achieve reliable object-based CD results, irrespective of the scale parameter in the segmented image. For developing the proposed method, two subsets from VHR multitemporal images were used. The comparison of the proposed method with existing PBCD and OBCD methods proved the superiority of the proposed method over existing CD methods by achieving the highest F1-score and kappa value.
The proposed method can detect changes related to newly built-up regions as well as partially and totally demolished buildings in the suburban and urban areas using the VHR imagery. Therefore, the proposed method can contribute to the United Nations Sustainable Development Goal 11 by detecting the changes related to modified buildings as well as new buildings in urban areas, which can help plan the development of sustainable cities in the future. However, proper parameter selection is vital depending on the sizes and spectral characteristics of the buildings in the study areas. Furthermore, if the proposed method is applied to datasets with different acquisition angles, the CD result will contain falsely detected changed buildings.
In our future work, we will consider avoiding the shortcomings which affect the CD result of the VHR multitemporal dataset, such as the missed detection of buildings because of their having the same spectral characteristics as the background or falsely detecting buildings as changed buildings because of the different acquisition angles. Furthermore, we will apply the proposed method to datasets related to building changes acquired by different satellite sensors.