Building Change Detection Using a Shape Context Similarity Model for LiDAR Data

: In this paper, a novel building change detection approach is proposed using statistical region merging (SRM) and a shape context similarity model for Light Detection and Ranging (LiDAR) data. First, digital surface models (DSMs) are generated from LiDAR acquired at two di ﬀ erent epochs, and the di ﬀ erence data D-DSM is created by di ﬀ erence processing. Second, to reduce the noise and registration error of the pixel-based method, the SRM algorithm is applied to segment the D-DSM, and multi-scale segmentation results are obtained under di ﬀ erent scale values. Then, the shape context similarity model is used to calculate the shape similarity between the segmented objects and the buildings. Finally, the reﬁned building change map is produced by the k-means clustering method based on shape context similarity and area-to-length ratio. The experimental results indicated that the proposed method could e ﬀ ectively improve the accuracy of building change detection compared with some popular change detection methods.


Introduction
Building change detection from remote sensing data is an important technology in land cover change, disaster assessment, and city monitoring [1][2][3]. Buildings constitute the main component of a city, and the building extraction and building change detection remain challenging tasks due to the complexity of urban scenes [4]. Since traditional manual visual interpretation and vectorization methods are time-consuming and expensive, automatic and robust building change detection methods have become a research hotspot in the field of photogrammetry and remote sensing [5].
Over the past few years, a series of automatic building change detection methods have been developed from remote sensing data. The most commonly used data sources for building change detection are images and LiDAR data. In the image-based method, the spectral, contextual, geometrical features of images, and the morphological building index (MBI) are used to extract buildings [4]. However, the state-of-the-art methods based on images still face the following major challenges: (1) occlusion and shadows between buildings; (2) being easily affected by factors such as seasons and registration accuracy; (3) difficulty to distinguish buildings from other man-made constructions, such as roads; (4) lack of height information, which limits the development of three-dimensional building change detection [6].
With the increasing acquisition of LiDAR data, many building change detection studies have been conducted based on the LiDAR point cloud [7]. The LiDAR-based methods include a DSM-based, point The study site was located in Lianyungang City, Jiangsu Province, China. The center of the study area was located near 118°56′20"E and 34°43′08"N. The experimental data sets were two DSMs generated from the LiDAR data captured in 2017 and 2018, with 1,122,573 and 1,062,337 points, respectively. The LiDAR point data were saved in LAS format 1.2 and the LiDAR point density was between 2-3 points/m 2 (ppm). The LiDAR data were interpolated into 0.5m resolution DSM using Terrascan TM . The DSMs (2126 × 1426) are shown in Figures 1a and b, and the difference data D-DSM generated by difference processing is shown in Figure 1c. The ground truth created through the visual analysis is shown in Figure 1d. The land cover of this study area includes buildings (mostly rectangular), roads, bare soil, and vegetation (grass and trees).

Methodology
The schedule of the proposed building change detection method is summarized in Figure 2. The approach consists of three main steps: SRM segmentation, shape similarity calculation, and building change detection. The details of each step are described in the following sections.

Methodology
The schedule of the proposed building change detection method is summarized in Figure 2. The approach consists of three main steps: SRM segmentation, shape similarity calculation, and building change detection. The details of each step are described in the following sections.

Segmentation of D-DSM Using SRM
Difference processing is performed on the DSM data at two time instances to generate the difference data D-DSM. Image segmentation is then implemented [25]. First, SRM is conducted on D-DSMs to generate objects, and then the object-oriented building change detection is carried out. In SRM, the observed image X, contains × pixels with L bands, and each pixel value belongs to the set { } 0,1, 2, , g  (in practice, we would have Assuming that X* is the optimal segmentation result, each band value of the pixel in observed image X is obtained by Q-level resampling of each pixel in X*. The pixel value of X* ranges from 0 to g/Q. Q is an independent random variable that tunes the segmentation scale. More objects are segmented as the Q value increases. SRM mainly achieves image segmentation by iterating between the merging predicate and merging order. The merging predicate ( ) , P R R′ can be defined as: where R and R′ denote regions in image X, a R and a R′ denote the average values of band a in R and R′ , respectively. ( ) , b R R′ is formulated as:

Segmentation of D-DSM Using SRM
Difference processing is performed on the DSM data at two time instances to generate the difference data D-DSM. Image segmentation is then implemented [25]. First, SRM is conducted on D-DSMs to generate objects, and then the object-oriented building change detection is carried out. In SRM, the observed image X, contains m × n pixels with L bands, and each pixel value belongs to the set 0, 1, 2, · · · , g (in practice, we would have g = 255). Assuming that X* is the optimal segmentation result, each band value of the pixel in observed image X is obtained by Q-level resampling of each pixel in X*. The pixel value of X* ranges from 0 to g/Q. Q is an independent random variable that tunes the segmentation scale. More objects are segmented as the Q value increases.
SRM mainly achieves image segmentation by iterating between the merging predicate and merging order. The merging predicate P(R, R ) can be defined as: where R and R denote regions in image X, R a and R a denote the average values of band a in R and R , respectively. b(R, R ) is formulated as: where δ is a constant, |R| and |R | denote the pixel numbers of |R| and |R |, respectively. We set δ = 1/ 6|I| 2 , and |I| denotes the pixel number of image X.
If P(R, R ) = true, R and R are merged. For the merging order, the function f can be used to sort the pixels in X: where p and p are the pixels in X, p a and p a are the gray values of pixels p and p in band a respectively. Image segmentation can be achieved by simple iteration. The average values of segmented objects are calculated based on D-DSM and a threshold is set for initial change detection. In this experiment, the threshold value was set to 41 (the mean value plus the standard deviation).

Shape Similarity Calculation Using Shape Context Model
Building shapes are usually regular and mostly rectangular. Therefore, the shape context proposed by Ling and Jacobs [26] is introduced to calculate the shape similarity between the changed objects and buildings. It is worth noting that only buildings of rectangular shapes are used to calculate the similarity for changed objects in this test site. If the changed buildings have other shapes, more prior building shapes should be used to generate more accurate detection results. The specific steps of shape context similarity are as follows: (1) Remove isolated points in changed objects generated by SRM segmentation, and generate n sample points in the boundary of objects to represent the shape of the changed objects.
(2) Establish the shape context for all sample points. As shown in Figure 3a, the shortest path between the edge point p and another edge point q is defined as the inner distance, which is denoted by d(p, q); the smaller angle between the tangent at p and the inner distance d(p, q) is defined as the inner-angle, which is denoted by θ(p, q). Consider d(p, q) and θ(p, q) as a vector, and the point p can generate n − 1 vectors for the remaining n − 1 points. According to the maximum distance and angle of the vector at point p, the inner distance and the inner angle can be divided into n d and n θ intervals, respectively. For shape point p, the histogram h generated by its n − 1 vectors can be calculated as follows: where k ∈ { 1, 2, · · · , K} , K = n d n θ , and bin(k) denotes the number of vectors that fall within the interval. Figure 3b shows the shape of an extracted building, and Figure 3c,d show the shape context histogram of the sample point Z 1 and sample point Z 2 , respectively. Finally, n histograms generated by n points are used to calculate the similarity of the two shapes.
(3) The similarity of the two points can be obtained by calculating the minimum cost of matching the sample point p i and the sample point q j . The shape context distribution of the sample points is shown in Figure 3c,d. Let C i j denote the cost of matching these two points. The χ 2 test statistic can be used to calculate the match cost [27]: where h i (k) and h j (k) denote the histograms at p i and q j , respectively, and K is the number of histogram bins. The permutation π should minimize the total match cost H(π) based on n sample points: The dynamic programming algorithm is used to solve the matching problem, and the match cost H(π) is used to describe the shape similarity between the shapes of changed objects and the shapes of buildings. For details, please refer to Ling and Jacobs [26]. In general, the larger the number of the shape sample points, the higher the accuracy of the matching results. In this paper, we set n = 50, n d = 5, n θ = 12.
, and bin( ) k denotes the number of vectors that fall within the interval. Figure where ( ) i h k and ( ) j h k denote the histograms at i p and j q , respectively, and K is the number of histogram bins. The permutation π should minimize the total match cost ( ) H π based on n sample points: The dynamic programming algorithm is used to solve the matching problem, and the match cost ( ) H π is used to describe the shape similarity between the shapes of changed objects and the shapes of buildings. For details, please refer to Ling and Jacobs [26]. In general, the larger the number of the shape sample points, the higher the accuracy of the matching results. In this paper, we set 50, 5, 12 d n n n θ = = = .

Building Change Detection Using the K-Means Algorithm
When performing building change detection based on the shape similarity results, the shape of the false detection caused by the miscoregistration of high-rise buildings is very similar to the changed buildings. It is usually distributed around the edge of buildings, which are characterized by a small area and a slender shape, while the changed buildings usually have large areas. Thus, the area-to-length ratio is calculated. The area-to-length ratios are relatively small for registration errors, while the area-to-length ratios of the changed buildings are relatively large.
The k-means algorithm is used to cluster the shape similarity and area-to-length ratio of the changed objects in the initial change detection results. The changed buildings have high shape similarity with building shape and relatively large values of area-to-length ratio, and other changed

Building Change Detection Using the K-Means Algorithm
When performing building change detection based on the shape similarity results, the shape of the false detection caused by the miscoregistration of high-rise buildings is very similar to the changed buildings. It is usually distributed around the edge of buildings, which are characterized by a small area and a slender shape, while the changed buildings usually have large areas. Thus, the area-to-length ratio is calculated. The area-to-length ratios are relatively small for registration errors, while the area-to-length ratios of the changed buildings are relatively large.
The k-means algorithm is used to cluster the shape similarity and area-to-length ratio of the changed objects in the initial change detection results. The changed buildings have high shape similarity with building shape and relatively large values of area-to-length ratio, and other changed features have low shape similarity with building shape and relatively small values of area-to-length ratio. The segmented objects are clustered into two types: changed buildings and other changes.
To quantitatively evaluate the performance of the proposed approach for building change detection, three indexes were used to assess the results: (1) Missed detections (MD): the number of unchanged pixels in the change detection map incorrectly classified when compared to the ground reference map. The missed detection rate P m is calculated by the ratio P m = MD/N 0 × 100%, where N 0 is the total number of changed pixels counted in the ground reference map. (2) False alarms (FA): the number of changed pixels in the change detection map incorrectly classified when compared to the ground reference. The false detection rate P f is calculated by the ratio P f = FA/N 1 × 100%, where N 1 is the total number of unchanged pixels counted in the ground reference map. (3) Total errors (TE): the total number of detection errors including both miss and false detections, which is the sum of the FA and the MD. The total error rate P t is described by the ratio

Results and Discussion
The experiments were conducted on a PC with 3.6 GHz clock speed and 16.00 GB RAM. MATLAB 2019 was used to program the proposed method.
In the SRM algorithm, the parameter Q controls the scale of the image segmentation. When the value of Q is larger, there will be more detailed patches in the segmentation results. When the value of Q is smaller, only the objects with a large area can be segmented. We set Q ∈ {16, 32, 64, 128, 256} to analyze the effect of the segmentation scale on the accuracy of the change detection result. Figure 4 shows the segmentation results generated by different Q values. As shown in Figure 4a, when Q = 16, the objects with a large area and a large change value were segmented into independent objects. With the increase of Q, such as Q = 256, more detailed objects were segmented into independent objects. When Q was too small, only the basic objects in the difference DSM data could be segmented into complete parts, and the generated segmentation result was too coarse, which resulted in more missed detection errors. However, it is worth noting that the complete, major and substantially changed objects could still be segmented into independent objects. When Q is too large, there will be more detailed patches in the segmentation result and more false detection errors. For example, for high-rise buildings, the changes caused by registration errors can be detected, which reduces the accuracy of the building change detection results. Therefore, determining the appropriate value of the segmentation parameter Q is essential for generating high-precision change detection results.
On this basis, the shape context was used to calculate the similarity between the building shape and the segmented object. As shown in Figure 5a, when the shape difference between the segmented object and the building was large, the calculated shape context histogram of the corresponding feature points varied greatly. When the shape of the segmented object was very similar to the building, the shape context histogram of the corresponding feature points after calculation was relatively similar, as shown in Figure 5b. Therefore, the more similar the segmented object is to the building shape, the higher the calculated similarity and the smaller the final matching loss will be.
The shape similarity between the segmented object and the building calculated by the shape context is shown in Figure 6. When the shape similarity between the segmented object and the building is greater, it appears brighter in the image, and vice versa. It can be seen from Figure 6 that for segmented objects at all scales, the shape similarity between the major changed buildings and the reference building was high, and the calculated similarity of other changed classes was low. Therefore, when the segmentation scale Q = 32, the shape similarity results could already describe the building changes well. As the segmentation scale increased, more detailed changes were segmented into independent objects, as shown in Figure 6b-e. Especially for high-rise buildings, the changes caused by registration errors were segmented into independent objects, and their shapes were similar to the building shape, as shown in Figure 6 d and e. Thus, it is difficult to automatically extract the changed buildings directly based on shape similarity, especially for the area with high-rise building changes.
Then, the area-to-length ratios were calculated. The area-to-length ratios of the registration errors were usually less than 3, and the area to length ratios of small buildings were usually greater than 6. Therefore, the shape similarity and the ratio of area to length were jointly used to detect building changes.
The k-means algorithm was used to cluster the shape similarity and area-to-length ratio of the segmented objects, and the segmented objects were clustered into two types: changed buildings and other changes. The building change detection results are shown in Figure 7. It can be seen from the figure that when Q = 16, only the buildings with a large area or large changes could be detected. The main reason is that the Q value was too small and the generated segmentation map was rough, resulting in the small buildings not being detected. As the Q value increased, more changed buildings with smaller areas could be detected because the larger Q value generated a finer segmentation map. When Q = 32 and 64, the obtained building change detection result was very similar to the ground reference data, and it could effectively remove the false detected buildings caused by registration.
When Q = 256, due to the overly fine segmentation results, there were many false detected changes in the detection results.
With the increase of Q, such as Q = 256, more detailed objects were segmented into independent objects. When Q was too small, only the basic objects in the difference DSM data could be segmented into complete parts, and the generated segmentation result was too coarse, which resulted in more missed detection errors. However, it is worth noting that the complete, major and substantially changed objects could still be segmented into independent objects. When Q is too large, there will be more detailed patches in the segmentation result and more false detection errors. For example, for high-rise buildings, the changes caused by registration errors can be detected, which reduces the accuracy of the building change detection results. Therefore, determining the appropriate value of the segmentation parameter Q is essential for generating high-precision change detection results.  On this basis, the shape context was used to calculate the similarity between the building shape and the segmented object. As shown in Figure 5a, when the shape difference between the segmented object and the building was large, the calculated shape context histogram of the corresponding feature points varied greatly. When the shape of the segmented object was very similar to the building, the shape context histogram of the corresponding feature points after calculation was relatively similar, as shown in Figure 5b. Therefore, the more similar the segmented object is to the building shape, the higher the calculated similarity and the smaller the final matching loss will be. object and the building was large, the calculated shape context histogram of the corresponding feature points varied greatly. When the shape of the segmented object was very similar to the building, the shape context histogram of the corresponding feature points after calculation was relatively similar, as shown in Figure 5b. Therefore, the more similar the segmented object is to the building shape, the higher the calculated similarity and the smaller the final matching loss will be. The shape similarity between the segmented object and the building calculated by the shape context is shown in Figure 6. When the shape similarity between the segmented object and the building is greater, it appears brighter in the image, and vice versa. It can be seen from Figure 6 that for segmented objects at all scales, the shape similarity between the major changed buildings and the Figure 5. Shape similarity calculated using shape context: (a) low shape similarity with the building, and (b) high shape similarity with the building. the building changes well. As the segmentation scale increased, more detailed changes were segmented into independent objects, as shown in Figure 6b-e. Especially for high-rise buildings, the changes caused by registration errors were segmented into independent objects, and their shapes were similar to the building shape, as shown in Figure 6 d and e. Thus, it is difficult to automatically extract the changed buildings directly based on shape similarity, especially for the area with highrise building changes. Then, the area-to-length ratios were calculated. The area-to-length ratios of the registration errors were usually less than 3, and the area to length ratios of small buildings were usually greater than 6. Therefore, the shape similarity and the ratio of area to length were jointly used to detect building changes.
The k-means algorithm was used to cluster the shape similarity and area-to-length ratio of the segmented objects, and the segmented objects were clustered into two types: changed buildings and other changes. The building change detection results are shown in Figure 7. It can be seen from the figure that when Q = 16, only the buildings with a large area or large changes could be detected. The main reason is that the Q value was too small and the generated segmentation map was rough, resulting in the small buildings not being detected. As the Q value increased, more changed buildings with smaller areas could be detected because the larger Q value generated a finer segmentation map. When Q = 32 and 64, the obtained building change detection result was very similar to the ground reference data, and it could effectively remove the false detected buildings caused by registration. When Q = 256, due to the overly fine segmentation results, there were many false detected changes in the detection results. Then, the area-to-length ratios were calculated. The area-to-length ratios of the registration errors were usually less than 3, and the area to length ratios of small buildings were usually greater than 6. Therefore, the shape similarity and the ratio of area to length were jointly used to detect building changes.
The k-means algorithm was used to cluster the shape similarity and area-to-length ratio of the segmented objects, and the segmented objects were clustered into two types: changed buildings and other changes. The building change detection results are shown in Figure 7. It can be seen from the figure that when Q = 16, only the buildings with a large area or large changes could be detected. The main reason is that the Q value was too small and the generated segmentation map was rough, resulting in the small buildings not being detected. As the Q value increased, more changed buildings with smaller areas could be detected because the larger Q value generated a finer segmentation map. When Q = 32 and 64, the obtained building change detection result was very similar to the ground reference data, and it could effectively remove the false detected buildings caused by registration. When Q = 256, due to the overly fine segmentation results, there were many false detected changes in the detection results.  Table 1 shows the accuracy of the building change detection results under different Q values. It can be seen from the table that the accuracy of change detection results increased with the increase of the Q value at the beginning. When Q = 32 and 64, the generated building change detection results had a similar total error rate. After that, as the Q value increased, the total error rate also increased. When Q = 64, the kappa coefficient was the highest. Overall, the accuracy of the building change detection results generated at Q = 32 and 64 was generally consistent and more similar to the ground reference data.  [28], multi-scale level set (MLS) [29], MLS+MRF (Markov Random Filed), and fuzzy local information c-means (FLICM) [30], and a building change detection method based on CNN (Convolutional Neural Networks) classification. The results are shown in Figure 8. All the values of the parameter μ, which tunes the length of the active contour, in CV, MLS, and MLS+MRF were set to 0.1, and the value of parameter in MRF, which tunes the balance between spectrum and spatial information was set to 0.5. Figure 8a-d presents the change maps generated by CV, MLS, MLS+MRF, and FLICM, respectively; all of these maps have a large amount of "salt-and-pepper" noise. Figure 8e presents the building change map  Table 1 shows the accuracy of the building change detection results under different Q values. It can be seen from the table that the accuracy of change detection results increased with the increase of the Q value at the beginning. When Q = 32 and 64, the generated building change detection results had a similar total error rate. After that, as the Q value increased, the total error rate also increased. When Q = 64, the kappa coefficient was the highest. Overall, the accuracy of the building change detection results generated at Q = 32 and 64 was generally consistent and more similar to the ground reference data. To prove the effectiveness of the proposed method, it was compared with some existing change detection (CD) methods, such as active contour model (CV) [28], multi-scale level set (MLS) [29], MLS+MRF (Markov Random Filed), and fuzzy local information c-means (FLICM) [30], and a building change detection method based on CNN (Convolutional Neural Networks) classification. The results are shown in Figure 8. All the values of the parameter µ, which tunes the length of the active contour, in CV, MLS, and MLS+MRF were set to 0.1, and the value of parameter α in MRF, which tunes the balance between spectrum and spatial information was set to 0.5. Figure 8a-d presents the change maps generated by CV, MLS, MLS+MRF, and FLICM, respectively; all of these maps have a large amount of "salt-and-pepper" noise. Figure 8e presents the building change map generated by the CNN classification method, which can effectively suppress noise but have some false alarms caused by the registration errors of high-rise buildings. Figure 8f presents the building change map produced by the proposed method, which is the closest to the ground reference map. The reason is that segmented objects were used instead of individual pixels, and the shape similarity and area-to-length ratio can remove false detection effectively.
ISPRS Int. J. Geo-Inf. 2020, 9,678 12 of 15 generated by the CNN classification method, which can effectively suppress noise but have some false alarms caused by the registration errors of high-rise buildings. Figure 8f presents the building change map produced by the proposed method, which is the closest to the ground reference map. The reason is that segmented objects were used instead of individual pixels, and the shape similarity and area-to-length ratio can remove false detection effectively. Five indices, including missed detection, false alarms, total errors, kappa coefficient, and calculation time were used to quantitatively assess the effectiveness of the proposed method. The quantitative experimental results are shown in Table 2. The missed detection rate P m , false detection rate P f , total error rate P t , kappa coefficient, and calculation time of the proposed method were 16.05%, 0.71%, 1.14%, 0.8, and 29 seconds, respectively. Unlike the other methods used in this study, the proposed method produced the least false alarms and total errors and missed detection levels and calculation time were satisfactory compared with the other methods. Additionally, the proposed method produced the highest kappa coefficient compared with the other methods, which indicates that the building change map has better consistency with the ground reference map. The CNN classification method can achieve satisfactory detection accuracy, but the CNN model training takes a long time.

Conclusions
A method for building change detection based on a shape context similarity model was proposed in this paper. First, the difference data D-DSM was generated by the difference processing from the DSM data at two time instances. The difference data D-DSM is then segmented using the SRM algorithm with different scales, and an empirical threshold was set to produce an initial change detection. The shape similarity between the changed objects of the initial change detection result and the buildings was calculated using a shape context similarity model, and the area-to-length ratio for the changed objects was also generated to remove false alarms caused by misregistration. Finally, the k-means clustering method was implemented based on the similarity index and area-to-length ratio to produce a building change map. In the process of segmentation by the SRM algorithm, the change detection results were affected by the scale control parameter Q within a certain range. When the value of Q was 32 or 64, the accuracy of the generated building change detection was almost the same. The SRM segmentation can reduce the "salt-and-pepper" noise caused by the pixel-based method.
The experimental results showed that the proposed method can effectively detect building changes based on DSM data compared with some other popular CD methods. The similarity between the segmented objects and the building calculated by the shape context can effectively extract building changes, and the area-to-length ratio can further remove the false detection changes caused by the registration errors of high-rise buildings and improve the accuracy of building change detection.
However, only rectangular buildings were used to calculate the similarity for changed objects, which may result in missed detections for changed buildings of other shapes. More prior building shapes will be used in future work to generate more accurate detection results.
Author Contributions: Xuzhe Lyu prepared the data and wrote the paper; Ming Hao involved in experiment designing and analyzed the data; Wenzhong Shi reviewed the manuscript and provided comments. All authors have read and agreed to the published version of the manuscript.

Conflicts of Interest:
The authors declare no conflict of interest.