Unsupervised Multitemporal Building Change Detection Framework Based on Cosegmentation Using Time-Series SAR

: Building change detection using remote sensing images is essential for various applications such as urban management and marketing planning. However, most change detection approaches can only detect the intensity or type of change. The aim of this study is to dig for more change information from time-series synthetic aperture radar (SAR) images, such as the change frequency and the change moments. This paper proposes a novel multitemporal building change detection framework that can generate change frequency map (CFM) and change moment maps (CMMs) from multitemporal SAR images. We ﬁrst give deﬁnitions of CFM and CMMs. Then we generate change feature using four proposed generators. After that, a new cosegmentation method combining raw images and change feature is proposed to divide time-series images into changed and unchanged areas separately. Secondly, the proposed cosegmentation and the morphological building index (MBI) are combined to extract changed building objects. Then, the logical conjunction between the cosegmentation results and the binarized MBI is performed to recognize every moment of change. In the post-processing step, we use fragment removal to increase accuracy. Finally, we propose a novel accuracy assessment index for CFM. We call this index average change difference (ACD). Compared to the traditional multitemporal change detection methods, our method outperforms other approaches in terms of both qualitative results and quantitative indices of ACD using two TerraSAR-X datasets. The experiments show that the proposed method is effective in generating CFM and CMMs.


Introduction
Change detection is a process of automatically analyzing and identifying the variation of Earth's surface objects based on multitemporal remote sensing images acquired in the same region at different times [1,2]. As a significant application of remote sensing image, change detection analysis provides an effective technological significance for land use and land cover monitoring [3,4], urban planning and management [5,6], natural disaster assessment and monitoring [7][8][9], etc.
Optical remote sensing systems require good sunlight and weather conditions to acquire high-quality optical images. In contrast, synthetic aperture radar (SAR) can acquire data all day without relying on a light source, which is suitable for some emergencies such as natural disaster survey.
According to the number of images, change detection can be divided into binary-date change detection and multi-temporal change detection. Meanwhile, change detection methods can be divided into deep learning (DL) methods and traditional methods depending on whether deep neural networks (DNN) are utilized. Recently, deep learning (DL) methods have made great breakthroughs in binary-date change detection. However, in multi-temporal change detection, DL methods are infrequent due to the large difficulty of acquiring high-quality labels. For example, Su et al. [10] and Yuan et al. [2] used traditional methods to classify change behavior into four types: step change, impulse change, cycle change, and complex change. These labeling tasks are difficult for a human. In this paper, we mainly focus on using traditional computer vision techniques to dig up more information from time-series SAR images.
Beyond change analysis between two dates, multi-temporal change analysis (more than two dates) mainly focuses on the long-term change information [10]. In general, most existing methods for change detection in multitemporal SAR images can be grouped into the following two types according to the different ways of using time-series images: (i) simultaneous comparison; (ii) pairwise traversal comparison. These two approaches extend binary-date change detection analysis from different points of view. The second approach compares each pair of adjacent images, and the first approach, which we mainly focus on, compares pixels at all times in the same position. Yuan et al. [2] applied the density-based temporal clustering method to extract change classification. Su et al. [10] presented a likelihood ratio test based method of change detection and classification for SAR time series. Le et al. [11] used Kullback-Leibler divergence as a similarity measure to generate a change criterion matrix (CCM). Likelihood ratio tests and information-theoretic measures approach also play an important role in change detection. Conradsen et al. [12] presented the likelihood ratio test statistic for the homogeneity of several complex variancecovariance matrices that may be used in order to assess whether at least one change has taken place in a time series of SAR data. Nascimento et al. [13] proposed a comparison between a classical change detection method based on the likelihood ratio and three statistical methods that depend on information-theoretic measures: the Kullback-Leibler (KL) distance and two entropies. Mian et al. [14] proposed new robust statistics for time series SAR change detection.
Furthermore, deformation monitoring, which is a subfield of the change detection domain, has also made great progress in recent years. Cavalagli et al. [15] presented an overview of the results of diagnostic and monitoring activities carried out in the last years through satellite radar interferometry and in situ measurements in the historical city of Gubbio, Italy. Soldato et al. [16] conducted analysis of remote-sensing SAR data and landslide-induced damage coupling interferometric synthetic aperture radar (InSAR) and field survey data.
The building is an ordinary man-made object and an essential factor of urbanization. Some scholars have developed traditional methods of building change detection. Xiao et al. [17] used optical images to conduct binary-date change detection based on cosegmentation. Huang et al. [18] proposed a morphological building index (MBI) to recognize building objects. Saha et al. [19] performed building change detection in veryhigh-resolution (VHR) SAR images using a cycle-consistent generative adversarial network and deep change vector analysis (DCVA) and fuzzy rules.
However, most multitemporal change detection approaches can only detect the change dynamics [11] or change classification [2,10,19]. Change classification has been explained above. Change dynamics in their paper is the probability of "changed" responses. The typologies of changes in our study include constructions, demolition and major displacements of buildings. Concretely, the constructions of buildings include analysis of the evolution of urbanization, monitoring for illegal construction, etc. The demolition and major displacements of buildings mainly involve urban redevelopments. In this paper, we propose a multitemporal change detection method that can acquire change frequency map and change moment maps. This proposed method is based on two hypotheses. (i) We assume that all the images have sufficient registration accuracy. (less than 0.5 pixel) (ii) The view angles of the sensors are almost the same. In other words, we do not consider changes caused by different view angles of the sensors. The details of our approach are as follows. Firstly, the change feature is generated using the proposed index. Meanwhile, MBI is calculated and binarized for each image. Secondly, the cosegmentation [17] is presented on each image and generate segmentation results for each date. Finally, CFM and CMMs are derived using cosegmentation results and binarized MBI.
For ease of expression, we first give explanations of CFM and CMMs.

Change Frequency Map (CFM)
An r × c matrix is said to be CFM if CFM ∈ {A ∈ Z r×c |a ij ≥ 0} and the change frequency CF in CFM illustrate the number of changes at the corresponding pixel. Where Z is the set of integers and r × c is the size of input image. Let N be the number of input images, and it is obvious that 0 ≤ CF ≤ N − 1.

Change Moment Maps (CMMs)
We first define the maximum number of changes: CMMs contain ∑ K k=1 k = K(K + 1)/2 maps. The element of a change moment map shows the moment of change. For example, if some objects whose CF equals three, then we have three change moment maps exhibiting the first, the second, and the third moment of change, respectively.
The rest of this paper is organized as follows: in Section 2, the principle and algorithm of the proposed method are explained in detail. Section 3 exhibits the experimental results and discussion using realistic TerraSAR-X time-series images to demonstrate the effectiveness of the proposed method. Finally, conclusions are drawn in Section 5.

Materials and Methods
In this section, we will introduce our novel building change detection framework in detail. The input of our algorithm is N registered SAR images. The basic idea of our method is to combine the changed area and the building area. In order to acquire the changed area, we proposed the change feature generator. However, the change feature generator is a pixellevel detector, which does not take into account neighborhood information. Accordingly, we use cosegmentation method proposed in [17] to make use of both change feature and neighborhood information. The cosegmentation method was firstly applied in binary-date change detection based on optical images. We modify the cosegmentation method and apply it in time-series SAR change detection. In the meanwhile, we extract building area using the MBI method. We finally combine the changed area and building area using a simple AND (logical conjunction) operation.
The workflow of our method is depicted in Figure 1.

Change Feature Generator
The change feature generator is used to generate difference map at pixel level. Change feature I C represents the severity of the change at pixel level. Let x = [x 1 , x 2 , ..., x N ] represent one pixel values of N times. Therefore, the generator is to find a suitable index to measure the variation of x. In this paper, we give four indices, range R, variance σ 2 , omnibus test Q [12] and max ratio L.  The range of x is the difference between the largest and smallest values: For binary-date change detection, i.e., N = 2, R = |x 2 − x 1 |, which is equivalent to the classical difference detector.
The sample variance is a classic unbiased estimator to measure the variation of a set of values. The sample standard variance is defined as follows: Q is derived from the omnibus test for equality of several complex covariance matrices [12]. In this paper, our data is single-band data and the number of looks equals one. In this case, Q is defined using the following equation: If all the elements of x are equal, then Q = 1. We use I C3 = 1 − Q as change feature generator in this paper.
Similar to R, L is a simple extension of the classic ratio detector: Let T be the threshold of the change feature, and it can be calculated by expectation maximization(EM) algorithm [20].
As we can see from Equation (1) to Equation (4), I C reaches a minimum of zero if there is absolutely no change, then the normalization of I C equals to αI C (α is a constant). Therefore, according to Equation (6), it does not matter whether I C is normalized because if T is the threshold of I C , αT must be the threshold of αI C .

Cosegmentation
The workflow of cosegmentation strategy is depicted in Figure 2. The function of cosegmentation method is to segment an image into two parts using change feature based on graph theory. This segmentation result depends on change information and neighborhood information. Cosegmentation contains three following steps.

Graph Establishment
Graph establishment depends on the number of pixels of multitemporal images. For each image, we build a graph G = (V, E), as shown in Figure 3. The graph contains three different kinds of nodes, which are source s, sink t, and normal nodes P, defined as follows: Node s and t represent background and foreground respectively. Every node in P represents a pixel of each image I i . Every pixel node links with source s and sink t, which is called t-link, and they link their neighborhood pixels simultaneously, which is called n-link. An eight-neighborhood link is used in this experiment.

Edge Weight Setting
The edge weight of the graph depends on a manually set parameter and the pixel value of the change feature and raw image. In this paper, we adopt the weight setting method in [17] according to the edge weight setting principle, as Table 1 shown. Coefficient λ, which balances the relative importance between change feature information and neighborhood information, is set manually. The principle of weight set consists of three items.

1.
Edges that link s and pixels that are more likely to change should be given a lower weight. Similarly, edges linking t and pixels that are more likely to change should be given higher weight. The other pixels do the opposite.

2.
For n-link, those edges linking similar pixels should be given higher weight. Edges linking significantly different pixels should be given a lower weight.

3.
For edges linking t and changed pixels, the weights should be significant enough to permit the flow to pass.

Edge Type Value Condition
In Table 1, I p C denotes the pixel value of I C . The larger I p C , the more likely this pixel is to change. According to principle 1, we define D f I p C and D b I p C : According to principle 2, we define V pq .
where I p and I q represent pixel values (panchromatic) or the spectral feature vectors (multispectral) of pixel p and pixel q. r p and c p represent the row and column number of pixel p. E[·] means the average value of all neighborhood pixel pairs over the whole image. Finally, we can define W according to principle 3:

Graph Cut
Graph cut aims to find a cut of this graph that serves: C is a cut of this graph, and e is an edge that belongs to C. Ω is the set of all cuts. w e is the weight of e. The energy minimization method [21] is adopted in this study to solve the graph cut problem.

Morphological Building Index
In this subsection, we give a review of the MBI method. MBI can extract buildings based on morphological processing, which was first used in optical images and achieved good results [18]. The basic idea of MBI is to build the relationship between the spectralstructural characteristics of buildings and the morphological operators. These spectralstructural characteristics of buildings are represented using the opening by reconstruction with a series of linear structural elements (SE). Let l and d, which are hyper-parameters set by traversing different values, denote length(pixel) and direction(degree) of linear SE, respectively. The d is generally constant and the l depends on the resolution of the image. The higher the resolution, the more pixels the building occupies, and l should be larger.
Buildings generally appear as high-brightness square objects in SAR images. The intuitive idea is to enhance these high-brightness square objects and in SAR images and suppress other parts of the image. Based on this idea, four characteristics are considered in this research: brightness, local contrast, size and directionality.
MBI calculation can be implemented by the following steps: 1.
Calculation of brightness of every pixel x.
White top-hat by reconstruction (WTH): where represents the opening-by-reconstruction [22] of the brightness image, and and l and d are two series that indicate the length and direction of a linear structural element, respectively. The parameters l and d in this research are displayed in Table 2.
We use d i and l j to represent the ith(1 ≤ i ≤ i m ) and jth(1 ≤ j ≤ j m ) number of d and l. As shown in Table 2, i m equals 5 and j m equals 4.

3.
Morphological profiles (MP): Differential morphological profiles (DMP): The DMP of the white top-hat is defined as: Finally, the morphological building index (MBI) is defined as the average of DMP:

Extract Change Information
Cosegmentation can not detect when one object exists or disappears. Assume a building existed on T 1 , demolished on T 2 , and rebuilt at the same place on T 3 . Under this circumstance, the results of the cosegmentation method are almost the same on each time, as shown in Figure 4g-i.
In order to solve this problem, we propose an extraction method, corresponding to C.1 to C.5 in Figure 1.
Precisely, we first calculate the MBI map and binarize the MBI map using OTSU [23], which are depicted in Figure 4d-f. Finally, we make logical conjunction between binMBI and cosegmentation results. The logical conjunction results illustrate when the objects exist or disappear, as shown in Figure 4j-l. With this logical conjunction results, we can easily detect change frequency and change moments for each pixel using a discrete difference strategy. Furthermore, the post-processing is to remove small fragments that are less than 100 m 2 .  for j = 1 : i do 14: Recognizing the moment of jth change for each object that changed i times in total; 15: end for 16: end for 17: return CMM; In step 5, We first adopt a morphological closing operation with a square structural element of 3 × 3 pixels to fill gaps and then adopt a morphological opening operation with the same structural element for smoothing.
In step 8, fragments remove procedure is removing small objects whose area is less than 100 m 2 , which may be caused by noise.
The output of Algorithm 1 consists of a change times map CFM and K(K + 1) 2 change moment maps CMM.

Accuracy Assessment
In binary-date change detection, the image is divided into changed and unchanged areas, which can be seen as a binary classification problem. However, in time-series change detection, the change frequency classification is not equivalent to common multiclassification problems. Assume that the real change frequency f = 5, and the outputs of the two detectors equal 6 and 8 respectively. It is obvious that the first detector is superior, but the assessment index of the common classification problem cannot distinguish the two detectors.
In order to quantitatively verify the performance and effectiveness of our method, we propose a novel assessment index average change difference(ACD) for CFM accuracy assessment. Let Y denote the real CFM andŶ denote the output CFM of the detector. The ACD is defined as follows: Furthermore, we could only consider changed areas, then we define a mask M that denotes real changed areas: More generally, we define mask M k that denotes areas where f ≥ k: Then the mask ACD can be defined as follows: The ACD denotes the expectation of the absolute error of a change detector. It is obvious that the smaller ACD is, the better detector. ACD equals zero if and only ifŶ = Y.

Dataset
In this paper, the study area is located in Tongzhou District of Beijing. Tongzhou has been undergoing rapid changes over the past ten years. Two TerraSAR-X datasets with 0.9 m range resolution and 1.8 m azimuth resolution are used in our experiment. The first dataset, obtained from Jan. 2012 to May 2013, contains eight images with 400 × 500 size, and the second dataset contains 16 images with 501 × 451 size acquired from Jan. 2012 to Mar. 2016. The construction and demolition of buildings in this area shown in data 1 are conducive to validating the effectiveness of different change detection algorithms. In the area shown in data 2, many small houses were gradually built from north to south from 2012 to 2016. This region is a typical example of the process of urbanization, which can be clearly illustrated in CMM 11 in Figure 7.
Raw data should be preprocessed before change detection. We first implement radiometric calibration and co-registration. Then we adopt MSAR-BM3D [24] for denoising. MSAR-BM3D method obtains an excellent despeckling capability in the spatial dimension, which is effective for dealing with multitemporal data. As shown in Figure A1 The sample denoised images and the corresponding optical images are displayed in Figure 5.

Parameter Setting
To our knowledge, our framework is the first simultaneous comparison method to generate CFM and CMMs. However, we can simply modify some other similar methods to make a contrast experiment. The statistical features and temporal clustering (SFTC) [2] and NORCAMA [10] are designed for change classification, which can be modified to generate CFM and CMMs. In the experiments, our method was compared with these two multitemporal change detection methods to demonstrate its superiority. There are two steps requiring manual parameter setting, the MBI calculation step and cosegmentation step. The parameter setting is shown in Table 2. In the cosegmentation step, λ is set by 0.25, which is a great balance between changing feature information versus neighborhood information.

Results
In this section, we present the accuracies for different methods using two datasets. We neglect ACD k for k ≥ 3 because objects whose CF ≥ 3 play a small part (0.94% in data 1 and 5.7% in data 2) and we do not have to consider it in isolation. The ACD of the two datasets is depicted in Figure 8.
The CFM and part of CMMs of the two datasets are illustrated in Figures 6 and 7. For the sake of argument, we define the following symbol: where CMM ij indicates the jth change moment for those objects whose CF = i. For example, CMM 11 means the first moment of change for objects whole change frequency equals 1. (e) CFM using the propoed method.
Range R is applied as the change feature generator To conserve space, we only illustrate CMMs of objects whose CF ≤ 2 because very few objects change more than three times.
Some sample images of the first dataset are shown in Figure 6. In Figure 6b, the objects in area 1 were constructed in T 6 , and were demolished in T 8 so that the CF of these objects equals 2. The CF of other objects in area 2, also equals to 2, but they were constructed in T 7 and demolished in T 8 . These two types of objects were built at different times but dismantled at the same time. Therefore, the CF and CMM 22 of these two types of objects are the same while the CMM 21 is different. The experiment of dataset 1 demonstrates the effectiveness of CMMs for detecting when the objects are built and when they are demolished.
In the first line of Figure 7, we select four representative images from the second dataset. From 2012 to 2016, these small bungalows began to be built gradually from north to south, which is a typical change detection task. The CFM indicates the change frequency of the area of interest. What we found from CFM is that most objects changed only once and a small part of objects changed twice or more. CMM 11 exhibits the construction process of these bungalows in this area. In CMM 11 , the different change moments are represented in different colors, where we can see that these bungalows are built from north to south gradually.
The different values K generated by different methods are listed in Table 3. Proposed-R 5 5 Proposed-σ 2 5 8 Proposed-Q 5 6 Proposed-L 5 6 Ground truth 4 5 We only illustrate CMMs of objects whose CF ≤ 2 in Figures 6 and 7 because very few objects change more than three times. The CMM 31 , CMM 32 and CMM 33 of these two datasets are illustrated in the appendix, as shown in Figures A2 and A3. Experimental results contain plenty of figures so that we put them in the appendix, which is also available on https://github.com/txdtplus/zhang2020_appendix. This website includes CFM, CMMs, and illustrations of two datasets.

Data Applicability
This method is specially designed for single polarized SAR images. Changes in vegetation are not evident in SAR images, which is very helpful for building change detection. However, in optical images, we have to extract the building areas using NDVI or other indices before detecting building changes. For multispectral optical images or multi-polarization SAR images, we need to find new ways to generate change feature.

Accuracy
Compared to other methods, our framework is designed especially for multitemporal building change detection and generating CFM and CMMs. The core of our framework is the contact between MBI calculation and cosegmentation. As Figure 8 shows, the change feature generator selection has little effect on accuracy. Figures 9 and 10 show different CFMs generated by various methods, which illustrate qualitative results comparing with other methods. The quantitative results of the various methods are list in Tables 4 and 5, as well as in Figure 8.
The detection errors are mainly caused by building recognition step and despeckling step. As shown in Figure A1, the boundaries of the adjacent small buildings become more blurred after despeckling, which poses a greater challenge to building recognition.

Algorithm Complexity
MBI calculation takes up most of the time of the algorithm. We implement experiments on a computer with i5 CPU at 3.00 GHz (6 cores) and 16 GB RAM. The runtimes of the three methods are listed in Tables 4 and 5.

Conclusions
We propose a novel multitemporal building change detection framework that can generate change frequency map(CFM) and change moment maps(CMMs) from multitemporal SAR images. We first give definitions of CFM and CMMs, then a new cosegmentation based on multitemporal images and change feature generator is proposed to divide timeseries images into changed and unchanged areas separately. The proposed cosegmentation and the morphological building index(MBI) are combined to extract changed building objects. The logical conjunction between the cosegmentation results and the binarized MBI is performed to recognize every moment of change. In the post-processing step, we use fragment removal to increase accuracy. Finally, we propose a novel accuracy assessment index for CFM.
The proposed method can acquire both CFM and CMMs while most multitemporal change detection approaches only capture the intensity of change. CFM and CMMs are of great significance to the field of building change monitoring. The experiment of dataset 1 demonstrates the effectiveness of CMMs for detecting when the objects are built and when they are demolished. The experiment on the second dataset illustrates that the CMMs can clearly state the process of urban building expansion.
The accuracy of our method is superior to other methods under the ACD index. We first introduce the cosegmentation method into an unsupervised multi-temporal SAR image change detection field and acquire excellent results. Both synthetic data and real TerraSAR-X data demonstrate the effectiveness of our method.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding author upon reasonable request.