K-Matrix: A Novel Change-Pattern Mining Method for SAR Image Time Series

: In this paper, we present a novel method for change-pattern mining in Synthetic Aperture Radar (SAR) image time series based on a distance matrix clustering algorithm, called K-Matrix. As it is different from the state-of-the-art methods, which analyze the SAR image time series based on the change detection matrix (CDM), here, we directly use the distance matrix to determine changed pixels and extract change patterns. The proposed scheme involves two steps: change detection in SAR image time series and change-pattern discovery. First, these distance matrices are constructed for each spatial position over the time series by a dissimilarity measurement. The changed pixels are detected by using a thresholding algorithm on the energy feature map of all distance matrices. Then, according to the change detection results in SAR image time series, the changed areas for pattern mining are determined. Finally, the proposed K-Matrix algorithm which clusters distance matrices by the matrix cross-correlation similarity is used to group all changed pixels into different change patterns. Experimental results on two datasets of TerraSAR-X image time series illustrate the effectiveness of the proposed method.


Introduction
The success of launching Earth Observation satellites has provided a powerful tool to map Earth's surface and acquire information about targets on the ground.Change detection (CD) has become an important application of remote sensing images [1], such as environmental monitoring [2][3][4], the observation of natural disasters [5], risk management [6], and the change analysis of human activity [7,8].Compared with optical sensors, synthetic aperture radar (SAR) sensors have many advantages.Crucially, SAR is an active microwave imaging sensor and it can image in all-time and any weather conditions [9,10].After years of development, SAR sensors can acquire many high-quality image data of the same target area.So that remotely sensed multitemporal images are usually used to analyze the changes taking place between two or more images acquired over the same area at different time [11].
In the literature, a great number of change detection methods have been proposed for SAR image time series.Generally, these change detection methods can be grouped into two main categories: (i) analysis methods based on bi-date change detection; (ii) time series analysis methods.For bi-date change detection task, existing methods can be either supervised or unsupervised.However, it is difficult for supervised methods to identify classes in SAR images on account of unavailable ground truth and unsupervised change detection is more popular because no ground truth is needed [12].Usually, these change detection methods consist of three steps [13]: (1) preprocessing of SAR images; (2) comparison between two images; (3) thresholding of the image change indicator.
The preprocessing step includes geometric correction, image registration, and SAR image denoising.In this paper, we study the change detection method under the condition that the SAR images have been corrected and co-registered.For SAR images, regions with homogeneous radar properties display strong fluctuations due to speckle.The common model of speckle is multiplicative noise model and the corresponding speckle statistics departs from the additive Gaussian noise model widely used for optical images.Many speckle reduction methods have been proposed in the recent reviews [14,15].To reduce the influence of speckle and improve the efficiency of using SAR images, we apply multi-channel logarithm with Gaussian denoising (MuLoG) to process all SAR images [16].In the comparison step, a suitable dissimilarity measure is used to compute the distance of two images.The most intuitive operator is the difference operator [17].However, the difference operator is susceptible to noise, then the classical Log-Ratio operator is proposed to detect changed pixels between two co-registered SAR images of the same scene [18].The Generalized Likelihood Ratio Test (GLRT) is proposed as the extension of the Log-Ratio [19].According to information theory, the Kullback-Leibler (KL) divergence, a statistical dissimilarity measure is often used to measure the distance in the task of SAR image change detection [20].Based on local distribution model, the Jensen-Shannon (JS) divergence is proposed to measure the dissimilarity of two probability density functions (PDFs) [21].In the last step, a thresholding algorithm is used to obtain the results of change detection.There are already many algorithms proposed to segment the image into binary image.For instance, the Otsu is most commonly used binarization algorithm [22].The Kittler-Illingworth (KI) [23] is another popular thresholding algorithm by using the generalized Gaussian to model the statistical distributions of changed and unchanged classes [24].The generalized Kittler and Illingworth minimum-error thresholding algorithm (GKI) is proposed to take into account the non-Gaussian distribution of the amplitude values of SAR images [25].After above three steps, the change results, named change map, are obtained and the change map is usually a binary matrix containing only 0 and 1 where 0 represents unchanged pixel and 1 represents changed pixel.Generally, all change maps are integrated to analyze the change information in time series.For example, the activity areas are mapped by summing all binary change maps, called IndexMap.The IndexMap contains the frequency of each change area over the time series [26].
Direct conversion of SAR image time series into time series is another effective analysis method [27].That is to say that each time series is constructed by extracting the feature of pixel at the same spatial position over all SAR images.Then the time series is analyzed one by one.In [28], a spatio-temporal change detection framework is proposed, and the results are obtained by using a matrix of cross-dissimilarities.The method for generalized means ordered series analysis (MIMOSA) compares only two different means between the amplitude SAR images in time series to analyze the change patterns [29].Another most common method is change analysis of SAR image time series based on the change detection matrix (CDM).These matrices are constructed for each spatial position over the time series by implementing similarity cross tests [30].The CDM is a binary matrix containing 0 and 1 where 0 represents for no change and 1 represents for change in certain characteristics of the pixel.A likelihood ratio test-based method of change detection and classification, namely NORmalized Cut on chAnge criterion MAtrix (NORCAMA), is proposed to extract different change patterns such as step change, impulse change, and cycle change in [31].This method classifies change types by analyzing all clustering results after a normalized cut algorithm being used on change criterion matrix (CCM).However, the change detection matrix ignores a lot of information when CDM can index the change information in pixel time series.In this paper, the distance matrix (DM) is constructed directly to detect changes in SAR image time series.Then, inspired by K-Shape algorithm [32], a novel clustering algorithm that measures the distance between two DMs by matrix cross-correlation is proposed to cluster distance matrices and discover the change patterns by clustering results.
The main contributions of the proposed method are as follows: (1) A novel two-step framework to mine the change patterns of SAR image time series.(2) An efficient change detection approach based on the distance matrix for SAR image time series.(3) An unsupervised clustering algorithm, called K-Matrix, for the distance matrix to extract the change patterns.
The method has been tested on two real data sets of SAR image time series acquired by the TerraSAR-X over two different areas in Shanghai city, China.Experimental results confirm the effectiveness of the proposed method.The remainder of this paper is organized as follows.Section 2 describes in detail the proposed change detection method for SAR time series and the proposed K-Matrix clustering algorithm in turn.Next, experimental results are given in Section 3, and capabilities and limitations are discussed in Section 4. Finally, the conclusions are drawn in Section 5.

Change Detection in SAR Image Time Series
To detect the pixels that have been changed in SAR image time series, we construct a distance matrix containing of the information on responses of distance between each pair of pixels from different dates.Considering an N-length co-registered SAR image time series I = {I t } N t=1 , where I t is the SAR image acquired at time t.At the position (i, j), we obtain a pixel set {I t (i, j)} N t=1 , named pixel time series, and I t (i, j) represents the amplitude value of SAR image I t located in (i, j).As shown in Figure 2, the distance matrix is computed by: where the dissimilarity is computed between each two dates using small sliding window to suppress the influence of noise in SAR images.For example, the dissimilarity measurement can be obtained by the Log-Ratio (LR) operator or other measure that can be used in SAR images.This matrix contains all information of dissimilarity which is relevant to the degree of change corresponding to each reference date.After that, our purpose is to analyze each pixel time series using the corresponding distance matrix.For a pixel time series, we assume that these pixels form a physical system.In physics, it is well known that the lower the energy the more stable for a system.Here, we define similarly the stability of the pixel system, i.e., it is stable if there is no change in terms of feature value of pixels while the degree of instability is corresponding to the changing degree of the pixel.Moreover, the distance matrix records the dissimilarity between any two pixels in time series, so that we redefine the energy value of the pixel system just like the definition of energy in gray-level co-occurrence matrix (GLCM): Theoretically, only when e = 0, the system is completely stable.However, in fact, even after radiation correction, the amplitude value of the pixels in SAR image time series still have some fluctuations.
To decide whether there is the stable pixel system or not, the value of energy e is compared to a threshold λ.The stability information, i.e., change information, is then defined as: where R is a binary matrix mask with the size of N × N containing 0 and 1 values, in which 0 represents "unchanged" and 1 represents "changed".In this paper, we consider the changed region as region of interest (RoI), where this region is the main study area in our change-pattern mining experiments.

K-Matrix Clustering Algorithm
In this section, we propose a novel centroid-based clustering algorithm, called K-Matrix that can preserve the feature shapes of time series using distance matrix.Specifically, we first present our distance measure, called the matrix cross-correlation, which is inspired by the cross-correlation measure for time-lagged signal in [32].Based on this distance measure, we propose a method to compute the barycenter, i.e., the centroid of matrix clusters.Finally, we describe our K-Matrix clustering algorithm, which relies on an iterative refinement procedure.
Originally, cross-correlation is a statistical measure with which we can determine the similarity of two time series.Here, this measurement is used to measure the similarity between two distance matrices.For each distance matrix, there is a corresponding graph in the feature space.As shown in Figure 3, in the graph (V, E), the vertex υ t ∈ V represents the feature point extracted from I t (i, j).We define the pixel time series at the spatial position (i, j) as a pixel system and define the corresponding Graph (V, E) in its feature space, i.e., vertex v t represents the feature of the pixel I t (i, j) in the feature space and the distance between features of pixel I t (i, j) and pixel I k (i, j) is defined as edge e tk .All elements of their distance matrix are set to 0 for the diagonal and the time flow information is hidden in the diagonal direction as shown in Figure 3. Considering two distance matrices, A and B, to achieve shift-invariance, we keep B static and slide diagonally A over B to compute their inner product for each shift matrix of A. We denote the time shift of a matrix by a shift variable s: Then, we need to determine the position q at which MCC(A, B) is maximized.In addition, based on this value of q, we can find the optimal time shift Ã(q) of matrix A corresponding to the matrix B. The computational procedure of the matrix cross-correlation sequences is illustrated in Figure 4. Specifically, we define the maximum of the matrix cross-correlation as the similarity between two distance matrices, named matrix cross-correlation similarity (MCS): For many clustering tasks, it is a challenging step to extract the meaningful centroid according to the proposed distance measurement.Although there are many clustering criteria that have been proposed to capture homogeneity and separation [33], the minimum within-class sum of squared distances is the most commonly used.For example, there is a set including n samples X = { x 1 , x 2 , ..., x n }, where x i ∈ R m , then the goal is to group X into k clusters P = {p 1 , ..., p k }, and the objective loss function is: However, if we replace the distance with the similarity measurement, the above formulas can be rewritten as: In this task of distance matrix clustering, assuming that there is a matrix set M set = {M 1 , M 2 , ..., M n } in which all elements belong to the same group p k , where n is the number of elements in M set , and p k is one of the cluster center set P. Now we have: It is difficult to solve this optimization problem directly because it is nonlinear.As mentioned earlier, we can obtain the aligned matrix when the matrix cross-correlation is the maximum.So that we can use these aligned matrix set Mset = { M1 , M2 , ..., Mn } to substitute M set .For simplicity, we will express this equation with vector.The distance matrix is symmetrical matrix and it can be written as: where U and L are an upper triangular matrix and a lower triangular matrix, respectively.
Here, the non-zero elements of the matrix U are sequentially taken to compose a vector x = (U 12 , ..., U 1N , ..., U (N−1)N ) T .Similarly, vector µ k is corresponding to the center matrix p k .Then Equation ( 9) is rewritten as (please see Appendix A for more details): Now, this problem has been converted to a well-known problem named maximization of Rayleigh Quotient [34].The solution is the eigenvector corresponding to the largest eigenvalue of matrix M.
After that, we can restore the center matrix p k by vector µ k .
It should be noted that the K-Matrix clustering algorithm is based on the iterative refinement procedure just like the one used in K-means.At first, we choose randomly K matrices as centroid to initialize the algorithm.Then, in every iteration, K-Matrix includes two steps: assignment and refinement.In the assignment step, we update the cluster memberships by assigning each matrix to the closest cluster (Equation ( 6)).In the refinement step, we use the cluster memberships assigned in previous step to update the centroid (Equation (A3)).This algorithm repeats above two steps until the assignment is unchanged or the default maximum number of iterations is reached.

Description of Datasets
There are two real datasets of SAR image time series used to test the proposed method.These images are acquired with TerraSAR-X sensor over Shanghai, China.The product type is single look complex (SLC) and the imaging mode is StripMap (SM).Both dataset I and dataset I I contain the co-registered 12 SAR images.They have the same spatial resolution of 3 m × 3 m.The size of each image in dataset I is 300 × 300 pixels and the size of each image in dataset I I is 800 × 800 pixels.These SAR images are obtained from 28 September 2013 to 18 October 2014.There are more details shown in Figures 5 and 6.All SAR images have been denoised by MuLoG + BM3D before change detection and pattern mining [16].The two ground truth maps of change detection are shown in Figure 7, i.e., as long as the change occurs in pixel time series, the corresponding spatial position is set to 1.

Change Detection in SAR Image Time Series
In this section, our goal is to find the area where the change has occurred.At first, the distance matrix set DM set must be constructed effectively.In this experiment, Log-Ratio (LR) operator and GLRT are selected as the dissimilarity measurements to compute the distance between two pixels for simplicity.Those distance measures based on statistical distribution usually have high computational complexity, such as KL divergence.Moreover, a sliding window is used to suppress the influence of noise when the distance is calculated.Here, the size of the sliding window is set as 5 × 5.The Log-Ratio operator is defined as follows: where • is the mean operator, (i, j) indexes the spatial position of the pixel in the image.According to [35], the similarity of the generalized likelihood ratio S GLR is written as: where y k is the pixel in patch y and L k is the (equivalent) number of looks of y k .Then, the distance can be denoted as: The distance matrix D ij = {d ij (t 1 , t 2 )} N×N is obtained after computing distance of all pixel pairs.In this way, we have the distance matrix set DM set = {D ij } m×n .Here, we chose a commonly used time series change detection framework for comparison [36].As shown in Figure 8, the results are obtained by summing the ChangeMaps of all adjacent two dates.For our change detection experiments, we use the following four indicators to evaluate these methods: Overall Accuracy (OA), the detection precision of changed pixels (Pc), the recall of changed pixels (Rc) and Kappa coefficient.They are defined as follows: where TP means the number of true detected changed pixels, TN means the number of true detected no changed pixels, FP means the number of false detected changed pixels and FN means the number of false detected no changed pixels.Moreover, Pe is defined as: For the dataset I, the value of N is 12.The values of m and n both are 300.According to the Formula (2), the energy value e can be computed and the feature maps are shown in Figure 9a,b.The feature map can are considered to be the activity map that the greater the value, the bigger the change.As shown in Figure 9, the Log-Ratio operator is sensitive to the change of the water area.Furthermore, to determine the changed area, i.e., region of interest (RoI), the thresholding algorithm is used to generate the binary map that the zeros represent the unchanged pixels and the ones represent the changed pixels.Here, the GKI algorithm is applied to separate the changed pixels from the unchanged pixels.The change detection results are shown in Figure 10.In Table 1, the performance of the proposed method is clearly better than the traditional one.The GLRT method has the highest overall accuracy (OA = 96.27%),detection precision of changed pixels (Pc = 91.94%),recall (Rc = 95.03%) and the best classification result (Kappa = 0.91).For the dataset I I, the value of N is 12.The values of m and n both are 800.The distance matrix set DM set = {D ij } 800,800 i=1,j=1 is obtained by calculating the each distance matrix D ij at the spatial position (i, j).Based the LR operator and GLRT, the distance matrix sets are obtained.Then, the feature maps are computed and shown in Figure 9c,d.Figure 11 shows the change detection results obtained by using the GKI algorithm.It is obvious that the LR operator method and GLRT method in our scheme have better performance of change detection.The quantitative analysis results are listed in Table 2. Please note that all the evaluation indicators of change detection are lower than these in dataset I because there are more complex scene in the dataset I I.However, our scheme with GLRT method still has the satisfactory performance with Pc 75.89%,Rc 73.78%, OA 92.84% and Kappa 0.70.

Change-Pattern Mining
After detecting changes in SAR image time series, our next goal is to discover these different change patterns.In Section 3.2, the distance matrix set has been obtained and results with the best performance are defined as change map.In this section, the above change detection results are considered to be the study area.The change map is an m-by-n matrix containing only 0 and 1.Then, we denote a new distance matrix set M set : Each element in M set is a distance matrix of the changed pixel.We apply the proposed K-Matrix clustering algorithm to two time series datasets.In this experiment, we only use the amplitude information of SAR images.For dataset I, we randomly choose three distance matrices to initialize K-Matrix algorithm from a practical point of view.The clustering results of change patterns are shown in Figure 12.Three areas with different change patterns are marked in red, green, and blue, respectively.In Figure 12b, three typical change-pattern areas are marked with three different color rectangles.We crop these areas and show them in Figure 13.It is obvious that the area marked with green represents the change of buildings (Figure 13b).Figure 13a,c both represent the change in the water areas, but they belong to different change patterns.Figure 14 shows a better visualization.For dataset I I, these SAR images have larger sizes and more complex change patterns.The clustering algorithm starts with the number of patterns being set to 4. The clustering results shown in Figure 15a are painted in four colors: red, green, blue, and yellow.In Figure 16c, the area marked by red rectangle is mainly the change of buildings.Due to the overlap and multiple scattering of high buildings, the clustering results have multiple change patterns mixed together.In Figure 16b, the area marked by green rectangle is likely to be a periodic wetland change pattern.The regions marked by blue and yellow rectangles represent two water areas with different change periods such as the areas shown in Figure 16a,d.More results are displayed in Figure 17. Figure 17a-d

Discussion
Change detection and change-pattern analysis of SAR image time series play a great role in Earth monitoring.It is a huge challenge to handle large amounts of remote sensing data while the area that changes is usually only a small part.The proposed change detection method is only used to detect whether a time series has changed and cannot detect where the change occurred.For a distance matrix, if the pixels of corresponding time series have not changed, the value of each element in the distance matrix is very small.if the pixel in the ith position changes, the values of the elements of the ith row and the ith column are larger than the values of other elements.Each time series with a different change pattern has a distance matrix corresponding to it.
To discover change patterns from the changed area, we propose a clustering algorithm based on distance matrices, called K-Matrix.Inspired by cross-correlation, the matrix cross-correlation is defined, and we expect that this method can cluster pixels with the same change trend into a group.As shown in Figures 14c and 17c, the solid line and dotted line are grouped the same cluster.In our experiment, only the amplitude characteristics of the pixels are used.From a theoretical point of view, the K-Matrix can be used to cluster the change patterns in multidimensional feature space.Figures 18  and 19 show the possible results of this method in 2-dimensional feature space of pixel time series.However, when we use multidimensional features, the final clustering results are difficult to visualize and whether the discovered patterns have obvious physical meaning is not clear.As an exploration in change-pattern mining of SAR time series, although our method achieves desirable results on two real datasets, there still exist certain limitations.For example, because the two-step scheme relies on the distance matrix, how to compute the distance between two pixels effectively is a problem worth thinking about.Moreover, the numerical complexity of distance matrix construction and the K-Matrix algorithm are O(mnN 2 ) and O(mnN 3 ), respectively, where m and n are the sizes of SAR images, N is the number of images.With the length of SAR image time series increasing, the computation cost and storage space of distance matrices will be very high.In the future, we will study the correspondence between the distance matrices of changed pixels and different change patterns and some more efficient algorithms to discover patterns with special physical meanings will be investigated.

Conclusions
In this paper, a new framework for the change-pattern mining of SAR image time series has been proposed.The method is based on the distance matrix which records the dissimilarity of two pixels at the same spatial position in time series.A novel method is proposed to detect the changes in the SAR image time series.The proposed K-Matrix is used to cluster the distance matrix into different groups.Based on our framework, the Log-Ratio operator and GLRT are used in change detection of SAR image time series.The accuracy and Kappa are used to evaluate the performance of the proposed method.Finally, we use the proposed K-Matrix to discover the change patterns based on the change detection results.Experimental results on two datasets of TerraSAR-X image time series illustrate the effectiveness of the proposed method.
where S = ( x 1 , ..., x n ) • ( x T 1 , ..., x T n ) T .Moreover, we set µ k = (I − O m ) • µ k = Q • µ k to obtain the z-normalized µ k , where I is the identify matrix, and O is a matrix with all ones [32].Finally, by dividing µ T k • µ k to make µ k unit norm, we obtain: where

Figure 1
Figure 1 shows the scheme of the proposed method that consists of two main steps: (1) change detection in SAR image time series; (2) change-pattern clustering.First, the detected region of interest (RoI) separates changed from unchanged pixels in SAR image time series.Afterward, the method focuses on the changed pixels only.A novel clustering algorithm is used to mine different change patterns on the RoI in SAR image time series.Finally, the different kinds of change patterns are grouped and visualized by analyzing clustering results.

Figure 1 .
Figure 1.Framework of the proposed method.

Figure 2 .
Figure 2. The strategy to build a distance matrix.

Figure 4 .
Figure 4.The cross-correlation between two matrices.

Figure 5 . 28 Figure 6 .
Figure 5. Dataset I of SAR image time series.

Figure 7 .
Figure 7. Ground truth for SAR image time series.

Figure 8 .
Figure 8.The baseline framework of change detection in SAR image time series.

Figure 12 .Figure 13 .
Figure 12.Clustering results, dataset I. (a) Change-pattern mining results, (b) Three typical change-pattern areas marked with three different color rectangles.

Figure 15 .
Figure 15.Clustering results, dataset II.(a) Change-pattern mining results, (b) Four typical change-pattern areas marked with four different color rectangles.

Figure 18 .
Figure 18.The 2-dimensional feature in time series.

Figure 19 .
Figure 19.The different changes of the same change-pattern.Left: the 2-dimensional feature space.Right: (a)-(h) different changes with different orders and scales.

Table 1 .
Performance analysis of results on dataset I.

Table 2 .
Performance analysis of results on dataset I I.