Multitemporal SAR Image Despeckling Based on a Scattering Covariance Matrix of Image Patch

This paper presents a despeckling method for multitemporal images acquired by synthetic aperture radar (SAR) sensors. The proposed method uses a scattering covariance matrix of each image patch as the basic processing unit, which can exploit both the amplitude information of each pixel and the phase difference between any two pixels in a patch. The proposed filtering framework consists of four main steps: (1) a prefiltering result of each image is obtained by a nonlocal weighted average using only the information of the corresponding time phase; (2) an adaptively temporal linear filter is employed to further suppress the speckle; (3) the final output of each patch is obtained by a guided filter using both the original speckled data and the filtering result of step 3; and (4) an aggregation step is used to tackle the multiple estimations problem for each pixel. The despeckling experiments conducted on both simulated and real multitemporal SAR datasets reveal the pleasing performance of the proposed method in both suppressing speckle and retaining details, when compared with both advanced single-temporal and multitemporal SAR despeckling techniques.


Introduction
The synthetic aperture radar (SAR) remote sensing technique has attracted wide attention since the launch of the advanced spaceborne sensors. These SAR satellites or sensors can provide images of the observed areas in different time phases in all day time and all weather conditions, and work in different imaging modes. The above advantages make the SAR a good tool to observe the Earth. However, speckle noise is inherent in SAR systems, as with most of the active imaging systems, which degrades the quality of the SAR data and hinders most applications using SAR data. In the imaging step of certain SAR systems, a multilook procedure is deployed to suppress the amount of speckle noise, which can, however, result in severe spatial resolution loss. Therefore, the application of a speckle reduction technique is often essential before using SAR data, especially for most satellite radar systems which provide single-look data. This paper focuses on the despeckling of single-polarization SAR data; fully polarimetric SAR or interferometric SAR despeckling is outside the scope of this paper.
During the past four decades, a number of single-temporal SAR despeckling methods were proposed. In the early years, many SAR filters were developed based on minimum mean-square error theory [1], such as the Lee filter [2] and the Kuan filter [3]. Local filters that work in the wavelet domain [4,5] and homomorphic approaches [6,7] that take the log of the data before applying the denoising process were also widely studied. However, most of these traditional despeckling techniques can hardly obtain a good balance between preserving the fine image structure and effectively reducing speckle, although they generally have high processing efficiency. In the last ten years, the nonlocal means (NLM) scheme [8] has attracted the interest of researchers when developing SAR speckle reduction methods [9][10][11]. The main idea of the NLM filter is to search similar image patches and to use a weighted average to estimate the true value of the reference pixel based on the similarity between the reference and the sample. Recently, inspired by NLM, Parrilli et al. [12] proposed a SAR block-matching denoising method and 3D filtering method (SAR-BM3D). This method achieves a remarkable despeckling performance by employing a collaborative filtering strategy conducted in the wavelet domain, using the similarity and redundancy information of nonlocal patches. It needs to be pointed out that the deep learning theory has been a hot topic in remote sensing applications in recent years [13][14][15], and has also been applied to filter SAR data [16].
Thanks to the newly launched satellite radar systems, multitemporal SAR images of the same area are now available in many cases. Theoretically, when dealing with the despeckling issue of multitemporal images, exploiting both the temporal and spatial information can provide better results than the single-temporal filters. In recent years, some multitemporal SAR filters have been investigated. In 2014, Su et al. [17] proposed a two-step multitemporal NLM filter, which conducts a nonlocal filtering step driven by the redundancy information in the temporal domain, and then employs nonlocal estimation in the spatial domain. Chierchia et al. [18] extended the SAR-BM3D method to filter multitemporal images by exploiting more temporal information to guide both the similar patch search and image filtering steps. More recently, Zhao et al. [19] proposed a unified framework which can extend some well-known single-image despeckling techniques to filter multitemporal images, based on the ratio image between the noisy image and the temporally averaged image. For further details of the multitemporal despeckling methods presented in the early years, we refer the reader to a review paper written by Trouvé et al. [20].
For most single-polarization SAR filters, to estimate the true values of the intensity or amplitude data, only the statistical information of the intensity or amplitude is considered. However, for single-look complex (SLC) SAR data, the phase information is available and can also help to characterize the scattering traits of the targets. Consider the phase information when despeckling the amplitude data can provide better processing results. Inspired by this idea, this paper presents a multitemporal SAR filter for SLC data, which is developed under the NLM framework. One of the main innovations of this work is that the filtering framework is based on a scattering covariance matrix (SCM) constructed by the complex data of the pixels in each image patch. By doing so, both the statistical information of the amplitude of each pixel and the phase difference between any two pixels in a patch are utilized to guide the filtering process. In addition, differing from most multitemporal SAR filters that use a change detection approach to exploit the temporal redundancy, the proposed method utilizes an adaptive temporal filter based on the stationarity of the patches, which can avoid the issue of determining the change detection threshold.
In the first step of the proposed method, the basic estimations of the patches of each image in the different time phases are independently estimated by a nonlocal weighted average. Then, by searching and grouping the similar patches in time and obtaining the stationarity information of the patch groups, a simple but effective temporal linear filter is used to further suppress the speckle level. In the final step, another weighted average is employed to obtain the final estimation of the patches, and an aggregation process is used to tackle the multiple estimations issue for each pixel.
The rest of this paper is organized as follows. The idea of the scattering covariance matrix of patch is introduced in Section 2. In Section 3, the multitemporal SAR despeckling framework is presented. Then, the good filtering performances of the proposed method are validated by the experiments conducted on both simulated and real multitemporal SAR datasets in Section 4, followed by conclusions in Section 5.

The Covariance Matrix of SAR Data
It needs to be noted that, although this paper focuses on the despeckling issue of single-polarization SAR data, it is still necessary to recall the statistical trait of fully polarimetric SAR (PolSAR) data, since the idea of SCM employed in the proposed method stems from it.
For single-look PolSAR data, the polarimetric information of a pixel can be characterized by a polarimetric covariance matrix (PCM), which is constructed by the outer product of the scattering vector α of this pixel with its conjugate transpose α K [21]: where |S HV | and φ HV denote, respectively, the amplitude and the phase of the vertical transmitting (subscript "V") and horizontal receiving (subscript "H") polarization channel; j is the imaginary part; and the superscripts T and * are, respectively, the transpose operator and the conjugate operator. Clearly, the off-diagonal terms in the PCM can reflect the phase differences between the different polarizations, and thus, the PCM can better characterize the scattering traits of the targets, compared with the polarimetric scattering vector α.
To compare the structure similarity between two image patches, only intensity or amplitude information is considered in most NLM-based SAR filters. A direct problem is that, as mentioned in the introduction, the phase information in SLC SAR data can also help to characterize the scattering traits of the targets and considering such information when comparing patches can provide better results. As revealed in [22], considering the homogeneity similarity between patches as well as the structure similarity is useful in improving the performances of the NLM-based filters.
Based on the above analyses and inspired by the PCM of PolSAR data, we propose to use the SCM to represent the complex scattering information of all the pixels in an image patch. Assuming that all the pixels in a patch centered at pixel x are statistically independent, we introduce a scattering vector for this patch by jointing the back-scattering signals of all the pixels: where S n = |S n | exp(j φ n ) denotes the complex data of the nth pixel. In this paper, we fix the size of all the patches as 3 × 3 (i.e., n = 9). The reason is that, the computational time of the filtering procedure could increase more than 10 times if we enlarge the patch size from 3 × 3 to 5 × 5, and we found small patches are helpful to retain point targets which is meaningful for many target detection applications. It can be seen that, this joint vector has a similar form with the PolSAR scattering vector α, with the difference that it is constructed by the scattering information of different pixels rather than by the scattering information of different polarimetric channels of a certain pixel. Therefore, intuitively, we can construct the SCM for this patch as the way of constructing PCM for PolSAR data, which is: ). Clearly, each diagonal term of M is the intensity of a pixel, while each off-diagonal term contains the phase difference between two pixels in the same patch, as well as the amplitude information of each pixel. By doing so, the issue of comparing the similarity of two patches can be converted into comparing the similarity between two SCMs, so that more information can be exploited rather than only using the amplitude information.
It is well known that the PCM of PolSAR data follows a complex Wishart distribution [23]. As we can see, since each element of the joint scattering vector v follows a complex Gaussian distribution as in the PolSAR scattering vector α, the SCM of a patch can also be modeled by the complex Wishart distribution given the clean matrix z, as follows: where Tr(·) denotes the trace of the matrix, Γ(·) denotes the Gamma function, L is the number of looks, and q = n is the dimension of the SCM M.

Similarity Measures Based on the SCM
Under the NLM framework, the proposed method is developed using the SCM of an image patch as the basic processing unit. Hence, the first question is how to measure the similarity between two patches based on their SCMs. To handle this issue, we employ a likelihood-ratio test method designed for complex Wishart-distributed data [24]. If we assume that two SCMs, M x and M y , follow the same complex Wishart distribution (i.e., x and y have the same true SCM), then the test statistic for the single-look data can be derived as: Taking the logarithm of Equation (7), we have: where G 1 (M x , M y ) is non-positive and is equal to zero when two patches have the same true values. The idea of the likelihood-ratio test for two patches can be extended to measure the stationarity of a patch group (i.e., the equality of multiple variables). With the hypothesis that all the patches of a group are equal, the test statistic for t variables can be derived as [25,26]: with the temporal average The value of this multivariable test statistic is between zero and one, and the more similar the patches are, the closer the value is to 1.

The Proposed Multitemporal SAR Despeckling Framework
The schematic map of the proposed SCM-based multitemporal SAR (SCM-MSAR) filter is shown in Figure 1. In Step I, we construct the SCMs for all patches of all time sequences (see Equations (3) and (4)), by doing so, new images with each unit represented as a matrix is obtained, which is termed as "matrix image" in the rest of this paper. Then, in Step II, each matrix image is independently filtered to obtain its basic estimation. In Step III, to utilize redundancy information in time, similar patches in different times are grouped and the stationarity information of each group is obtained (see Equations (9) and (10)). After this, a temporal linear filter is employed to further reduce the speckle level of the target matrix image in Step IV. In Step V, the final estimation of the target matrix image is obtained by a guided filter, using both the information of the speckled data and the temporally filtered data. Lastly, since each pixel is included in several patches, and an aggregation step is taken to tackle the multiple estimations problem. More details of the proposed SCM-MSAR filter are provided in the following.
The value of this multivariable test statistic is between zero and one, and the more similar the patches are, the closer the value is to 1.

The Proposed Multitemporal SAR Despeckling Framework
The schematic map of the proposed SCM-based multitemporal SAR (SCM-MSAR) filter is shown in Figure 1. In Step Ⅰ, we construct the SCMs for all patches of all time sequences (see Equations (3) and (4)), by doing so, new images with each unit represented as a matrix is obtained, which is termed as "matrix image" in the rest of this paper. Then, in Step Ⅱ, each matrix image is independently filtered to obtain its basic estimation. In Step Ⅲ, to utilize redundancy information in time, similar patches in different times are grouped and the stationarity information of each group is obtained (see Equations (9) and (10)). After this, a temporal linear filter is employed to further reduce the speckle level of the target matrix image in Step Ⅳ. In Step Ⅴ, the final estimation of the target matrix image is obtained by a guided filter, using both the information of the speckled data and the temporally filtered data. Lastly, since each pixel is included in several patches, and an aggregation step is taken to tackle the multiple estimations problem. More details of the proposed SCM-MSAR filter are provided in the following.

The First Spatial Filtering Approach
The SCM-MSAR filter includes two spatial filtering steps (Step Ⅱ and Step Ⅴ in Figure 1), which are weighted average approaches conducted on the speckled and the temporally filtered matrix images, respectively. In the first spatial filtering step, we use the following linear formulation to filter patch x in each time sequence independently: (11) with the weight: denotes the size of the large filtering window centered at x; 1 h is the weight normalization parameter. As shown in Figure 1, the different superscripts of M represent the

The First Spatial Filtering Approach
The SCM-MSAR filter includes two spatial filtering steps (Step II and Step V in Figure 1), which are weighted average approaches conducted on the speckled and the temporally filtered matrix images, respectively. In the first spatial filtering step, we use the following linear formulation to filter patch x in each time sequence independently: with the weight: where w 1 × w 1 denotes the size of the large filtering window centered at x; h 1 is the weight normalization parameter. As shown in Figure 1, the different superscripts of M represent the estimation results of the matrix image in the different filtering steps, which is consistent in the whole paper. G 1 (·) is the similarity function between two speckled SCMs calculated by Equation (8). In fact, we can observe that, this is a standard NLM filter, with the exception that the intensities of all the pixels in a patch (i.e., the diagonal terms of the SCM) are filtered in each weighted average step.

Temporally Similar Patch Grouping
In some cases, due to the limited precision of the georeference or the geocoding process, the multitemporal SAR images will not be exactly registered, which leads to a slight offset in the space between SAR images. In such cases, if we take a temporal filter simply using the pixels located in the same image position of different times, the image could be badly blurred. In this paper, to facilitate the temporal filtering procedure, we propose a method to search and group similar patches in different times. It should be pointed out that the similar patch grouping approach is employed to utilize redundancy information in time, rather than performing accurate registration.
For a given patch x in the target time, and in its local window with the size of w × w, we can compare the similarity between x and each patch in the window of any complementary time by Equation (8). By doing so, the most similar pixel with respect to x in all the time sequences is obtained. However, we found that the patch grouping process is not good enough due to the influence of speckle, and considering the information of the neighbors of x can refine the estimation process (i.e., we conduct the above grouping process for all the patches inside a 3 × 3 local window of x), and find the patch in each time which is considered as the similar patch most frequently. Figure 2 shows the temporal average of a multitemporal SAR dataset with and without similar patch grouping, and we also compared the temporal average results by using the proposed similar patch grouping approach and the one presented in [17]. To reveal the feasibility of different approaches, we display the ratio images between the original speckled image and the temporal averaged images. Due to the multiplicative trait of speckle, the content residing in the ratio image should be very like pure speckle, if the temporal averaged approach reduces the speckle level without blurring the details. Clearly, the original result is badly blurred, while the results with the similar patch grouping approaches, especially with the proposed one, better retain the details. ) ( 1 ⋅ G is the similarity function between two speckled SCMs calculated by Equation (8). In fact, we can observe that, this is a standard NLM filter, with the exception that the intensities of all the pixels in a patch (i.e., the diagonal terms of the SCM) are filtered in each weighted average step.

Temporally Similar Patch Grouping
In some cases, due to the limited precision of the georeference or the geocoding process, the multitemporal SAR images will not be exactly registered, which leads to a slight offset in the space between SAR images. In such cases, if we take a temporal filter simply using the pixels located in the same image position of different times, the image could be badly blurred. In this paper, to facilitate the temporal filtering procedure, we propose a method to search and group similar patches in different times. It should be pointed out that the similar patch grouping approach is employed to utilize redundancy information in time, rather than performing accurate registration.
For a given patch x in the target time, and in its local window with the size of w w × , we can compare the similarity between x and each patch in the window of any complementary time by Equation (8). By doing so, the most similar pixel with respect to x in all the time sequences is obtained. However, we found that the patch grouping process is not good enough due to the influence of speckle, and considering the information of the neighbors of x can refine the estimation process (i.e., we conduct the above grouping process for all the patches inside a 3 3 × local window of x), and find the patch in each time which is considered as the similar patch most frequently. Figure 2 shows the temporal average of a multitemporal SAR dataset with and without similar patch grouping, and we also compared the temporal average results by using the proposed similar patch grouping approach and the one presented in [17]. To reveal the feasibility of different approaches, we display the ratio images between the original speckled image and the temporal averaged images. Due to the multiplicative trait of speckle, the content residing in the ratio image should be very like pure speckle, if the temporal averaged approach reduces the speckle level without blurring the details. Clearly, the original result is badly blurred, while the results with the similar patch grouping approaches, especially with the proposed one, better retain the details.

Temporal Filtering Approach
Taking into account the redundant information in the temporal domain when designing, multitemporal SAR filters can achieve better results than single-temporal filters. To achieve this goal, most of the current multitemporal SAR filters use a change detection step, often followed by a simple temporal averaging using the unchanged pixels, to further suppress the speckle or to compare the similarity between pixels more precisely. As we stated before, the phase information is not considered in these methods. Furthermore, determination of a proper threshold for change detection is difficult in many cases.
In this paper, we propose a simple but effective temporal filter based on the stationarity of a similar patch group in time (Step Ⅳ in Figure 1), which is depicted as: where x J denotes the stationarity calculated by Equation (9), and R is the temporal average of the basic estimations of the patches. As we can see, when 1 ≈ . This implies that the proposed temporal filtering step is adaptive, which is more applicable and reasonable than using a change detection step based on a threshold, and simply taking a temporal average.

The Second Spatial Filtering Approach
In the second spatial filtering step (Step Ⅴ in Figure 1), we employ a similar framework used in the SAR guided filters [27,28] to reduce the noise residing in the temporally filtered image of the target time sequence. The main idea is to guide the filtering process with the help of the original data. In other words, we consider the information of both the temporally filtered patches and the speckled patches in the target time, which is formulated as: with the weight where 2 h is another weight normalization parameter, and ) , ( denotes the similarity between two patches in the temporally filtered image. It needs to be noted that, since the assumption of a complex Wishart distribution does not still hold in the filtered image, this similarity measure term cannot be calculated by Equation (8). Considering that the SCMs are Hermitian positive definite

Temporal Filtering Approach
Taking into account the redundant information in the temporal domain when designing, multitemporal SAR filters can achieve better results than single-temporal filters. To achieve this goal, most of the current multitemporal SAR filters use a change detection step, often followed by a simple temporal averaging using the unchanged pixels, to further suppress the speckle or to compare the similarity between pixels more precisely. As we stated before, the phase information is not considered in these methods. Furthermore, determination of a proper threshold for change detection is difficult in many cases.
In this paper, we propose a simple but effective temporal filter based on the stationarity of a similar patch group in time (Step IV in Figure 1), which is depicted as: where J x denotes the stationarity calculated by Equation (9), and R is the temporal average of the basic estimations of the patches. As we can see, when J x ≈ 1 (i.e., high stationarity),M 2 x ≈ R; and when J x ≈ 0 (i.e., high nonstationary),M 2 x ≈M 1 x . This implies that the proposed temporal filtering step is adaptive, which is more applicable and reasonable than using a change detection step based on a threshold, and simply taking a temporal average.

The Second Spatial Filtering Approach
In the second spatial filtering step (Step V in Figure 1), we employ a similar framework used in the SAR guided filters [27,28] to reduce the noise residing in the temporally filtered image of the target time sequence. The main idea is to guide the filtering process with the help of the original data. In other words, we consider the information of both the temporally filtered patches and the speckled patches in the target time, which is formulated as: with the weight where h 2 is another weight normalization parameter, and G 2 (M 2 x ,M 2 i ) denotes the similarity between two patches in the temporally filtered image. It needs to be noted that, since the assumption of a complex Wishart distribution does not still hold in the filtered image, this similarity measure term cannot be calculated by Equation (8). Considering that the SCMs are Hermitian positive definite matrices, we use the affine-invariant metric proposed by D'Hondt et al. [29] to measure their similarity: For the setting of the weight normalization parameters h 1 and h 2 in the spatial filtering approaches, we deploy a strategy based on the complexity of the image scene [30,31]: before each spatial filtering approach, the parameter is chosen as the 0.8-quantile of the distribution of the similarity between any two neighboring pixels computed throughout the image. By doing so, the parameters can be set lower to better preserve the image details, but when the image scene is complex, the parameter can be set larger, to better suppress the speckle.
After the second spatial filtering approach, the final estimations of all the pixels in each patch are obtained. Since each pixel is included in n patches when constructing the matrix images, we finally average the multiple estimations to obtain more robust results.

Experimental Part
In this section, to illustrate the effectiveness of the proposed SCM-MSAR filter, we describe the experiments carried out on simulated images and two real multitemporal SAR datasets, and make comparisons between the proposed filter and three state-of-the-art filters: the single-temporal SAR-BM3D filter [12], the multitemporal SAR-BM3D (MSAR-BM3D) filter [18], and the ratio-based multitemporal SAR (RB-MSAR) filter [19], which were all accomplished by the source codes offered by the authors of the respective papers. In the experiments, the sizes of the searching windows of all the methods were fixed as 35 × 35. For reproducibility, the SLC SAR datasets used in the experiments and the despeckling results of the SCM-MSAR filter can be downloaded from the link (http://sendimage.whu.edu.cn/en/resources/).
Generally speaking, an outstanding SAR filter should have the following characteristics [32,33]: (1) speckle is reduced to a large degree in homogeneous areas; (2) image features and radiometric information are well preserved; and (3) the result should be without artifacts.
To quantify the performances of the filters in the above aspects, we use the following four indicators: (1) Equivalent number of looks (ENL): the ENL indicator is widely used in the literature to evaluate the level of speckle, and the higher the ENL value, the better the filter reduces the speckle.  [34] is used to assess the edge-preservation capability of despeckling methods. It was designed for experiments with simulated images, since a reference is needed to calculate this index. A higher FOM value indicates a better edge-preservation result. (3) Edge-preservation degree based on the ratio of average (EPD-ROA): for the real SAR datasets, EPD-ROA [35] is one of the indices used to assess the edge-preservation capability of filters which does not need a reference. This index should be close to one if the filter shows a good edge-preservation performance. (4) Mean of ratio (MOR): due to the statistical trait of speckle, a MOR value between the speckled and the despeckled data should be very close to one if the filter can effectively preserve the radiometric information.

Experiments with Simulated Datasets
In this part, we use a framework proposed by Di Martino et al. [32] to simulate the multitemporal SAR datasets. This framework was developed based on a SAR raw signal simulator (SARAS) [36], which simulates the SAR images relevant to canonical scenes. In SARAS, to evaluate a SAR signal, sound geometrical and electromagnetic models for the evaluation of the reflectivity function of the scene and a model for the transfer function of the system are employed. By doing so, proper physical models for the sensed surface, the scattering, and the radar operational mode are considered. Through simulating multiple SAR images as different instances relevant to the same scene, SAR images with an arbitrary number of looks can be obtained. Hence, to simulate a "clean" reference image, one can generate enough images of the same scene by this framework and average them. First, we simulated three datasets which all have four time sequences. To inspect the influence of the changes between the images in the target time and the other times (i.e., the complementary times) on the despeckling results, we randomly chose some pixels with a certain ratio and changed their values by a mean filter before exerting speckle for each complementary time. The change ratios were set as 0.4, 0.2, and 0.1 for the three simulated datasets, respectively. The simulated images and the filtered results of the dataset with r = 0.4 are shown in Figure 3. At first sight, both the single-temporal and multitemporal SAR-BM3D filters show some artifacts which blurs the image to some degree. Visually, the multitemporal SAR-BM3D filter performs slightly better than the single-temporal filter both in suppressing speckle and retaining edges. Compared with the BM3D filters, the RB-MSAR filter smooths the homogeneous areas to a greater degree, but some strong speckle still resides on the image. The superiority of the SCM-MSAR filter is distinguished: the speckle is reduced by a large degree and the boundaries are effectively preserved, and moreover, with much less artifacts. To visualize the capability of the despeckling methods in preserving edges, we plotted the edge profiles for each filtered image (Figure 3). The conclusions we can draw from the edge profiles are generally in line with the above observation.
Sensors 2019, 9 of 16 Through simulating multiple SAR images as different instances relevant to the same scene, SAR images with an arbitrary number of looks can be obtained. Hence, to simulate a "clean" reference image, one can generate enough images of the same scene by this framework and average them. First, we simulated three datasets which all have four time sequences. To inspect the influence of the changes between the images in the target time and the other times (i.e., the complementary times) on the despeckling results, we randomly chose some pixels with a certain ratio and changed their values by a mean filter before exerting speckle for each complementary time. The change ratios were set as 0.4, 0.2, and 0.1 for the three simulated datasets, respectively. The simulated images and the filtered results of the dataset with 4 . 0 = r are shown in Figure 3. At first sight, both the singletemporal and multitemporal SAR-BM3D filters show some artifacts which blurs the image to some degree. Visually, the multitemporal SAR-BM3D filter performs slightly better than the singletemporal filter both in suppressing speckle and retaining edges. Compared with the BM3D filters, the RB-MSAR filter smooths the homogeneous areas to a greater degree, but some strong speckle still resides on the image. The superiority of the SCM-MSAR filter is distinguished: the speckle is reduced by a large degree and the boundaries are effectively preserved, and moreover, with much less artifacts. To visualize the capability of the despeckling methods in preserving edges, we plotted the edge profiles for each filtered image (Figure 3). The conclusions we can draw from the edge profiles are generally in line with the above observation. The above analyses are backed up by the quantitative assessment results listed in Table 1. The SCM-MSAR filtered image has the highest FOM and ENL values among all the despeckled images. Furthermore, the MOR values reveal that the proposed method shows the best performance in retaining radiometric information, while the single-temporal SAR-BM3D filter distorts the radiometric information to a much larger degree. An interesting phenomenon, which should be pointed out, is that as the change ratio decreases, the superiority of the SCM-MSAR filter with regard to the other two multitemporal filters becomes less significant. This reveals that, for a certain dataset, the superiority of the proposed method will be more significant if the image scene is highly dynamic in time, benefiting from the adaptive temporal filtering approach. We also simulated a dataset having eight time sequences with r = 0.1. Clearly, as the time sequences increase, the performances of all the multitemporal filters have been notably improved since more redundant temporal information is exploited. Thanks to the adaptive temporal filtering approach, the superiority of the proposed filter compared with other filters becomes more significant as the time sequences increase.

Experiments with Two Real Multitemporal Datasets
Experiments were also carried out on two real multitemporal datasets, to verify the effectiveness of the proposed method. The first dataset has two time sequences which were acquired by the TerraSAR-X system in Ruhr, Germany, on 20 February 2008 and 4 March 2008, respectively. The imaging mode is strip mode with a spatial resolution of 3 m × 3 m working on X band, and the polarization type is HH. From Figure 4, visually, the two SAR-BM3D methods and the proposed method show quite comparable results in preserving image details, especially preserving the high returns from the urban area. Compared with the other three methods, the RB-MSAR filter shows a slight oversmoothing problem. The above observations are supported by the ratio images and the EPD-ROA values shown in Table 2. The single-temporal SAR-BM3D still does not effectively suppress the speckle, although it has the highest EPD-ROA value. The proposed method again shows the best result in preserving radiometric information.    The second real SAR dataset shown in Figure 5 was acquired by the Sentinel-1 system and has three images which were acquired on 31 July 2017, 17 September 2017 and 11 October 2017, respectively. The imaging mode is interferometric wide swath mode with a spatial resolution of 5 m × 20 m working on C band, and the polarization type is HV. Since the original images are not well-registered, the offset estimation step presented in Section 3 was deployed before applying each of the multitemporal filters.
Once again, we can observe that RB-MSAR has an oversmoothing problem, and some artifacts are exhibited in the despeckled image, which degrades the resolution of the image to some degree. From the ENL and EPD-ROA values, the proposed method obtains a good balance between reducing speckle and retaining image features, and the MOR values again indicate the outstanding performance of the proposed method in preserving radiometric information. The second real SAR dataset shown in Figure 5 was acquired by the Sentinel-1 system and has three images which were acquired on 31 July 2017, 17 September 2017 and 11 October 2017, respectively. The imaging mode is interferometric wide swath mode with a spatial resolution of 5 m × 20 m working on C band, and the polarization type is HV. Since the original images are not wellregistered, the offset estimation step presented in Section 3 was deployed before applying each of the multitemporal filters. Once again, we can observe that RB-MSAR has an oversmoothing problem, and some artifacts are exhibited in the despeckled image, which degrades the resolution of the image to some degree. From the ENL and EPD-ROA values, the proposed method obtains a good balance between reducing speckle and retaining image features, and the MOR values again indicate the outstanding performance of the proposed method in preserving radiometric information. Preservation of signals from strong scatterers is essential for target and man-made structure detection. To show the capability of the filters in retaining the information of strong scatterers, two lines were taken across two high-return targets, respectively (marked by the red rectangle in Figure  5b), and the signal intensity of the lines after despeckling by the different methods are plotted in Figure 6. Here, it can be observed that the data filtered by the proposed method are the most in line with the original data, which indicates the proposed method's ability to retain strong targets. The other methods smear the signatures to a greater or lesser degree. The proposed SCM-MSAR filter consists of three filtering steps and a multiple estimations aggregation step. To reveal the influence of the different steps on the filtering result, we display the filtered image of each step in Figure 7. As we can see, after the first spatial filtering step, the speckle is suppressed to some degree. Then, by using the adaptive temporal filter, the speckle is notably reduced, but some details are slightly blurred. In the second spatial filtering step, since the information of the original speckled image is used to guide the despeckling process, the oversmoothing problem is alleviated and finally, via the multiple estimations aggregation step, the resolution of the image is further improved. Preservation of signals from strong scatterers is essential for target and man-made structure detection. To show the capability of the filters in retaining the information of strong scatterers, two lines were taken across two high-return targets, respectively (marked by the red rectangle in Figure 5b), and the signal intensity of the lines after despeckling by the different methods are plotted in Figure 6. Here, it can be observed that the data filtered by the proposed method are the most in line with the original data, which indicates the proposed method's ability to retain strong targets. The other methods smear the signatures to a greater or lesser degree. Preservation of signals from strong scatterers is essential for target and man-made structure detection. To show the capability of the filters in retaining the information of strong scatterers, two lines were taken across two high-return targets, respectively (marked by the red rectangle in Figure  5b), and the signal intensity of the lines after despeckling by the different methods are plotted in Figure 6. Here, it can be observed that the data filtered by the proposed method are the most in line with the original data, which indicates the proposed method's ability to retain strong targets. The other methods smear the signatures to a greater or lesser degree. The proposed SCM-MSAR filter consists of three filtering steps and a multiple estimations aggregation step. To reveal the influence of the different steps on the filtering result, we display the filtered image of each step in Figure 7. As we can see, after the first spatial filtering step, the speckle is suppressed to some degree. Then, by using the adaptive temporal filter, the speckle is notably reduced, but some details are slightly blurred. In the second spatial filtering step, since the information of the original speckled image is used to guide the despeckling process, the oversmoothing problem is alleviated and finally, via the multiple estimations aggregation step, the resolution of the image is further improved. The proposed SCM-MSAR filter consists of three filtering steps and a multiple estimations aggregation step. To reveal the influence of the different steps on the filtering result, we display the filtered image of each step in Figure 7. As we can see, after the first spatial filtering step, the speckle is suppressed to some degree. Then, by using the adaptive temporal filter, the speckle is notably reduced, but some details are slightly blurred. In the second spatial filtering step, since the information of the original speckled image is used to guide the despeckling process, the oversmoothing problem is alleviated and finally, via the multiple estimations aggregation step, the resolution of the image is further improved. The despeckled images despeckled by the first spatial filtering step, the temporal filtering step, the second spatial filtering step, and the aggregation step, respectively.

Conclusions
In this paper, a multitemporal SAR filter was proposed for the despeckling of SLC SAR data, using a SCM of each patch as the basic processing unit, which can better take advantage of the phase difference between pixels. The proposed method includes two spatial filtering approaches. One is a nonlocal filtering scheme, and the other is a guided filtering scheme. In addition, an adaptive temporal filter is employed in the proposed method based on the temporal stationarity of patches, to better exploit the redundant temporal information. We compared the proposed method with some state-of-the-art single-temporal and multitemporal SAR filters, in experiments conducted on both simulated and real datasets. The experimental results revealed that the proposed method can not only obtain a good balance between reducing speckle and retaining image structure, but also show an outstanding performance in preserving radiometric information.
It must be pointed out that, since the output of each spatial and temporal filtering step of the proposed method is the matrix image, the phase of each pixel in a patch cannot be directly retrieved using the filtered SCM; that is to say, the proposed filter only tackles with the despeckling issue of amplitude or intensity. Our future work will focus on the despeckling problem of both amplitude and phase for complex SAR data.
Author Contributions: All the authors designed the study and discussed the basic structure of the manuscript. X.M. carried out the experiments and finished the first version of this manuscript. P.W. reviewed and edited this paper.  The despeckled images despeckled by the first spatial filtering step, the temporal filtering step, the second spatial filtering step, and the aggregation step, respectively.

Conclusions
In this paper, a multitemporal SAR filter was proposed for the despeckling of SLC SAR data, using a SCM of each patch as the basic processing unit, which can better take advantage of the phase difference between pixels. The proposed method includes two spatial filtering approaches. One is a nonlocal filtering scheme, and the other is a guided filtering scheme. In addition, an adaptive temporal filter is employed in the proposed method based on the temporal stationarity of patches, to better exploit the redundant temporal information. We compared the proposed method with some state-of-the-art single-temporal and multitemporal SAR filters, in experiments conducted on both simulated and real datasets. The experimental results revealed that the proposed method can not only obtain a good balance between reducing speckle and retaining image structure, but also show an outstanding performance in preserving radiometric information.
It must be pointed out that, since the output of each spatial and temporal filtering step of the proposed method is the matrix image, the phase of each pixel in a patch cannot be directly retrieved using the filtered SCM; that is to say, the proposed filter only tackles with the despeckling issue of amplitude or intensity. Our future work will focus on the despeckling problem of both amplitude and phase for complex SAR data.