Change Detection in UWB SAR Images Based on Robust Principal Component Analysis

: This paper addresses the use of a data analysis tool, known as robust principal component analysis (RPCA), in the context of change detection (CD) in ultrawideband (UWB) very high-frequency (VHF) synthetic aperture radar (SAR) images. The method considers image pairs of the same scene acquired at different time instants. The CD method aims to maximize the probability of detection (PD) and minimize the false alarm rate (FAR). Such aim ﬁts into a multiobjective optimization problem, since maximizing the probability of detection generally implies an increase in the number of false alarms. In that sense, varying the RPCA regularization parameter leads to PD variation with respect to FAR, which is known as receiver operating characteristic (ROC) curve. To evaluate the proposed method, the CARABAS-II data set was considered. The experimental results show that RPCA via principal component pursuit (PCP) can provide a good trade-off between PD and FAR. A comparison between the results obtained with the proposed method and a classical CD algorithm based on the likelihood ratio test provides the pros and cons of the proposed method.


Introduction
Among the different approaches in unsupervised separation, robust principal component analysis (RPCA) methods can deal with additive mixing models and are commonly used for foreground detection in video surveillance systems to detect moving objects [1]. In synthetic aperture radar (SAR) applications, detecting and removing undesired content is an essential issue in practice [2]. The problem of detecting and removing content in SAR data can be treated as an unsupervised signal separation problem, usually referred to as blind source separation (BSS) [3].
Currently, the use of RPCA in SAR applications can already be found in the literature. In [4], RPCA solved via principal component pursuit (PCP) is used in a pre-processing step to decompose the SAR data into two parts, one related to stationary objects and the other to moving targets. The authors achieved separation employing a proper windowing scheme based on the data set of the X-band SAR system, namely GOTCHA Volumetric SAR. By still considering the GOTCHA data set and the RPCA via PCP, [5] presents preliminary results for multi-pass SAR change detection (CD) in X-band. In the context of ground moving target indication (GMTI), [6] proposes the use of RPCA via a relaxed version of PCP (Relaxed PCP) in order to achieve moving target detection in multichannel surveillance radar systems (MC-SAR), presenting computational results using real data from a Chinese X-band SAR system. Then, [7] extends the method presented in [6] aiming at clutter suppression. GMTI problem is again addressed in [8], in which a fast interferometry RPCA method is proposed considering multiple-input multiple-output (MIMO) SAR systems, where gains in terms of moving target detection and computational running time complexity are observed. Moreover, sparse representations have also been considered in other types of radars, such as in [9,10]. These works provided a framework based on l q -norm constraints by means of an approach known as sparse learning via iterative minimization (SLIM).
The goal in CD applications is detecting changes in a scene that occurs between different measurement campaigns [11]. The changes can be a consequence of natural disasters or human activities (military and civilian), which result in the appearance or disappearance of targets on the ground. The challenge associated with the design of an automated CD algorithm goes beyond performing the target detection. It is also related to the clutter suppression aiming at a low false alarm rate (FAR) to provide useful information for decision making [11]. For instance, in wavelength-resolution SAR systems, such as the Coherent All Radio Band System (CARABAS) [12], false alarms are often caused by elongated structures commonly sensitive to flight path (e.g., power lines) [13].
Over the last years, CD algorithms with different approaches have been proposed aiming at higher values of probability of detection (PD) while keeping or reducing the FAR. A method similar to space-time adaptive processing (STAP), which exploits differences between target, clutter, and interference signals, was presented in [11]. STAP theory is usually applied in GMTI to combine data from different receiving channels in order to perform stationary clutter suppression and moving target detection [2]. However, in [11], the challenge is to enhance changes between two single-channel SAR images acquired at different times (image pairs). In that sense, in [11], a hypothesis test is used-which relies on a likelihood ratio test-followed by a constant false alarm rate (CFAR) normalization, thresholding, and morphological operations to reduce false detection. Testing statistics such as amplitude ratio and generalized likelihood ratio tests have been used in wavelength-resolution SAR change detection, which considers a pair of images by using the one-look data statistics [14,15].
It is worth noticing that CARABAS is one of the few available UWB wavelength-resolution SAR systems. The system is operating with ultrawideband (UWB) in the VHF range , which results in low influence of speckle and high stability between measurements. That behavior allows multiple images of the same scene to have a high correlation between measurements, which, in turn, allows one to detect targets through CD algorithms [12]. Knowing that objects responsible for clutter can be considered stable for different image acquisitions once VHF UWB SAR might have only one scatter in the resolution cell for high resolution, the combination of different images taken from different flights (or from different antennas on the same device) can be used to improve the CD performance. Such a feature paves the way for the use of statistical tools such as RPCA via PCP.
This paper aims to explore the use of the signal separation tool known as RPCA via PCP [16] as a CD algorithm where the inputs are two UWB wavelength-resolution SAR images. The issue addressed in this paper differs from the references above-mentioned in important aspects. First, the focus herein is in UWB wavelength-resolution SAR images. It can be considered as an important issue since the RPCA performance relies on the statistical behavior of the backscattered signal. In addition, detecting changes between two single-channel (SC) SAR images may present several differences in terms of signal correlation when compared to separate sparse content between MC-SAR channels for GMTI application. Thus, this paper shows that the classical model of RPCA via PCP may achieve relevant results for the class of images analyzed since the targets (or changes) are sufficiently uncorrelated from the background content. Currently, no other results are found for methods based on RPCA via PCP for this type of radar system, which has unique statistical properties. The evaluation is performed considering a methodology designed aiming at pairs of images to attend the UWB VHF SAR application which accounts for details like detections tangent between images and other measurement criteria to better accomplish the application. Moreover, the results presented in this paper could be considered as a baseline for other futures CD algorithms proposals based on RPCA, aiming at VHF UWB SAR systems.
The rest of this paper is organized as follows. Section 2 introduces the principle of RPCA via PCP. Then, Section 3 details the materials and methods herein used. Section 4 details the proposed method to achieve RPCA via PCP change detection. Computational results are presented in Section 5. Finally, concluding remarks are given in Section 6.

Robust PCA
Principal component analysis (PCA) has been a ubiquitous tool in data analysis [17] for several decades. In order to accomplish tasks such as denoising and dimensionality reduction, PCA relies on the observation that the covariance matrix of high-dimensional tabular data sets (such as images) exhibits, very often in practice, particular structures, which means that the intrinsic dimension of a data set is usually lower than its observed dimension. As a consequence, it becomes possible to approximate high-dimension data sets by simpler structures.
In PCA, the search for a low-dimension approximation is carried out by means of the following optimization problem: where X ∈ R N×M is the observed data matrix, L ∈ R N×M is the low-rank approximation of X, and || · || F denote the Frobenius norm. The rank of L is (at most) l, which is (often) lower than rank(X) = min(N, M). In fact, the optimization problem expressed in (1) can also be seen as the problem of minimizing the Frobenius norm of a noise term N in the following approximation of the observed data: X = L + N.
In the past decade, much effort has been put in an extension of (1) known as robust principal component analysis (RPCA) [16], mainly aiming at giving robustness to the PCA against outliers. In this case, the observed data, X, is decomposed as follows where L corresponds to the low-rank term and S is a sparse matrix. Such a representation finds application in several domains, from video surveillance [1] to geophysics [18]. It is possible to perform RPCA by means of a tractable convex optimization formulation. This can be achieved by considering the nuclear norm of L (i.e., the sum of the singular values of L), expressed by ||L|| * , as an approximation of rank(L), and the l 1 norm of S, expressed by ||S|| 1 , as a measure of sparsity. Therefore, the following optimization problem can be considered where λ > 0 is a regularization parameter that balances the two terms. An essential aspect of this formulation, which is also known as principal component pursuit (PCP), is that it can recover, under some mild conditions [16], the original terms L and S that generated the observed data according to (2). There are several strategies to optimize (3). A very popular one is based on the alternating direction method of multipliers (ADMM) [19]. This approach can overcome a critical issue of the augmented Lagrangian method (ALM). The ALM approach is based on a single step, which means that optimization is performed by jointly considering the variables L and S. On the other hand, in ADMM, the variables L and S are updated in two separate steps. The nice aspect of this approach is that each step in ADMM can be addressed by simpler methods. It becomes possible to set up a strategy based on alternating minimization steps [19]. In the case of RPCA, ADMM decomposes the problem as follows: where ρ represent the augmented Lagrangian parameter. The alternating steps can be solved by simply defining the shrinkage operator and shrinkage operator for singular values [19]. Due to its simplicity, the ADMM method is herein considered.

Materials and Methods
This paper proposes a CD method for UWB VHF SAR images based on RPCA via PCP. The materials and tools used for the method development and evaluation can be summarized as follows: (a) the CARABAS-II data set of images, (b) an RPCA via PCP algorithm implementation, (c) a classical CD algorithm based on the likelihood ratio test.
CARABAS-II is a VHF low-frequency SAR UWB device and represent the second generation of Swedish CARABAS radars, developed by the company Saab AB. A data set composed by 24 images achieved using the CARABAS-II system is documented in [11], and made available as an open data set by the U.S. Air Force Research Lab (AFRL) at the sensor data management system (SDMS) website [20].
More precisely, the SAR raw data were recorded into onboard hard drives leaving all signal processing steps to be performed offline, such as image formation, radiometric calibration, geocoding, equalization, and others [12]. The 24 images are georeferenced according to the Swedish reference system RR92 [21], and show the same ground area in a restricted military zone, differing from each other in target positions (missions) and flight-heading of its acquisition (passes). The final images can be considered as the results of contributions from different sources (e.g., targets, background, and additive noise). In fact, this is a superposition process in the complex domain, where the final SAR images contain amplitude and phase information. However, in the data set provided at [20], the phase information has been removed by taking the absolute value of the data. Thus, all the formulation presented in this paper is considering the SAR magnitude images (real positives).
Each magnitude SAR image covers an area of 2 km per 3 km. For the CARABAS-II, the spatial resolutions are 2.5 m in azimuth and also in range, as reported in [22]. With the image pixel of the SAR images of 1 m in azimuth and range [13]. Thus, each image is composed by 6 million pixels, organized in a matrix of 3000 rows per 2000 columns. More information about the CARABAS-II system parameters can be found in [11].
For wavelength-resolution SAR systems such as CARABAS, the wavelength is larger than the resolution cell [11,15]. Since there is only a single scatterer per resolution cell, and the wavelength at the center frequency is in the order of a few meters, small objects are poorly reflected. This property makes the system callousness to soil roughness and forest canopy, as well as the presence of shrubs, leaves, small branches, among other objects of similar nature and scale. Distinct reflections come from objects that are placed some meters above the ground and have dimensions of meters. Hence, reflections regarding elongated structures and large dimensions objects are very sensitive to the flight path, as well as their sidelobes. Additionally, scatters of a few meters or higher do not change between measurements, which makes the noise resulting from speckles low enough, even for time intervals of several hours, days, or months.
For instance, Figure 1  Following with the description of the materials and methods, the CD method herein proposed makes use of a MATLAB implementation of the RPCA via PCP available at [23]. Indeed, [23] provides a MATLAB implementation of the RPCA via PCP algorithm, whose formulation was presented in the Section 2.
The results achieved with the proposed method-which will be detailed in Section 4-are compared to the ones obtained with the classical CD algorithm based on the likelihood ratio test [11]. The algorithm described in [11] was implemented with MATLAB, and the runtime analysis was made using the same processing platform for both CD algorithms. The processing platform is composed by a Intel Core i7-3610QM processor, operating with 8Gb of RAM, using a solid state hard drive.

RPCA Evaluation Method
Let I m be an image of a scene measured at a given measurement m, formed by an array of magnitudes of R rows and C columns. A separation model can be considering I m as the sum of three matrices, given by where B m denotes the background content, T m a matrix containing the targets, and N m a matrix of additive noise. Generally, the CD problem between two SC SAR images is based on an image-difference formulation. This is especially true for wavelength-resolution SAR systems like CARABAS, due to its backscatter characteristics [14,15]. Thus, multiplicative noise can be ignored and, as a consequence, the CD does not need to be based on rationing. Considering that issue, detecting changes between measurements 1 and 2 can be performed by I 1 − I 2 . Then, if the background is unchanged during the data collections (i.e., B 1 = B 2 ), which is true for wavelength-resolution SAR, it is possible to retrieve the positive differences, T 1 , and the negative differences, T 2 , by where, N 1−2 is a Gaussian noise term resulting from N 1 − N 2 [15].
In this paper, that issue is addressed in a different manner. Let V 1 and V 2 be the images I 1 and I 2 transformed into the vector form, respectively. Similarly to the RPCA video surveillance application, V 1 and V 2 can be organized in the observed data matrix, X, as where p 1 , . . . , p R×C andp 1 , . . . ,p R×C represent the magnitude values of image pixels on vectors V 1 and V 2 , respectively. Thus, for the CARABAS-II image set, X has two rows and 6 × 10 6 columns, that is, Considering the RPCA statements presented in Section 2, it is expected to decompose the observed matrix X, into a sparse matrix S, and a low-rank matrix L, keeping into the matrix S the content that most diverges between the two rows of matrix X, that is, the changes between measurements 1 and 2. This expected result is supported by the relative background stability presented by wavelength-resolution SAR system. Due to N = 2 and N << M, the rank of L may remain equal to the rank of X for several values of λ. The separation of content from L to S is performed during the iterations of the RPCA algorithm until achieving a given value of sparsity to S. This target value of sparsity relies on the λ value. Lower values of λ result in more information in S (non-zero values), which increases the PD and FAR. The content that will be priorly transferred to S depends on the correlation structure of the input matrix. Hence, it is possible to balance the content of the matrix S such that the targets are contained in S, avoiding the noise and false alarms contents whenever possible. In that sense, sparse content with respect to the images I 1 and I 2 will be distributed, respectively, in the first and second rows of matrix S.

Measurement Criteria
Each non-zero value in S is considered a detection. Similarly to [11], the targets are considered detected if detections exist within a radius of 10 m (i.e., 10 pixels) from the ground truth position. Defining PD as the number of detected targets divided by the number of known targets, it is possible to calculate an average PD for a given amount of CD tasks. Other detections that could not be related to any target is counted as false alarms. After counting, the FAR is calculated as the number of false alarms per square kilometer.
In this paper, the false alarms are counted by using a window of 10 × 10 pixels. For instance, ten consecutive pixels (in one or both directions) are considered with respect to the same object and counted as one false alarm. Ten distant pixels are considered distinct objects and counted as ten false alarms. Following that rule, for example, 100 consecutive pixels will be counted as ten false alarms and 100 separate pixels as 100 false alarms. This metric allows a better ranging of the RPCA λ parameter by dealing with lower values, which increases the number of false alarms that are related to the same objects.
Also, the proposed method allows decision making in the following singular situation. When detections related to false alarms exist within the window of 10 × 10 pixels on both rows of matrix S. This may be the case if an object moves a few meters within the window or due to scattering variations between measurements. Since variations within the window of 10 × 10 pixels can be related to the same object, such cases will not be considered changes and will be counted separately as detections tangent between rows (DTBR). More details about DTBR is given in Section 5.
It is worth mentioning that CD algorithms which make use of morphological operations naturally associate adjacent pixels to the same object. The proposed method does not use CFAR normalization, threshold, or any morphological operations (e.g., erode, dilate). Therefore, CD performance depends only on the chosen λ value.

Results
The default formulation presented in [16] to calculate the λ value is given by λ = ( √ R × C) −1 for N < M and M = R × C. This results in λ = 4.082 × 10 −4 for the CARABAS-II data set. However, the choice of a suitable λ value relies on the data set statistical behavior in addition to the input matrix dimensions. Therefore, a workable operating region was found empirically by ranging the λ value. More specifically, an average PD, FAR, and DTBR, were calculated for each evaluated λ value, considering the 24 image pairs defined in [11]. Then, a workable operating region was observed within the range of 6 to 13 times the value of 4.082 × 10 −4 . Figure 2a shows the results for the following λ values: 0.0024, 0.0029, 0.0033, 0.0037, 0.0041, 0.0045, 0.0049, and 0.0053. It is possible to observe that as the λ value increases within the range, the PD and FAR are affected differently. More precisely, for λ = 0.0024, an average PD of 0.985 can be achieved while an undesirable FAR of 6.583 is obtained. However, if a PD above 0.95 is considered acceptable, better relationships between PD and FAR may be achieved for λ values of 0.0029 to 0.0033. Moreover, λ = 0.0037 can be an option to keep the FAR lower than 1, aiming at a reasonable PD of 0.8850. The other λ values may be considered in circumstances in which lower values of FAR is needed. It is worth observing that FAR related to DTBR is accounted for separately. As mentioned in Section 4, DTBR can be considered null but used for decision making on stringent applications. One case of DTBR is illustrated in Figure 3a at azimuth 1000-1500, range 0-500, marked with two overlapping circles in red and blue colors, respectively.   Besides the DTBR, Figure 3a shows the CD results for λ = 0.0037 using an image pair formed with the amplitude SAR images of Mission 2, Pass 3, as I 1 , and Mission 5, Pass 3, as I 2 . Each detection is automatically marked with circles. Detections related to I 1 are observed between azimuth 500-1000 and around range 500. Similarly, the circles related to I 2 are observed around azimuth 2500, between range 1000-1500. Note that I 1 is also used as a background-image in Figure 3a for guidance purposes. Thus, the circles related to I 2 are not accompanied by the array of bright dots. The introduced method detected 43 of 50 targets, specifically, 25 targets related to I 1 and 18 targets related to I 2 . Since the PD is calculated by the ratio of detected vehicles to the total number of vehicles, then PD = 43/50 = 0.86. Once no false alarms are observed, FAR = 0 considering the DTBR null. Figure 3b shows the results for λ = 0.0033 to illustrate the pros and cons in decreasing the λ value. It is possible to observe an increase in detected changes related to I 2 , while two false alarms appeared.
Another approach to calculate the PD and FAR can be considering only the detections related to one of two images. This allows comparing the results with CD algorithms that play the role of detecting targets only in the surveillance-image. In that sense, if I 1 is the surveillance-image, and I 2 is considered only as the reference-image, then detections associated with the second row of matrix S might be discarded, except for the DTBR analyzes. Therefore, PD can be calculated as PD = 25/25 = 1, keeping no false alarms. Thus, the obtained results can be better compared, for instance, to the ones obtained by the classical CD algorithm described in [11], which reached in PD = 25/25 = 1, and FAR = 3/6 = 0.5, using I 1 and I 2 as surveillance-image and reference-image, respectively.
General comparison considering the 24 image pairs can be made by observing Figure 2b. Actually, Figure 2b shows the receiver operating characteristic (ROC) curve of the RPCA via PCP method. Additionally, Figure 2b presents the ROC curve considering only the detections related to the first row of matrix S. The point indicating the average performance of the CD algorithm in [11] is also provided for comparison, which is given by PD = 0.965 and FAR = 0.667. It is possible to observe that the proposed method obtained similar performance for λ = 0.0033, when compared to the CD algorithm in [11]. In fact, PD = 0.952 and FAR = 0.763, which means a slightly lower performance if operating in PD = 0.965 is required. Alternatively, PD = 0.976 can be reached if an increase in FAR of 0.36 is allowed. Reduce the FAR to 0.444 is also an option if PD = 0.875 is sufficient.

Discussion
This paper proposed a method based on RPCA via PCP to perform CD in wavelength-resolution SAR images. The method considers the use of image pairs provided by a VHF UWB SAR system, and are evaluated using the CARABAS-II challenged data set, available at U.S. Air Force SDMS website.
The results indicate that RPCA via PCP can be used for UWB VHF SAR change detection applications. More precisely, original results illustrate the merits of the RPCA via PCP as a CD method for the wavelength-resolution SAR system. Moreover, contributions regarding the method, which account details like the DTBR and other measurement criteria, can also be observed. Thus, the results presented in this paper could be considered for future solutions that can be developed from the proposed method.
Despite the similar performance for λ = 0.0033, the proposed method presents several gains in terms of runtime processing. Specifically, the proposed method spent an average time of 54 s to each CD task, while the CD algorithm in [11] (also implemented in MATLAB to comparison) required a runtime about 10 times longer for the same processing platform. Both implementations considering single-thread execution. This behavior can be explained by the differences in the mathematics behind both CD strategies. In fact, a matrix solution may present several advantages in terms of runtime processing when compared to a pixel by pixel solution. The RPCA MATLAB implementation available in [23] uses the singular value decomposition (SVD) in economy size mode. This mode computes only the first N columns of the singular vectors when the number of rows is lower than the number of columns. In this case, singular values, returned as a diagonal matrix, is an N by N matrix. This condition shown to be sufficient, since no gains in terms of detection were observed in using the transpose of X as input.
It is worth mentioning that many efforts have been made in developing algorithms to solve RPCA via PCP aiming at computational reductions. In [1], different algorithms are listed and classified as: basics, linearized, and fast algorithms. Moreover, [1] also describes algorithms that require computing the SVD results in O(M × N × min(M, N)) complexity. That issue motivates researches focusing on the reduction of the complexity by avoiding the computation of SVD. Thus, it is expected that the use of fast algorithms will present even more reductions in runtime. More information about such algorithms can be found in [1].
Furthermore, some individual comparison can be made. Using λ = 0.0037 as an example, the proposed method presents better results for six image pairs. FAR reductions are also observed to other seven image pairs but with lower PD values. Equal and worse results are observed for two and nine image pairs, respectively. This behavior can be explained by the differences in the mathematics behind both CD strategies. This issue can be considered on developing algorithms combining different mathematical strategies to better explore the images features.
The results achieved in this study motivate the development of other methodologies for UWB VHF SAR change detection, based in other sparse representations. For instance, an approach based on a convex solution would be very welcome [24]. Thus, such challenges can be addressed in future studies.