Anomaly Detection in Hyperspectral Imagery Based on Low-Rank Representation Incorporating a Spatial Constraint

: Hyperspectral imagery contains abundant spectral information. Each band contains some speciﬁc characteristics closely related to target objects. Therefore, using these characteristics, hyperspectral imagery can be used for anomaly detection. Recently, with the development of compressed sensing, low-rank-representation-based methods have been applied to hyperspectral anomaly detection. In this study, novel low-rank representation methods were developed for anomaly detection from hyperspectral images based on the assumption that hyperspectral pixels can be e ﬀ ectively decomposed into a low-rank component (for background) and a sparse component (for anomalies). In order to improve detection performance, we imposed a spatial constraint on the low-rank representation coe ﬃ cients, and single or multiple local window strategies was applied to smooth the coe ﬃ cients. Experiments on both simulated and real hyperspectral datasets demonstrated that the proposed approaches can e ﬀ ectively improve hyperspectral


Introduction
Due to the high spectral resolution and abundant spectral information in a large number of spectral bands, hyperspectral image (HSI) data have been widely used to distinguish different targets on the ground.According to the availability of target information, target detection algorithms can be divided into two categories: with or without a priori target knowledge [1].Since target information is often difficult to obtain in practice [2], anomaly detection does not require any a priori information of targets and is of great importance in real applications [3,4].
Anomaly detection involves modeling the background and using the difference between the pixels and the background to detect anomalous pixels.Many different anomaly detection algorithms have been proposed.The well-known Reed-Xiaoli (RX) algorithm, considered to be a classical detection algorithm [5], is a second-order matched filtering algorithm, for which the similarity between the test pixel and the background is calculated by Mahalanobis distance.When the entire image is considered for background modeling, this is known as the global RX detector (GRXD).If the RX detector estimates the background using local statistics, it is referred to as the local RX detector (LRXD) [6].However, neither of these two algorithms can exclude the influence of anomalous features on covariance.Until recently, many researchers have made contributions to anomaly detection.Billor et al. proposed the blocked adaptive computationally efficient outlier nominator (BACON) [7] algorithm, which tries to select the growing subset of outlier-free points and obtain robust background statistics to weaken the contamination by anomalies.Imani proposed the median-mean line RX detector (MML-RXD) [8], in which the benefits of the median-mean metric are used to provide more reliable background samples.Yao et al. were inspired by the bilateral filter in digital image processing and proposed the bilateral-filter-based anomaly detector (BFDA) [9], which substitutes the weighted surrounding neighbors for centric pixels to remove anomalous points.Some enhanced RX-based algorithms such as the weighted-RXD (WRXD) [10] and the linear filter-based RXD [10] were also proposed.They aim at increasing the probability of anomaly detection by improving the estimation of background statistics to exclude the influence of anomalous features on covariance.
Generally speaking, in real hyperspectral imagery, the background information is very complicated and cannot be described with just the covariance matrix and mean vector.To avoid obtaining the accurate covariance matrix of the background, some linear-representation-based methods have been successfully applied to hyperspectral anomaly detection.These include the unsupervised nearest regularized subspace (UNRS) [11] detector proposed by Li et al. and the collaborative-representation-based detector (CRD) [12] for anomaly detection, both of which are based on the assumption that the spectrum of the central pixel is similar to that of the surrounding pixel and can be linearly represented by the surrounding pixels.To make full use of the spatial information of neighboring pixels, the two detectors adopt a dual window strategy, which leads to the detection accuracy being affected by the size of the internal and external windows.
Unlike UNRS and CRD, with the development of compressed sensing, sparse-representation-based methods have been extensively researched.By using a small number of similar signals to represent signals, noise and anomalous signal information can be effectively suppressed.Some sparse-representation-based methods have been applied to hyperspectral image anomaly detection.For instance, the sparse representation detector (SRD) was proposed based on the assumption that hyperspectral pixels can be well represented by only a few pixels [13,14].Sparse representation theory has a good application in the processing and representation of one-dimensional information, but for two-dimensional matrix data, sparsity can only represent the structural information of the dataset.
Due to the redundancy and diversity of the observed data, two-dimensional matrix data are presented with low-rank characteristics.By performing a low-rank constraint on the matrix, the global structure information of the two-dimensional matrix can be better described.Low-rank representation (LRR) can be considered as a special case of sparse representation, extending from one-dimensional sparsity to two-dimensional low rank.Some low-rank representation models have also been applied to hyperspectral anomaly detection, such as the classical robust principal component analysis (RPCA) [15], which decomposes the matrix into low-rank and sparse matrices.Chen et al. [16] proposed the RPCA-RX anomaly detection method, with the classic RX detector being applied to the sparse matrix.Unlike RPCA, the LRR [1,17] model assumes that the data are drawn from multiple subspaces, which is better suited for hyperspectral images due to the complex background features of real data.Because of the correlation between the representation coefficients, the low-rank constraint of the matrix can effectively describe the global structure of the dataset.Therefore, Xu et al. [18] proposed the low-rank representation sum-to-one (LRRSTO) anomaly detection model.In order to make the representation coefficients robust, a sum-to-one constraint was added.
In recent years, sparse subspace clustering (SCC) theory has attracted considerable attention and become a research hotspot due to its superiority.Zhai et al. first introduced the SSC [19] model to HSIs based on the assumption that pixels belonging to the same land-cover class approximately lie in the same subspace.The SSC algorithm generally calculates the affinity matrix by solving the sparse representation coefficients and uses the spectral clustering algorithm to segment the data into different subspaces.
However, directly applying the SSC algorithm to HSIs fails to take advantage of the rich spectral and spatial information of HSIs.In light of this, the SSC incorporating spatial information (SSC-S) [20] model was proposed.Based on the hypothesis that the background pixels in the neighborhood window have the same background as the central testing pixel, Zhang et al. hypothesized that the representation coefficients of the neighborhood pixels in the local window should also be similar to the central testing pixel.In the SSC-S model, they eliminated the influence of noise and other errors by constraining the space representation coefficient to enhance the clustering effect.
In this study, in order to make full use of spectral similarity in a local spatial neighborhood of HSI, we were inspired by the SSC-S algorithm to introduce a spectral constraint term to the LRRSTO model.We then developed two novel anomaly detection methods based on single or multiple local windows and the low-rank representation sum-to-one model (SLW_LRRSTO/MLW_LRRSTO).
Unlike traditional LRRSTO models, we added a spatial constraint to the LRRSTO model in our SLW_LRRSTO methods.Based on the hypothesis that the background pixels in the neighborhood window share the same background as the central testing pixel, we used a single local window to smooth the LRR coefficients matrix.For the MLW_LRRSTO methods, we adopted multiple local windows to enhance the utilization of spatial information.However, in the LRRSTO model, the overcomplete dictionary composed of data itself contains a large number of redundant dictionary atoms, which greatly slows down the speed of the algorithm.In order to accelerate the operation, in the proposed SLW_LRRSTO/MLW_LRRSTO models, we adopted the random selection method to construct the initial dictionary.

LRRSTO Detection Algorithm
LRR can effectively describe the global correlation of the observed HSI.Based on the LRR, the HSI can be decomposed into a background part and a sparse part.The anomaly information can then be detected in the sparse part.Xu et al. [18] applied the sum-to-one constraint to enhance the robustness of the representation coefficient in the LRRSTO method, which is formulated as where X ∈ R p×MN is the 2D HSI data matrix with p bands and MN pixels, D ∈ R p×d is the dictionary matrix, Z ∈ R d×MN is the LRR coefficient matrix, Z T 1 d×1 = 1 MN×1 is the sum-to-one constraint, and E is the sparse matrix.

SSC-S Clustering Algorithm
The SSC model [21] can be expressed as where Z ∈ R MN×MN is the representation coefficient matrix, E ∈ R p×MN is the representation error term, and λ is the tradeoff parameter.Here, diag(C) = 0 is used to eliminate trivial solutions.
According to Tobler's first law, neighborhood pixels within a local window usually belong to the same class as the central pixel under test.They also have similar sparse coefficients to the central pixel [22].If the mean representation coefficient vectors of the neighborhood pixels are different, the central pixel under test may be noise.Zhang et al. [20] proposed a novel SSC model incorporating spatial neighborhood information by utilizing a mean constraint for sparse representation coefficients.
In this model, Z is utilized to regularize Z, and α 2 Z − Z the difference between the test pixel and the mean of its neighborhood pixels.The representation coefficient matrix can then be obtained by solving the following optimization problem: where α 2 Z − Z 2 F < ε is the spatial constraint, and α and λ are the regularization parameters.

The Local Summation Anomaly Detection (LSAD) Algorithm
A circular window is utilized to calculate the covariance matrix and the mean vector for the pixels that are between the outer and inner windows in the LRXD [23].However, if there are anomalous pixels in the frame, the local background distribution is not the best representation for the testing pixel.Du et al. [24] proposed a novel local summation anomaly detection (LSAD) method which exploits a second-order Mahalanobis distance statistical feature and a multiple local window filter to establish a local summation anomaly detection strategy.

Single Local Window
In the LRRSTO model, the spatial similarity between adjacent background pixels is not considered; that is, the neighborhood pixels of local windows have the same category as the central pixel under test.In order to make full use of the correlation information between local background pixels, we constrained the representation coefficients of central pixels by using mean filtering.Based on the LRRSTO model, the spatial constraint of the low-rank representation coefficient Z − Z 2 F < ε was used.The local distributions of the single window are obtained by the single local window filter, but it does not include the pixel under test in the detection statistics.Therefore, we proposed a novel single local window anomaly detection method based on the low-rank representation sum-to-one model (SLW_LRRSTO): where Z T 1 = 1 is the sum-to-one constraint, Z − Z 2 F is the spatial constraint in the single local window, Z is the representation coefficient matrix, Z is the smoothing filtering result of Z, and η and λ are the tradeoff parameters.

Multiple Local Background Statistics
The abovementioned SLW_LRRSTO algorithm only utilizes a single window filter of a specified size.Ideally, the local neighborhood pixels of the single local window do not contain anomalous pixels.However, when the local neighborhood contains anomalous pixels, the detector cannot obtain ideal detection performance.To solve this problem, we proposed using multiple local windows.
In Figure 1, the purple squares represent anomalous pixels, and the white squares represent the same category background pixels.Figure 1a is the ideal distribution of neighboring pixels, where the center pixel is the anomaly and the neighborhood pixels are the background.The mean value of the neighborhood pixels differs from that of the center test pixel value, which can improve the detection accuracy.Figure 1b shows that the center pixel is the anomaly, and the neighborhood contains anomalous pixels.With the mean filter window (red box), the mean value of the neighborhood cannot detect the center pixel as the anomaly.However, with the bottom-right local window filter (green box), the neighborhood pixels do not contain the anomalous pixel, and the center pixel can be detected as an anomaly.Figure 1c shows that the center pixel is a background pixel and the neighborhood pixels include an anomalous pixel in the single local window.For the single mean filter window (red box), the mean value of the neighborhood indicates the center pixel is an anomaly.Using the bottom-right local window mean filter (green box), the neighborhood pixels do not contain the anomaly, which can prevent the center pixel from being identified as an abnormal target.Based on the SLW_LRRSTO model, the representation coefficient spatial constraint of the single window is extended to the spatial constraint of multiple local windows.As shown in Figure 2, the size of the red single window is  ×  (where  is odd).We used a small green window, the size of which is  =  + 1 2 ⁄ to process the data in the red single window from left to right and top to bottom. =  −  + 1 ×  −  + 1 is the number of multiple local windows.
Figure 2a shows a single window of size 5 × 5 which is expanded to nine local windows.The corresponding filter is shown in Figure 2b, where four of nine local windows contain anomalous pixels.This differs from the single local window, in which the neighborhood pixels include two anomalous pixels.On the one hand, the multiple local window strategy can effectively reduce the number of abnormal pixels in the neighborhood.On the other hand, the situation of the local window containing abnormal pixels rarely occurs.Based on the SLW_LRRSTO model, the representation coefficient spatial constraint of the single window is extended to the spatial constraint of multiple local windows.As shown in Figure 2, the size of the red single window is N × N (where N is odd).We used a small green window, the size of which is L = (N + 1)/2 to process the data in the red single window from left to right and top to bottom.Based on the SLW_LRRSTO model, the representation coefficient spatial constraint of the single window is extended to the spatial constraint of multiple local windows.As shown in Figure 2, the size of the red single window is  ×  (where  is odd).We used a small green window, the size of which is  =  + 1 2 ⁄ to process the data in the red single window from left to right and top to bottom. =  −  + 1 ×  −  + 1 is the number of multiple local windows.
Figure 2a shows a single window of size 5 × 5 which is expanded to nine local windows.The corresponding filter is shown in Figure 2b, where four of nine local windows contain anomalous pixels.This differs from the single local window, in which the neighborhood pixels include two anomalous pixels.On the one hand, the multiple local window strategy can effectively reduce the number of abnormal pixels in the neighborhood.On the other hand, the situation of the local window containing abnormal pixels rarely occurs.Figure 2a shows a single window of size 5 × 5 which is expanded to nine local windows.The corresponding filter is shown in Figure 2b, where four of nine local windows contain anomalous pixels.This differs from the single local window, in which the neighborhood pixels include two anomalous pixels.On the one hand, the multiple local window strategy can effectively reduce the number of abnormal pixels in the neighborhood.On the other hand, the situation of the local window containing abnormal pixels rarely occurs.
In Formula (4), the spatial constraint of the single local window η Z − Z 2 F is extended to We used the inexact augmented Lagrange multiplier (IALM) algorithm [1,17,25] to solve the MLW_LRRSTO model and add auxiliary variable J.The formula can be written as where is the auxiliary variable.The augmented Lagrange function is where Y 1 , Y 2 , Y 3 are the Lagrange multipliers and µ > 0 is the penalty parameter.
The IALM [1,17,25] is a multiple-variable optimization problem, which can be solved by alternately updating one variable by the minimizing function with the other variables fixed.The problem can be divided into the following subproblems: Step 1: Fix (J, Z, E) and update J.The objective function can then be written as follows: Step 2: Update J i in each local window and use these auxiliary variables to combine J: Step 3: Fix (J, Z, E) and update Z.The objective function can then be written as follows: Step 4: Fix (J, Z, E) and update E. The objective function can then be written as follows: The procedure of the proposed method is summarized as follows Algorithm 1, in which Θ, Ω, and λ 2,1 are the singular value thresholding and λ 2,1 is the minimization operator.

Algorithm 1. Inexact ALM algorithm for MLW_LRRSTO
Input: data matrix X, size N of single local window, parameters β > 0 and λ > 0 Initialize: , otherwise End while Output: the optimal solution (J, Z, E)

Experiments and Analysis
In order to evaluate the effectiveness of the proposed methods, we conducted our experiments on three hyperspectral images.The first experiment was used to analyze the propriety of the proposed methods, and the other two experiments were used to demonstrate its effectiveness.It is worth noting that these three datasets were obtained after preprocessing including atmospheric correction.

Simulated Hyperspectral Image
The hyperspectral data used in the simulated experiment [26,27] were collected by the airborne visible/infrared imaging spectrometer (AVIRIS) over Salinas Valley, CA, USA, which can be downloaded from the GIC website [28].The original image comprising vegetables, bare soil, and vineyard fields had a spatial size of 512 × 217 pixels and was made up of 224 spectral bands in the wavelength range of 370-2510 nm.In total, 202 bands were used in the experiment after the noise and water absorption bands had been removed.The image scene in pseudocolor is shown in Figure 3a.A region with a size of 120 × 120 was selected to generate the simulated data and is shown in Figure 3b.The 25 specific anomalous pixels were randomly selected from the whole image, with different spectra from the subimages, and are shown in Figure 3c. Figure 3d is the corresponding anomaly location map.The anomalous pixels were simulated by the target implantation method [29].A simple linear mixture model was adopted for the implanted pixels as where f is the abundance fraction (ranging from 0.04 to 1), t is the specific anomaly spectrum, and b is the background spectrum.

Parameters Analysis
The initial choices of different parameters were important for the proposed methods, which involve two regularization parameters:  and .For this simulated dataset, we adopted the target embedding method to fully control the generating environment of subpixel anomalous targets.Therefore, in order to show the superiority and fully excavate the potential of the proposed methods, we found the optimal parameters by a trial-and-error method.Figure 4 shows how the detection performance changed as the parameters were changed.The range of parameters  and  was set to 0-2, and the specific data results are shown in Table 1.The data results in Figure 4 and Table 1 show that MLW_LRRSTO was sensitive to .For the simulated dataset, it achieved low area under the curve (AUC) values when  was larger than 0.5.In this range, the average AUC value for the AVIRIS_Salinas data was 0.9815.Therefore, we empirically set  = 1 and  = 0.1 for all the datasets in our experiments.

Parameters Analysis
The initial choices of different parameters were important for the proposed methods, which involve two regularization parameters: η and λ.For this simulated dataset, we adopted the target embedding method to fully control the generating environment of subpixel anomalous targets.Therefore, in order to show the superiority and fully excavate the potential of the proposed methods, we found the optimal parameters by a trial-and-error method.Figure 4 shows how the detection performance changed as the parameters were changed.The range of parameters η and λ was set to 0-2, and the specific data results are shown in Table 1.The data results in Figure 4 and Table 1 show that MLW_LRRSTO was sensitive to λ.For the simulated dataset, it achieved low area under the curve (AUC) values when λ was larger than 0.5.In this range, the average AUC value for the AVIRIS_Salinas data was 0.9815.Therefore, we empirically set η = 1 and η = 0.1 for all the datasets in our experiments.
Remote Sens. 2019, 11, 1578 9 of 16 0-2, and the specific data results are shown in Table 1.The data results in Figure 4 and Table 1 show that MLW_LRRSTO was sensitive to .For the simulated dataset, it achieved low area under the curve (AUC) values when  was larger than 0.5.In this range, the average AUC value for the AVIRIS_Salinas data was 0.9815.Therefore, we empirically set  = 1 and  = 0.1 for all the datasets in our experiments.The first real experimental image [1,4,30] was obtainned by the HYDICE sensor over an urban area, which can be downloaded from the website [31].The image had a spectral resolution of 10 nm and a spatial resolution of 1 m.The whole image had a spatial size of 307 × 307 pixels.It comprised 210 spectral bands, and 162 bands were used in the experiment after the noise and water absorption bands had been removed.The image scene in pseudocolor is shown in Figure 5a.A region with a size of 80 × 100 pixels was selected as the test data in the upper-rightmost area of the scene and is shown in Figure 5b.The ground-truth map is shown in Figure 5c.Twenty-one pixels were anomalies, which were cars and roofs, because they had spectra that differed from the background.The first real experimental image [1,4,30] was obtainned by the HYDICE sensor over an urban area, which can be downloaded from the website [31].The image had a spectral resolution of 10 nm and a spatial resolution of 1 m.The whole image had a spatial size of 307 × 307 pixels.It comprised 210 spectral bands, and 162 bands were used in the experiment after the noise and water absorption bands had been removed.The image scene in pseudocolor is shown in Figure 5a.A region with a size of 80 × 100 pixels was selected as the test data in the upper-rightmost area of the scene and is shown in Figure 5b.The ground-truth map is shown in Figure 5c.Twenty-one pixels were anomalies, which were cars and roofs, because they had spectra that differed from the background.The second real dataset [32][33][34] was acquired by the AVIRIS sensor over San Diego, CA, USA.This image had a spatial resolution of 3.5 m and the whole image had a spatial size of 400 × 400 pixels.It comprised 224 spectral channels, and 186 channels were used in the experiment after the noise and water absorption bands had been removed.The image scene in pseudocolor is shown in Figure 6a.The top-left 100 × 100 of the scene was chosen as the test image, as shown in Figure 6b.The second real dataset [32][33][34] was acquired by the AVIRIS sensor over San Diego, CA, USA.This image had a spatial resolution of 3.5 m and the whole image had a spatial size of 400 × 400 pixels.It comprised 224 spectral channels, and 186 channels were used in the experiment after the noise and water absorption bands had been removed.The image scene in pseudocolor is shown in Figure 6a.The top-left 100 ×100 of the scene was chosen as the test image, as shown in Figure 6b.The ground-truth map is shown in Figure 6c.Fifty-eight pixels were anomalies from the subimage.The airplanes were considered as anomalies, the main bodies of which were pure pixels and edges were mixed pixels.The third real dataset was acquired during the "Viareggio 2013 trial" hyperspectral data collection campaign [35], which can be downloaded from the website [36].Datasets in this campaign were acquired by a pushbroom hyperspectral "Sistema Iperspettrale Modulare Galileo Avionica" sensor mounted on an ultralight aircraft.In this paper, the subset of D1F12H1 was utilized.The image had a spatial size of 375 × 450 pixels and comprised 511 spectral bands ranging from 400 to 1000 nm.The spatial resolution and spectral resolution were 0.6 m and the 2.3 nm, respectively.The image scene in true color is shown in Figure 7a.It covers a parking lot, several sports facility buildings, and a football field in a suburban vegetated area in Viareggio, Italy.In the ground-truth map (Figure 7b), there are three vehicles, four panels, and two reference calibration tarps with 135 anomalous pixels to be detected in the scene.

Detection Performance
The proposed methods (MLW_LRRSTO and SLW_LRRSTO) were compared to the GRXD, LRXD, WRXD, RPCA-RX, and LRRSTO.In different datasets, the window size parameters were set differently for LRXD, and the parameters of other methods remained the same.The regularization parameters for LRXD, RPCA-RX, and LRRSTO were empirical parameters in the experiments.In the LRXD, the covariance matrix size depends on the number of bands, and in order to have an accurate sample covariance matrix, the number of captured pixels should be no less than band number + 1.The third real dataset was acquired during the "Viareggio 2013 trial" hyperspectral data collection campaign [35], which can be downloaded from the website [36].Datasets in this campaign were acquired by a pushbroom hyperspectral "Sistema Iperspettrale Modulare Galileo Avionica" sensor mounted on an ultralight aircraft.In this paper, the subset of D1F12H1 was utilized.The image had a spatial size of 375 × 450 pixels and comprised 511 spectral bands ranging from 400 to 1000 nm.The spatial resolution and spectral resolution were 0.6 m and the 2.3 nm, respectively.The image scene in true color is shown in Figure 7a.It covers a parking lot, several sports facility buildings, and a football field in a suburban vegetated area in Viareggio, Italy.In the ground-truth map (Figure 7b), there are three vehicles, four panels, and two reference calibration tarps with 135 anomalous pixels to be detected in the scene.The third real dataset was acquired during the "Viareggio 2013 trial" hyperspectral data collection campaign [35], which can be downloaded from the website [36].Datasets in this campaign were acquired by a pushbroom hyperspectral "Sistema Iperspettrale Modulare Galileo Avionica" sensor mounted on an ultralight aircraft.In this paper, the subset of D1F12H1 was utilized.The image had a spatial size of 375 × 450 pixels and comprised 511 spectral bands ranging from 400 to 1000 nm.The spatial resolution and spectral resolution were 0.6 m and the 2.3 nm, respectively.The image scene in true color is shown in Figure 7a.It covers a parking lot, several sports facility buildings, and a football field in a suburban vegetated area in Viareggio, Italy.In the ground-truth map (Figure 7b), there are three vehicles, four panels, and two reference calibration tarps with 135 anomalous pixels to be detected in the scene.

Detection Performance
The proposed methods (MLW_LRRSTO and SLW_LRRSTO) were compared to the GRXD, LRXD, WRXD, RPCA-RX, and LRRSTO.In different datasets, the window size parameters were set differently for LRXD, and the parameters of other methods remained the same.The regularization parameters for LRXD, RPCA-RX, and LRRSTO were empirical parameters in the experiments.In the LRXD, the covariance matrix size depends on the number of bands, and in order to have an accurate sample covariance matrix, the number of captured pixels should be no less than band number + 1.

Detection Performance
The proposed methods (MLW_LRRSTO and SLW_LRRSTO) were compared to the GRXD, LRXD, WRXD, RPCA-RX, and LRRSTO.In different datasets, the window size parameters were set differently for LRXD, and the parameters of other methods remained the same.The regularization parameters for LRXD, RPCA-RX, and LRRSTO were empirical parameters in the experiments.In the LRXD, the covariance matrix size depends on the number of bands, and in order to have an accurate sample covariance matrix, the number of captured pixels should be no less than band number + 1.Therefore, in AVIRIS_Salinas, HYDICE_Urban, AVIRIS_SanDiego, and Viareggio datasets, the inner and outer window parameters (W out , W in ) were set to (17,9), (19,5), (19,9), and (25, 7), respectively, after extensive searching.Parameter λ of RPCA was set to 0.01 and parameter λ of LRRSTO was set to 0.1.The size of the single local window was set to 3 for MLW_LRRSTO and SLW_LRRSTO, and the tradeoff parameters were set as η = 1 and λ = 0.1 for MLW_LRRSTO and SLW_LRRSTO.It is worth noting that the detection results of all the methods in this paper are the average value of 10 repeated experiments.The parameters of each detection method were set as shown in Table 2.
Table 2. Parameters of the different detection methods.

Method
Parameters

AVIRIS_Salinas Experiment
Figure 8 shows the receiver operating characteristic (ROC) curves [37].An important observation from Figure 8 is that the performance of RPCA was clearly better than that of GRXD, LRXD, and WRXD.Compared with LRRSTO, the proposed SLW_LRRSTO and MLW_LRRSTO had a higher probability of detection in the condition of a low false alarm rate.MLW_LRRSTO was the best method in terms of the overall detection performance.Furthermore, we also computed the AUC of the ROC [37] to evaluate the performance of these methods.The results are shown in Table 3, where the proposed MLW_LRRSTO achieved the highest score, which was higher than the second-highest score from the SLW_LRRSTO.Table 3 also lists the execution time of the different methods.The GRXD and WRXD had shorter computational time, but the detection rates were lower than RPCA_RX and LRRSTO.The RPCA_RX and LRRSTO took less time than MLW_LRRSTO, while compared with it, the SLW_LRRSTO and MLW_LRRSTO had higher detection rates.The SLW_LRRSTO and MLW_LRRSTO had less computational time than LRXD, in which the inverse function of local background requires heavy computational time.worth noting that the detection results of all the methods in this paper are the average value of 10 repeated experiments.The parameters of each detection method were set as shown in Table 2.
Table 2. Parameters of the different detection methods.

AVIRIS_Salinas Experiment
Figure 8 shows the receiver operating characteristic (ROC) curves [37].An important observation from Figure 8 is that the performance of RPCA was clearly better than that of GRXD, LRXD, and WRXD.Compared with LRRSTO, the proposed SLW_LRRSTO and MLW_LRRSTO had a higher probability of detection in the condition of a low false alarm rate.MLW_LRRSTO was the best method in terms of the overall detection performance.Furthermore, we also computed the AUC of the ROC [38] to evaluate the performance of these methods.The results are shown in Table 3, where the proposed MLW_LRRSTO achieved the highest score, which was higher than the secondhighest score from the SLW_LRRSTO.Table 3 also lists the execution time of the different methods.The GRXD and WRXD had shorter computational time, but the detection rates were lower than RPCA_RX and LRRSTO.The RPCA_RX and LRRSTO took less time than MLW_LRRSTO, while compared with it, the SLW_LRRSTO and MLW_LRRSTO had higher detection rates.The SLW_LRRSTO and MLW_LRRSTO had less computational time than LRXD, in which the inverse function of local background requires heavy computational time.The ROC curves and AUC values are shown in Figure 9 and Table 4, respectively.The proposed MLW_LRRSTO and SLW_LRRSTO generated higher probability of detection when the false alarm rate was low, and they had similar ROC curves.SLW_LRRSTO achieved the highest AUC score among all the detectors, and MLW_LRRSTO obtained the second-highest score.Therefore, it can be concluded that MLW_LRRSTO and SLW_LRRSTO were more effective methods than GRXD, LRXD, WRXD, RPCA, and LRRSTO for the detection of anomalous pixels in the HYDICE_urban dataset.As shown in Table 3, according to the computational time of the different methods, we can see that GRXD, WRXD, and RPCA_RX had relatively fast operation speed and LRXD took longer to calculate.The MLW_LRRSTO took the longest time to calculate, in which the use of multiple local windows requires heavy computational time, but the proposed MLW_LRRSTO and SLW_LRRSTO generated higher detection rates than the others.The ROC curves and AUC values are shown in Figure 9 and Table 4, respectively.The proposed MLW_LRRSTO and SLW_LRRSTO generated higher probability of detection when the false alarm rate was low, and they had similar ROC curves.SLW_LRRSTO achieved the highest AUC score among all the detectors, and MLW_LRRSTO obtained the second-highest score.Therefore, it can be concluded that MLW_LRRSTO and SLW_LRRSTO were more effective methods than GRXD, LRXD, WRXD, RPCA, and LRRSTO for the detection of anomalous pixels in the HYDICE_urban dataset.As shown in Table 3, according to the computational time of the different methods, we can see that GRXD, WRXD, and RPCA_RX had relatively fast operation speed and LRXD took longer to calculate.The MLW_LRRSTO took the longest time to calculate, in which the use of multiple local windows requires heavy computational time, but the proposed MLW_LRRSTO and SLW_LRRSTO generated higher detection rates than the others.The ROC curves of all the methods are shown in Figure 10.Compared with LRXD, the proposed MLW_LRRSTO exhibited a slightly lower probability of detection for a low false alarm rate.Compared to SLW_LRRSTO, MLW_LRRSTO exhibited a higher probability of detection in terms of the overall detection performance, and it achieved the highest probability of detection for all the false alarm rate values.The AUC values and computational time of the different methods are shown in Table 5, where the GRXD, WRXD, and RPCA_RX had shorter computational times than the others, and the MLW_LRRSTO took the longest time to calculate.However, the proposed MLW_LRRSTO yielded a score that was higher than that of the others.Compared with LRRSTO, the SLW_LRRSTO had a shorter computational time and higher detection rate, which is a significant improvement.The ROC curves of all the methods are shown in Figure 10.Compared with LRXD, the proposed MLW_LRRSTO exhibited a slightly lower probability of detection for a low false alarm rate.Compared to SLW_LRRSTO, MLW_LRRSTO exhibited a higher probability of detection in terms of the overall detection performance, and it achieved the highest probability of detection for all the false alarm rate values.The AUC values and computational time of the different methods are shown in Table 5, where the GRXD, WRXD, and RPCA_RX had shorter computational times than the others, and the MLW_LRRSTO took the longest time to calculate.However, the proposed MLW_LRRSTO yielded a score that was higher than that of the others.Compared with LRRSTO, the SLW_LRRSTO had a shorter computational time and higher detection rate, which is a significant improvement.The ROC curves of all the methods are shown in Figure 11.Compared with LRRSTO, the proposed SLW_LRRSTO and MLW_LRRSTO exhibited a slightly lower probability of detection for a low false alarm rate.From the AUC values in Table 6, we can conclude that the GRXD and RPCA_RX had lower detection accuracy.Compared with the others, SLW_LRRSTO and MLW_LRRSTO yielded higher scores, which indicates that the two proposed methods had a significant improvement.From the computational time in Table 6, we can see that the GRXD and WRXD had shorter computational time than the others, and the LRXD had the longest time, which may have been caused by the large window size or the inverse function of local background.The computational time of SLW_LRRSTO and MLW_LRRSTO was relatively longer, but the detection accuracy was improved significantly compared with the other methods.The ROC curves of all the methods are shown in Figure 11.Compared with LRRSTO, the proposed SLW_LRRSTO and MLW_LRRSTO exhibited a slightly lower probability of detection for a low false alarm rate.From the AUC values in Table 6, we can conclude that the GRXD and RPCA_RX had lower detection accuracy.Compared with the others, SLW_LRRSTO and MLW_LRRSTO yielded higher scores, which indicates that the two proposed methods had a significant improvement.From the computational time in Table 6, we can see that the GRXD and WRXD had shorter computational time than the others, and the LRXD had the longest time, which may have been caused by the large window size or the inverse function of local background.The computational time of SLW_LRRSTO and MLW_LRRSTO was relatively longer, but the detection accuracy was improved significantly compared with the other methods.The ROC curves of all the methods are shown in Figure 11.Compared with LRRSTO, the proposed SLW_LRRSTO and MLW_LRRSTO exhibited a slightly lower probability of detection for a low false alarm rate.From the AUC values in Table 6, we can conclude that the GRXD and RPCA_RX had lower detection accuracy.Compared with the others, SLW_LRRSTO and MLW_LRRSTO yielded higher scores, which indicates that the two proposed methods had a significant improvement.From the computational time in Table 6, we can see that the GRXD and WRXD had shorter computational time than the others, and the LRXD had the longest time, which may have been caused by the large window size or the inverse function of local background.The computational time of SLW_LRRSTO and MLW_LRRSTO was relatively longer, but the detection accuracy was improved significantly compared with the other methods.

Discussion
To verify the effectiveness of the proposed methods, we conducted our experiments on three hyperspectral images which are widely used in the field of anomaly detection and compared the results with five other methods: GRXD, LRXD, WRXD, RPCA-RX, and LRRSTO.
In the proposed SLW_LRRSTO method, in order to achieve better spatial constraints, we smoothed the LRR coefficients of the central testing pixel by using a single local window.However, the spatial constraints of coefficients using a single local window were greatly affected by the anomalous pixels in the neighborhood.Therefore, in order to effectively reduce the impact of the anomalous pixels in the neighborhood, we proposed the MLW_LRRSTO algorithm, which adopts the multiple local window to constrain the spatial coefficients.
For a numerical comparison, we used the ROC curves and AUC values as the main criteria to evaluate these detection results.From the ROC curves shown in Figures 8-11, and the specific AUC values shown in Tables 3-6, it can be seen that the detection accuracy of the proposed two methods was obviously better than that of the other methods.Based on these experimental results, we can conclude that the proposed two methods showed superior detection performance compared with the others.
Although the proposed methods yielded outstanding detection performance, there still exist some shortcomings that can be further improved.The proposed method involves two regularization parameters-η and λ-which have an important influence on the detection results of images.Although the optimal parameters can be found by a trial-and-error method in synthetic data experiments, in practical applications, we can neither obtain the a priori knowledge of anomalous objects in advance nor can we find the optimal parameters by the trial-and-error method.Therefore, only empirical parameters can be used.Moreover, the matrix decomposition is very time consuming, and it is important to speed up the operation of the algorithm.

Conclusions
In this paper, two novel anomaly detection methods for HSIs have been proposed.The first approach is the SLW_LRRSTO anomaly detection method, which is based on LRRSTO with the combination of spectral and spatial information.SLW_LRRSTO adds a spatial constraint to the low-rank representation coefficient and smooths the low-rank representation coefficients of the central testing pixel using a single local window.The second approach is the MLW_LRRSTO method, which uses the same model as the SLW_LRRSTO method but employs a multiple local window smoothing filter strategy.To confirm the effectiveness of the proposed methods, experiments were conducted on both simulated and real hyperspectral data in comparison with other detection methods.The experimental results confirmed that SLW_LRRSTO and MLW_LRRSTO can effectively separate anomalous targets from the background, and their detection performance was clearly better than that of other detection methods.In particular, MLW_LRRSTO outperformed SLW_LRRSTO and offered the highest detection accuracy.
However, there still exist some aspects which can be further improved.The focus of the next research should be on how to automatically tune the parameters of the proposed SLW_LRRSTO and MLW_LRRSTO methods and to further improve the speed of the algorithms.

Figure 1 .
Figure 1.Analysis between the center test pixel and neighboring pixels in a single local window.The purple squares represent anomalous pixels, and the white squares represent the same category background pixels: (a) ideal distribution of neighboring pixels; (b) the neighborhood contains anomalous pixels; (c) the center pixel is a background pixel.

Figure 2 .
Figure 2. Schematic diagram of a single local window and multiple local windows and the corresponding filters (N = 5): (a) single local window; (b) multiple local windows.

Figure 1 .
Figure 1.Analysis between the center test pixel and neighboring pixels in a single local window.The purple squares represent anomalous pixels, and the white squares represent the same category background pixels: (a) ideal distribution of neighboring pixels; (b) the neighborhood contains anomalous pixels; (c) the center pixel is a background pixel.

Figure 1 .
Figure 1.Analysis between the center test pixel and neighboring pixels in a single local window.The purple squares represent anomalous pixels, and the white squares represent the same category background pixels: (a) ideal distribution of neighboring pixels; (b) the neighborhood contains anomalous pixels; (c) the center pixel is a background pixel.

Figure 2 .
Figure 2. Schematic diagram of a single local window and multiple local windows and the corresponding filters (N = 5): (a) single local window; (b) multiple local windows.

Figure 2 .
Figure 2. Schematic diagram of a single local window and multiple local windows and the corresponding filters (N = 5): (a) single local window; (b) multiple local windows.
where M is the Remote Sens. 2019, 11, 1578 6 of 16 number of multiple local windows and I is the identity matrix.We defined I = [I, I, • • • , I] T and

Figure 4 .
Figure 4. Results with different parameters for AVIRIS_Salinas dataset.Figure 4. Results with different parameters for AVIRIS_Salinas dataset.

Figure 4 .
Figure 4. Results with different parameters for AVIRIS_Salinas dataset.Figure 4. Results with different parameters for AVIRIS_Salinas dataset.

Figure 5 .
Figure 5. HYDICE_Urban dataset: (a) false-color image of the whole scene; (b) false-color image of the detection area; (c) the ground-truth map.

Figure 5 .
Figure 5. HYDICE_Urban dataset: (a) false-color image of the whole scene; (b) false-color image of the detection area; (c) the ground-truth map.

Figure 6 .
Figure 6.AVIRIS_SanDiego dataset: (a) false-color image of the whole scene; (b) false-color image of the detection area; (c) the ground-truth map.

Figure 6 .
Figure 6.AVIRIS_SanDiego dataset: (a) false-color image of the whole scene; (b) false-color image of the detection area; (c) the ground-truth map.

Figure 6 .
Figure 6.AVIRIS_SanDiego dataset: (a) false-color image of the whole scene; (b) false-color image of the detection area; (c) the ground-truth map.

Figure 8 .Table 3 .Figure 8 .
Figure 8. Receiver operating characteristic (ROC) evaluation of the different methods for the AVIRIS_Salinas data.

Figure 9 .
Figure 9. ROC evaluation of the different methods for the HYDICE_urban dataset.

Figure 9 .
Figure 9. ROC evaluation of the different methods for the HYDICE_urban dataset.

Figure 10 .
Figure 10.ROC evaluation of the different methods for the AVIRIS_SanDiego dataset.

Figure 11 .
Figure 11.ROC evaluation of the different methods for the Viareggio dataset.

Figure 10 .
Figure 10.ROC evaluation of the different methods for the AVIRIS_SanDiego dataset.

16 Figure 10 .
Figure 10.ROC evaluation of the different methods for the AVIRIS_SanDiego dataset.

Figure 11 .
Figure 11.ROC evaluation of the different methods for the Viareggio dataset.

Figure 11 .
Figure 11.ROC evaluation of the different methods for the Viareggio dataset.

Table 1 .
Results with different parameters for the AVIRIS_Salinas data.

Table 1 .
Results with different parameters for the AVIRIS_Salinas data.

Table 3 .
Area under the curve (AUC) values for the different detectors with the AVIRIS_Salinas dataset.

Table 4 .
AUC values for the different detectors with the HYDICE_urban dataset.

Table 4 .
AUC values for the different detectors with the HYDICE_urban dataset.

Table 5 .
AUC values for the different detectors with the AVIRIS_SanDiego dataset.

Table 6 .
AUC values for the different detectors with the Viareggio dataset.

Table 5 .
AUC values for the different detectors with the AVIRIS_SanDiego dataset.

Table 5 .
AUC values for the different detectors with the AVIRIS_SanDiego dataset.

Table 6 .
AUC values for the different detectors with the Viareggio dataset.

Table 6 .
AUC values for the different detectors with the Viareggio dataset.