Recursive Local Summation of RX Detection for Hyperspectral Image Using Sliding Windows

: Anomaly detection has received considerable interest for hyperspectral data exploitation due to its high spectral resolution. Fast processing and good detection performance are practically signiﬁcant in real world problems. Aiming at these requirements, this paper develops a recursive local summation RX anomaly detection approach by virtue of sliding windows. This paper develops a recursive local summation RX anomaly detection approach by virtue of sliding windows. A causal sample covariance/correlation matrix is derived for local window background. As for the real-time sliding windows, the Woodbury identity is used in recursive update equations, which could avoid the calculation of historical information and thus speed up the processing. Furthermore, a background suppression algorithm is also proposed in this paper, which removes the current under test pixel from the recursively update processing. Experiments are implemented on a real hyperspectral image. The experiment results demonstrate that the proposed anomaly detector outperforms the traditional real-time local background detector and has a signiﬁcant speed-up effect on calculation time compared with the traditional detectors.


Introduction
Attributed to the high spectral resolution, hyperspectral images are now capable of uncovering many subtle signal sources that cannot be known by prior knowledge or be visually inspected by image analysts [1,2].Signal sources appear as anomalies in the data, such as unexpected presence, low probability of occurrence, small sample population whose signature is spectrally distinct from spectral signatures of its surrounding data samples.As a result, anomaly detection has received considerable interest in hyperspectral imaging in the last twenty years [3][4][5][6].
The RX detector developed by Reed and Yu [3] is acknowledged to be the most widely used anomaly detector.The classic RX algorithm is based on the global sample covariance matrix K, and is referred to as K-RXD.Since then, many RX-like anomaly detectors have been proposed [7][8][9][10][11][12][13].
Of particular interest are RXD using global sample correlation matrix R (R-RXD) [7,8], and RXD based on local background covariance matrix (L-RXD) [9].The L-RXD uses not only spectral information but also spatial information to bring benefit for detection performance [10].However, it may fail to obtain the best detection performance due to the penuriousness and unicity of local background distribution in every local window.A local summation anomaly detection (LSAD) is proposed in [13] by combining multiple local neighboring distributions of the pixel under test to get better performance.LSAD can be considered as a local summation RXD (LS-RXD) using subspace feature projection for the stable local covariance estimation.
The hyperspectral remote sensing has developed rapidly in recent years, but as the satellite relocation cycle becomes shorter, some new problems come out.For instance, the massive data has brought some challenges to the data transmission and storage.Moreover, for the anomaly detection problem, the anomalies such as moving targets may show up for a short time and disappear quickly.In this case, timely detection is necessary.However, data transmission is quite time-consuming, to achieve timely detection, developing the recursive anomaly detection algorithms is important and necessary.Recently, several real-time anomaly detection methods [14][15][16][17][18][19] have been proposed.Specifically, real-time causal process of K-RXD and R-RXD detector (called as RT-CK-RXD, RT-CR-RXD) were developed in [14].The real-time R-RXD and constrained energy minimization (CEM) are optimized and integrated in a dual-mode parallel Field-Programmable Gate Array (FPGA) based hardware platform in [16].Unlike the RT-CR-RXD, in the FPGA-based implementations, each pixel under test is located in the middle region of the background, which can improve the performance of target detection.The computational performance of real-time causal linewise progressive anomaly detection (RCLPAD) based on Cholesky decomposition along with linear system solving were developed in [17].An advanced anomaly detector using causal sliding array windows to capture local autocorrelation matrix statistics in the sense of causality was developed (CSA-RXD) [18], by virtue of causal sliding windows, a causal sample correlation matrix can be derived for causal anomaly detection.Recursive update equations are also derived and thus speed up real-time processing.A real-time L-RXD using the local casual square window is proposed in [19].However, the method proposed in [19] still needs to calculate the inverse of a matrix to detect each pixel.Compared with sliding array window, setting a sliding square window usually contains much more spectral-spatial integration information.This paper addresses this issue and further develops the recursive processing for LS-RXD based on sliding square window.The contribution of this work is based on two points: a recursive version of LS-RXD according to a causal relation from the Woodbury identity, which reduce the runtime; and a background suppression algorithm integrated with the recursive procedure, which improves the detection accuracy.
The rest of the paper is organized as follows.In Section 2, several related RX anomaly detectors are briefly covered.Section 3 provides the design of recursive sliding window detector.Section 4 demonstrates the experiments of the proposed algorithm compared with some traditional anomaly detection algorithms.Finally, Section 5 draws our conclusions.

Related Anomaly Detectors
In this section, we provide a short overview of K-RXD, L-RXD and LS-RXD.Assume that {r i } N i=1 is a set of data sample vectors, and r i = (r i1 , r i2 , .., r iL ) T is the ith data sample vector, where L is the total number of spectral bands.

K-RXD
The K-RXD, denoted by δ K−RXD (r), is specified as follows: where µ = (1/N) ∑ N i=1 r i is the global sample mean and T is the sample data covariance.The form of δ K−RXD in (1) is actually the well know Mahalanobis distance between the data sample being detected and global sample mean.It should be pointed out that the model assumes the data arise from two normal probability density functions with the same covariance matrix but different means.

L-RXD
Local anomaly detection is very important since the global RX anomaly detector fails to work when the anomalies are relatively small or only distinct from the local surroundings, but buried in the global background.The most widely used local anomaly detection algorithm is derived from the commonly used RXD, named as local-RX detector (L-RXD).The L-RXD, denoted by δ L−RXD (r), is specified by: where µ W is the local sample mean of a square window of size ω × ω pixels, centered at pixel r and Σ W is the background data sample covariance matrix of the local window W.
For L-RXD, a window of the selected size should be chosen firstly.The window size should not be too large or too small to obtain considerable background estimation.

LS-RXD
The traditional L-RXD exploits only one sliding window to estimate the neighborhood background statistic for the pixel under test.It is difficult to detect a multi-pixels anomaly target by L-RXD if the local distributions of some windows are mostly occupied by anomaly pixels because the background statistic will be contaminated seriously by anomaly pixels.In order to solve this problem, a local summation RX detector is proposed in [13].As illustrated in Figure 1, nine local windows will be taken for the pixel under test, represented by a yellow pixel if the local window is chosen to be 3 × 3 size.For an ω × ω size local window, the sliding filter contains ω × ω local windows for summation.The summation detector result for the pixel under test r is specified by where W i is the local pixel samples dataset from window i, µ W i and Σ W i are the mean vector and covariance matrix of W i , respectively.Suppose that the pixel samples dataset in the local window is denoted as W = {r p ij }, where i = 1, 2, .., ω, j = 1, 2, .., ω and p ij is the global location of r p ij in the whole data set {r i } N i=1 .As a matter of fact, the LS-RXD specified by (3) can be implemented by recursively updating the detection result of each pixel in W as the window is sliding, that is where µ W and Σ W are the mean vector and covariance matrix of W, δ t LS−RXD and δ t+1 LS−RXD are the t, t + 1 times updated detection result, respectively.
In doing so, the only difference between L-RXD and LS-RXD is that as the local window is sliding; only the detection result of the centered pixel in the local window is calculated by L-RXD, while the detection results of all ω × ω pixels in the local window are updated by LS-RXD.
It is worth noting that the local summation RX detector in [13] is called as LSAD for short.Subspace feature projection is used in LSAD to approximately calculate the Σ W i −1 in Equation 3 to enable LSAD with robust background feature statistics.However, it is difficult to realize a timely process due to the subspace feature projection in practice.Band selection onboard before data transmission is feasible to avert the singularity of an inversed local covariance.Therefore, we only focus on the recursive process of L-RXD and LS-RXD in the following.

Recursive LS-RXD
In the aforementioned local summation detection algorithms, a new local covariance matrix inversion is repeatedly calculated as the local window slides.The key issue of the recursive process of L-RXD and LS-RXD is how to perform a recursive computation for every independent covariance matrix inversion.
In what follows, we describe how to calculate the covariance matrix inversion of a casual sliding array window recursively.

The Covariance Matrix Inversion of Causal Sliding Array Window
Figure 2 shows the causal sliding array window at r n−1 depicted by dotted lines and the causal sliding array window at r n depicted by dashed lines, where a is the array window size.The farthest pixel r n−a from r n in the causal sliding array window at r n is removed from the causal sliding array window at r n , while the most recent data sample vector r n is added to the causal sliding array window at r n+1 .Defining R a (n) = (1/a) ∑ r i ∈W r i r T i , where W = {r i } n i=n−a+1 .R a (n) is called the "causal" sample auto correction matrix correlation matrix, and is formed by data sample vectors in the causal sliding array window.Then R a (n) can further be expressed as By repeatedly use of the following Woodbury matrix identity [20] twice: the inverse of R a (n) can be updated recursively via R −1 a (n) by virtue of ( 7) and ( 8) The "causal" covariance matrix formed by all the data sample vectors in the sliding array window can be specified by where is the "causal" sample mean of sliding array window.Using Woodbury matrix identity again, by letting By virtue of ( 7), ( 8), ( 10) and ( 11), K −1 a (n) can be updated recursively by R −1 a (n − 1) and µ a (n − 1), via deleting the pixel r n−a and adding the current pixel r n .

Recursive Processing of the Covariance Matrix Inversion of Sliding Window
Figure 3 illustrates two continually sliding windows with size of ω × ω depicted by black dashed lines and orange dashed lines, respectively, where r p ωω +1 denotes the most recent received data sample vector.The sample data vectors update process in sliding windows can be implemented in ω steps by removing one pixel and adding one pixel each step.Suppose that p ωω = n − 1, the inverses of correlation matrices of the local window at r p ωω and r p ωω +1 are denoted as R −1 ω 2 (n − 1) and R −1 ω 2 (n) respectively, and inverses of the covariance matrices of the local window at r p ωω and r p ωω +1 are denoted as In analogy with ( 7), ( 8), (10), and (11), K −1 ω 2 (n) can be updated recursively as follows.

Recursive Processing of LS-RXD
Except for the recursive processing of the covariance matrix inversion of the sliding window, some other issues should also be considered.
The first issue is the edge expansion.To ensure that there is no absence of detection on the edge of an image, the edge expansion is usually operated as a preprocessing for a local window detector.Due to the low probability of anomaly targets appearance in hyperspectral images, enplaned layers can be randomly chosen from the whole image [13].With this consideration in mind, take the window with size of 3 × 3 as an example.We design the sliding window strategy, depicted in Figure 4, where the yellow, blue and purple grids, respectively, denote the latest pixel received, the processed data and the pixels to be processed.As Figure 4 shows, when the sliding window meets the right board of the hyperspectral image, the next several sliding windows are across the border by moving down one line and adding new data one by one.The last sliding window moves to the right-bottom until the last sample data r Row×Col is received.This design enables the recursive processing of LS-RXD more conveniently.The second issue is how to keep track of which data sample vector should be removed and which data sample vector should be added as a matrix window moves on.Let W = {r p ij } ω i,j=1 denote the local sliding window in an image with size Row × Col, where p ij is the global location of r p ij in the whole data set {r i } N i=1 .For the first local window, p ij can be expressed as p ij = x (i−1)×Col+j .Using the strategy of Figure 4, it is very easy to update the global location of pixels in the follow-up window as p ij = p ij+1 successively.
After the aforementioned issues are solved, the recursive LS-RXD, called as R-LS-RXD can be obtained by Three comments are worthwhile: 1.It is important to note that, using the strategy of Figure 4, the updating counts of the detection value for the pixel located in several top and bottom lines of the image are less than ω.This will result in the whole detection result being inconsistent.To solve this problem, the updating number of each pixel is counted, which is denoted as N p ij , and finally the detection result is obtained as To avoid the singularity problem of calculating the inverse of the sample correlation and covariance matrix used by anomaly detectors, ω × ω must at least equal to or greater than the total number of spectral bands [13,18].3. The whole design procedure is also suitable for recursive L-RXD which is not included here.

Background Suppression of Sliding Windows
This section mainly discusses the background suppression sliding window furthermore.It is not convenient to set the current under test pixel to conclude in the local window background with other data samples, because it will reduce the separation between background information and anomaly information separation while the current under test pixel is anomaly [21].In order to suppress the background information and improve the detection performance, we need to remove the current under test pixel (r k ) from the recursive update processing.
Assume that R k is the correlation matrix removed r k , and R k is specified by Once using Woodbury matrix identity, letting Assume that µ k is the mean vector of background sample data removed r k , the inverse of covariance matrix could be specified by As a result, the background suppression recursive R-BS-LS-RXD can be specified by

Computational Complexity Analysis
This section provides a detailed analysis on the computational complexity of calculating recursive update Equations ( 12)- (15).The advantage of using causal sliding windows over local windows is the use of recursive Equations ( 12) and (13), where the Woodbury identity is implemented twice, instead of recalculating each time as long as a new data sample vector comes in.Table 1 shows the computation complexity of matrix algebra.Based on the information in Table 1, the matrix inversion computation complexity is higher than the matrix multiplication computation.The usage frequency of the Woodbury matrix identity is determined by the size of the sliding window.Local background information is updated by calculating ω times of Equations ( 12) and ( 13), regardless of the number of pixels in the local background.Such a significant benefit arises from the recursive specialty in ( 12) and ( 13).Hence, the computational complexity of processing a single local window specified by its window size ω × ω requires ω times calculations of matrix multiplication.In addition, it only needs to calculate the inverse of the covariance matrix once.
Table 2 tabulates the number of floating operations (flops) required for LS-RXD and R-LS-RXD, which update K −1 a (n) in different method, where the bands number is specified by L, local window size is specified by a = ω × ω, and the pixels number to be processed is specified by N.These parameters determine the number of flops in the algorithm.Figure 5 plots the number of floating operations required for every algorithm versus L, a and N. The configurations of parameters are shown in Table 3.

LS-RXD R-LS-RXD
Operator Initialization Input r n Input r n µ a K K −1 µ a Equation( 7) Equation( 8) Table 3. Configuration of The Parameters.As shown in Figure 5, the comparison of different anomaly detectors depends on the specific configuration of the parameters.Generally speaking, R-LS-RXD is faster than LS-RXD.The number of floating operations The number of floating operations

Results and Discussion
To demonstrate the performance of anomaly detection using recursive local summation RXD, two real hyperspectral image scenes were conducted for experiments.The first image data set is the Airborne Visible/Infrared Imaging Spectrometer (AVIRIS) image scene of Sandiego airport area which is located in California.A sub-image with size 100 × 100 along with its ground truth are shown in Figure 6a,b, respectively.It was acquired through 224 spectral bands with a spectral coverage from 0.4 to 2.5 µm where the spatial resolution is 3 m and spectral resolution is 20 nm.After removing low signal-to-noise ratio (SNR) and water absorption bands, a total of 126 spectral bands were used for experiments.In order to quantitatively evaluate detection performance, receiver operating characteristic (ROC) curves are used to compare the different algorithms.Based on the provided ground truth, we can perform an analysis via ROC curves of the false alarm ratio (P f ) versus the detection ratio (Pd) by taking all the possible thresholds (τ).We can further calculate the area under the ROC curve (AUC) for a quantitative performance analysis.The algorithm with a larger AUC value is regarded as a better performance.
Traditional ROC curves is a 2D plot represented by values of P f and Pd.Furthermore, we can plot another 2D ROC curve of P f and τ, which provides crucial information of progressive background suppression as the threshold τ varies.when it comes to the interpretation of anomaly detection by visual inspection with no availability of ground truth or AUC values with similar performance.
Three experiments are conducted with the purpose of: (1) evaluating the influence of window size on the detection performance of R-LS-RXD; (2) comparing the detection performance of different algorithms; and (3) comparing computing times of different algorithms, respectively.

Optimum Size of Sliding Window
Band selection is very practical in anomaly detection [22,23]; nine bands are selected by signal-to-noise ratio estimation and maximal information (SNRE-MI) [23] in the experiment to obtain better result.To investigate the influence of window size on detection performance of R-LS-RXD, two hyperspectral images of different sensors (AVIRIS and HYDICE, respectively) in the previous section are used for experiments, the size of sliding window varies from 5 × 5 up to 17 × 17 with steps of two pixels side width.Figure 8a-g and Figure 9a-g show their detection abundance fractional maps with their detected abundance fractions in gray scale of AVIRIS and HYDICE hyperspectral image, respectively.According to the experiment, the detection result is poor with a window size of 5 × 5, where the background and anomaly are difficult to separate for both sensors.Additionally, the performance begins to improve as the window size increases.When the window size is greater than or equal to 11 × 11, detection performances are similar by visual inspection as shown in Figures 8 and 9. Figures 8h and 9h show the results of global background K-RXD detector for comparison.In order for a further quantitative evaluation of detection performance with different window sizes, the ROC curves are implemented.To simplify our study, ROC curves for HYDICE data are not given, since the results are similar for both data sources.Figure 10 shows the ROC curves for AVIRIS data with different window sizes, with a traditional (Pd, P f ) ROC in (a) and a (P f , τ) curve analysis in (b), respectively.Additionally, the AUC values, denoted by A z , are calculated for each (Pd, P f ) curves and (P f , τ) curves.In general, the higher the value of A z (Pd, P f ) and the lower the value of A z (P f , τ), the better the detection performance is.Results are tableted in Table 4, with the best results highlighted.Based on the result of Figure 10 and Table 4, as the window size goes up, the values of A z (Pd, P f ) are increased while the values of A z (P f , τ) are decreased.The detector reaches the best detection power in size 13 × 13, and the best background suppression performs the best with size 17 × 17.However, the trend of A z (P f , τ) decreasing obviously slows down when the window size increases from 13 × 13 to 15 × 15.When it comes to the global size background, the value of A z (Pd, P f ) decreases to an untrustworthy value and is difficult to be distinguished by visual inspection in Figures 8h and 9h.
The conclusions for the experiment are as follows.As with the size of window increases, the sliding window RXD window RXD detector obtains better detection performances.However, 13 × 13 is the optimum size for a Sandiego hyperspectral image.As an alternative interpretation, although a larger window size results in better background suppression, the detection performance is much more important in the detector evaluation.

Performance Evaluation for Different Algorithms
In this section, we compare the detection performance of the LS-RXD, causal sliding array window (CSA-RXD) [18], proposed R-LS-RXD and R-BS-LS-RXD.In order to obtain the best detection results, the sliding window is implemented with size of 13 × 13 for both AVIRIS and HYDICE hyperspectral images.
Detection results of the four detectors using AVIRIS and HYDICE data are shown in Figures 11 and 12, respectively.The first line shows the gray scale results with detected abundance fractions, and the second line demonstrates the binary detection maps separated in an appropriate threshold, which was calculated by Otsu algorithm [24].Both hyperspectral images of different sensors came to the same conclusions as follows, showing the adaptation of proposed algorithms for different sensors.It can be found obviously from the detection results that CSA-RXD, which merely take partial advantage of spectral-spatial integration information, omit number targets by visual inspection as shown in Figures 11d and 12d.On the contrary, other anomaly detectors, which are implemented with spectral-spatial integrated information can acquire excellent detection performance.The maximum detection of ground target shows in Figures 6b and 7b can be detected by LS-RXD, R-LS-RXD and R-BS-LS-RXD by visual inspection in Figures 11a-c and 12a-c.As also shown in the figure, R-BS-LS-RXD gets better background suppression compared with R-LS-RXD and LS-RXD.This indicates that R-BS-LS-RXD can not only correctly detect anomaly target pixels as R-LS-RXD performs, but also acquires excellent background suppression as CSA-RXD performs.
Similarly, to simplify our study, a quantitative evaluation with traditional ROC curves, (P f , τ) curves, is demonstrated in Figure 13.AUC values are listed in Table 5, only for AVIRIS Sandiego data, since the results are similar for both data sources.It is interesting to note that the ROC curves of LS-RXD, R-LS-RXD and R-BS-LS-RXD are overlapped completely.This indicates that these algorithms get similar detection power from the traditional ROC curve analysis.Meanwhile, R-BS-LS-RXD gets a better performance in background suppression as the (P f , τ) curve shows.It is clearly shown that anomaly detectors with spectral-spatial integration have better performance, where the ROC curves of LS-RXD, R-LS-RXD and R-BS-LS-RXD are much closer to the upper left corner than CSA-RXD.
AUC values tablet in Table 5 prove that the proposed R-LS-RXD and R-BS-LS-RXD get a similar detection performance with LS-RXD.In addition, A z (Pd, P f ) of LS-RXD, R-LS-RXD and R-BS-LS-RXD is greater than CSA-RXD.By contrast, R-BS-LS-RXD produced lowest value of A z (P f , τ).In general, R-BS-LS-RXD can suppress the background information and improve the detection performance.

Computing Time Comparison for Different Algorithms
In order to verify the computing effectiveness of recursive LS-RXD, we design a comprehensive comparative analysis on the computer processing time (CPT) of R-LS-RXD and LS-RXD.The computer environments used for the experiments are 64-bit operating systems with Intel i5-4590, a central processing unit (CPU) of 3.3 GHz, and 8 GB of random access memory (RAM).In order to remove the pulse error caused by the computer itself, the following data on complexity analyses are averaged after five experiments.Based on the results in Table 6, the computing time of R-LS-RXD is significantly less than LS-RXD in every window size.In addition, the acceleration is particularly noticeable when the window is in a small-scale.In the experiment, the speedup ratio is up to three when the window size is chosen as 7 × 7.As the window size grows up, the speedup ratio remains, at least, two.
To further evaluate computational complexity, Figure 14 plots the computing time versus the number of processed pixels for both R-LS-RXD and LS-RXD on the Sandiego hyperspectral image.Each algorithm was run and executed five times to produce an average computing time.As we can see, R-LS-RXD requires less time than LS-RXD does due to the fact that the former implements a recursive process, while the latter implements a nonrecursive process.As also shown in the figure, the computing time increases linearly as new pixels are added.

Conclusions
This paper proposes a recursive local summation RX algorithm for hyperspectral anomaly detection based on sliding window processing.In order for a fast implementation of a sliding window detector, a recursive update equation for the inversion of local background covariance matrices is developed.In addition, a background suppression R-BS-LS-RXD detector is also proposed in this paper, which removes the current under test pixel from the recursively update processing.This method exploits a local summation strategy in a sliding window, which could sum multiple correlated local background statistics to suppress the major background.The real hyperspectral image experiments have proven that the R-LS-RXD and LS-RXD obtain similar detection performances, which can be competitive with that of CSA-RXD based on sliding array window background.To investigate the computational complexity issue, a comprehensive comparative analysis on the CPT of running recursive updating sliding window detector and un-recursive updating method is conducted in theory and experiments.The result shows R-LS-RXD has a significant acceleration effect for calculation.Our future work mainly focuses on deriving real-time progressive processing of anomaly detection for hyperspectral imagery that was acquired by other data formats.
Figure 1 takes 3 × 3 size multiple local windows to demonstrate the implementation of the local summation strategy.

Figure 2 .
Figure 2. Casual sliding array window at r n with width specified by a.

Figure 10 .
Figure 10.Receiver operating characteristic (ROC) curves analysis for AVIRIS data with different window size: (a) curve of (Pd, P f ); (b) curve of (P f , τ)

Table 1 .
Computation Complexity of Matrix Algebra.

Table 4 .
Area under the ROC curve (AUC) values of (Pd, P f ) and (P f , τ) with different window sizes

Table 5 .
AUC values with different detectors.
AlgorithmLS-RXD R-LS-RXD R-BS-LS-RXD CSA-RXDA z (Pd, P f ) Table 6 tablets the computing time of algorithms with different window sizes in San Diego hyperspectral image.