Change Detection in Multispectral Remote Sensing Images with Leader Intelligence PSO and NSCT Feature Fusion

: Change detection (CD) using Remote sensing images have been a challenging problem over the years. Particularly in the unsupervised domain it is even more difﬁcult. A novel automatic change detection technique in the unsupervised framework is proposed to address the real challenges involved in remote sensing change detection. As the accuracy of change map is highly dependent on quality of difference image (DI), a set of Normalized difference images and a complementary set of Normalized Ratio images are fused in the Nonsubsampled Contourlet Transform (NSCT) domain to generate high quality difference images. The NSCT is chosen as it is efﬁcient in suppressing noise by utilizing its unique characteristics such as multidirectionality and shift-invariance that are suitable for change detection. The low frequency sub bands are fused by averaging to combine the complementary information in the two DIs, and, the higher frequency sub bands are merged by minimum energy rule, for preserving the edges and salient features in the image. By employing a novel Particle Swarm Optimization algorithm with Leader Intelligence (LIPSO), change maps are generated from fused sub bands in two different ways: (i) single spectral band, and (ii) combination of spectral bands. In LIPSO, the concept of leader and followers has been modiﬁed with intelligent particles performing Lévy ﬂight randomly for better exploration, to achieve global optima. The proposed method achieved an overall accuracy of 99.64%, 98.49% and 97.66% on the three datasets considered, which is very high. The results have been compared with relevant algorithms. The quantitative metrics demonstrate the superiority of the proposed techniques over the other methods and are found to be statistically signiﬁcant with McNemar’s test. Visual quality of the results also corroborate the superiority of the proposed method.


Introduction
Change detection (CD) is the process of identifying changes occurred at a scene between two time points, using multi-temporal images. It is an active research area due to its numerous applications in real-life [1][2][3][4]. In this paper, the focus is on remote sensing change detection [2,5], owing to the following aspects. Firstly, monitoring large area is possible with satellite imageries, which otherwise, is very time consuming and expensive or even not possible due to remote areas [3,6]. Secondly, quick and automated systems to monitor changes are extremely necessary for deploying timely actions in various realistic problems such as disaster management/assessment, monitoring land use/land cover changes or forest cover changes due to fire/infestations, urbanization and environmental monitoring, and oceanography [4,[6][7][8][9]. Though we find a profusion of CD techniques in the literature, most of these methods are application specific and highly depend on the spectral, spatial and sensor characteristics of the imageries used and cannot be considered as optimal [1,6,7,[10][11][12][13][14][15][16][17][18][19][20]. In addition to that, they often fail to address the real challenges while the dynamics occurred is complex or heterogeneous in nature. As multispectral remote sensing images depict the signature of various land cover types better in one band than other, powerful techniques to derive CD map by drawing information from multiple spectral bands are the need of the hour.
For the remote sensing problems, getting a proper ground truth is difficult in many situations. Therefore, unsupervised methods are preferred to supervised methods, as this does not require learning with training patterns [3,21,22]. Generally, the change detection in unsupervised framework consists of three steps: preprocessing, image comparison and image analysis [1,3]. In the preprocessing step, anomalies due to radiometric and geometric factors [23,24] are removed by correction algorithms and georeferencing, respectively. Noise trapped in the images is also filtered out to some extent in this stage for improving the strength of the signals. And the two images (captured at different dates) are co-registered by providing the pixel-to-pixel alignment, before computing the difference images.
From the rectified images, difference image (DI) is generated in the second step, usually by subtraction or ratioing [1,22,24,25]. While the former calculates the difference in intensity values of pixels between the images at two dates, the latter computes the ratio of corresponding pixels [24,26]. In the ratio image, the changed pixels are aligned away from "1" unsymmetrically on both sides, where the value "1" corresponds to the unchanged pixel, as equal values produce 1 on ratioing [1,2]. By the hypothesis of few changes between the images of two times, changes can be detected in the tails of the statistical distribution of the pixel values in the difference image [22]. Another widely used method is change vector analysis (CVA) computed from multiple spectral bands [3,6,23]. The CVA and subtraction method have been used in change detection of various geographical environments [1,6,25] but researchers argue that it is too simple to deal with all complexities involved in detecting the real changes on various land covers [1,6,23,25]. Moreover, they are not robust against noise and effects due to external factors [6]. Whereas ratioing is efficient in inhibiting the effects of slope, sun angle and illumination variations and has been employed in a wide variety of CD problems [25], but its nonlinear nature is much criticized [22,25]. The two methods, subtraction and ratioing, are simple and easy to implement, have demonstrated exclusive behaviour on different land cover types [1,24,25]; that is, ratio method is good for detecting changes in water, soil and urban scene; difference method is good for detecting vegetation and water [24,25]; hence, they can be complementary to some extent.
As the accuracy of change map has a strong bias on the quality of the difference images (DI), fusion techniques [7,13,27] are often adopted to enhance the quality of DI, at pixel level or in transform domain [7,13,28]. Therefore, in the present study, we propose fusion of a normalized difference image (NDI) and a normalized ratio image (NRI) to produce a better quality difference image, in the Nonsubsampled Contourlet Transform (NSCT) [29] domain by exploiting the distinct properties of NSCT such as shift and scale invariance and multidirectional frequency localizations [30].
In the final step, the DI is partitioned into two groups, changed or unchanged, using a suitable technique, such as thresholding or segmentation [7,10,24,31]. The choice of proper threshold has always been challenging [23,26,32,33]; hence, the use of classification/clustering algorithms became popular [6,7,26]. Acquiring accurate labeled data is crucial for getting good classification results which is very difficult or even impossible in the case of remote sensed data [3,26]. So, clustering algorithms received much attention from researchers in recent years as these algorithms can derive clusters independently [3,24,26]. Hard clustering algorithms like K-means and ISODATA that produce crisp clusters have been applied in many CD studies [3,26,34]. However, for remote sensing images (where pixels belong to overlapping clusters), fuzzy clustering techniques that generate clusters based on the degree of membership of pixels are found to be more appropraite [3]. Several variants of fuzzy algorithms that incorporate spatial information to increase the accuracy of change detection have been reported in the literature [3,7,13,14,35].Techniques such as multi-resolution Markov random field models [15], random walks [10] and hybrid methods like hybrid conditional random field [12], Markov-CA Model [18,36], Kmeans clustering with Adaptive majority voting [37] are also into great interest now. When employing optimization techniques on clustering algorithms, the accuracy of change detection is found to be improved further [35,38]. Nature inspired algorithms such as Particle Swarm Optimization (PSO) [39], Firefly Algorithm [40], Ant Colony Optimization (ACO) [41] and so forth are powerful techniques that can be used for solving complex problems by utilizing the metaheuristics underlying the nature, for example, swarm intelligence [42,43]. Due to its simplicity and nominal parameters, PSO was investigated thoroughly and has been applied in several real world problems including change detection [42][43][44][45]. However, PSO has two major shortcomings: early convergence or no convergence, and menace of local optima [42,43] due to loss of diversity of particles after a few iterations. We address these issues in the proposed PSO variant, called Leader Intelligence PSO (LIPSO). In LIPSO, an intelligent Leader [46] and the followers take Lévy flight [43] occasionally to overcome the local optima problem and thereby the convergence is ensured.
The contributions can be summarized as follows: • A modified PSO called LIPSO, is proposed to detect the changes in the remote sensing images. • For the better estimations of the changed regions, three strategies are used: (a) fusing Normalized difference image and Normalized ratio image in the NCST domain, band-wise to generate an improved difference image (b) NSCT decomposition helped in enhancing the information and removing the noise, and (c) tested with different combinations of band-wise difference images with LIPSO to find the best result, and finally, it has been found that all the four bands are equally useful for the detection of the changes. • An improved version of Change Vector Analysis (CVA) [3], named Transform CVA, is proposed. • The proposed method is compared with Fuzzy C means (FCM) [3], Standard PSO (SPSO) [47], PSO with Lévy flight (PSOLF) [42,43] and Transform CVA are compared.
The rest of the paper is organized into following sections. In Section 2, the proposed methodology is described in details. In Section 3, the data sets and accuracy metrics are provided. The experimental setup and findings are presented in the Section 4 with detailed analysis of the results. Finally, the conclusion is outlined in Section 5.

Materials and Methods
Our objective is to detect the changes occurred from the temporal images by delineating the changed pixels from the unchanged ones. Therefore, the problem is a clustering problem that groups the image pixels into changed and unchanged clusters. Consider two co-registered remote sensing images of intensity values, with B bands ..B} acquired at two dates T 1 and T 2 respectively, of the same geographical area. We process the images to produce a binary change map G (x) : R → [0, 1] by segmentation, according to the following rule: 0 if x belongs to changed region, 1 if x belongs to unchanged region.
For this, we propose a novel change detection technique which focuses on two aspects: the quality of the difference image and the robustness of the segmenting algorithm, which are directly related to the performance of any CD technique. Apparently, it consists of two parts, (i) Generate good quality difference image by NSCT feature extraction and intra-scale fusion, and (ii) Generate binary change map using Leader Intelligence PSO (LIPSO). The workflow of the procedure is shown in the Figure 1.

Generate Good Quality Difference Image by NSCT Feature Extraction and Intra-Scale Fusion
As mentioned in the introduction, fusion of images can produce better quality images, by combining information from multiple source images. Image fusion in transform domains [7,13,28] produces better results as the wavelet/contourlet transforms can represent frequencies in time and space. Nonsubsampled Contourlet Transforms (NSCT) possess advantages such as shift and scale invariance [29], a property more suitable for the change detection studies. Added to that, being multi directional, the NSCT is efficient in capturing 2D singularities [30] to derive edge and line features of image, thus geographical structures in them, while wavelets can capture only point singularities. This transform contains a pair of filter banks, of which, the nonsubsampled pyramidal filter bank (NSPFB) generates sub band pyramids with j + 1 redundancy at jth scale. The nonsubsampled directional filter bank (NSDFB) produces 2 l j directional subbands at l j level and jth scale [29] resulting into sub bands of multiple coarseness and directions. The structure of the NSCT is illustrated in the Figure 2. A large number of coefficients are zeroes [30] and hence, nonsubsampled contourlets are efficient in suppressing background and various noise models, and can project changed pixels more prominently. In the present study, a NDI and a NRI having complementary information are fused at NSCT feature level, due to their unique properties described earlier and produces fused NSCT coefficients that contain information from the two DI's mentioned. The steps of the method are summarized in Algorithm 1.

Algorithm 1 NSCT Fusion
Input: Source images I 1 at time T 1 and I 2 at time T 2 Output: Fused NSCT coefficients 1: Generate NDI using Equations (1) and (2). 2: Generate NRI using Equation (3). 3: Apply NSCT on the NDI and NRI up to sufficient scales J and obtain approximation coefficients and directional features. 4: Denoise features using appropriate threshold (λ). 5: Fuse the approximation coefficients by averaging, and high frequency coefficients on minimum energy rule. The DI by subtraction from spectral bands are generated as and normalized, called Normalized Difference Image (NDI), as where DI b,max is the maximum pixel value and DI b,min , the minimum value of pixel (where b = 1, 2, . . . , B). The second DI, Ratio image is generated as In the ratio images, as the unchanged pixel will have a value closer to the "1", we will subtract the values from "1" and take the absolute value to map the values in the range [0, 1] (to match with the NDI), we call this as NRI (normalized ratio image). NSCT Features are extracted from these DIs (NDI and NRI) by applying J-scale decompositions (where J-scale is chosen using entropy of the decomposed images). The subband coefficients are denoised using a threshold defined by Donoho and Johnstone equation λ = σ 2log(MN) (here, σ is estimated suitably and MN is the size of the image) [48]. A large number of background pixels are removed on thresholding and hence, the foreground containg the changed pixels gets enhanced. The resultant sub band contains different land cover information from both NDI and NRI features. The denoised approximation coefficients of NDI and NRI are fused by averaging to produce a resultant sub band comprising of the details of both the DI's equally [7]. The minimum local energy fusion rule is applied on higher frequency sub bands [7] so as to preserve the edge & salient features, using where DH q, f j,b , DH q,nor j,b , DH q,r j,b are high frequency coefficients in direction q at jth scale of fused image, NDI and NRI, and E q,nor j,b , E q,r j,b represents the local energy of the coefficient in the direction q of band b at jth scale around 3 × 3 window of NDI and NRI, respectively.
From each spectral band, 'S' number of NSCT features (subband images) of size M × N each, are obtained after intra-scale fusion, which are mapped to S dimensional vector with respect to each pixel, and segmented by employing the proposed LIPSO algorithm.

Generate Binary Change Map Using Leader Intelligence PSO
As our task is to produce two mutually exclusive clusters of changed and unchanged pixels, the inter-distance between the clusters is to be maximized and the intra-distance of the pixels in a group is to be minimized. Therefore, this problem can be formulated as an optimization problem trying to minimize the following error function: where K is the number of clusters(regions), x j is the data point belongs to kth cluster(C k ), v k is the cluster centre of kth cluster, d(x j − v k ) is the Euclidean norm and n k is the number of data points belonging to the kth cluster(region). Particle Swarm Optimization algorithm (PSO) proposed by Kennedy and Eberhart [39,49] is a population based algorithm that imitates the best characteristics of social insects, in solving complex optimization problems [42,43,50]. It is a typical optimization algorithm suitable for realistic problems like clustering and we choose this due to its many virtues: it is a widely accepted optimization method with a well-defined mathematical foundation, especially, when it is combined with Lévy flights [42,43]. Interested readers may please refer to Reference [50] for the theoretical details of PSO and Lévy flight in References [42,43] & in the Appendix A. The metaheuristic underlying in nature such as swarm intelligence is used in PSO for fast convergence of the problem, which is exploited in generating changed and unchanged clusters in our experiment. It is rather easy to understand its logic, require a few parameter settings only and computational overhead is low as compared to many other algorithms. Despite several variants of PSO are reported in the literature, the local optima problem could not be eliminated completely. Moreover, they often fail in arriving at consistent solution. Of the several improved versions of PSO, the Lévy flight version is found promising. We consider two variants of that, the LFPSO [42] in which the Lévy flight is performed only when the particle exceeds a limit value and the PSOLF [43], where the particles fly according to a probability, p > 0.5. Although they perform well with benchmark problems [43], they fail to achieve good results in real-life problems like image segmentation resulting into bad clusters or not even generating clusters. It is probably due to the inability of the particles to attain sufficient velocity or fail to maintain a velocity within the search space. A second reason may be due to loss of diversity that leads to stagnation of particles. Therefore, a modified PSO with the concept of Leader and followers [46] performing Lévy flight intelligently (LIPSO), is proposed. One can easily understand the steps involved in LIPSO from Algorithm 2 and the flowchart in Appendix B.

Algorithm 2 Leader Intelligence Particle Swarm Optimization
Input: Fused coefficients Output: Cluster centres 1: Initialize particles randomly selected from the search space, as X i = {X i1 , X i2 , . . . , X iD }. 2: Initialize various parameters: c 1 and c 2 , Xmax, Xmin, Vmax and Vmin. 3: Evaluate fitness by Equation (5). 4: Assign pBest i,D = initial positions of particles, gBest D = best particle position of swarm, pBest f it i = the fitness value of ith particle, and gBest f it = the best fitness of the swarm. 5: Set the particle with the highest fitness value as the Leader. 6: If rand > 0.5 then update position of Leader by Lévy flight using Equation (9), else update velocity and position of Leader by Equations (6) and (7) respectively. 7: Update followers as: if the distance between leader and followers crosses the threshold, update the followers with Equation (13); otherwise update velocity and position of particles by Equations (6) and (7) respectively. 8: Evaluate fitness function Equation (5)  The proposed algorithm, LIPSO, is a variant of PSOLF [50] in the following aspects: (i) an intelligent parameter that produces a reasonable step size replaces inertia weight term in the equation of PSO with Lévy flight, (ii) the concept of Leader-follower in Reference [46] is modified by performing Lévy flight intelligently, and (iii) Modification in the structure of the PSO algorithm by incorporating co-operative competition of the individual swarms, that is, the follower updates its position w.r.t the current position of the leader.
Initially, as in the standard PSO, the particles are chosen randomly from the search space-the vector of NSCT coefficients generated in the first part. Each particle represents the cluster centers of changed and unchanged pixels and hence, the dimension of particle is 2d = D(say). The fitness of the particles is evaluated using Equation (5) and the one with the minimum value is designated as the best particle. The pBest i,D , gBest D , pBest f it i , gBest f it are assigned with the initial particle position, best position among the particles, fitness value of the ith particle and fitness of the best particle in the swarm, respectively. The best particle of the swarm is designated as the leader and the rest is termed as followers. Over the iterations, the concept of Lévy flight is implemented with the assumption that the leader of the swarm occasionally takes Lévy walk in reality while foraging and that takes place only after the local search is finished and when the food (here, solution of the problem) is no more available in the locality. Therefore, the Lévy flight is first employed on the leader, on random probability basis, for example, half of the time or two third of time; otherwise, the position is updated using Equations (6) and (7) as in SPSO [50].
where V t+1 i and V t i are the velocity of ith particle at iteration t + 1 and t respectively, c 1 and c 2 are acceleration parameters often called cognitive weighting factor and social weighting factor, ω is the inertia weight, ⊕ is the element level multiplication of matrix and X t+1 i and X t i are the position of ith particle at iteration t + 1 and t respectively. In the present work, ω is calculated as where itr is the current iteration and Maxitr is the maximum iteration. With Lévy flight, the position of the leader is updated using the following equation where X t+1 L is the position of the leader at current iteration t + 1, and X t L is the position of leader at iteration t and V t+1 L is the velocity of particle at current iteration, calculated as where ILevy(X t+1 L ) is the Lévy flight of intelligent leader at current iteration, given as, where s is the step size calculated using Equation (A2), η is the Intelligent parameter calculated as where A is the intelligent constant, an integer value chosen based on the scale of the problem. η can be fine-tuned by changing the value of "A" according to the problem in our hand. In this experiment the value of A was best performed in the range [5,10]. The intelligent parameter η controls the step length during Lévy flight with an assumption that the intelligence of the leader will decide how fast the swarm should move towards the best position. With this Lévy equation, a larger step size is obtained for the leader. Once the position of Leader is updated, the followers are updated by the Equation (13) if their distance from leader crosses a threshold value; otherwise by Equation (7).
where f Levy(X t+1 i ) is the Lévy flight of the ith follower at the current iteration, which is the velocity of follower, calculated as With Lévy flight, particles explore the search space for better position; otherwise, the local search is continued for better solution. Thus, the algorithm maintains a trade-off between exploration and exploitation. In the present work, the threshold value lies in the range [0, 1], which is chosen randomly. It can be set to a fixed value as well, depending on the problem. After updating the followers the fitness is calculated and updated the pBest and gBest, pBest f it i and gBest f it, if better values are found. The iterations are repeated until convergence. The convergence criteria could be set in two different ways-(1) till the difference between current gBest and that in the memory is less than a very small epsilon value; (2) until the maximum iteration is completed. As there are chances to occur no improvement in the fitness value for a few iterations in sequence, the first method is not suitable for segmentation problems. Therefore, the second method was adopted in this experiment. A suitable number of iterations (Maxitr) that generate robust clusters could be fixed empirically.
Once the algorithm is converged, the gBest D contains the cluster centers of changed and unchanged pixels using which the clusters are generated and from the cluster labels (0 = changed, 1 = unchanged), the change map is prepared.

NSCT Feature Fusion (Spectral Fusion)
In order to enhance the accuracy of change map further, various combinations of NSCT features of multiple spectral bands that merge information from different spectra are investigated [3]. Since the reflectance properties of different geographical cover types vary between spectral bands (e.g., blue band is suitable for bathymetric studies, Green for robust vegetation detection etc.) [6,11,51], spectral combination is more suitable for detecting changes. The fusion of spectral information is done as follows: from each spectral band, we get S dimensional vector of NSCT features having length M × N. When B spectral bands are chosen from an image, the dimension increases to S × B and this vector is used for segmenting by LIPSO algorithm. Compared to CVA computed by Equation (15), this method is more robust as it calculates the Euclidean distance of each pixel from their cluster centres, thus resulting into more tight and compact clusters. Where as in CVA, the Euclidean distance between pixels of spectral bands is considered and hence, the chance of outliers will be more that leads to less compact clusters. Combining more spectral bands will result into more accurate change map as it joins information of various land cover types.

Transform CVA
Change vector analysis or CVA [3] technique is used to compute difference image from two temporal images of the same scene, by taking multiple spectral bands into consideration. CVA can reflect the magnitude as well as the direction of change occurred. The magnitude is computed from suitable number of spectral bands B of input images, say I 1,b and I 2,b , of two dates as In order to enhance the accuracy of CVA method, NSCT decomposition on the CVA magnitude is done. Extract NSCT features of CVA, up to desirable levels and directions. By thresholding the features, suppress the noise, followed by segmentation using LIPSO, and the change map of better accuracy is generated. Since NSCT is efficient in filtering noise, the changed pixels can be separated easily from unchanged pixels. Though the computational cost increases, Transform CVA can be adopted wherever accuracy is important and the bands are noisy.

Accuracy Metrics
The performance of the technique was evaluated with the most popular accuracy metrics, the Overall Accuracy (OA) [3] and Kappa statistic [6]. OA is calculated as the percentage of pixels classified correctly. Mathematically, overall accuracy is computed as OA = (TP + TN)/(TP + FP + TN + FN) * 100, where TP = True Positive, TN = True Negative, FP = False Positive and FN = False Negative. Another metric used is Kappa statistic, generally called Cohen's Kappa. It is a measure of the agreement between two sets of classifications of a dataset taking the random chance agreements into consideration [6]. If the resultant CD map and the ground truth are in perfect agreement, Kappa value will be 1 and on perfect disagreement Kappa will be 0. The statistical significance of improvement in the accuracy is verified with McNemar's test, a non-parametric test used for assessing the statistical significance of changes occurred on the proportions of a dichotomous trait at two time points [52]. This test does not tell which method is superior, hence it should be used with other accuracy metrics for verifying effectiveness of the algorithm [6].

Data Set 1
The

Data Set 2
A set of images of Malambuzha Reservoir, Kerala, India acquired by Landsat 7 ETM+ at two dates are used in this experiment. The geographical location of the images lies between longitude 76.2814 • E and 77.1813 • E and latitude 10.0517 • N and 11.3540 • N. Figure 4 shows the two images acquired on 14 January 2001 and 18 February 2002 and the reference map of the changes occurred respectively. The size of the image is 7781 × 6961 pixels, from which, a portion of 200 × 320 pixels was selected for the study. From the images, it is well understood that the area covered by water in the image of year 2002 is less than that seen in the image of year 2001. As a result, the shoreline is exposed and a clear change in the boundary of the water is seen in the difference image. Four bands at spatial resolution 30 m, in the wavelength range (0.45-0.52 µm), (0.52-0.60 µm), (0.63-0.69 µm) and (0.76-0.90 µm) are selected for this study as they are suitable for the land cover and shoreline studies [4,25].

Results and Discussion
The experiments were aimed at finding the best changed map using the unsupervised method. The selected images were pre-processed and georeferenced using ArcGIS software with reference to the selected co-ordinates from the Google Earth map. All the images except the image of year 2001 of Dubai City were found to be radiometrically consistent. So, brightness correction was done on this image by histogram matching [2]. As the temporal images are necessarily co-registered for change detection, all the three image pairs were co-registered semi-automatically, by selecting control points belong to geographical structures like intersections, roads and so forth interactively coupled with MatLab commands. Ground truth was prepared semi-automatically. First a reference map was prepared from difference image programmatically for every pair of images and manually labeled the pixels into changed or unchanged, based on the available information and expert opinions.
After pre-processing the bitemporal images, a normalized DI and a normalized ratio image were generated from each spectral band and NSCT features were extracted from them. In this study, we have considered NSCT decomposition up to scale 2, as the entropy of the sub bands were found decreasing beyond that. Thus, we have chosen one low level approximation (LL) and four directional sub bands for each spectral band. The sub bands were fused as detailed in Section 2.1 to enhance the information content. Experiments were conducted on four spectral bands taken individually and in combination on each data set. The results of various experiments are consolidated in the Tables 1-6 as given below. From the change map obtained, we can also assess the robustness of the technique visually. The resultant change maps are given in the Figures 6-8 for reference.
The Table 1 describes the overall accuracy obtained for various methods against the proposed method for the Itezhi dataset. The results of band wise DI without NSCT decomposition is given at the left half of the top section. On the right side, the OA of the intra-scale fusion of NDI and NRI are displayed. In the middle portion, the results of CVA and Transform CVA with combinations of 2 bands, 3 bands and 4 bands are presented. At the bottom section, the description of the combination of NSCT coefficients are given, with 2 bands, 3 bands and 4 bands combination and the respective accuracy values on the right side. Similarly, the  Tables 5 and 6 respectively. From the tables, it is evident that the NSCT fusion on single bands with LIPSO produces higher accuracies as compared to other methods-FCM, SPSO and PSOLF. Results of multiple spectral fusion, the proposed method, demonstrate the highest accuracy values for all the datasets. Table 1. Overall Accuracy (OA) obtained for Fuzzy C means (FCM) [3], Standard Particle Swarm Optimization (SPSO) [47], PSO with Lévy flight (PSOLF) [43], Leader Intelligence PSO (LIPSO) (proposed) on Itezhi, Tezhi Reservoir, Zambia data set.    Table 4. Kappa statistic obtained for FCM [3], SPSO [47], PSOLF [43], LIPSO (proposed) of Malambuzha Dam, Kerala data set for with and without NSCT decomposition.  Table 5. OA obtained for FCM [3], SPSO [47], PSOLF [43], LIPSO (proposed) of Dubai City data set; OA = Overall accuracy.  Table 6. Kappa statistic obtained for FCM [3], SPSO [47], PSOLF [43], LIPSO (proposed) of Dubai City dataset. The proposed technique was compared with popular algorithms, the FCM [53] and two PSO variants, the SPSO [47] and PSOLF [43] qualitatively and quantitatively. The resultant change maps for the three datasets are shown in Figures 6-8, from which we can evaluate their visual quality. The change map obtained for NSCT fusion with FCM in Figure 6c shows less number of noisy spots while the change maps of NDI and NRI are affected badly by background pixels. The cluttering effect of black spots on changed boundary in NRI and the background pixels in NDI seems to be subsided considerably on the change map of NSCT fusion, as these images behave complementary to each other. From this, one can deduce that the NSCT is good in preserving the foreground information while suppressing the background and noisy pixels largely. It can also be noted from the Figure 6g-i that on combining NSCT features derived from multiple spectral bands, the change maps become clearer by detecting more number of changed pixels. Similar results can be noted for the Malambuzha dataset as shown in Figure 7.

DI
In the case of Dubai city dataset, the land cover types are different from the dataset 1 and 2. It can be noted large built ups and structures in the second image, which is absent in image 1. In the change maps, changes occurred in large sized geometrical shapes and structures are detected correctly. The area with small structures causes a few misalarms. The bands are less noisy but, due to the heterogeneous nature, it shows a few mishits on smaller objects. However, the proposed NSCT feature fusion of bands produced good change maps compared to other methods. In order to test the performance of the proposed technique, we analyse the results for each data set quantitatively too.  From Tables 1 and 2, it can be noted that the overall accuracy and Kappa statistic obtained for the band wise NSCT feature fusion with LIPSO are higher (e.g., 99.38%, 0.95424 for band3) as compared to the values obtained for FCM, SPSO and PSOLF, for every band of Itezhi Tezhi dataset. From the bottom portion of the table, we can compare the performance of LIPSO against other algorithms on multiple band-combination of the fused NSCT features. For all combinations experimented, it was found that the LIPSO outperformed the SPSO and PSOLF and the proposed method (NSCT feature fusion+LIPSO) achieved the highest accuracy value 99.64%. Secondly, for the combination of the NSCT feature from multiple bands, from the Table 1, it is evident that as the number of bands increases, the accuracy also is getting increased. Table 2 shows the corresponding Kappa statistic obtained for the OA values in Table 1, from which we can find that the LIPSO has got the highest accuracy values against other methods on all cases. The Kappa coefficient 0.97249 for the proposed method with 4 band NSCT feature fusion shows that the accuracy is increased nearly to 1. The change map has almost perfect agreement with the ground truth. From the above, we can deduce that the proposed technique (NSCT feature fusion + LIPSO) out performs any other method we considered, with the highest OA and Kappa statistic. The NSCT helps remove the background noise and the meta heuristic approach of LIPSO help achieve better optima. At the same time, information from various spectra is merged on spectral fusion and as a result, higher accuracy is obtained. Next we compare the Transform CVA against CVA method. On analysing the results for various combinations of bands (2 bands, 3 bands & 4 bands), it is noted that the Transform CVA shows better performance with significant improvement in accuracy values. This is because, the NSCT decomposition employed on CVA extracts the singularities and remove the noisy pixels on thresholding. The computational overhead of Transform CVA is balanced, as the inversion of NSCT is not necessary in the change detection framework. Similar trends can be noticed for the values of OA and Kappa obtained for the Malambuzha dataset from Tables 3 and 4 and for the Dubai City dataset from Tables 5  and 6 respectively. The tables are self-explanatory. From the tables, one can confirm that the proposed method outperforms all other methods we compared.
The overall accuracy obtained for four methods-CVA with FCM, Transform CVA with LIPSO, NSCT feature fusion with FCM and the proposed method (NSCT feature fusion with LIPSO) for Itezhi dataset is represented graphically in Figure 9a. From the graph, we can clearly understand that the proposed NSCT feature fusion combined with LIPSO has outperformed all other methods we considered. It is also clear that Transform CVA is superior to the conventional CVA method. From the Figure 9b it is obvious that the NSCT fusion of NDI and NRI gives commendable improvement in accuracy compared to individual methods. Kappa statistics for various methods and techniques with Malambuzha dataset are plotted in the Figure 10a for comparing the performance of the proposed LIPSO against other PSO variants, while employing on NSCT features of different bands. For all the bands, NSCT fusion with LIPSO performs better than other PSO methods. As seen in Figure 10b, on increasing number of bands, the accuracy of NSCT feature fusion gets improved and obviously it can observe that the proposed NSCT feature fusion method outperforms traditional CVA and Transform CVA. In order to test the significance of improvement in accuracy metric OA of the proposed LIPSO method against other algorithms, we have applied the McNemar's test and the result obtained for Itezhi dataset is shown in Table 7. The test was conducted with SPSO and PSOLF by taking the reference image obtained for LIPSO. A confusion matrix of correctly classified pixels and misclassified pixels were prepared and calculated the χ 2 values. From the table, it is evident that the test confirms the improvement in OA for the proposed method against the SPSO and PSOLF is statistically significant.  Table 8. The Chi-square values were computed for the proposed method against SPSO and PSOLF, with one degree of freedom and 95% confidence. The result confirms the improvement in OA for the proposed method is statistically significant. The statistical significance of the results of Dubai City dataset can be verified from Table 9 as well.

Conclusions
A novel unsupervised change detection method based on NSCT feature fusion and LIPSO that exploits information from multiple spectral bands is proposed. The multi-directional, multiscale characteristics of NSCT suppress noise and by fusing two complementary DI (NDI and NRI) in NSCT domain, generates a better qualiy DI. While segmenting with LIPSO, the intelligent leader and followers performing Lévy flight help overcome local optima and form better clusters of changed and unchanged pixels. The algorithm was tested on representative datasets from diverse land cover types and spectral signatures. The NSCT feature fusion of multiple bands demonstrate improvement in accuracy on different datasets, compared to single band NSCT features, as it merges the spatial and spectral information. This method also outperforms the well-known CVA method. From the results that are found statistically significant with the McNemar's hypothesis test, the following inferences are drawn. (i) the quality of difference image and thus the accuracy of change map can be improved by fusing NDI and NRI in NSCT domain, (ii) NSCT feature fusion of bands produce promising results than any single band, which also demonstrates superiority over CVA method, (iii) on producing the change detection maps, the proposed LIPSO, outperforms FCM and other variants of PSO, the SPSO and PSOLF and finally, (iv) the Transform CVA can bring better accuracy than the CVA method but still not better than the proposed one.
The experiment was conducted on Landsat imageries. In order to test the robustness of the algorithm with cross platform data, our next study is aimed at extending the proposed methodology to imageries acquired from various satellites sensors. where Γ is the standard gamma function.