Ensemble Empirical Mode Decomposition Parameters Optimization for Spectral Distance Measurement in Hyperspectral Remote Sensing Data

This study proposed a new approach to measure the similarity between spectra to discriminate materials and evaluate the performance of parameter-selection procedures. Many pure pixel vector-based similarity measurements have been developed in the past to calculate the distance between two pixel vectors. However, those methods may not be effective since they do not take full advantage of the spectral correlation. In this study, we adopt Ensemble Empirical Mode Decomposition (EEMD) to decompose the spectrum into serial components and employ these components to improve the performance of spectral discrimination. Performance evaluation was conducted with several commonly used measurements, and the spectral samples used for experimentation were provided by the spectral library of United States Geological Survey (USGS). The experimental results have demonstrated that EEMD can extract the spectral features more effectively than common spectral similarity measurements, and it better characterizes spectral properties. Our experimental results also suggest general rules for selecting noise standard deviation, the number of iterations for EEMD and the collection of Intrinsic Mode Functions (IMFs) for classification. Finally, since EEMD is a time-consuming algorithm, we also OPEN ACCESS Remote Sens. 2014, 6 2070 implement parallel processing with a Graphics Processing Unit (GPU) to increase the processing speed.


Introduction
Hyperspectral image classification is very important for endmember discrimination in various applications.In the past, many pure pixel vector-based similarity measurements have been proposed to evaluate the similarity between two pixel vectors.Several popular methods, including the Euclidean Distance (ED), Mahanalobis Distance (MD) [1][2][3][4] and Spectral Angle Mapper (SAM) [1][2][3][4][5][6], are widely used to measure the spectral distance and provide acceptable results for pure pixel classification.However, they also have some drawbacks, because those distance measurements do not fully utilize the correlation between bands [4].
In this study, we adopted a signal-analysis method to analyze spectral data by Empirical Mode Decomposition (EMD), which will generate a collection of Intrinsic Mode Functions (IMF) [7].The decomposition procedure of EMD depends on the magnitude of the original signal with various intrinsic time scales, i.e., it can decompose the signal into different frequency components.The EMD has been widely used in the past for time-domain signal processing, and was also employed to decompose the time-sequence signal to determine intrinsic information [8,9].For EMD to be effective, the differences in both frequencies and amplitude must be sufficient for decomposition analysis.If the physical criteria for the differences between two signals are not met, the sifting process derives an IMF with single tone modulated in amplitude instead of a superposition of two unimodular tones [9].Thus, the modulated signal would no longer encompass the characteristics of the original signals.
To overcome the problem of mode mixing, Wu and Huang proposed Ensemble Empirical Mode Decomposition (EEMD) [10,11].
In various signal processing applications, both EMD and EEMD have been implemented for feature extraction and noise reduction [12,13].Especially for remote sensing images, 2D-EMD [14][15][16] and MEEMD [11] have been proposed recently for the decomposition of hyperspectral image into IMFs, but they apply to pre-selected two-dimensional image band instead of one-dimensional spectral information.The aim of this research is to discriminate materials by extracting the unique absorption features from the spectrum of each pixel.In this study, we propose a two-stage process for spectral similarity measurement.It first adopts EEMD to generate a series of IMFs and accumulate a set of IMFs for enhancing absorption features, followed by SAM as a common technique for hyperspectral image classification.
Furthermore, due to the large amount of large-dimension data processing required, it is not efficient to process hyperspectral images with EEMD [17].Therefore, parallel processing with a Graphics Processing Unit (GPU) is implemented for EEMD [18,19].The performance analysis shows that GPU can significantly reduce the computing time for EEMD.

Methodology
The proposed method is a two-stage process to measure the spectral similarity of two pixel vectors.In the first stage, EEMD is adopted to decompose a series of IMFs, and a set of IMFs is accumulated to enhance absorption features.Secondly, SAM is utilized as the distance measure for spectral similarity.Because of the computational complexity, parallel processing architecture is also implemented.

Ensemble Empirical Mode Decomposition (EEMD)
EEMD is a self-adaptive algorithm.In comparison, the traditional Fourier transform needs to convert the signal by frequency-domain integral analysis, but EMD can be directly performed for decomposition on a time-domain signal.After a special sifting process, a signal x(t) can be decomposed to n units of h j representing IMFs, and an item r n as its trend.
  All IMFs are orthogonal to each other, and each IMF represents a unique range of energy and frequency.The sum of all IMFs is equal to the original data.The IMF must satisfy the following two conditions [9]: 1.The numbers of extrema and zero-crossings of IMFs must be either equal or differ at most by one.2. At any point, the mean of local maxima and local minima envelopes is zero.
In reality, the nature of a signal x(t) does not satisfy the definition of IMF.That is to say, a large part of the data consists of various frequencies.To satisfy the definition of IMF, the use of EMD incorporates the sifting process [7].This process serves two purposes: (1) to eliminate ride waves; and (2) make the IMF wave profiles more symmetric.
By using EMD for signal decomposition, the input signal must satisfy the following three conditions: 1.The signal has at least two extrema; one is the maximum and the other the minimum.2. The time-period scale is defined by the time lapse between two extrema.3.If the data have no extrema, only the inflection point is recorded, and the extrema can then be estimated by differentiation.
Finally, the results can be calculated by integration of these components.The algorithm is summarized as follows: (1) Identify all extrema of x(t) (2) Interpolate between minima (resp.maxima) with "envelopes" e min (t) (resp.e max (t)) (3) Compute the mean envelope , where k is the iteration number.
(4) Extract the detail h j = x(t)−m k (t).( 5) Repeat ( 1)-( 4) until h j (t) meets the definition of IMF, and IMF converges.( 6) Repeat ( 1)-( 5) to generate a residual r n (t), r n (t) = x(t)−h n (t) In practice, the above procedure has to be refined by a sifting process which repeats steps (1)-( 4) on the signal r(t), until it can be considered as having zero-mean according to the stopping criteria.Once this is achieved, the result is considered as the effective IMF.Then step ( 6) is applied to generate the corresponding residual r n (t).
To make sure the EMD decomposition process generates IMFs that meet its conditions, Huang et al. [7] proposed that a stopping criterion in the sifting process is needed for the EMD process.The criterion can be implemented by limiting the size of the standard deviation (SD) by twice sifting the results as defined below: A typical value is between 0.2 and 0.3 [7].When the computed SD value lies in the specified range, the sifting process is automatically stopped.Figure 1 shows the operating procedures of EMD.First, in Figure 1(a), the signal x(t) is input and decomposed to n IMFs.Each IMF is calculated by the "k times" sifting process until the SD is less than 0.3 as shown in Figure 1(b).The sifting process (see Figure 1(c)) computes the difference between the signal x(t) and the mean of the maxima and minima envelopes.Although the use of EMD has made significant contributions in many applications, its ability to handle signal-processing problems is still insufficient.Rilling and Flandrin [8,9] stated that EMD decomposition capability strongly depends on the frequencies and amplitudes, and the differences in both frequencies and amplitudes of two signals must be sufficient for EMD decomposition analysis.If the criteria for the differences between two signals are not met, the sifting process derives an IMF with a single tone modulated in amplitude instead of a superposition of two unimodular tones.This phenomenon is called the beat effect.To overcome the problem of mode mixing, Wu and Huang [10] proposed EEMD.A uniform distribution of white noise is added to signals before decomposition to reduce the effect of the mode mixing in the EMD process [20].As a result, the EEMD method is capable of resolving both the issues of mode mixing and multi-dimensional computation [11].
For EEMD, the ratio of the added white noise and the number of signals in the ensemble must be predetermined.According to the number in the ensemble, different white noise w i (t) with the same amplitude is added N times to an original signal x(t) to generate N modified signals Next, the EMD decomposition is performed on each modified signal x i (t).Assume the signal is decomposed into n units of IMF and one residue as a trend.Further, by the EEMD method, it will get N × n IMF signals and n trends r in (t).Then, x i (t) can be rewritten as: To reduce the mode mixing, the EEMD method averages the result of the IMF set H j (t) and the trend R(t) derived from EMD.
The error in the decomposition caused by the added white noise is given by the following empirical formula of Wu et al. [10] for large amounts of data: where N is the number of ensembles, ε is the amplitude of the added noise, and ε n is the final standard deviation.According to this empirical formula, w i (t) can be obtained, The EEMD process is shown in Figure 2. Comparing EEMD and EMD (Figure 1), the only difference is that EEMD needs to average N h j (t) to get each IMF, but EMD does not.

Spectral Angle Mapper
A measurement of the similarity of pixels is normally needed for spectral mapping, and the Spectral Angle Mapper (SAM) is a widely used as a spectral similarity metric in remote sensing [1][2][3][4][5][6].In a scatter plot of pixel values from two bands of a spectral image, pixel spectra and target spectra will plot as points as shown in Figure 3.If a vector is drawn from the origin through each point, the angle between any two vectors defines the spectral angle between those two points.The SAM computes a spectral angle between the closest set of pixel spectra and the target spectra, s i and s j .

Parallel Computing Implementations
In this research, the experiment adopts parallel-computing technology [18,19] to speed up the EEMD.The EEMD method can be performed by a GPU developed by NVIDIA.The experiment divides EEMD into two sections.The first is to assign threads for computing each individual i th x(t), and to record an entire result of IMF h j (t) by iterative computation.For each input spectrum x(t), N additive Gaussian noises are randomly generated.Each thread processes one noisy spectrum and decomposes it into IMFs.The second is to compute in a parallel manner for a vector ensemble mean of the j th IMF from all threads (see Figure 4).

Experimental Results
The experimental data were provided by the United States Geological Survey (USGS) spectra library, where five minerals were chosen: actinolite, andradite, goethite, hematite and illite.Each material has four to 10 spectra (Figure 5).For each mineral, at least one spectrum is quite different from the others, which reduces the classification accuracy.The EEMD can extract the absorption feature to improve the accuracy.
To demonstrate the effectiveness of EEMD, the SAM was applied to the original and decomposed spectra by EMD and EEMD.Furthermore, a comparison of parameter settings for EEMD was also conducted, including noise standard deviation, number of signals in each ensemble average and number of accumulated IMFs.

SAM for Original Data
To assess the accuracy of discrimination between minerals, 2000 spectra of each material were simulated with additive white Gaussian noise having signal-to-noise ratio (SNR) 40, and spectral similarity was measured by SAM.Table 1 shows the similarity discrimination of these five minerals in contrast to the original spectra.The SAM values between pairs of five original spectra are 75.8%,80%, 75%, 70.2% and 80%.The experiment examined the degree of accuracy by kappa coefficient [21].The test result is reliable because the kappa coefficient is 0.7025.

SAM for EMD Decomposed Data
The EMD algorithm was applied to decompose the spectra into seven IMFs (Figure 6), and spectral similarity was measured by SAM for each IMF.From Table 2, the largest kappa coefficient occurs for the first IMF (0.6564) which is slightly less than that for the original data, while the worst kappa coefficient occurs for the sixth IMF (0.0079), which did not provide good discrimination.Each IMF is ordered sequentially from higher frequencies to lower frequencies.Summation of all IMFs yields the original data.Because we compared the absorption feature of each IMF with the original spectra, the wavelength of the absorption feature can be estimated.Since this feature is distributed through several IMFs, combining a set of IMFs should enhance the absorption features.The accumulation of IMFs is shown in Figure 7, and the absorption features are clearly observed.

SAM for EEMD Decomposed Data
EEMD was employed to overcome the drawback of mixing modes in EMD.Several parameters have to be determined to initialize EEMD, including noise standard deviation (Nstd), number of signals in each ensemble average (N) and number of accumulated IMFs.
First, the accuracy of performance for each number of signals in the ensemble average was analyzed by kappa coefficient (Table 3).The noise standard deviation was selected from 0.1-0.9, and the numbers, N, for the ensemble averages were 10, 50, 80, 100, 500 and 1,000.When N = 1, EEMD is identical to EMD.The experimental results show a stable kappa value when N exceeds 100 (Figure 8).Therefore, considering the efficiency of the algorithm, N was set to be 100 for EEMD.Secondly, the noise standard deviation needs to be determined.From Table 4, N for the ensemble average is 100, the noise standard deviation (Nstd) is from 0.1-0.9.The simulated data contain additive white Gaussian noise with SNR = 20, 30 and 40.The results show that the kappa coefficients are over 0.7511 and 0.8688 for the third and fourth IMF, respectively, for all Nstd from 0.1-0.9 and SNR 40.Significant improvements are obtained with the third or fourth IMF with EEMD alone compared with the accuracy of original data (0.7025).The maximum kappa value is 0.9771 when Nstd = 0.2, SNR = 40 for the fourth IMF.Therefore, if the spectral estimate has high SNR, with noise standard deviation less than 0.2, EEMD performs well.Finally, the number of accumulated IMFs was analyzed.Several IMFs contain absorption features, so that the accumulation of a set of IMFs should enhance the classification accuracy (Figure 6).To accumulate IMFs, the absorption features must be clearly identified.In this experiment, the number of signals in the ensemble average was set as 100 to optimize, as closely as possible, the tradeoff between noise reduction and efficiency.EEMD was applied to various SNRs and noise standard deviations.Table 5 shows the classification accuracy using SAM values of the accumulated IMFs: IMF 1-2, IMF 1-3, IMF 1-4, IMF 1-5, IMF 1-6.The results show that the kappa coefficients are over 0.7556 and 0.9224 for IMF 1-3 and IMF 1-4, respectively, for all Nstd and SNR 40.They also indicate that EEMD provides further improvements for the fourth IMF compared with the third IMF.The highest kappa value is 0.9909 when Nstd = 0.2, SNR = 40 for IMF 1-4.Furthermore, the kappa coefficient decreased when Nstd > 0.5, indicating that the classification accuracy was also reduced.If the spectra have low SNR, both the number of IMFs and the Nstd need to be increased to achieve an acceptable performance.On the other hand, with high SNR, no matter how many IMFs are used, the classification performs well when Nstd < 0.5.The stable performance appears with IMF 1-5 and Nstd = 0.5, the kappa coefficient is 0.8051, 0.9455, and 0.9830.For higher classification accuracy and computational efficiency, it suggests EEMD with N = 100, Nstd = 0.2-0.5 and accumulation of IMF 1-4 or IMF 1-5.

EEMD Speedup by GPU
The computation environments are shown in Table 6.The proposed algorithm was developed to run on NVIDIA Tesla C1060 GPU via CUDA, and was compared with its CPU serial code on Intel Xeon 5504 with Linux, and Intel i5-2400 with Windows 7. CUDA (Computer Unified Device Architecture) is a parallel-computing platform and programming model created by NVIDIA and implemented by the GPUs that they produce.Table 7 shows the performance of EEMD in four different processing environments and computer language.For performance comparison, the numbers of test samples (N) are chosen from 500 to 3,000.In PC environment, the computing time is approximately proportional to the number of samples and C/C++ language is about five times faster than Matlab.Comparing the environments using C language, the cluster architectures can further improve the performance.The computational performances have 15 and 60 times improvement with quad-core CPU and GPU, respectively, when N exceeds 2,000.It is worth noting that the 240-core GPU is not efficiently utilized with a small sample size-when N = 500-with only a 30 time improvement for GPU compared to a PC environment.

Conclusions
Empirical mode decomposition (EMD), a fully data-driven method for decomposing signals (Huang et al. [7]), is excellent for extracting nonlinear characteristics of signals.Additionally, EEMD outperforms EMD by accommodating noise and avoiding the beat effect.We proposed a two-stage process for spectral similarity measurement; first, adopt EEMD to generate a series of IMFs and then accumulate a set of IMFs for enhancing absorption features; secondly, use the SAM technique for hyperspectral image classification.The experimental results show that EEMD-decomposed hyperspectral signals can enhance discrimination ability.The IMFs also indicate the absorption features of spectra, and the accumulated IMFs can improve absorption characteristics.Our study also overcame two drawbacks of EEMD; the algorithm is time-consuming and several parameters have to be determined before processing.To overcome the first drawback, we propose parallel processing with GPU architecture to decompose spectral data.The performance analysis shows that GPU can significantly reduce the computing time for EEMD.Our insights into selecting three key parameters (noise standard deviation, number of signals in ensemble averages, and the number of accumulated IMFs for EEMD) assist in overcoming the second drawback.

Figure 5 .
Figure 5. Spectral reflectance results for five minerals.

Figure 6 .
Figure 6.The spectra of Intrinsic Mode Functions (IMFs) by EMD for Actinolite.

Table 1 .
Similarity discrimination rates on the experimental samples of five minerals vs. USGS spectral library.

Table 2 .
The similarity discrimination rate for the EMD process on the samples of five minerals.

Table 7 .
The performance of EEMD in various computational environments (values are in seconds).