Noise Removal Based on Tensor Modelling for Hyperspectral Image Classification

Salah Bourennane; Caroline Fossati; Tao Lin

doi:10.3390/rs10091330

,

and

Centrale Marseille, Institut Fresnel, Aix Marseille Université, CNRS, 13013 Marseille, France

^*

Author to whom correspondence should be addressed.

Remote Sens.2018, 10(9), 1330;https://doi.org/10.3390/rs10091330

This article belongs to the Section Remote Sensing Image Processing

Version Notes

Order Reprints

Abstract

With the current state-of-the-art computer aided manufacturing tools, the spatial resolution of hyperspectral sensors is becoming increasingly higher thus making it easy to obtain much more detailed information of the scene captured. However, the improvement of the spatial resolution also brings new challenging problems to address with signal dependent photon noise being one of them. Unlike the signal independent thermal noise, the variance of photon noise is dependent on the signal, therefore many denoising methods developed for the stationary noise cannot be applied directly to the photon noise. To make things worse, both photon and thermal noise coexist in the captured hyperspectral image (HSI), thus making it more difficult to whiten noise. In this paper, we propose a new denoising framework to cope with signal dependent nonwhite noise (SDNW), Pre-estimate—Whitening—Post-estimate (PWP) loop, to reduce both photon and thermal noise in HSI. Previously, we proposed a method based on multidimensional wavelet packet transform and multi-way Wiener filter which performs both white noise and spectral dimensionality reduction, referred to as MWPT-MWF, which was restricted to white noise. We get inspired from this MWPT-MWF to develop a new iterative method for reducing photon and thermal noise. Firstly, the hyperspectral noise parameters estimation (HYNPE) algorithm is used to estimate the noise parameters, the SD noise is converted to an additive white Gaussian noise by pre-whitening procedure and then the whitened HSI is denoised by the proposed method SDNW-MWPT-MWF. As comparative experiments, the Multiple Linear Regression (MLR) based denoising method and tensor-based Multiway Wiener Filter (MWF) are also used in the denoising framework. An HSI captured by Reflective Optics System Imaging Spectrometer (ROSIS) is used in the experiments and the denoising performances are assessed from various aspects: the noise whitening performance, the Signal-to-Noise Ratio (SNR), and the classification performance. The results on the real-world airborne hyperspectral image HYDICE (Hyperspectral Digital Imagery Collection Experiment) are also presented and analyzed. These experiments show that it is worth taking into account noise signal-dependency hypothesis for processing HYDICE and ROSIS HSIs.

Keywords:

hyperspectral image; signal-dependent noise; multiway Wiener filtering; denoising; classification; wavelet packet transform

1. Introduction

Hyperspectral images consist of a considerably large number of narrow spectral bands which are uniformly distributed over a wide spectral range. For each pixel, the almost continuous spectral signature adds the third orthogonal spectral dimension to the two-dimensional spatial domain and constitutes the well-known 3D data cube. Thus, hyperspectral signatures offer the capability to identify and discriminate ground cover types. To benefit from the additional spectral dimension, specific processing methods of hyperspectral images, for instance spectral signatures unmixing, target detection and classification, etc., have been developed. For these methods to operate well, the reduction of the noise affecting hyperspectral images (HSIs) should be incorporated into them.

The noise in HSIs can be distinguished into two classes [1]: random noise and fixed pattern noise. Photon and thermal noise are examples of random noise in HSIs, while striping, periodic and interference noise are examples of fixed pattern noise, which is generated by errors in the calibration process and can be removed from the HSI by suitable procedures [2]. However, random noise, due to its stochastic nature, cannot be removed by those procedures and influences the performance of the algorithms adopted in hyperspectral data exploitation. Note that, in this paper, we only focus on the random noise and will later refer to it as noise. For sensors used in hyperspectral imagery, the theory predicts that random noise mainly comes from two aspects: signal-independent (SI) electronic noise and signal-dependent (SD) photon noise [3]. The widely accepted SI noise model is white Gaussian one [1,3]. In [4,5], it was shown that the SI noise in some HSIs is colored, i.e., spectrally non-white. With the improvement of the sensitivity in the electronic components [6], the resolution of the charged coupled device (CCD) camera has improved significantly, so that the photon noise has become as dominant as the signal-independent electronic noise in HSI data collected by new-generation hyperspectral sensors [7,8,9,10]. In this case, the assumption of additive and stationary noise model is not appropriate although this hypothesis is plausible for HSIs where the SI noise is dominant while SD noise, which depends on the useful signal level, is negligible. Therefore, in this paper, we use the widely accepted noise model in [7,8,9,10] including both signal-dependent and signal-independent noise. There are few denoising algorithms based on the photon noise model. The tensor-based denoising methods were proposed only for the white noise situation. According to the different statistical properties of SI and SD noise, in this paper, we propose a method to remove the SD photon noise as well as the SI thermal noise in the HSI.

In the literature, two widely used models for the random noise in HSIs are the additive white noise along both spectral and spatial dimensions [8,11,12,13,14], and the Additive White Gaussian Noise (AWGN) along the spatial dimensions but non-stationary in the spectral dimension [15,16]; these are pretty reasonable processes when the thermal noise is dominant [8]. However, when the SD photon noise is taken into account, these two noise models become appropriate. The SD photon noise in digital images was discussed in [17,18], where the noise parameters were estimated by utilizing a scatter plot-based estimation procedure. Nonetheless, the discussion of these two papers was only limited to the pure SD noise, which is not suitable for the noise in HSIs. A more accurate generalized SD noise model for digital images was introduced in [19], where the noise parameters were estimated by the Linear Minimum Mean Square Error (LMMSE) in the wavelet domain. In addition, the same noise model was employed in [20] under the name Poissonian-Gaussian noise model, where the noise in HSIs was modeled as two parts: the Poissonian part for modeling the photon noise and the Gaussian part for modeling the remaining stationary distribution in the output data. Images were firstly transformed into the wavelet domain; then, the expectation/standard-deviation pairs were estimated by employing a local estimation method. Finally, the Maximum Likelihood (ML) method was applied to the locally estimated expectation/standard-deviation pairs to estimate the global parameters. Based on the generalized SD noise model, a scatter plot-based estimation method was proposed in [7] to estimate the parameters of the noise generated by the new-generation imaging spectrometers.

The generalized SD noise model was then introduced in the HSI framework in [3]. In this paper, the 2D fractal Brownian motion (fBm) model was employed as a statistical model for intra-band image texture correlation, which has made it possible to estimate additive noise variance locally both from homogeneous and textural images Scanning Widow (SW). Then, the noise parameters were estimated by a linear fit in each band. The spectral textural correlation was also considered, and each SW in a Multicomponent Scanning Window (MSW) was treated as a mixture of fBm-samples and noise. Finally, the global noise parameters were estimated by ML estimation.

The generalized SD noise model was also used in [8], where the HYperspectral Noise Parameter Estimation (HYNPE) algorithm was proposed. Unlike [3], HYNPE utilized the assumption that the noise in each band after being whitened followed a standard Gaussian distribution. The joint probability density function (PDF) of the whitened noise values was used to form the ML criterion. Nonetheless, the signal and noise values were assumed to be known in the ML criterion, which was not true in practical situations. Hence, the MLR-theory based approach was exploited to estimate them. However, MLR calculates the parameters by minimizing the Least Square Error (LSE), which is a biased estimator when the noise is not white. Thus, the estimates of MLR are not accurate in HYNPE, leading to the inaccuracy of the final parameter estimates. In this paper, we investigate the relevance of a new denoising framework for reducing simultaneously the SD photon noise and the SI thermal noise in HSIs on classification [21]. Considering that the noise variance is entangled with the signal, a denoising loop is proposed to remove the noise from HSIs. Each iteration of the loop consists three steps: firstly, we do the pre-estimate, i.e., use MWPT-MWF to directly denoise the HSI containing the photon and thermal noise. Secondly, use the pre-estimate to estimate the noise parameters by employing the ML criterion, and then whiten the noise. Finally, apply MWPT-MWF to the whitened HSIs to obtain the post-estimate. Then, in the next iteration, use the post-estimate of the last iteration as the pre-estimate of current iteration. After several iterations, the HSIs will be well denoised and the resulting classification results are improved.

The remainder of this paper is organized as follows: Section 2 gives the signal model used in this paper. Appendix A presents the Multilinear Algebra Tools. Appendix B overviews the HYNPE algorithm. Appendix C presents the MWPT-MWF method. Section 3 presents the proposed iterative denoising method: considering that the noise variance is dependent on the signal, a denoising loop is proposed to remove the noise from HSIs. Each iteration of the loop consists of three steps: (i) we perform a pre-estimation, i.e., use MWPT-MWF [22,23] to directly denoise the HSI containing the photon and thermal noise, (ii) we use the pre-estimation process to estimate the noise parameters by employing the ML criterion, and then whiten the noise and (iii) we apply MWPT-MWF to the whitened HSIs to obtain the post-estimate results. Then in the next iteration, we use the post-estimate of the last iteration as the pre-estimate of current iteration. After several iterations, the HSIs will be well denoised. Section 4 gives some comparative experimental results. The real-world HSI reflective optics system imaging spectrometer (ROSIS) is used in the experiments to evaluate the performances of denoising and classification processes. Finally, Section 6 provides details of the conclusions of the research undertaken.

In this paper, we denote by

$x \in R$	a scalar
$x \in R^{I_{1}}$	a vector
$X \in R^{I_{1} \times I_{2}}$	a matrix
$X \in R^{I_{1} \times I_{2} \times \dots I_{N}}$	a N-order tensor
$X_{n}$	n-mode unfolding matrix
$X_{X T}$	a tensor obtained by calculating the square root of each element of $X$
$I_{n}$	n-mode dimension
$∥ X ∥$	the Frobenius norm of $X$
∘	the vector outer product
⊙	the Khatri-Rao product
⊛	the Hadamard product
$< X, Y >$	the inner product between $X$ and $Y$
$E [\cdot]$	the mathematical expectation
${(.)}^{T}$	the transposition.

2. Signal Modeling with Thermal and Photon Noise

An HSI is a three-dimensional data cube and can be modeled as a tensor

R \in R^{I_{1} \times I_{2} \times I_{3}}

, with the first two dimensions being the spatial domain and with the third dimension being the spectral domain. In fact, a hyperspectral sensor captures a HSI as a series of two-dimensional images of the spatial domain by a sensor array. By extending the data model in [7] to the 3D representation, a noisy HSI can be expressed as a third order tensor

R \in R^{I_{1} \times I_{2} \times I_{3}}

composed of a multidimensional signal

X \in R^{I_{1} \times I_{2} \times I_{3}}

impaired by an additive random noise

N \in R^{I_{1} \times I_{2} \times I_{3}}

:

R = X + N,

(1)

where

N

accounts for both thermal and photon noise and its variance depends on the pixel

x_{i_{1}, i_{2}, i_{3}}

in the useful signal

X

. The photon noise is caused by the random fluctuation of photon flux arriving at the CCD sensor, and it follows a Poisson model [24]. As the pixel size becomes smaller in the new-generation hyperspectral sensor, the number of photons that reach a pixel per unit time becomes smaller as well. Hence, the photon noise cannot be neglected anymore [20]. For a given entry

x_{i_{1}, i_{2}, i_{3}}

of the pure signal HSI tensor

X \in R^{I_{1} \times I_{2} \times I_{3}}

, the corresponding photon noise element

p_{i_{1} i_{2} i_{3}}

of tensor

P \in R^{I_{1} \times I_{2} \times I_{3}}

can be expressed as [25]:

p_{i_{1} i_{2} i_{3}} = \sqrt{x_{i_{1} i_{2} i_{3}}} u_{i_{1} i_{2} i_{3}},

(2)

where

u_{i_{1}, i_{2}, i_{3}}

is a stationary, zero-mean uncorrelated random process independent on

x_{i_{1}, i_{2}, i_{3}}

with variance

σ_{u, i_{3}}^{2}

. The thermal noise component in each sensor is electronics noise, denoted by

t_{i_{1}, i_{2}, i_{3}}

which can be modeled as an additive zero-mean white Gaussian noise in each band with variance

σ_{t, i_{3}}^{2}

, while the noise variance changes from sensor to sensor due to different states of the electronic components in the sensors. Elementwise, the data model is [7]:

r_{i_{1}, i_{2}, i_{3}} = x_{i_{1}, i_{2}, i_{3}} + \sqrt{x_{i_{1}, i_{2}, i_{3}}} \cdot u_{i_{1}, i_{2}, i_{3}} + t_{i_{1}, i_{2}, i_{3}} .

(3)

Then, we can define

N = P + T

, and Equation (1) can be correspondingly rewritten as

R = X + P + T .

(4)

The unfolding matrix

R_{3} \in R^{I_{3} \times M_{3}}

of the HSI data tensor

R \in R^{I_{1} \times I_{2} \times I_{3}}

(with

M_{3} = I_{1} I_{2}

) can be expressed as:

R_{3} = X_{3} + N_{3},

(5)

where

X_{3}

is the mode-3 unfolding matrix of the multidimensional signal tensor

X

and

N_{3} = P_{3} + T_{3},

(6)

with

P_{3}

and

T_{3}

being the mode-3 unfolding matrices of

P

and

T

, respectively.

3. Proposed Method

The aim of this paper is to obtain the pure signal estimate

\hat{X}

, which is necessary to determine the noise variance. Nonetheless, since the noise is signal-dependent, the noise variances of the entries of

R

are different from each other and are related to the signal entries

x_{i_{1} i_{2} i_{3}}

. It is worth noting that, for a given entry

x_{i_{1} i_{2} i_{3}}

, the noise

n_{i_{1} i_{2} i_{3}}

is a summation of two Gaussian-distributed variables

u_{i_{1} i_{2} i_{3}}

and

t_{i_{1} i_{2} i_{3}}

. Hence,

n_{i_{1} i_{2} i_{3}} = \sqrt{x_{i_{1} i_{2} i_{3}}} u_{i_{1} i_{2} i_{3}} + t_{i_{1} i_{2} i_{3}}

is a conditional zero-mean Gaussian-distributed random variable, i.e.,

n_{i_{1} i_{2} i_{3}} \sim N (0, σ_{n_{i_{1} i_{2} i_{3}}}^{2}),

(7)

where

N (\cdot)

denotes the normal distribution, and

σ_{n_{i_{1} i_{2} i_{3}}}^{2}

is the noise variance, which can be expressed as [3,7,8]:

\begin{matrix} σ_{n_{i_{1} i_{2} i_{3}}}^{2} & = E [{(\sqrt{x_{i_{1} i_{2} i_{3}}} u_{i_{1} i_{2} i_{3}} + t_{i_{1} i_{2} i_{3}})}^{2} | x_{i_{1} i_{2} i_{3}}] \\ = x_{i_{1} i_{2} i_{3}} σ_{u, i_{3}}^{2} + σ_{t, i_{3}}^{2} . \end{matrix}

(8)

This is to say that a precise noise variance estimate

{\hat{σ}}_{i_{1} i_{2} i_{3}}^{2}

needs the precise signal estimate

{\hat{x}}_{i_{1} i_{2} i_{3}}

. Hence, the signal estimate and noise variance estimate problems are inter-related, thus making the signal estimate problem difficult to solve. In HYNPE, the noise parameters are estimated by using the signal estimate generated by the MLR theory based method, which estimates the signal by minimizing LSE. However, the LSE estimator requires that the signal and noise should be statistically independent, which is not satisfied in the SD photon noise situation. Thus, the estimates of the signal and noise are not precise, which makes the parameter estimate result unreliable as well. Moreover, since there is only one step for estimating the signal, the imprecise estimate degrades the performances of noise parameter estimation.

To use the classical parameter estimation algorithms, such as LSE and LMMSE, it is necessary to make the noise “independent of” the signal.

From Equation (8), it is evident that the noise variance

σ_{n_{i_{1} i_{2} i_{3}}}^{2}

is dependent on signal

x_{i_{1} i_{2} i_{3}}

. To cut off this relation, we need to whiten the noise:

{\underset{̲}{n}}_{i_{1} i_{2} i_{3}} = \frac{n_{i_{1} i_{2} i_{3}}}{σ_{n_{i_{1} i_{2} i_{3}}}} \sim N (0, 1),

(9)

where the underlined is used to distinguish the whitened data from the original data. After the whitening operation, we can consider that the noise

{\underset{̲}{n}}_{i_{1} i_{2} i_{3}}

is independent from the whitened signal

{\underset{̲}{x}}_{i_{1} i_{2} i_{3}} = \frac{x_{i_{1} i_{2} i_{3}}}{σ_{n_{i_{1} i_{2} i_{3}}}}

.

It is worth noting that, in the likelihood function Equation (A13), the signal value

x_{i_{1} i_{2} i_{3}}

is assumed to be known. However, we cannot get this prior information in realistic situations; therefore, the signal value

x_{i_{1} i_{2} i_{3}}

should be replaced by its estimate

{\tilde{x}}_{i_{1} i_{2} i_{3}}

, as is presented in Appendix B.

Referring to [22], the wavelet-tensor-based algorithm MWPT-MWF yields the most accurate signal estimate

{\tilde{x}}_{i_{1} i_{2} i_{3}}

. In addition, the well-known HYNPE algorithm permits obtaining the ML estimates of

{\hat{σ}}_{u, i_{3}}^{2}

and

{\hat{σ}}_{t, i_{3}}^{2}

. Hence, in this paper, we choose to combine these two methods to get more accurate estimation of noise variance for each element of

N

, which can be calculated by:

{\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}^{2} = {\tilde{x}}_{i_{1} i_{2} i_{3}} {\hat{σ}}_{u, i_{3}}^{2} + {\hat{σ}}_{t, i_{3}}^{2} .

(10)

When the noise variance of each entry of

R

is obtained, the noise can be whitened by:

{\underset{̲}{r}}_{i_{1} i_{2} i_{3}} = {\underset{̲}{x}}_{i_{1} i_{2} i_{3}} + {\underset{̲}{n}}_{i_{1} i_{2} i_{3}} = \frac{x_{i_{1} i_{2} i_{3}}}{{\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}} + \frac{n_{i_{1} i_{2} i_{3}}}{{\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}}

(11)

and the whitened hyperspectral image can be written as

\underset{̲}{R} = \underset{̲}{X} + \underset{̲}{N} .

(12)

However, MWPT-MWF was proposed for the white noise situation, therefore, when we use it to estimate

{\tilde{x}}_{i_{1} i_{2} i_{3}}

directly without noise whitening, the estimate result is not accurate. To distinguish it with the signal estimate after noise whitening,

{\tilde{x}}_{i_{1} i_{2} i_{3}}

is named as pre-estimate. Correspondingly, the estimate

{\hat{x}}_{i_{1} i_{2} i_{3}}

obtained after the noise whitening procedure is more accurate than

{\tilde{x}}_{i_{1} i_{2} i_{3}}

, so we name it as post-estimate. The PWP process needs to be repeated several times to improve the performance of estimation. In fact, HYNPE only takes the first pre-estimation step in the PWP procedure, which degrades its performance when estimating the parameters. However, we utilize the more accurate post-estimate of current PWP iteration as the pre-estimate of the next PWP iteration. Therefore, the estimate accuracy can be improved in the PWP loop.

To adaptively stop the PWP loop according to the processed HSI, we need to find a stop criterion. The RMSE between the pre-estimate

\tilde{X}

and the post-estimate

\hat{X}

is given as follows:

R M S E_{X} = \frac{∥ \hat{X} - \tilde{X} ∥^{2}}{I_{1} I_{2} I_{3} {∥ \hat{X} ∥}^{2}},

(13)

where

\tilde{X}

and

\hat{X}

are the tensor forms of the pre-estimate and post-estimate, respectively. With the iteration times increasing, the

R M S E_{X}

becomes asymptotically stable. Hence, we can use the relative error of

R M S E_{X}

between two adjacent iterations as the stop criterion:

e = \frac{| R M S E_{X} - R M S E_{X}^{0} |}{R M S E_{X}^{0}},

(14)

where

R M S E_{X}^{0}

is the RMSE of last iteration. If e is less than a given value

ϵ

, the loop should be terminated. This newly proposed method is called Signal-Dependent-Noise-Whitening MWPT-MWF(SDNW-MWPT-MWF), and, to make it easy to understand, its pseudo-code and flowchart are also supplied in Algorithm 1 and Figure 1, respectively.

Algorithm 1: SDNW-MWPT-MWF algorithm

procedure SDNW-MWPT-MWF Tensor $R$
Set the maximum iteration times J.
Set $R M S E_{X}^{0} = 1$ .
Compute the signal pre-estimate $\tilde{X}$ by performing the MWPT-MWF to the data tensor $R$ .
for j = 1; j <= J; j++ do
Compute the SD and SI noise variance estimates ${\hat{σ}}_{u, i_{3}}^{2}$ and ${\hat{σ}}_{t, i_{3}}^{2}$ by using Equation (A12).
Compute the noise variance ${\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}^{2}$ of each element of $R$ : ${\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}^{2} = {\tilde{x}}_{i_{1} i_{2} i_{3}} {\hat{σ}}_{u, i_{3}}^{2} + {\hat{σ}}_{t, i_{3}}^{2}$ .
Compute the whitened tensor $\underset{̲}{R}$ by whitening each element $r_{i_{1} i_{2} i_{3}}$ of $R$ : ${\underset{̲}{r}}_{i_{1} i_{2} i_{3}} = \frac{r_{i_{1} i_{2} i_{3}}}{{\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}}$ .
Compute the whitened signal post-estimate $\underset{̲}{\hat{X}}$ by performing the MWPT-MWF to the whitened tensor $\underset{̲}{R}$ .
Compute the signal post-estimate $\hat{X}$ by performing the inverse whitening operation to $\underset{̲}{\hat{X}}$ : ${\hat{x}}_{i_{1} i_{2} i_{3}} = {\underset{̲}{\hat{x}}}_{i_{1} i_{2} i_{3}} \times {\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}$ .
Compute $R M S E_{X}$ by using Equation (13).
Compute e by using Equation (14).
if $e < ϵ$ then
Break.
end if
Use the post-estimate $\hat{X}$ in this iteration as the pre-estimate $\hat{X}$ in next iteration: $\tilde{X} \leftarrow \hat{X}$ .
Refresh the value of $R M S E_{X}^{0}$ : $R M S E_{X}^{0} \leftarrow R M S E_{X}$ .
end for
return tensor $\hat{X}$ .
end procedure

Figure 1. Flowchart of the SDNW-MWPT-MWF algorithm.

4. Experimental Results

The data set used in the experiments is the HSI captured by the ROSIS during a flight campaign over Pavia University, Northern Italy.

The ROSIS owns 103 spectral bands and

610 \times 340

pixels with the geometric resolution being 1.3 m. In this paper, only a part of

250 \times 250

pixels of this image is used. Hence, it is modeled as a

250 \times 250 \times 103

tensor in the experiments. The SD photon noise

P

and the SI thermal noise

T

are both taken into account. In order to reproduce different noise scenarios, the SNR ranged from 20 dB to 40 dB with a step of 5 dB. As the power of the SD photon noise and that of the SI thermal noise are of the same level, in this paper, only the case

E [∥ X_{X T} ⊛ P ∥^{2}] = E [{∥ T ∥}^{2}]

is taken into account. The random noise is generated with a variance depending on the value of the useful signal according to Equation (8) and added into the signal

X

as Equation (3) to create the noisy HSI data

R

. The raw HSI has SNR between 35 and 40 dB [26,27]. This high-SNR HSI could be viewed as a noise-free data cube, so, in this experiment, the raw image can be taken as a reference data cube

X

.

The RGB composites of

X

and

R

are shown in Figure 2. Correspondingly, Figure 3 presents the curve of the mean noise variance versus the band number.

Figure 2. RGB composites of

X

and

R

(band 20, 35 and 45 for red, green and blue): (a) RGB composites of

X

; (b) RGB composites of

R

.

Figure 3. Mean noise variance in each band (

S N R_{I N P U T} = 20

dB).

In the experiments, the wavelet db3 and transform level [1 1 0] are employed in the SDNW-MWPT-MWF. It is worth noting that various denoising methods can be used to replace the MWPT-MWF in the proposed SDNW-MWPT-MWF method. MWF is a classical tensor-based denoising method and MLR was used in the HYNPE method. Therefore, MWF and MLR have been considered to replace the MWPT-MWF as comparative experiments and are named as SDNW-MWF and SDNW-MLR, respectively. In this paper, the value of

ϵ

is set to

10^{- 3}

for these three methods.

4.1. Noise-Whitening Performance Evaluation and Comparison

According to the relationship given in Equation (10), the noise variance estimate

{\hat{σ}}_{n_{i_{1} i_{2} i_{3}}}^{2}

relies on the SD noise variance estimate

{\hat{σ}}_{u, i_{3}}^{2}

and the SI noise variance estimate

{\hat{σ}}_{t, i_{3}}^{2}

. Hence, the performance of estimating

{\hat{σ}}_{u, i_{3}}^{2}

and

{\hat{σ}}_{t, i_{3}}^{2}

influences directly the noise variance estimation result. Thus, we firstly consider the evolution of

{\hat{σ}}_{u, i_{3}}^{2}

and

{\hat{σ}}_{t, i_{3}}^{2}

in the estimation loop. The RMSE is employed to analyze the accuracy of

{\hat{σ}}_{u, i_{3}}^{2}

and

{\hat{σ}}_{t, i_{3}}^{2}

. The RMSE of the SD photon noise variance and the SI thermal noise variance are calculated by

\begin{matrix} R M S E_{S D} & = \frac{1}{I_{3}} \sum_{i_{3} = 1}^{I_{3}} {(\frac{{\hat{σ}}_{u, i_{3}}^{2} - σ_{u, i_{3}}^{2}}{σ_{u, i_{3}}^{2}})}^{2}, \end{matrix}

(15)

\begin{matrix} R M S E_{S I} & = \frac{1}{I_{3}} \sum_{i_{3} = 1}^{I_{3}} {(\frac{{\hat{σ}}_{t, i_{3}}^{2} - σ_{t, i_{3}}^{2}}{σ_{t, i_{3}}^{2}})}^{2} . \end{matrix}

(16)

Notice that low values for

R M S E_{S D}

and

R M S E_{S I}

denote good estimation accuracy.

A comparative analysis of SDNW-MWF, SDNW-MLR and SDNW-MWPT-MWF have been carried out by analyzing

R M S E_{S D}

and

R M S E_{S I}

with the maximum iteration times being set as 10. Figure 4 and Figure 5 present the evolution of

R M S E_{S D}

and

R M S E_{S I}

(in logarithmic scale) with the iteration times. From these two figures, it can be seen that, in the case where

S N R_{I N P U T} = 20

dB, SDNW-MLR performs better than SDNW-MWF in estimating both

σ_{u, i_{3}}^{2}

and

σ_{t, i_{3}}^{2}

. However, in the case where

S N R_{I N P U T} = 40

dB, SDNW-MWF outperforms SDNW-MLR. Nonetheless, in both cases, the proposed SDNW-MWPT-MWF can improve the estimation performance significantly according to the lowest

R M S E_{S D}

and

R M S E_{S I}

it obtains. Moreover,

R M S E_{S D}

and

R M S E_{S I}

are greater than the initial error in SDNW-MLR and SDNW-MWF, whereas

R M S E_{S D}

and

R M S E_{S I}

are well constrained in SDNW-MWPT-MWF.

Figure 4. Evolution of

R M S E_{S D}

with iteration times according to different values of

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Figure 5. Evolution of

R M S E_{S I}

with iteration times according to different values of

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Apart from

σ_{u, i_{3}}^{2}

and

σ_{t, i_{3}}^{2}

, the estimate of the signal also influences the accuracy of the noise variance estimate of a pixel (see Equation (10)). Since the post-estimate

{\hat{x}}_{i_{1} i_{2} i_{3}}

of current PWP iteration is used as the pre-estimate

{\tilde{x}}_{i_{1} i_{2} i_{3}}

of the next PWP iteration, we only analyze the estimation performance of the post-estimate

{\hat{x}}_{i_{1} i_{2} i_{3}}

. To assess the performance of the signal estimator

{\hat{x}}_{i_{1} i_{2} i_{3}}

, we resort to the

S N R_{I N P U T}

, which will be presented in Section 4.2.

To intuitively present the noise whitening results, Figure 6 shows the mean noise variance of each band after the noise whitening operation. It is evident that the mean noise variance generated by SDNW-MWPT-MWF changes slightly around 1 and is quite constant with respect to the band number. However, the mean noise variance generated by SDNW-MLR and SDNW-MWF is not very satisfactory. In Figure 6a, it can be seen that the trend of the mean noise variances in the lower bands is similar to that in Figure 3 thus implying that the noise in the lower bands (from 1 to 20) is not well whitened. On the other hand, in the bands from 20 to 100, the noise variances are relatively constant, the mean noise variance value is not 1. In Figure 6b, the noise whitening results by SDNW-MLR and SDNW-MWF are worse than that in Figure 6a. Hence, the whitening results in Figure 6 are strongly in favor of the proposed method SDNW-MWPT-MWF.

Figure 6. Curves of noise variance versus band number with

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

The normal probability plot is usually used to visually ascertain whether or not a dataset is approximately normally distributed, which means that the values in the dataset have the same noise variance. Hence, we supply the normal probability plots of the noise before and after whitening in Figure 7. It is obvious that the noise before whitening (Figure 7a) is not normally distributed. After whitening by SDNW-MWF (Figure 7b), the noise approaches towards the normal distribution though there still remain some values not well whitened as can be observed in the interval [−5, 0]. Nonetheless, the noise values after whitening by SDNW-MLR (Figure 7c) and SDNW-MWPT-MWF (Figure 7d) form a straight line and as such they can be considered as normally distributed. Correspondingly, Figure 8a presents the noise distribution situation in the noise environment where

S N R_{I N P U T} = 40

dB. It can be seen that the noise values after whitening by SDNW-MLR (Figure 8b) and SDNW-MWF (Figure 8c) are still not normally distributed, while the noise values after whitening by SDNW-MWPT-MWF (Figure 8d) are well normally distributed. The results in Figure 7 and Figure 8 validate once again that the proposed SDNW-MWPT-MWF performs well in both lower (

S N R_{I N P U T} = 20

dB) and higher (

S N R_{I N P U T} = 40

dB) SNR noise environments.

Figure 7. Normal probability plot for the noise: (a) before whitening, and whitening after (b) SDNW-MWF; (c) SDNW-MLR; (d) SDNW-MWPT-MWF, with

S N R_{I N P U T} = 20

dB.

Figure 8. Normal probability plot for the noise: (a) before whitening, and after whitening using (b) SDNW-MLR; (c) SDNW-MWF; (d) SDNW-MWPT-MWF, with

S N R_{I N P U T} = 40

dB.

4.2. Denoising Performance

In Section 4.1, we have discussed in detail the SD and SI noise variance estimates

R M S E_{S D}

and

R M S E_{S I}

. In this subsection, we show some results about the denoising performance. Figure 9 presents the noise removed in band 10 from the ROSIS HSI by SDNW-MLR, SDNW-MWF and SDNW-MWPT-MWF in the noise environment where

S N R_{I N P U T} = 20

dB. It is obvious that the removed noise is SD noise by comparing visually Figure 9 with the original image Figure 2a. The experimental results in Figure 9 imply that the proposed denoising framework is efficient in removing the SD photon noise in the HSI.

Figure 9. Noise removal using: (a) SDNW-MLR; (b) SDNW-MWF; (c) SDNW-MWPT-MWF, band 10,

S N R_{I N P U T} = 20

dB.

Additionally, we have assessed the denoising performance of various methods by analyzing the criterion

S N R_{O U T P U T}

. Figure 10 presents the evolution of the

S N R_{O U T P U T}

in various noisy environments. It is obvious that the

S N R_{O U T P U T}

generated by SDNW-MWPT-MWF reaches a higher stable value after several iterations. Conversely, the

S N R_{O U T P U T}

of SDNW-MLR and SDNW-MWF are relatively lower than that of SDNW-MWPT-MWF due to the highest estimation errors of

R M S E_{S D}

and

R M S E_{S I}

as shown in Figure 4 and Figure 5, respectively. When

S N R_{I N P U T} = 20

dB, the

S N R_{O U T P U T}

generated by SDNW-MWF is only improved marginally compared to the

S N R_{I N P U T}

. On the other hand, when

S N R_{I N P U T} = 40

dB SDNW-MWF improves the SNR significantly but the

S N R_{O U T P U T}

is not stable in the evolution when the iterations are increased. The SDNW-MLR performs well when

S N R_{I N P U T} = 20

dB, but when

S N R_{I N P U T} = 40

dB the denoising performance of SDNW-MLR is not good compared to that of SDNW-MWF and SDNW-MWPT-MWF. This trend becomes worse as the

S N R_{O U T P U T}

of SDNW-MLR is lower than the

S N R_{I N P U T}

. Moreover, Figure 11 compares the

S N R_{O U T P U T}

of each method from 20 dB to 40 dB with a step of 5 dB. It shows that SDNW-MLR can improve the SNR from 20 to 30 dB, while degrading the SNR from 35 to 40 dB. The SDNW-MWF is even worse as there is only a marginal improvement of the SNR in 20 dB. From 25 to 40 dB, it degrades the SNR. Nonetheless, from the results presented in Figure 10 and Figure 11, we can conclude that SDNW-MWPT-MWF is a stable and reliable denoising method in various noisy environments.

Figure 10. Evolution of

S N R_{O U T P U T}

with iteration times according to different values of

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Figure 11. Comparison of denoising results based on

S N R_{O U T P U T}

versus

S N R_{I N P U T}

.

4.3. Classification after Denoising

In Section 4.2, we have mainly compared various methods with the capability of improving the SNR. However, some methods might also modify the useful signal severely in the denoising process and which cannot be reflected by the SNR. The classification is employed to distinguish different materials in HSIs [21], and it is sensitive to the signal distortion. Hence, in this paper, we take into account the classification improvement ability of the various methods considered.

Two real-world images are considered for this investigation. The first one, referred to as ROSIS HSI, is described in the beginning of this section.

In the ROSIS HSI, there are nine classes: bitumen, self-blocking bricks, trees, shadows, gravel, bare soil, asphalt, painted metal sheets, and meadows, which are shown in Figure 2b and in the ground truth in Figure 12 with different colors. A proportion of

10 %

of the reference data of each class is randomly selected as the training samples. The numbers of training and testing samples are shown in Table 1. The SVM classifier is employed to do the classification and its kernel function is RBF with

γ = 1

and the penalty parameter

C = 100

.

Figure 12. Classification reference data:(a) ground truth of the area; (b) nine classes in ROSIS HSI.

Table 1. Training and testing samples used in the classification.

The second one, referred to as HYDICE HSI, was acquired by the HYperspectral Digital Imagery Collection Experiment (HYDICE) and has 148 spectral bands (from 435 to 2326 nm), 310 rows, and 220 columns. The scene is shown in Figure 13a. This HSI is modeled as a tensor

R \in R^{310 \times 220 \times 148}

and its ground truth is shown in Figure 13b. According to the ground truth, there are seven land cover classes in HYDICE HSI: field, trees, road, shadow and three different targets. A proportion of

30 %

of the reference data of each class is randomly selected as the training samples. The numbers of training and testing samples are shown in Table 2.

Figure 13. HSI images: (a) ground truth; (b) classes in HYDICE HSI.

Table 2. Training and testing samples used in the classification.

The classification is applied to the denoised HSIs using various methods in order to compare their abilities of improving the classification performance. Figure 14 presents the classification results obtained in the noisy environment where

S N R_{I N P U T} = 30

dB. It can be seen from the “bare soil” class that there are less misclassified pixels in the results obtained by SDNW-MWPT-MWF compared to the other results. To make it easy to compare, we give the Overall Accuracy (OA) and Kappa coefficient (K) results for the various methods used in Table 3. It can be seen that if the classification is applied directly to the noisy HSI, the OA is only

91.33 %

and K =

0.71

. After denoising by SDNW-MLR, there is a marginal improvement of OA resulting in

91.88 %

and K =

0.81

. SDNW-MWF performs better than SDNW-MLR, and its OA is increased to

94.20 %

and K to

0.94

. The proposed SDNW-MWPT-MWF makes the most significant improvement among the denoising methods. Indeed, its OA is

98.51 %

and its K is

0.97

. The classification result clearly shows that the proposed SDNW-MWT-MWF is efficient in improving the classification performance.

Figure 14. Classification results of the ROSIS HSI after denoising by: (a) SDNW-MLR; (b) SDNW-MWF; and (c) SDNW-MWPT-MWF. The classification result without denoising (d) is supplied as a benchmark. (

S N R_{I N P U T} = 30

dB).

Table 3. OA (%) and Kappa of the classification of the denoised ROSIS HSI,

S N R_{I N P U T} = 30

dB.

To investigate how the number of training samples affects the performance of our method and other comparison methods, we considered a proportion of

50 %

of the reference data of each class is randomly selected as the training samples. Table 4 shows the OA and Kappa coefficient results. As shown from Table 3 and Table 4, our proposed method outperforms all other methods. In particular, with the number of training samples reducing (

10 %

), our method could obtain bigger accuracy gains than other comparison methods.

Table 4. OA (%) and Kappa of the classification of the denoised ROSIS HSI,

S N R_{I N P U T} = 30

dB.

Figure 15 shows the OA values obtained from the denoised HYDICE HSI and shows that the SDNW-MWPT-MWF method permits reduction of the noise, which is of great interest for SVM classifier. The comparison of the OA and Kappa values (see Table 5) calculated for each preprocessing of denoising shows that the multilinear algebra-based method SDNW-MWPT-MWF leads to better classification results than the considered methods in this experiment.

Figure 15. Classification results of the HYDICE HSI after denoising by: (a) SDNW-MLR; (b) SDNW-MWF; and (c) SDNW-MWPT-MWF. The classification result without denoising (d) is supplied as a benchmark. (

S N R_{I N P U T} = 30

dB).

Table 5. OA (%) and Kappa of the classification of the denoised HYDICE HSI,

S N R_{I N P U T} = 30

dB.

To make the experimental results more convincing, we have compared the classification improvement performance of SDNW-MLR, SDNW-MWF and SDNW-MWPT-MWF in varying noisy environments, i.e.,

S N R_{I N P U T}

varies from 20 dB to 40 dB with a step of 5 dB. The classification result of the noisy HSI without denoising is also supplied as a benchmark. Figure 16 presents the curves of OA versus

S N R_{I N P U T}

. In the cases where the HSI is impaired severely (

S N R_{I N P U T} = 20 and 25

dB), the classification result can be improved significantly after denoising, which implies that the denoising is a necessary preprocessing procedure prior to the classification. Above 30 dB, SDNW-MLR can only improve the classification result marginally and the same is valid for SDNW-MWF with more than 35 dB. Nonetheless, SDNW-MWPT-MWF performs better than SDNW-MLR and SDNW-MWF and it improves the OA significantly for

S N R_{I N P U T}

values from 20 dB to 40 dB.

Figure 16. Classification OA with respect to

S N R_{I N P U T}

.

5. Discussion

Hyperspectral sensors collect data in hundreds of narrow contiguous spectral bands, providing a powerful means to discriminate different materials by their spectral features. In the last decade, for improving the classification, several denoising algorithms for hyperspectral images were proposed. Most were derived assuming the spatial stationarity of the noise that affects hyperspectral images, meaning that the noise characteristics are assumed to be the same in each hyperspectral image region. The existing algorithms are proposed to remove the signal independent noise for enhancing hyperspectral image applications. In this study, we have proved that assumption is not valid for new-generation hyperspectral sensors, where photon noise, which depends on the spatially varying signal level, is not negligible. We are thus studying the possible impacts of signal-dependent noise on the performance of existing classification algorithms. By assuming a signal-dependent noise model, we have proposed a novel multilinear algebra-based algorithm to remove simultaneously the signal-independent noise and signal-dependent noise. This result has inspired a new iterative procedure noise-whitening (SDNW-MWPT-MWF) which improves the SVM’s robustness in the presence of signal-dependent noise. The noise-whitening is achieved by exploiting estimates of the noise variance for each band and for each pixel. The performance of the proposed SDNW-MWPT-MWF method are validated on the simulated HSIs disturbed by both SD and SI noise and on the real-world ROSIS and HYDICE HSIs. From the analysis and the comparative study against other similar methods in the experiments, it can be concluded that SDNW-MWPT-MWF method can effectively reduce both SD and SI noise from HSIs. It is also necessary to take into account the signal-dependent noise in the denoising when dealing with HSIs that were collected by a new-generation airborne hyperspectral sensors. Indeed, this study demonstrated that the signal-dependent noise may affect the properties of existing classification algorithms, thus encouraging future work about the impact of SD noise. Our ongoing activity is aimed at demonstrating the benefits arising from using the SDNW-MWPT-MWF algorithm in mitigating the impact of the SD noise in different algorithms for hyperspectral data exploitation.

Since in this study the optimal parameter combination is found by time-consuming brute force searching, future works will be focused on the reduction of the computational load. A heuristic algorithm can be used to search for the optimal (sub-optimal) parameter combination such as a genetic algorithm.

6. Conclusions

In this paper, a denoising method SDNW-MWPT-MWF has been proposed under the PWP denoising framework for the purpose of reducing the SD photon and SI thermal noise in HSIs. Two other adapted denoising methods SDNW-MLR, SDNW-MWF are also used as comparative methods. The performances of these methods are compared in the case of photon noise whitening, denoising and improvement of classification. The photon noise whitening experiment, the denoising experiment and the classification experiment are designed to assess the parameter estimation performance, the improvement of the SNR and the distortion of the spectra in the HSI, respectively. From the experimental results obtained, it can be concluded that the SDNW-MLR performs well in removing the noise, though it also changes the signal severely in the denoising process as shown from the lower classification result OA. In contrast, SDNW-MWF performs well in preserving the signal though it cannot remove the noise well from the images. However, the proposed SDNW-MWPT-MWF is able to obtain high

S N R_{O U T P U T}

as well as the high OA, thus implying that it is able to remove noise well while still preserving the signal.

These promising results encourage us to extend our experiments on other hyperspectral data such as Indian Pines and Salinas HSIs.

In this study, we employed the HYNPE algorithm to estimate the noise parameters, and then filtered noise by the proposed method SDNW-MWPT-MWF. Nonetheless, SDNW-MWPT-MWF converts SD noise to additive white Gaussian noise by a pre-whitening procedure and then applied MWPT-MWF. Due to the parameter estimate errors, the whitened noise is only approximately white, therefore the performance of filters developed for the white noise might degrade. Thus, it is interesting to develop a filter that can directly process the SD noise without the noise whitening procedure. For example, the tensor-based filtering method using a PARAFAC tensor decomposition could be carried out in the future.

Author Contributions

All authors contributed to the conception and design of the methodology.

Funding

This research received no external funding.

Acknowledgments

The authors would like to thank the reviewers for their careful reading and helpful comments, which improved the quality of this paper.

Conflicts of Interest

The authors declare no conflict of interest.

Appendix A. Multilinear Algebra Tools

Since multilinear algebra is used in this paper, we present some basic definitions in this section to make it easy to understand the discussions in the following sections. More details about multilinear algebra can be found in [28,29,30,31].

Appendix A.1. Definition of a Tensor

An N-th order tensor is an N-dimensional array,

X \in R^{I_{1} \times \dots \times I_{N}}

, in which

R

indicates the real manifold, and N is the number of dimensions. The elements in this tensor can be expressed as

x_{i_{1} \dots i_{N}}

, with

i_{1} = 1, \dots, I_{1}; \dots; i_{N} = 1, \dots, I_{N}

. The nth dimension of this tensor is called mode-n.

Appendix A.2. Unfolding

Unfolding is also known as matricization or flattening. It can convert a tensor into its matrix form for the conveniency of using matrix-based data analyzing methods. A tensor can be unfolded in different ways according to the mode in which the unfolding is performed. The unfolding in mode-n is called mode-n unfolding.

Let

X_{n} \in R^{I_{n} \times M_{n}}

denote the mode-n unfolding matrix of a tensor

X \in R^{I_{1} \times \dots \times I_{N}}

, where

M_{n} = I_{n + 1} I_{n + 2} \dots I_{N} I_{1} \dots I_{n - 1}

. The columns of

X_{n}

are the

I_{n}

-dimensional vectors obtained from

X

by varying index

i_{n}

while keeping the other indices fixed.

Appendix A.3. Mode-n Tensor Product × n

The mode-n product is defined as the product between a data tensor

X \in R^{I_{1} \times \dots \times I_{N}}

and a matrix

B \in R^{J \times I_{n}}

in mode n. This mode-n product is denoted by

C = X \times_{n} B

, whose entries are given by

c_{i_{1} \dots i_{n - 1} j i_{n + 1} \dots i_{N}} ≜ \sum_{i_{n} = 1}^{I_{n}} x_{i_{1} \dots i_{n - 1} i_{n} i_{n + 1} \dots i_{N}} b_{j i_{n}},

(A1)

where

C \in R^{I_{1} \times \dots \times I_{n - 1} \times J \times I_{n + 1} \times \dots \times I_{N}}

.

Appendix A.4. Element Extraction

For a given tensor

X \in R^{I_{1} \times I_{2} \times \dots \times I_{N}}

, we define the following element extraction operation:

X (i_{1}, i_{2}, \dots, i_{N}) ≜ x_{i_{1} i_{2} \dots i_{N}} .

(A2)

Appendix A.5. Hadamard Product ⊛

The Hadamard product is defined as the product of two equal-size tensors

X \in R^{I_{1} \times I_{2} \times \dots \times I_{N}}

and

Y \in R^{I_{1} \times I_{2} \times \dots \times I_{N}}

. The entries of the Hadamard product

Z = X ⊛ Y

can be computed as:

z_{i_{1} \dots i_{n - 1} i_{n} i_{n + 1} \dots i_{N}} = x_{i_{1} \dots i_{n - 1} i_{n} i_{n + 1} \dots i_{N}} y_{i_{1} \dots i_{n - 1} i_{n} i_{n + 1} \dots i_{N}},

(A3)

where

Z \in R^{I_{1} \times I_{2} \times \dots \times I_{N}}

.

Appendix A.6. Mode-n Rank of a Tensor

The term mode-n rank

K_{n}

of a tensor

X \in R^{I_{1} \times I_{2} \times \dots \times I_{N}}

, denoted by

K_{n} = {rank}_{n} (X)

, is the dimension of the column space of the mode-n unfolding matrix

X_{n}

[29], i.e.,

K_{n} = {rank}_{n} (X) = rank (X_{n}) .

(A4)

The term mode-n rank is another way to extend the notion of the matrix rank to the tensor case. The mode-n rank is actually a rank of a matrix, therefore it can be analyzed by using matrix based techniques. Nevertheless, for a matrix

X

, the ranks of its column space and row space are similar,

rank (X) = rank (X^{T})

, but, for the tensor

X

, the different mode-n ranks are not necessarily the same. This means that the mode-n rank of tensor

X

is not a unique value but a set of N values

\{K_{1}, \dots, K_{N}\}

.

Appendix B. Overview of HYNPE

HYNPE was proposed in [8] to estimate the parameters of the photon noise which has aroused recently much research interest in the new generation hyperspectral sensors. HYNPE assumes that the noise sources are independent from one another and spectrally uncorrelated, which is the same noise model as the one introduced in Section 2. There are mainly two steps of HYNPE for estimating the noise parameters.

The first step is to estimate the noise and signal by employing the MLR theory. For a given noisy HSI

R

, we consider its mode-3 unfolding matrix

R_{3} \in R^{I_{3} \times I_{1} I_{2}}

, which consists of

I_{3}

row vectors (see Appendix A):

R_{3} = {[r_{1}^{T} r_{2}^{T} \dots r_{I_{3}}^{T}]}^{T} .

(A5)

Denote the estimate of pure HSI

X

by

\hat{X}

and noise

N

by

\hat{N}

. Then, the mode-3 unfolding matrix of

\hat{X}

and

\hat{N}

are

{\hat{X}}_{3}

and

{\hat{N}}_{3}

, respectively, and their row-vector forms can be expressed as:

\begin{matrix} {\hat{X}}_{3} = {[{\hat{x}}_{1}^{T} {\hat{x}}_{2}^{T} \dots {\hat{x}}_{I_{3}}^{T}]}^{T}, & {\hat{N}}_{3} = {[{\hat{n}}_{1}^{T} {\hat{n}}_{2}^{T} \dots {\hat{n}}_{I_{3}}^{T}]}^{T} . \end{matrix}

(A6)

The MLR theory estimates the signal by exploiting the strong spectral correlation of the useful signal and the weak between-band correlation of the random noise in the HSI. It assumes that the signal estimate row vector

{\hat{x}}_{i_{3}}^{T}

in band

i_{3}

can be expressed as a linear combination of the noisy data row vectors

r_{j_{3}}^{T}

(

j_{3} = 1, \dots, i_{3} - 1, i_{3} + 1, \dots, I_{3}

) in the other

I_{3} - 1

bands:

{\hat{x}}_{i_{3}} = Θ_{i_{3}} w_{i_{3}},

(A7)

where

Θ_{i_{3}} = [r_{1}, \dots, r_{i_{3} - 1}, r_{i_{3} + 1}, \dots, r_{I_{3}}]

and

w_{i_{3}} \in R^{(I_{3} - 1)}

is the combination weight vector. Then, the optimal weight vector can be estimated by minimizing the LSE:

{\hat{w}}_{i_{3}} = arg min_{w_{i_{3}}} {∥ r_{i_{3}} - {\hat{x}}_{i_{3}} ∥}^{2} .

(A8)

The solution of the LSE problem is well known and can be expressed as:

{\hat{w}}_{i_{3}} = {(Θ_{i_{3}}^{T} Θ_{i_{3}})}^{- 1} Θ_{i_{3}}^{T} r_{i_{3}},

(A9)

and the corresponding signal estimate

{\hat{x}}_{i_{3}}

and noise estimate

{\hat{n}}_{i_{3}}

can be computed by:

\begin{matrix} {\hat{x}}_{i_{3}} & = Θ_{i_{3}} {\hat{w}}_{i_{3}}, \end{matrix}

(A10)

\begin{matrix} {\hat{n}}_{i_{3}} & = r_{i_{3}} - {\hat{x}}_{i_{3}} . \end{matrix}

(A11)

After the estimation of the signal and noise, the noise variances

{\hat{σ}}_{u, i_{3}}^{2}

and

{\hat{σ}}_{t, i_{3}}^{2}

are estimated by maximizing the likelihood function [8]:

{{\hat{σ}}_{u, i_{3}}^{2}, {\hat{σ}}_{t, i_{3}}^{2}} = \underset{\begin{matrix} σ_{u, i_{3}} > 0 \\ σ_{t, i_{3}} > 0 \end{matrix}}{arg max} ln (σ_{u, i_{3}}, σ_{t, i_{3}}),

(A12)

with

\begin{matrix} ln (σ_{u, i_{3}}, σ_{t, i_{3}}) & = - \frac{M}{2} ln (2 π) \\ - \frac{1}{2} \sum_{i_{1} = 1}^{I_{1}} \sum_{i_{2} = 1}^{I_{2}} ln [σ_{u, i_{3}}^{2} \cdot x_{i_{1} i_{2} i_{3}} + σ_{t, i_{3}}^{2}] \\ - \frac{1}{2} \sum_{i_{1} = 1}^{I_{1}} \sum_{i_{2} = 1}^{I_{2}} \frac{n_{i_{1} i_{2} i_{3}}^{2}}{σ_{u, i_{3}}^{2} \cdot x_{i_{1} i_{2} i_{3}} + σ_{t, i_{3}}^{2}} . \end{matrix}

(A13)

Since

x_{i_{1} i_{2} i_{3}}

and

n_{i_{1} i_{2} i_{3}}

are unknown in a realistic scenario, they are replaced by the estimates computed in Equations (A10) and (A11), respectively.

Appendix C. Multiway Wiener Filter in Multidimensional Wavelet Packet Domain

In this section, we remind the main principles of the multidimensional wavelet packet transform (MWPT) and of the MWF in multidimensional wavelet packet domain [22].

Appendix C.1. MWPT

MWPT can be computed by performing 1D wavelet packet transform in each mode [32]. Therefore, the wavelet packet coefficient tensor

C_{l}^{R}

can be computed as

C_{l}^{R} = R \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}},

(A14)

and the reconstruction can be written as

R = C_{l}^{R} \times_{1} {({\tilde{W}}_{1}^{l_{1}})}^{T} \times_{2} {({\tilde{W}}_{2}^{l_{2}})}^{T} \times_{3} {({\tilde{W}}_{3}^{l_{3}})}^{T},

(A15)

where

l = {[l_{1}, l_{2}, l_{3}]}^{T}

, and

l_{1}, l_{2}, l_{3} \geq 0

. In particular, when

l_{1}, l_{2}, l_{3} > 0

, MWPT indicates the 3D wavelet packet transform.

{\tilde{W}}_{k}^{l_{k}}

denotes the

l_{k}

level wavelet packet transform to kth mode of

R

.

C_{l, m}^{R}

is defined as the coefficient subtensor of

C_{l}^{R}

, where

m = {[m_{1}, m_{2}, m_{3}]}^{T}

is the index vector, and

0 \leq m_{k} \leq 2^{l_{k}} - 1

,

k = 1, 2, 3

. Then, for each element of

C_{l, m}^{R}

, we can define:

C_{l, m}^{R} (j_{1}, j_{2}, j_{3}) ≜ C_{i}^{R} (J_{1} (j_{1}), J_{2} (j_{2}), J_{3} (j_{3})),

(A16)

where

\{J_{n} = {[\frac{m_{n} I_{n}}{2^{l}}, \dots, \frac{(m_{n} + 1) I_{n}}{2^{l}} - 1]}^{T}, n = 1, 2, 3\}

(A17)

and

j_{n} \in \{1, \dots, \frac{I_{n}}{2^{l}}\}, n = 1, 2, 3 .

(A18)

The notation

C_{l, m}^{R} (j_{1}, j_{2}, j_{3})

indicates the element of tensor

C_{l, m}^{R}

in position

(j_{1}, j_{2}, j_{3})

as defined in Equation (A2). From the properties of the wavelet packet transform, we know that

m_{n}

indicates the “frequency” of mode n. Thus,

m

is the frequency index of coefficient block

C_{l, m}^{R}

. For convenience, a component tensor of

R

is referred to as

C_{l, m}^{R}

in this paper.

Appendix C.2. Multiway Wiener Filter in Multidimensional Wavelet Packet Domain

In the existing MWF algorithm, the filter is applied to the whole hyperspectral image

R

. As the calculation of the filters needs the estimation of the signal subspace or rank in each mode for suppressing the smallest eigenvalues [33], some weak signal might be removed in this procedure. Therefore, the SNR is an important factor influencing the rank. When SNR is higher, the rank estimated is greater, therefore more signal is preserved in the filtering process. In the contrast condition, more signal is lost. When the noise is white, the power of noise in each component

C_{l, m}^{R}

is the same, whereas the signal concentrates in the lower frequency component. That is to say, in different components, the SNR is different. When MWF is applied to each component, more signal can be preserved. Performing MWPT to tensor

R

,

X

and

N

in Equation (1), we obtain:

\begin{matrix} R \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}} \\ = (X + N) \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}} \\ = X \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}} + N \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}} . \end{matrix}

(A19)

The coefficient tensor of each part

\begin{matrix} C_{l}^{R} & = R \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}}, \end{matrix}

(A20)

\begin{matrix} C_{l}^{X} & = X \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}}, \end{matrix}

(A21)

\begin{matrix} C_{l}^{N} & = N \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}}, \end{matrix}

(A22)

and the coefficient tensor of the estimate

\hat{X}

:

{\hat{C}}_{l}^{X} = \hat{X} \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}} .

(A23)

Extracting the components of each frequency

C_{l, m}^{R}

,

C_{l, m}^{X}

and

C_{l, m}^{N}

from

C_{l}^{R}

,

C_{l}^{X}

and

C_{l}^{N}

respectively by using Equation (A16), we obtain:

C_{l, m}^{R} = C_{l, m}^{X} + C_{l, m}^{N} .

(A24)

From Parseval’s theorem, the following expression can be obtained:

\begin{matrix} ∥ X - \hat{X} ∥^{2} = ∥ C_{l}^{X} - {\hat{C}}_{l}^{X} ∥^{2} = \sum_{m} {∥ C_{l, m}^{X} - {\hat{C}}_{l, m}^{X} ∥}^{2}, \end{matrix}

(A25)

which means that minimizing the MSE between

X

and its estimate

\hat{X}

is equivalent to minimizing the MSE between

C_{l, m}^{X}

and

{\hat{C}}_{l, m}^{X}

for each

m

. If

{\hat{C}}_{l, m}^{X}

is estimated by Tucker3 decomposition of

C_{l, m}^{R}

:

{\hat{C}}_{l, m}^{X} = C_{l, m}^{R} \times_{1} H_{1, m} \times_{2} H_{2, m} \times_{3} H_{3, m},

(A26)

then

H_{1, m}, H_{2, m}, H_{3, m}

are the mode-n filters of the multiway Wiener filter [33]. After estimating

{\hat{C}}_{l, m}^{X}

for each

m

, we obtain

{\hat{C}}_{l}^{X}

by concatenating

{\hat{C}}_{l, m}^{X}

. Furthermore, the estimate

\hat{X}

can be obtained by inverse MWPT, i.e.,

\hat{X} = {\hat{C}}_{l}^{X} \times_{1} {({\tilde{W}}_{1}^{l_{1}})}^{T} \times_{2} {({\tilde{W}}_{2}^{l_{2}})}^{T} \times_{3} {({\tilde{W}}_{3}^{l_{3}})}^{T} .

(A27)

Appendix C.3. Best Transform Level and Basis Selection

In MWPT-MWF algorithm, several parameters should be determined:

Level of transform: the performance of the algorithm is affected by the level of transform, which depends on the size of tensor $R$ . The maximum level can be calculated by:

$N_{L_{k}} = ⌈ {log}_{2} I_{k} ⌉ - 5, k = 1, 2, 3,$

(A28)

where $⌈ \cdot ⌉$ rounds a number upward to its nearest integer, and the constant 5 is reduced from $⌈ {log}_{2} I_{k} ⌉$ to make sure there are enough elements in each mode so that the transform is meaningful. Then, the set of possible transform levels can be expressed as:

$L_{k} = {0, 1, \dots, N_{L_{k}}}, k = 1, 2, 3,$

(A29)

where ${\cdot}$ denotes a set.
Basis of transform: there are many wavelet bases designed for different cases. For the simplicity of expression, we define:

$W = {w_{1}, w_{2}, \dots, w_{N_{W}}}$

(A30)

to denote the set of possible wavelet bases, where $N_{W}$ is the number of wavelets in this set.

The best transform level and basis should minimize the MSE or risk

R_{c} (X, \hat{X}) = E [∥ X - \hat{X} ∥^{2}]

[34], whose equivalent form using the coefficients can be expressed as

R_{c} (X, \hat{X}) = \sum_{m} E [∥ C_{l, m}^{X} - {\hat{C}}_{l, m}^{X} ∥^{2}] .

(A31)

Then, the best transform level and basis can be selected by

l, w = \underset{l_{k} \in L_{k}, w \in W}{arg min} \sum_{m} E [∥ C_{l, m}^{X} - {\hat{C}}_{l, m}^{X} ∥^{2}], k = 1, 2, 3 .

(A32)

As the selection of the optimal

l, w

depends on

X,

which is generally unknown, to overcome this drawback, an alternative solution should be found. Denoting by

{\hat{C}}_{l, m}^{X} [d]

the estimate of

C_{l, m}^{X}

at the dth ALS loop and noticing that when

∥ {\hat{C}}_{l, m}^{X} [d] - {\hat{C}}_{l, m}^{X} {[d - 1] ∥}^{2}

is minimized,

{\hat{C}}_{l, m}^{X} ≜ {\hat{C}}_{l, m}^{X} [d]

is the optimal estimate of

C_{l, m}^{X}

obtained by MWF, and at the same time

E [∥ C_{l, m}^{X} - {\hat{C}}_{l, m}^{X} ∥^{2}]

is minimized. Therefore, Equation (A32) can be replaced by:

l, w = \underset{l_{k} \in L_{k}, w \in W}{arg min} \hat{R_{c}}, k = 1, 2, 3,

(A33)

where

\hat{R_{c}} = \sum_{m} {∥ {\hat{C}}_{l, m}^{X} [d] - {\hat{C}}_{l, m}^{X} [d - 1] ∥}^{2} .

(A34)

Appendix C.4. Summary of the MWPT-MWF Method

The algorithm MWPT−MWF, can be summarized as presented here.

Find the optimal $l_{1}, l_{2}, l_{3} \in L$ and $w \in W$ . Loop $l_{1}, l_{2}, l_{3}$ and w:
(a)
Decompose the data $R$ by MWPT: $C_{l}^{R} = R \times_{1} {\tilde{W}}_{1}^{l_{1}} \times_{2} {\tilde{W}}_{2}^{l_{2}} \times_{3} {\tilde{W}}_{3}^{l_{3}}$ .
(b)
Extract component $C_{l, m}^{R}$ from $C_{l}^{R}$ using Equation (A16), for $m = {[m_{1}, m_{2}, m_{3}]}^{T}$ , where $0 \leq m_{k} \leq 2^{l_{k}} - 1$ , $k = 1, 2, 3$ .
(c)
Filter component $C_{l, m}^{R}$ by MWF: ${\hat{C}}_{l, m}^{X} = C_{l, m}^{R} \times_{1} H_{1, m} \times_{2} H_{2, m} \times_{3} H_{3, m}$ .
(d)
Calculate the risk $\hat{R_{c}}$ using Equation (A34). If $\hat{R_{c}}$ reaches a fixed threshold, return the optimal $l_{1}, l_{2}, l_{3}, w$ and ${\hat{C}}_{l, m}^{X}$ .
Concatenate ${\hat{C}}_{l, m}^{X}$ to obtain $C_{l}^{X}$ and perform inverse MWPT: $\hat{X} = {\hat{C}}_{l}^{X} \times_{1} {({\tilde{W}}_{1}^{l_{1}})}^{T} \times_{2} {({\tilde{W}}_{2}^{l_{2}})}^{T} \times_{3} {({\tilde{W}}_{3}^{l_{3}})}^{T}$ .

References

Acito, N.; Diani, M.; Corsini, G. Signal-dependent noise modeling and model parameter estimation in hyperspectral images. IEEE Trans. Geosci. Remote Sens. 2011, 8, 2957–2971. [Google Scholar] [CrossRef]
Rakwatin, P.; Takeuchi, W.; Yasuoka, Y. Stripe noise reduction in modis data by combining histogram matching with facet filter. IEEE Trans. Geosci. Remote Sens. 2007, 45, 1844–1856. [Google Scholar] [CrossRef]
Uss, M.L.; Vozel, B.; Lukin, V.V.; Chehdi, K. Local signal-dependent noise variance estimation from hyperspectral textural images. IEEE J. Sel. Top. Signal Process. 2011, 3, 469–486. [Google Scholar] [CrossRef]
Roger, R.E. Principal components transform with simple, automatic noise ajustment. INT J. Remote Sens. 1996, 17, 2719–2727. [Google Scholar] [CrossRef]
Chang, C.-I.; Du, Q. Interference and noise adjusted principal components analysis. IEEE Trans. Geosci. Remote Sens. 1999, 37, 2387–2396. [Google Scholar] [CrossRef]
Rhodes, H.; Agranov, G.; Hong, C.; Boettiger, U.; Mauritzson, R.; Ladd, J.; Karasev, I.; McKee, J.; Jenkins, E.; Quinlin, W.; et al. Cmos imager technology shrinks and image performance. In Proceedings of the 2004 IEEE Workshop on Microelectronics and Electron Devices, Boise, ID, USA, 16 April 2004; pp. 7–18. [Google Scholar]
Alparone, L.; Selva, M.; Aiazzi, B.; Baronti, S.; Butera, F.; Chiarantini, L. Signal-dependent noise modelling and estimation of new-generation imaging spectrometers. In Proceedings of the Workshop on Hyperspectral Image and Signal Processing (WHISPERS), Grenoble, France, 26–28 August 2009; pp. 1–4. [Google Scholar]
Liu, X.; Bourennane, S.; Fossati, C. Denoising of hyperspectral images using the parafac model and statistical performance analysis. IEEE Trans. Geosci. Remote Sens. 2012, 50, 3717–3724. [Google Scholar] [CrossRef]
Gao, L.; Yao, D.; Li, Q.; Zhuang, L.; Zhang, B.; Bioucas-Dias, J.M. A New Low-Rank Representation based hyperspectral image denoising method for mineral mapping. Remote Sens. 2017, 9, 1145. [Google Scholar] [CrossRef]
Fan, Y.R.; Huang, T.Z.; Zhao, X.L.; Deng, L.J.; Fan, S. Multispectral Image Denoising via Nonlocal Multitask Sparse Learning. Remote Sens. 2018, 10, 116. [Google Scholar] [CrossRef]
Atkinson, I.; Kamalabadi, F.; Mohan, S.; Jones, D. Wavelet-based 2D multichannel signal estimation. In Proceedings of the IEEE International Conference on Image Processing, Barcelona, Spain, 14–17 September 2003; pp. 743–745. [Google Scholar]
Pizurica, A.; Philips, W. Estimating the probability of the presence of a signal of interest in multiresolution single-and multiband image denoising. IEEE Trans. Image Process. 2006, 3, 654–665. [Google Scholar] [CrossRef]
Letexier, D.; Bourennane, S. Noise removal from hyperspectral images by multidimensional filtering. IEEE Trans. Geosci. Remote Sens. 2008, 7, 2061–2069. [Google Scholar] [CrossRef]
Renard, N.; Bourennane, S. Improvement of target detection methods by multiway filtering. IEEE Trans. Geosci. Remote Sens. 2008, 8, 2407–2417. [Google Scholar] [CrossRef]
Aiazzi, B.; Alparone, L.; Barducci, A.; Baronti, S.; Marcoinni, P.; Pippi, I.; Selva, M. Noise modelling and estimation of hyperspectral data from airborne imaging spectrometers. Ann. Geophys. 2006, 1, 1–9. [Google Scholar]
Gao, L.R.; Zhang, B.; Zhang, X.; Zhang, W.-J.; Tong, Q.-X. A new operational method for estimating noise in hyperspectral images. IEEE Geosci. Remote Sens. Lett. 2008, 1, 83–87. [Google Scholar] [CrossRef]
Aiazzi, B.; Alparone, L.; Baronti, S. A robust method for parameter estimation of signal-dependent noise models in digital images. In Proceedings of the IEEE International Conference on Digital Signal Processin, Santorini, Greece, 2–4 July 1997; pp. 601–604. [Google Scholar]
Aiazzi, B.; Alparone, L.; Baronti, S.; Garzelli, A. Coherence estimation from multilook incoherent sar imagery. IEEE Trans. Geosci. Remote Sens. 2003, 11, 2531–2539. [Google Scholar] [CrossRef]
Argenti, F.; Torricelli, G.; Alparone, L. Mmse filtering of generalised signal-dependent noise in spatial and shift-invariant wavelet domains. Signal Process. 2006, 8, 2056–2066. [Google Scholar] [CrossRef]
Foi, A.; Trimeche, M.; Katkovnik, V.; Egiazarian, K. Practical poissonian-Gaussian noise modeling and fitting for single-image raw-data. IEEE Trans. Image Process. 2008, 10, 1737–1754. [Google Scholar] [CrossRef] [PubMed]
Cheng, G.; Li, Z.; Han, J.; Yao, X.; Guo, L. Exploring Hierarchical Convolutional Features for Hyperspectral Image Classification. IEEE Trans. Geosci. Remote Sens. 2018, 99, 1–11. [Google Scholar] [CrossRef]
Lin, T.; Bourennane, S. Hyperspectral image processing by jointly filtering wavelet component tensor. IEEE Trans. Geosci. Remote Sens. 2013, 6, 3529–3541. [Google Scholar] [CrossRef]
Lin, T.; Bourennane, S. Hyperspectral image denoising with rare signal preserving by jointly filtering image component. In Proceedings of the 2013 4th European Workshop on Visual Information Processing (EUVIP), Paris, France, 10–12 June 2013; pp. 265–269. [Google Scholar]
Faraji, H.; MacLean, W.J. CCD noise removal in digital images. IEEE Trans. Image Process. 2006, 9, 2676–2685. [Google Scholar] [CrossRef]
Aiazzi, B.; Alparone, L.; Baronti, S.; Butera, F.; Chiarantini, L.; Selva, M. Benefits of signal-dependent noise reduction for spectral analysis of data from advanced imaging spectrometers. In Proceedings of the Workshop on Hyperspectral Image and Signal Processing: Evolution in Remote Sensing (WHISPERS), Lisbon, Portugal, 6–9 June 2011; pp. 1–4. [Google Scholar]
Othman, H.; Qian, S.-E. Noise reduction of hyperspectral imagery using hybrid spatial-spectral derivative-domain wavelet shrinkage. IEEE Trans. Geosci. Remote Sens. 2006, 2, 397–408. [Google Scholar] [CrossRef]
Chen, G.; Qian, S.-E. Denoising of hyperspectral imagery using principal component analysis and wavelet shrinkage. IEEE Trans. Geosci. Remote Sens. 2011, 3, 973–980. [Google Scholar] [CrossRef]
Kolda, T.G.; Bader, B.W. Tensor decompositions and applications. SIAM Rev. 2009, 3, 455–500. [Google Scholar] [CrossRef]
De Lathauwer, L.; De Moor, B.; Vandewalle, J. A multilinear singular value decomposition. SIAM J. Matrix Anal. Appl. 2000, 4, 1253–1278. [Google Scholar] [CrossRef]
De Lathauwer, L.; De Moor, B.; Vandewalle, J. On the best rank-1 and rank-(r1, r2, ..., rn) approximation of higher-order tensors. SIAM J. Matrix Anal. Appl. 2000, 4, 1324–1342. [Google Scholar] [CrossRef]
Cichocki, A.; Zdunek, R.; Phan, A.; Amari, S. Nonnegative Matrix and Tensor Factorizations: Applications to Exploratory Multi-Way Data Analysis and Blind Source Separation; Wiley: Hoboken, NJ, USA, 2009. [Google Scholar]
Mallat, S. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Pattern Anal. Mach. Intell. 1989, 7, 674–693. [Google Scholar] [CrossRef]
Muti, D.; Bourennane, S. Multidimensional filtering based on a tensor approach. Signal Process. 2005, 12, 2338–2353. [Google Scholar] [CrossRef]
Donoho, D.; Johnstone, I. Ideal denoising in an orthonormal basis chosen from a library of bases. Comptes Rendus de l’Academie des Sciences-Serie I-Mathematique 1994, 12, 1317–1322. [Google Scholar]

Figure 1. Flowchart of the SDNW-MWPT-MWF algorithm.

Figure 2. RGB composites of

X

and

R

(band 20, 35 and 45 for red, green and blue): (a) RGB composites of

X

; (b) RGB composites of

R

.

Figure 3. Mean noise variance in each band (

S N R_{I N P U T} = 20

dB).

Figure 4. Evolution of

R M S E_{S D}

with iteration times according to different values of

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Figure 5. Evolution of

R M S E_{S I}

with iteration times according to different values of

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Figure 6. Curves of noise variance versus band number with

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Figure 7. Normal probability plot for the noise: (a) before whitening, and whitening after (b) SDNW-MWF; (c) SDNW-MLR; (d) SDNW-MWPT-MWF, with

S N R_{I N P U T} = 20

dB.

Figure 8. Normal probability plot for the noise: (a) before whitening, and after whitening using (b) SDNW-MLR; (c) SDNW-MWF; (d) SDNW-MWPT-MWF, with

S N R_{I N P U T} = 40

dB.

Figure 9. Noise removal using: (a) SDNW-MLR; (b) SDNW-MWF; (c) SDNW-MWPT-MWF, band 10,

S N R_{I N P U T} = 20

dB.

Figure 10. Evolution of

S N R_{O U T P U T}

with iteration times according to different values of

S N R_{I N P U T}

: (a) 20 dB; (b) 40 dB.

Figure 11. Comparison of denoising results based on

S N R_{O U T P U T}

versus

S N R_{I N P U T}

.

Figure 12. Classification reference data:(a) ground truth of the area; (b) nine classes in ROSIS HSI.

Figure 13. HSI images: (a) ground truth; (b) classes in HYDICE HSI.

Figure 14. Classification results of the ROSIS HSI after denoising by: (a) SDNW-MLR; (b) SDNW-MWF; and (c) SDNW-MWPT-MWF. The classification result without denoising (d) is supplied as a benchmark. (

S N R_{I N P U T} = 30

dB).

Figure 15. Classification results of the HYDICE HSI after denoising by: (a) SDNW-MLR; (b) SDNW-MWF; and (c) SDNW-MWPT-MWF. The classification result without denoising (d) is supplied as a benchmark. (

S N R_{I N P U T} = 30

dB).

Figure 16. Classification OA with respect to

S N R_{I N P U T}

.

Table 1. Training and testing samples used in the classification.

ID	Class	Training Samples	Testing Samples
1	Bitumen	133	1330
2	Self-Blocking Bricks	171	1709
3	Trees	70	697
4	Shadows	49	486
5	Gravel	93	929
6	Bare Soil	503	5029
7	Asphalt	187	1868
8	Painted metal sheets	135	1345
9	Meadows	201	2005
	Total	1542	15,398

Table 2. Training and testing samples used in the classification.

ID	Class	Training Samples	Testing Samples	Color
1	Field	12,174	40,811	Green
2	Trees	1361	5537	Sea green
3	Road	1146	3226	White
4	Shadow	1363	5036	Maroon
5	Target 1	138	519	Red
6	Target 2	78	285	Blue
7	Target 3	67	223	Yellow
	Total	16,327	55,637

Table 3. OA (%) and Kappa of the classification of the denoised ROSIS HSI,

S N R_{I N P U T} = 30

dB.

Table 3. OA (%) and Kappa of the classification of the denoised ROSIS HSI,

S N R_{I N P U T} = 30

dB.

Methods	Without Denoising	SDNW-MLR	SDNW-MWF	SDNW-MWPT-MWF
OA	91.33	91.88	94.20	98.51
Improvement	0	0.55	2.87	7.18
Kappa	0.79	0.81	0.91	0.96
Improvement	0	0.02	0.12	0.17

Table 4. OA (%) and Kappa of the classification of the denoised ROSIS HSI,

S N R_{I N P U T} = 30

dB.

Table 4. OA (%) and Kappa of the classification of the denoised ROSIS HSI,

S N R_{I N P U T} = 30

dB.

Methods	Without Denoising	SDNW-MLR	SDNW-MWF	SDNW-MWPT-MWF
OA	91.98	92.73	95.92	99.06
Improvement	0	0.75	3.94	7.08
Kappa	0.81	0.84	0.94	0.98
Improvement	0	0.03	0.13	0.16

Table 5. OA (%) and Kappa of the classification of the denoised HYDICE HSI,

S N R_{I N P U T} = 30

dB.

Table 5. OA (%) and Kappa of the classification of the denoised HYDICE HSI,

S N R_{I N P U T} = 30

dB.

Methods	Without Denoising	SDNW-MLR	SDNW-MWF	SDNW-MWPT-MWF
OA	95.60	95.66	97.22	99.94
Improvement	0	0.06	1.62	4.34
Kappa	0.894	0.895	0.945	0.989
Improvement	0	0.01	0.51	0.95

© 2018 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Noise Removal Based on Tensor Modelling for Hyperspectral Image Classification

Abstract

1. Introduction

2. Signal Modeling with Thermal and Photon Noise

3. Proposed Method

4. Experimental Results

4.1. Noise-Whitening Performance Evaluation and Comparison

4.2. Denoising Performance

4.3. Classification after Denoising

5. Discussion

6. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

Appendix A. Multilinear Algebra Tools

Appendix A.1. Definition of a Tensor

Appendix A.2. Unfolding

Appendix A.3. Mode-n Tensor Product × n

Appendix A.4. Element Extraction

Appendix A.5. Hadamard Product ⊛

Appendix A.6. Mode-n Rank of a Tensor

Appendix B. Overview of HYNPE

Appendix C. Multiway Wiener Filter in Multidimensional Wavelet Packet Domain

Appendix C.1. MWPT

Appendix C.2. Multiway Wiener Filter in Multidimensional Wavelet Packet Domain

Appendix C.3. Best Transform Level and Basis Selection

Appendix C.4. Summary of the MWPT-MWF Method

References

Article Metrics

Citations

Article Access Statistics