Satellite Multispectral and Hyperspectral Image De-Noising with Enhanced Adaptive Generalized Gaussian Distribution Threshold in the Wavelet Domain

: The presence of noise in remote sensing satellite images may cause limitations in analysis and object recognition. Noise suppression based on thresholding neural network (TNN) and optimization algorithms perform well in de-noising. However, there are some problems that need to be addressed. Furthermore, finding the optimal threshold value is a challenging task for learning algorithms. Moreover, in an optimization-based noise removal technique, we must utilize the optimization algorithm to overcome the problem. These methods are effective at reducing noise but may blur some parts of an image, and they are time-consuming. This flaw motivated the authors to develop an efficient de-noising method to discard un-wanted noises from these images. This study presents a new enhanced adaptive generalized Gaussian distribution (AGGD) threshold for satellite and hyperspectral image (HSI) de-noising. This function is data-driven, non-linear, and it can be fitted to any image. Applying this function provides us with an optimum threshold value without using any least mean square (LMS) learning or optimization algorithms. Thus, it is possible to save the processing time as well. The proposed function contains two main parts. There is an AGGD threshold in the interval [-   ,   ], and a new non-linear function behind the interval. These combined functions can tune the wavelet coefficients properly. We applied the proposed technique to various satellite remote sensing images. We also used hyperspectral remote sensing images from AVIRIS, HYDICE, and ROSIS sensors for our experimental analysis and validation process. We applied peak signal-to-noise ratio (PSNR) and Mean Structural Similarity Index (MSSIM) to measure and evaluate the performance analysis of different de-noising techniques. Finally, this study shows the superiority of the developed method as compared with the previous TNN and optimization-based noise suppression methods. Moreover, as the results indicate, the proposed method improves PSNR values and visual inspection significantly when compared with various image de-noising methods.


Introduction
Recently satellite and hyperspectral image (HSI) de-noising has become very popular among researchers in remote sensing. Noise removal is one of the critical and challenging tasks in image and signal processing. Various types of unwanted noises sometimes influence the quality and resolution of an image. These noises may damage the quality of an image during the acquisition and transmission procedures, causing deflection from the original image. Therefore, these artifacts can affect the characteristics and attributes of the image. Noise removal plays an important role as a preprocessing step in various applications and topics, such as satellite and remote sensing image processing [1], before doing any further study and processing on the image.
Noises have different characteristics of corrupting and distorting the original image. Based on their attributes and characteristics, applying and utilizing good models may be required to properly represent these noises. Then, suitable noise reduction methods can be applied for de-noising the images. This process needs to be done very carefully to keep the most important, effective, and efficient attributes of an image and discard the nonimportant characteristics [1]. It also allows for further analysis to be attempted efficiently. A wide range of techniques has been proposed for noise removal.
For example, Chen et al. (2015) [2] presented a weighted couple of sparse representations. Chan et al. (2005) [3] introduced median-type noise detectors and detailpreserving regularization-based noise removal. A universal noise reduction algorithm, combined with an impulse detector, was proposed in 2005 [4]. Yin et al. [5] proposed a highly accurate image reconstruction process for multimodal noise removal. Decker et al. in 2011 [6] introduced mode estimation in high-dimensional spaces using flat-top kernels. Portilla et al. in 2003 [7] also presented noise removal with a scale mixture of Gaussians. Additionally, Miura et al. in 2013 [8] proposed randomly valued impulse noise reduction with Gaussian curvature. Lu et al. (2013) [9] proposed sparse coding for noise removal utilizing spike and slab prior. A new technique for image restoration in the presence of impulse noise was proposed by Yuan and Ghanem in 2015 [10]. Lin et al. (2010) [11] introduced a switching bilateral filter combined with a noise detector for universal noise reduction. Awad et al. in 2011 [12] proposed the standard deviation to get the optimal direction for impulse noise reduction. Random-valued impulse noise removal utilizing a new directionally weighted median filter was proposed by Dong and Xu in 2007 [13]. Impulse noise removal with a new adaptive center weighted median (ACWM) filter was conducted in a study proposed by Lin (2007) [14]. Green et al. (1988) [15] introduced noise reduction based on a transformation for ordering multispectral data in terms of image quality [15]. Hyperspectral noise removal using a cubic total variation model was presented in 2012 [16].
Noise removal based on wavelet transform has become very popular in image processing. Various noise reduction approaches have been developed in the wavelet domain. Adapting to unknown smoothness via wavelet shrinkage was introduced in 1995 [17]. Chang et al. in 2000 [18] proposed spatially adaptive wavelet thresholding-based noise removal with context modeling. Figueiredo and Nowak (2001) utilized Jeffrey's noninformative prior, combined with an empirical Bayes approach for wavelet-based image estimation [19]. A novel Bayesian multiscale method for speckle noise suppression in medical ultrasound images was introduced in 2001 [20]. Image de-noising based on curvelet transform has been proposed by Stark et al. (2002) [21]. The bivariate shrinkage function for noise removal in the wavelet domain was proposed by Sendur and Selesnick (2002) [22].   [23] presented image de-noising and compression using adaptive wavelet thresholding. Sveinsson and Benediktsson in 2003 [24] introduced the speckle noise removal of SAR images using almost translation-invariant wavelet transformations. Achim et al. (2003) proposed SAR image de-noising via Bayesian wavelet shrinkage based on heavy-tailed modeling [25]. Hyperspectral remote sensing noise reduction with 3D UWT using a new, improved soft thresholding function has been presented in 2018 [26]. Pižurica et al. (2002) proposed Bayesian wavelet-based noise removal using a joint inter-and intrascale statistical model [27]. Translation invariant wavelet-based noise reduction with smooth sigmoid-based shrinkage (SSBS) function and un-decimated wavelet transform (UWT) utilizing a soft thresholding function has been proposed in Refs. [28] and [29], respectively. Li et al., in Ref. [30], presented an approach for hyperspectral image de-noising using non-local low-rank and TV regularization.
Additionally, low-rank tensor approximation combined with robust noise modeling has been proposed in Ref. [31] for remote sensing image de-noising. Elad et al. (2006) [32] introduced image de-noising using sparse and redundant representations over learned dictionaries [32]. Translation-invariant de-noising was proposed by Coifman and Donoho in 1995 [33]. Thresholding neural network-based noise reduction with a smooth sigmoid-based shrinkage function was introduced in 2017 [34]. Dan et al. (2011) proposed a wavelet image de-noising algorithm based on local adaptive wiener filtering [35]. Thresholding neural networks with an improved threshold function were proposed in 2017 [36]. Zhang (2001) presented thresholding neural networks (TNN) for adaptive noise reduction [37]. Noise removal with enhanced thresholding and median filter was introduced in 2018 [38]. Sahraeian et al. (2007) [39] proposed image de-noising in the wavelet domain based on improved TNN and cycle spinning. Nasri and Nezamabadi-pour in 2009 introduced a new adaptive function with three shape-tuning parameters for image de-noising in the wavelet domain [40].
Steepest descent gradient-based least mean square is used in thresholding neural networks to acquire the optimum value of the threshold. This process is time-consuming, and it is one of the disadvantages of utilizing the thresholding neural network. Thus, optimization-based noise removal has been proposed by Bhandari et al. (2016) to overcome this problem [41]. In their technique, the authors used the adaptive threshold function with threes shape-tuning parameters [40] combined with the (Adaptive Differential Evolution) JADE optimization algorithm [42]. In 2019, Golilarz et al. [1] applied the Harris Hawks optimization (HHO) algorithm [43], which results in an improvement in the qualitative and quantitative results of the JADE algorithm. Continuously, the authors proposed to utilize an improved version of the previous adaptive generalized Gaussian distribution (AGGD) threshold function [44] to improve the efficiency of both TNN and optimization-based image de-noising. Additionally, this approach reduces the time of processing remarkably.
In this study, we proposed a new AGGD-based function to improve the former method's results and quality. In the proposed enhanced AGGD-based noise reduction, inside the interval [ − , ] the wavelet coefficients can be tuned using the AGGD function, and behind the interval, a new non-linear data-driven function has been applied to tune these wavelet components. The difference between the proposed method and the former improved AGGD is that in the proposed approach, we have used = as the threshold value, and also, we applied a new non-linear function behind the interval. We compared the proposed technique with some other methods available in the literature. The results have shown the superiority of the proposed method over former studies in terms of higher peak signal-to-noise ratio (PSNR) values.
The rest of the paper is as follows: Section 2 describes noise, wavelet transform, and the procedure of image de-noising in the wavelet domain. It also explains the TNN and optimization-based noise removal. Section 3 is the proposed method. In this part, an overview of AGGD and improved AGGD is explained as well. Section 4 is the experimental analysis. Section 5 delivers results and discussions. The conclusion of the paper is presented in Section 6.

Noise
Consider the noisy vector as = [ , , … , ] , corrupted by additive white Gaussian noise (AWGN): where denotes the input wavelet coefficients and is the iid (independent and identically distributed) Gaussian noise.
Consider the noise-free data as = [ , , … , ] and the thresholded output vector as = [ , , … , ] . In wavelet-based noise suppression, it is very critical to use a suitable threshold value and threshold function. The universal threshold value can be acquired by the VisuShrink technique as it is formulated below [45]. It is clear that VisuShrink can employ universal thresholding on the detail coefficients. This threshold is utilized for removing the additive white Gaussian noise with high probability, which tends to smooth image appearance, since the threshold may be quite big due to its dependency on the number of samples [46].
where n is the sample size, and is the robust median estimator [45], which is given below: where , are the coefficients in the sub-band [45].
Applying discrete wavelet transform on the input data leads to the approximation and detail components. The input data may go through the low pass (LP) and high pass (HP) filters. Then, there exists a down-sampling right after performing the filtering. In this stage, the filtered down-sampled signal would be the input data, which again need to be passed through the LP and HP filters [44] [47]. This procedure continues until we get four sub-bands, namely, LL, LH, HL and HH, as can be seen from Figure 1. H represents the high frequency, and L denotes to low frequency.

Explanation of TNN and Optimized Based Image De-noising
Wavelet-based image de-noising can be performed as follows [1], [37]. Applying a wavelet transform provides us with the wavelet detail (important or significant) and noisy (non-important) components. It is obvious that the important features of the image need to remain, and the non-important characteristics need to be discarded. Therefore, we can apply a suitable threshold function to do so. The threshold function can easily tune the wavelet components by the threshold value. By applying the threshold function, we will obtain the thresholded wavelet constituents. In order to obtain the output de-noised image, we are required to apply the inverse discrete wavelet transform (IDWT). This denoising process is shown in Figure 2  Zhang first proposed noise removal using a thresholding neural network (TNN) in 2001 [37]. Unlike the neural network, in TNN, the linear transform is fixed, and the activation function can be adaptive. In this network, we can get noisy coefficients by applying the linear transform on the noisy input image. Then, these noisy components can be passed through the non-linear activation function to get thresholded wavelet constituents. In the end, the inverse linear transform will be applied to these thresholded coefficients to obtain noise-free images. The improved soft and hard functions proposed by Zhang (2001) [37] are two types of non-linear threshold function which can be used in this network.
where £ is the improved soft threshold function, ω is the wavelet component, and > 0 is a user-defined function parameter [37].
where £ is the improved hard threshold function, ω is the wavelet components, t is the threshold value and > 0 is a user-defined function parameter [37].
We can attain the optimum threshold value in the step as follows [39]: where ∆ (S) is defined as: where is the learning rate, and ( ) is the mean square error (MSE) risk function. Nasri and Nezamabadi-pour in 2009 [40] proposed a sub-band adaptive TNN network with a new non-linear threshold function. In adaptive wavelet-based noise removal techniques, the threshold functions are chosen to be non-linear and adaptive. This function is given below.
where is the wavelet coefficient, a and b are the shape-tuning parameters, and c calculates the asymptote of the function.
Noise removal using TNN with the steepest descent learning is time-consuming. In 2016, Bhandari et al. [41] proposed utilizing the meta-heuristic nature-inspired optimization algorithm (JADE algorithm) instead of using TNN to improve the qualitative and quantitative result and to increase the speed. In 2019, Golilarz et al. [1] improved the quality of the previous method by utilizing the HHO optimizer. It was proven that using the HHO algorithm can increase the speed of processing remarkably. The whole procedure of optimization-based noise reduction is shown in Figure 3.

Explanation of AGGD and Improved AGGD
Although optimization-based noise reduction could solve some drawbacks of utilizing the TNN, the quality of the de-noised image needs to be improved, and the computational time also needs to be decreased. In 2019, an adaptive generalized Gaussian distribution (AGGD)-oriented threshold was proposed for image de-noising [44]. The process of acquiring this data-driven function is formulated below [44]. This procedure is illustrated in Figure 4.
Step 1: Produce zero-mean Gaussian distribution function as below.
Step 2: Produce B( ) as follows: Then, B( ) can be alternatively formulated as follows.
Step 3: Produce Normalized B( ): Step 4: Produce D( ) as below: Step 5: Produce AGGD threshold function E( ): where E(ω) is the AGGD threshold function, ω is the wavelet coefficient, is the robust median estimator [45], and is the threshold value, which is the intersection of ( ) and .
To improve the quality of the previous research, Golilarz et al. (2019) [1] presented a complete non-linear function as an improved version of the AGGD threshold function. This function could also improve the processing speed of former AGGD, TNN-based, and optimization-based noise reduction techniques. It is not required to apply the steepest descent learning and optimization algorithms to attain the threshold value [1]. The improved AGGD function is formulated below. Figure 5 depicts both the AGGD and the improved AGGD functions together.

Remove the Discontinuity and
Produce D(ω)

Proposed Enhanced AGGD
The improved AGGD threshold function could enhance the qualitative and quantitative results of TNN and optimization-based noise removal. As we mentioned before, the improved AGGD consists of two main parts. In the interval [-t, t], the function is adaptive GGD, and behind the interval, it is a non-linear function. The threshold value is > . In this section, an enhanced version of the former improved AGGD threshold is presented. In the proposed technique, = . Similar to the improved AGGD, the developed function also contains two parts: in the interval, [− , ], which is adaptive GGD, and behind the interval is non-linear. Note that, unlike the improved AGGD function, there is no intersection between the adaptive GGD and the identity function. Based on the experimental results, this function could improve the performance of the former techniques and also it could provide us with a faster processing time. The proposed function is formulated below. Figure 6 illustrates the enhanced adaptive GGD threshold.

Experimental Analysis
Note that in this study, for evaluating the de-noising performance of various methods, we utilized the peak signal-to-noise ratio (PSNR), which is given below: where is the mean square error, as follows.
where is the original image, is de-noised image, and , is the size of the image [41].
In the experimental part, we performed several experiments to analyze the results of the proposed approach. Note that, in all the experiments, the Db4 wavelet with one level of decomposition has been utilized. Besides this, we have used eight original satellite images, as shown in Figure 7. The dataset is available in Ref. [1]. We also utilized four hyperspectral images, namely, Indian Pine (captured by AVIRIS sensor), Washington DC Mall (captured by HYDICE sensor), Pavia Center and Pavia University (captured by ROSIS sensor). The dataset is available in Ref. [47]. For the optimization algorithms used in this section, the swarm size is considered 30 and the maximum iteration 500 [1]. The parameters of the HHO and JADE optimizers are the same as those which have been utilized in Refs. [43] and [41], respectively. Note that in these experiments, additive white Gaussian noise (AWGN) with zero mean and different standard deviations has been used for noisy images. For these implementations, Matlab programming language has been used on a computer with Intel Core i7 and 16 GB Ram.

Results and Discussions
In the first experiment, as can be seen from Table 1, a comparison has been made between the proposed technique with improved AGGD [1], HHO [1], Bayes [23] and Sure [17] in terms of PSNR values for different standard deviations. The results indicate that the proposed enhanced AGGD performed better than other techniques. Here we have utilized five satellite images. Moreover, as we can see from Figure 8, this study compared the proposed technique with other methods visually and qualitatively as well. In this experiment, Image 4 has been used as the test image contaminated by additive white Gaussian noise (AWGN) with zero mean and standard deviation of 30 to obtain the noisy image. Obviously, the visual inspection of the de-noised image obtained by the proposed method is better than the resolution of other approaches.

Noisy
Sure Bayes HHO Improved AGGD Enhanced AGGD Figure 8. Visual comparison of different de-noising methods for Image 4.
Furthermore, in this experiment, we compared the enhanced adaptive-based noise reduction method (proposed method) with the improved AGGD [1], HHO-based noise reduction, Bayes [23] and Sure [17] for larger  values.
Based on the results in Figure 9, the proposed method gives higher PSNR values for larger standard deviations as compared to other methods. To evaluate the performance analysis of the developed method, MSSIM is also used in the way it is utilized in Ref. [41]. On this correspondence, Figure 10 illustrates the quantitative analysis of several noise reduction methods for larger standard deviations in terms of MSSIM values. As it is evident, the developed method performs promisingly even for large standard deviations.
Here we used test Image 3.   Enhanced AGGD-based noise removal is introduced to improve the quality of the de-noised image. In addition to the quality improvement, the computational cost of the function is cheaper than the alternative methods (improved AGGD [1], HHO [1], Bayes [23], and Sure [17]). We compared the computational time and speed of the processing between different noise reduction methods in Table 2. Test Image 3 has been used in this experiment. The standard deviation is =10. For all these methods, the time is the average of 20 runs. Obviously, the enhanced adaption-based noise reduction method (developed method) is the fastest technique compared to other techniques. In the second experiment, we made a comparison between the proposed enhanced AGGD-based noise reduction with the soft threshold technique, both qualitatively and quantitatively. Figure 11 shows this comparison. The numbers are PSNR (dB).  Figure 11. Comparing between the proposed method and soft thresholding for images 6, 7 and 8 for the standard deviation of 20.
In the third experiment, we used test Image 5 to compare the performance analysis of the proposed enhanced AGGD technique with adaptive soft and adaptive hard [46], and standard soft and standard hard, threshold functions for = 10, 15, 20, 25, 30. As can be seen from Figure 12 the results show that the proposed method outperforms other methods in terms of PSNR value. In the fourth experiment, band 20 of the Washington DC Mall (WDM) hyperspectral image [26] is utilized to show that the proposed technique performs well, even in the denoising of hyperspectral images. For WDM, we considered the same patch which has been used in Ref. [26]. Figure 13 shows the comparison between the proposed enhanced AGGD method and the AGGD [44], JADE [41], and Nasri's methods [40] for different standard deviations. The results display the superiority of the proposed technique in terms of PSNR values. In the last experiment, to further analyze the performance of the developed method, we used the Indian Pine, Pavia Center and Pavia University hyperspectral images (HSI) to compare the proposed method with some other well-known approaches. The patches of these datasets are utilized in the same way as in Ref. [26]. Noisy images can be acquired by adding AWGN with zero mean and a standard deviation of 10. In Table 3, we compared the de-noising results of the proposed technique with VisuShrink [45], SureShrink [17], BayesShrink [23], and Bivariate shrinkage [22]. The results indicated that the proposed method acts promisingly in de-noising the hyperspectral remote sensing images as well.

Conclusions
Noise removal utilizing TNN and optimization algorithms can achieve desirable results. However, some existing issues still need to be addressed and solved to improve their qualitative and quantitative results, particularly for light detection and ranging (LiDAR) point clouds [49] and hyperspectral images [50,51]. In wavelet-based image denoising, the threshold value is very important. This optimum value can be obtained using a least mean square (LMS) learning in the TNN network and using an optimization algorithm in an optimization-based image de-noising approach. These TNN and optimized noise reduction methods perform well but still fail to keep the de-noised image's visual quality. A new enhanced version of the adaptive generalized Gaussian distribution (AGGD)-oriented threshold function has been introduced in this study to solve this drawback. Utilizing this function can provide us with a cheaper computational cost since we will not apply any LMS learning or optimization algorithms to attain the optimum threshold value. The most important attributes and characteristics of this function are that it is a data-driven and non-linear function. This function consists of two parts. There exists an adaptive GGD function in the interval [-, ] to tune the small noisy coefficients. Moreover, there exists another non-linear function behind the interval [− , ] to tune the larger wavelet components. The results show that the proposed enhanced AGGD-oriented function is a promising method in de-noising the satellite remote sensing images [52,53]. For future work, to achieve better results, we will deal with utilizing some optimization algorithms [54] and also will focus on the non-linear part behind the interval, in which we will use a new type of non-linear data-driven function to be connected to the AGGD function in the interval [-, ].