Next Article in Journal
Detection of the Monitoring Window for Pine Wilt Disease Using Multi-Temporal UAV-Based Multispectral Imagery and Machine Learning Algorithms
Next Article in Special Issue
Wavelet Integrated Convolutional Neural Network for Thin Cloud Removal in Remote Sensing Images
Previous Article in Journal
Inclinometer and Improved SBAS Methods with a Random Forest for Monitoring Landslides and Anchor Degradation in Otoyo Town, Japan
Previous Article in Special Issue
Multi-Scale Feature Aggregation Network for Semantic Segmentation of Land Cover
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

D3CNNs: Dual Denoiser Driven Convolutional Neural Networks for Mixed Noise Removal in Remotely Sensed Images

1
School of Information and Artificial Intelligence, Nanchang Institute of Science and Technology, Nanchang 330108, China
2
Artificial Intelligence School, Wuchang University of Technology, Wuhan 430223, China
3
School of Electrical and Information Engineering, Wuhan Institute of Technology, Wuhan 430205, China
4
School of Electronic and Information Engineering, Wuhan Donghu University, Wuhan 430212, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2023, 15(2), 443; https://doi.org/10.3390/rs15020443
Submission received: 15 December 2022 / Revised: 31 December 2022 / Accepted: 9 January 2023 / Published: 11 January 2023
(This article belongs to the Special Issue Pattern Recognition and Image Processing for Remote Sensing II)

Abstract

:
Mixed (random and stripe) noise will cause serious degradation of optical remotely sensed image quality, making it hard to analyze their contents. In order to remove such noise, various inverse problems are usually constructed with different priors, which can be solved by either model-based optimization methods or discriminative learning methods. However, they have their own drawbacks, such as the former methods are flexible but are time-consuming for the pursuit of good performance; while the later methods are fast but are limited for extensive applications due to their specialized tasks. To fast obtain pleasing results with combination of their merits, in this paper, we propose a novel denoising strategy, namely, Dual Denoiser Driven Convolutional Neural Networks (D3CNNs), to remove both random and stripe noise. The D3CNNs includes the following two key parts: one is that two auxiliary variables respective for the denoised image and the stripe noise are introduced to reformulate the inverse problem as a constrained optimization problem, which can be iteratively solved by employing the alternating direction method of multipliers (ADMM). The other is that the U-shape network is used for the denoised auxiliary variable while the residual CNN (RCNN) for the stripe auxiliary variable. The subjectively and objectively comparable results of experiments on both synthetic and real-world remotely sensed images verify that the proposed method is effective and is even better than the state-of-the-arts.

1. Introduction

Remotely sensed images (RSIs) are “photographs” of the Earth’s surface, which can truly and vividly reflect the current situation of the distribution, the relationship, and the changes in the interaction of surface features, from which rich information including vegetation, soil moisture, water quality parameters, and surface and sea temperature can be obtained [1]. This Earth resource information can play an important role in the fields of agriculture, forestry [2], water conservancy, oceans, and ecological environment [3], which is beneficial for us to remotely investigate resources [4], monitor the environment [5], and analyze and predict disasters [6]. However, due to the impact of detector and photon effects, the obtained remotely sensed images may be degraded by both stripe noise (caused by the different responses of each detector) and random noise (which is mainly additive Gaussian white noise (AGWN) produced by photon effects) [7], which can be formulated as
y = x + s + n ,
where y is the observed image, x is the latent clean image, s is the stripe noise, and n is the AGWN. It is an impossible task to simultaneously obtain s and x using Equation (1). To solve the problem, the common idea is the usage of a two-step method, such as first denoising [8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38] then destriping [39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55]. Such strategy may smooth many useful structures of image x. An example is shown in Figure 1, from which we can observe that many rich structures are smoothed. It is absolutely independent to reduce the two typical noise using the two-step method that may impair the structures of stripe or images and further smooth rich details. Therefore, one typical noise should be considered when the other is processed. To address this issue and preserve as many fine structures as possible, different regularizations on x and s are both considered to build a unified restoration model [56], which can be solved by the optimization methods (such as alternating direction method of multipliers (ADMM) and split Bregman). The procedures of these methods can be briefly presented as follows:
  • According to the Bayes’ theorem, the estimation of x and s with the posterior distribution P x , s | y can be converted into the following equation
    P x , s | y P y | x , s P x P s ,
    where P y | x , s is a likelihood prior which can be presented as
    P y | x , s e x p 1 2 y x s 2 2 ,
    where · 2 2 is the l 2 -norm function. P x and P s are respectively the prior probabilities of x and s, which are used to obtain optimization solutions being closed to the actual values. With proper parameters λ 1 and λ 2 , the prior probability P x is written as
    P x e x p λ 1 Φ x ,
    while P s is defined as
    P s e x p λ 2 Ψ s ,
    where Φ · and Ψ · are different regularizations on x and s, respectively. By inserting Equations (3) to (5) into Equation (2), the posterior distribution P x , s | y is equivalent to
    P x , s | y P y | x , s P x P s e x p 1 2 y x s 2 2 + λ 1 Φ x + λ 2 Ψ s .
  • With the usage of the logarithmic transformation, the optimization solution of Equation (6) is transferred from maximizing the posterior distribution to minimizing the energy function L x , s = l o g P x , s | y which is
    L x , s = 1 2 y x s 2 2 + λ 1 Φ x + λ 2 Ψ s .
  • The optimization solution of model (7) is solved with the employment of ADMM or split Bregman by introducing auxiliary variables.

2. Related Works

The stripe noise coexisting with the random noise makes it difficult to formulate the mixed noise with an explicit expression due to the nonindependent as well as nonidentical property. Usually, a unified energy model L is constructed by different penalty priors Φ and Ψ respectively on image x and stripe s to separate them from image decomposition perspective [43], in which the stripe is equal to the image. According to the categories of priors, they are divided into the following three categories:
  • Sparsity-based priors: These methods viewed that the image especially the stripe is sparse, so different priors, such as gradient-based variation, dictionary-based learning, and low-rank recovery, are combined to constrain the models for pursuing the optimal approximate solution. For instance, Huang et al. [56] proposed a uniform mixed noise removal model by the employment of joint analysis and weighted synthesis sparsity priors (JAWS). Chang et al. [57] employed unidirectional total variation and sparse representation (UTVSR) to simultaneously destripe and denoise remote sensing images. Xiong et al. [58] proposed a spectral-spatial L 0 gradient regularized low-rank tensor factorization method for hyperspectral denoising. Zheng et al. [59] removed mixed noise in hyperspectral images via low-fibered-rank regularization. Liu et al. [60] used the global and local sparsity constraints for a unified model construction to simultaneously estimation intensity bias and remove stripe noise in noisy infrared images. Zeng et al. [61] proposed a hyperspectral image restoration model with global L 1 2 spatial-spectral total variation regularized local low-rank tensor recovery. Xie et al. [62] denoised hyperspectral images via non-convex regularized low-rank and sparse matrix decomposition. Hu et al. [63] proposed a restoration method that can simultaneously remove Gaussian noise and stripes using adaptive anisotropy total variation and nuclear norms. Wang et al. [64] presented a l 0 l 1 Hybrid total variation model for hyperspectral image mixed noise removal and compressed sensing. Wang et al. [65] exploited nonconvex logarithmic penalty for hyperspectral image denoising. These methods pursued their exciting denoising performance at the cost of expensively computational complexity.
  • Sparsity-based priors with joint of deep CNN denoiser prior: Recently, deep convolutional neural network (CNN) as a prior for a specialized task has been popular applied in various fileds, especially in image restoration, due to its fast speed and large modeling capacity. Such property had been induced as an image prior to solve the inverse problem of image restoration [66,67,68], and had a considerable advantage. Inspired by its encouraging performance, Huang et al. [69] exploited deep CNN prior with the combination of unidirectional variation prior (UV-DCNN) to simultaneously destriping and denoising optical remotely sensed images. Zeng et al. [70] used CNN denoiser prior regularized low-rank tensor recovery for hyperspectral image restoration. These unfolding image denoising methods interpreted a truncated unfolding optimization as an end-to-end trainable deep network and thus usually produced pleasing results with fewer iterations using additional training for each task [68].
  • Discriminative learning prior: As the Gaussian white noise and the stripe noise are both additive, so there are also various CNN-based denoising methods proposed to obtain both the image and the stripe. For example, He et al. [71] proposed a deep-learning approach to correct single-image-based nonuniformity in uncooled long-wave infrared detectors. Chang et al. [72] introduced a deep convolutional neural network (DCNN), named as HSI-DeNet, for HSIs’ noise removal. Zhang et al. [73] employed a spatial-spectral gradient network to remove hybrid noise in hyperspectral image. Luo et al. [74] suggested a spatial–spectral constrained deep image prior (S2DIP), which simultaneously capitalize the high model representation ability brought by the CNN in an unsupervised manner and does not need any extra training data. Despite the effectiveness of these methods, the CNN models are pretrained and cannot be jointly optimized with other parameters.
Inspired by the ideas in Refs. [66,67,68,69], in this paper, we proposed a unified mixed noise removal framework, named as Dual Denoiser Driven Convolutional Neural Networks (D3CNNs), to take advantages of both the optimization- and discriminative-learning-based methods. The flowchart of the proposed D3CNNs approach is shown in Figure 2. The main contributions of this paper are as follows:
  • A unified mixed noise removal (MNR) framework, named as Dual Denoiser Driven Convolutional Neural Networks (D3CNNs), is proposed by using the CNN based denoiser and striper priors.
  • Two deep denoiser/striper priors, respectively trained by a highly flexible U-shape denoiser and an effective residual learning strategy, are plugged as two modular parts into a half quadratic splitting based iterative algorithm to solve the inverse problem.
  • Quantitative and qualitative results of experiments on both synthetic and real-world images validate the effectiveness of the proposed mixed noise removal scheme and even outperforms other advanced denoising approaches.

3. Dual Denoiser Driven Convolutional Neural Networks

Although various variable splitting algorithms can be employed to solve model (7), in this paper, we adopt half quadratic splitting (HQS) method due to its simplicity and fast convergence [68].

3.1. Half Quadratic Splitting (HQS) Algorithm

In order to plug the denoiser prior as well as the striper prior into the optimization procedure of Equation (7), two auxiliary variables z 1 and z 2 are introduced in HQS to decouple the data term and prior terms of Equation (7) and to reformulate it as a constrained optimization problem given by
L x , s = 1 2 y x s 2 + λ 1 Φ z 1 + λ 2 Ψ z 2 , s . t . z 1 = x , z 2 = s .
Then, Equation (8) is solved by minimizing the following cost function
L μ 1 , μ 2 x , s = 1 2 y x s 2 + λ 1 Φ z 1 + μ 1 2 x z 1 2 + λ 2 Ψ z 2 + μ 2 2 s z 2 2 ,
where μ 1 and μ 2 are penalty parameters that vary iteratively in a non-descending order. The problem (9) can be addressed by the usage of the alternating direction method of multipliers (ADMM), which iteratively solves the following subproblems for each variable while keeping the rest variables fixed:
min x , s 1 2 y x s 2 + μ 1 2 z 1 x 2 + μ 2 2 z 2 s 2 ,
min z 1 λ 1 Φ z 1 + μ 1 2 z 1 x 2 ,
min z 2 λ 2 Ψ z 2 + μ 2 2 z 2 s 2 .
The subproblem (10) can be further separated into two subproblems, and then Equation (9) can be iteratively solved by the following subproblems:
x k + 1 = arg min x 1 2 y x s k 2 + μ 1 2 z 1 k x 2 ,
s k + 1 = arg min s 1 2 y x k + 1 s 2 + μ 2 2 z 2 k s 2 2 ,
z 1 k + 1 = arg min z 1 1 2 λ 1 λ 1 μ 1 μ 1 2 z 1 x k + 1 2 + Φ z 1 ,
z 2 k + 1 = arg min z 1 1 2 λ 2 λ 2 μ 2 μ 2 2 z 2 s k + 1 2 + Ψ z 2 .
As we can see, the data term and regularization term are separated into four individual subproblems. To be specific, Equations (13) and (14) are both quadratic regularized least-squares problems which have fast closed-form solutions:
x k + 1 = y + μ 1 z 1 k s k 1 + μ 1 ,
s k + 1 = y + μ 2 z 2 k x k + 1 1 + μ 2 .
According to Bayesian perspective [75], the subproblem (15) corresponds to Gaussian denoising on the image x k + 1 by a Gaussian denoiser with noise level λ 1 / μ 1 and the subproblem (16) corresponds to denoising the stripe image s k + 1 by a stripe restorer with noise level λ 2 / μ 2 . Consequently, the denoiser and the stripe restorer acted as two modular parts are plugged into the alternating iterations to solve Equation (8). To address this, Equations (15) and (16) can be rewritten as follows:
z 1 k + 1 = D e n o i s e r ( x k + 1 , λ 1 / μ 1 ) ,
z 2 k + 1 = s t r i p e r ( s k + 1 , λ 2 / μ 2 ) .
From Equations (19) and (20), two benefits can be observed. First, the priors Φ · and Ψ · can be implicitly replaced by a denoiser and a stripe restorer, respectively. Such a promising property can be jointly employed to solve many inverse problems, for instance, denoising and stripe restoration subproblems. Second, it is interesting to learn a DCNN denoiser and a DCNN stripe restorer to replace Equations (19) and (20), respectively, so as to utilize the advantages (such as high flexibility and efficiency as well as powerful modeling capacity) of DCNN.

3.2. U-Shape Denoiser Network

U-Net, known as an effective and efficient tool for image-to-image translation, fused multiscale features by concatenating the feature maps of the downsampling layers and the corresponding upsampling layers [68,76,77]. However, it may have two drawbacks. One is that the information may be lost when using stride convolution operation. The other is that its modeling capacity is limited. To capture as much information for constructing the corrupted pixel as possible, the receptive field usually needs to be successively enlarged by the employment of convolution in CNN, which can be solved by increasing the filter size or the depth. At present, the existing popular way is to use 3 × 3 filter with a large depth. However, this method may cause a highly computational burden. Therefore, we replace traditional convolution (Conv) with dilate convolution (DConv) such that DConv can enlarge the receptive field while inheriting the superiorities of 3 × 3 Conv. For addressing the second issue, the residual network is employed due to its superior modeling capacity by stacking multiple residual blocks [68]. By introducing DConv and integrating residual learning modular (RLM) into U-Net, the proposed denoiser prior network, named as U-shape denoiser network (USD-Net), is modeled, the flowchart of which is shown in Figure 3.
Note that: (1) In RLM, the batch normalization (BN) and rectified linear unit (ReLU) are respectively replaced by the momentum batch normalization (MBN) [78] and parametric rectified linear unit (PReLU), as MBN can solve the underfitting problem caused by a small batch size of BN while PReLU is utilized for the nonlinearity to generate high quality estimation with less filters. The dilation factors from the first layer to the last layer are respectively set to 1, 2, 3, 4, 3, 2, and 1, which aggregates multiscale contextual information without losing resolution or increasing the depth of the network. The equivalent receptive field of each layer is 3, 5, 7, 9, 7, 5, and 3. In RLM of each scale layer, five successive “MBN + PReLU” blocks are adopted. (2) The USD-Net has four-scale layers, 2 × 2 strided dilated convolution (SDConv) is employed between each downscaling layer while 2 × 2 transposed dilated convolution (TDConv) is exploited between each upscaling layer. (3) The same scale between SDConv and TDConv has an identity skip connection. (4) The channels in each scale layer are gradually increased from 64 to 128 to 256 to 512.

3.3. Stripe Estimation Network

For stripe estimation, a deep residual convolutional network (SE-Net), which had been proved efficiently and effectively in various fileds [33,34,79], is employed. Similar to USD-Net, the components BN and ReLU of the traditional blocks are respectively replaced by the MBN and PReLU. The architecture of SE-Net is shown in Figure 4, and its depth is 16; such deeper layers can enlarge receptive field to obtain more contextual information for constructing the stripe image precisely. According to the differences of the components and the channels in the blocks, the whole layer of the SE-Net is decomposed into six sublayers, for instance: (1) the first sublayer is the “Conv + PReLU” block including 64 filters with the size of 3 × 3 × 64 , each sublayer from the second to the six sublayers contains three blocks; (2) The sublayers from the second to the six layers, respectively named as CMP 1, CMP 2, CMP3, CMP 2, and CMP 1, are symmetrical according to the channels, each of which contains three components: Conv, MBN, and PReLU; (3) The channels from the first to the six sublayers are respectively 64, 64, 128, 256, 128, and 64 with dilation 1.

3.4. Loss Function

As there are two networks respectively for different tasks, so they are pretrained with the usage of loss functions to guarantee a stable convergence for favorable results. For accurately estimating the clean image and the stripe image, the global and local information are both important. To this point, the most widely used loss function (mean squared error, MSE)
LF M S E = 1 N i = 1 N Φ x i g i 2 2
is exploited as a global loss to perform global constraints, where Φ · denotes USD-Net, N represents the total number of training image pairs x i , g i i = 1 N , x i and g i respectively denote the ith latent clean image in the training database and its corresponding ground-truth image. Meanwhile, a local loss LF G is formed by the l 1 -norm on the gradient information for artifacts prevention and image structure preservation in the estimated result, which is formulated as
LF G = 1 N i = 1 N Φ x i g i 1 ,
where ∇ denotes the difference in the horizontal and vertical directions. Finally, the whole loss function of USD-Net is defined as
LF U S D N e t = LF M S E + β LF G
to be minimized for training the USD-Net Φ · , where β is the weighting parameter to balance the effects of the two losses. Such loss function (23) is also used to train the SE-Net Ψ · , where the USD-Net mapping function Φ , the latent clean image x, and the ground-truth image g in the global and local losses are respectively replaced by the SE-Net mapping function Ψ , the contaminated stripe s, and the ground-truth stripe T.
Taking all above procedures into account, we can conclude the optimization of the proposed dual denoiser driven convolutional neural networks for remotely sensed image restoration as shown in Algorithm 1.
Algorithm 1 The Optimization of Dual Denoiser Driven Convolutional Neural Networks for Remotely Sensed Image Restoration
 Initial Setting: Observed degraded image y, parameters λ 1 and λ 2 , iteration number K,
 initial noise level σ 1 0 and σ 2 0 , z 1 0 = z 2 0 = 0 , and two pretrained networks (denoiser in
 Equation (19) and striper in Equation (20).
while Convergence criterion Equations (24) and (25) or k K is not satisfied do
  1: Computing x k using Equation (17);
  2: Computing s k using Equation (18);
  3: Calculating z 1 k using Equation (19);
  4: Calculating z 2 k using Equation (20);
  5: Updating k: k = k + 1 .
end while
Output: Latent clean image x k and stripe s k .

4. Experimental Results and Discussion

4.1. Experimental Preparation

4.1.1. Experimental Environment and Data

The denoiser as well as stripe estimation models are trained in MATLAB (R2015b) environment with MatConvNet package [80] on a PC with Intel Core i7-5960X CPU 3.0 GHz 16.0 GB memory associated with a Nvidia GTX 1080Ti GPU. The test data are downloaded from [81,82,83]. In [81], the MODIS aboard Terra and Aqua level 1B data contain 36 spectral bands, in which the data of the 36th band are degraded by both stripe and AGWN noise. Hyperspectral data in [82] and multispectral data in [83] are selected to test the structure preservation ability. The selected test images are cropped to 162,752 patches of size 35 × 35 and are separated randomly into training and test images with a ratio of 8:2.

4.1.2. Experimental Parameters Setting

The denoiser learning network is trained by using stochastic gradient descent (SGD) [84] with a learning rate of 10 4 and is finished after 60 epochs while the stripe estimation network is trained by employing ADAM solver [85] with a learning rate of 10 5 and is finished after 300 epochs. The parameter β in Equation (23) is empirically set to 0.5. From Equation (13) to Equation (16), we can find that there are five involved parameters including two regularization parameters λ 1 and λ 2 , two penalty parameters μ 1 and μ 2 , and iteration number K to be set. Generally, the regularization parameters λ 1 and λ 2 come from the prior terms and keep fixed during iterations, and they are usually set as an empirical range for favorable performance, such as λ 1 0.21 , 0.53 and λ 2 0.19 , 0.57 . In our experiments, we fix them to 0.25 and 0.21, respectively. Theoretically, the noise level ( σ 1 = λ 1 / μ 1 or σ 2 = λ 2 / μ 2 ) is gradually decreasing during iterations, resulting in the continuous increment of the penalty parameters μ 1 or μ 2 . In this paper, the initial σ 1 0 is fixed to 29 and the final σ 1 K is determined by the image noise level (which is usually less than 29) while the initial σ 2 0 is set to 49 and the final σ 2 K is determined by the stripe’s intensity. Both σ 1 and σ 2 are uniformly sampled from the initial noise level to the final one in log space. The convergence criterion is as follows:
x k + 1 x k 2 2 x k 2 2 < η 1
and
s k + 1 s k 2 2 s k 2 2 < η 2 ,
where η 1 = 3 × 10 4 and η 2 = 2 × 10 4 in the following experiments. When the convergence criterion is not satisfied, the total iteration number K is set to 29, which is large enough to get a excited performance.

4.1.3. Compared Methods and Evaluation Indexes Selection

To verify the efficiency of the proposed D3CNNs, several state-of-the-arts derived from different categories are selected to be compared, including two-stage mixed noise removal (first denoising (Weighted Nuclear Norm Minimization, WNNM [26]), then destriping (Weighted Double-Sparsity Unidirectional Variation, WDSUV [49]) (WNNM-WDSUV), model-based methods (UTVSR [57] and JAWS [56]), semi-discrimination learning method (UV-DCNN [69]), and full-discrimination learning method (HSI-DeNet [72]). In the synthetic experiments, the synthesized RSIs are noised by AGWN ranged from 15 to 30 with a step size of 5 and stripe with three intensities (10, 30, and 50) as well as three proportions (0.1, 0.4, and 0.6), which are similar to those in Ref. [43]. Except for the visual comparisons, the objective indexes are also selected to quantitatively assess their ability of AGWN and stripe noise removal. For instance, two referenced indices, peak signal-to-noise ratio (PSNR) and structural similarity index (SSIM) [34], are employed to respectively assess the capability of noise reduction and structure preservation in synthetic experiments. While four reference-free metrics, mean of the mean relative deviation (MMRD) to evaluate the performance in retaining fine details of noise-free sharp regions [49], Q-Metric (QM) to evaluate the denoising performance [59], mean of the inverse coefficient of variation (MICV) to reflect the level of the remaining stripe noise in homogeneous regions [86], and natural image quality evaluator (NIQE) to evaluate the quality of the improved results [87], are employed for quantitative evaluation in the real-world experiments. In these indices, small MMRD and NIQE values depict that the estimated image quality is quite encouraged while large values of other metrics indicate that the results are pleasing.
In the following test experiments, eight synthetic RSIs (as shown in Figure 5) and seven real-life degraded RSIs (as shown in Figure 6) are selected to be experimented for the verification of the efficiency of the proposed D3CNNs strategy.

4.2. Discussion of Intermediate Results

Figure 7c–e and Figure 8h–j respectively provide the visual results of x k and s k at different iterations on the testing images from Figure 5, while both Figure 7f and Figure 8f show the PSNR convergence curves for x k and s k . From the figures, several observations can be concluded as follows: First, the deep denoiser and the deep striper priors play the important roles of noise removal and stripe estimation, leading to a noise-free image x and a clean stripe s. Second, compared with intermediate results, the final results including x and s contain more fine details while they are more visually similar to the ground-truths, meaning that Equations (13) and (14) can iteratively recover the details with the help of two deep priors. Third, according to Figure 7f and Figure 8f, x k and s k enjoy a fast convergence to a fixed point.

4.3. Experiments on Synthetic Rsis

4.3.1. Qualitative Evaluation

To subjectively assess the efficiency of the D3CNNs approach, we select the visual comparisons of five synthetic RSIs degraded by AGWN with different noise level and the stripe with different types (including period, proportion, and intensity), as shown in Figure 9, Figure 10, Figure 11 and Figure 12. From the figures, we can see that all of state-of-the-arts have great ability in denoising and removing stripe. However, their performance can be discriminated from the enlarged visual areas, of which the observations can be concluded as follows: First, many fine details, especially the structures along the stripe, of the image results yielded by the UTVSR (Figure 9a, Figure 10a and Figure 11) and WNNM-WDSUV (Figure 9b, Figure 10b and Figure 11) methods are over-smoothed. Second, the HSI-DeNet method estimates latent clean images from the view of image decomposition and preserves more details than the prior two approaches but is still subject to losing image details in stripe maps, resulting in the lost of many rich details, as shown in Figure 9c, Figure 10c, Figure 11 and Figure 12. Third, the UV-DCNN method plugs the DCNN denoiser prior into the UV model, which reduces the interference of AGWN on the recovery of the stripe to a great extent. As shown in Figure 9e, Figure 10e, Figure 11 and Figure 12, the image details in stripe maps become less while the details in estimated images are richer. Fourth, with deep feature analysis of image and stripe, the JAWS method uses the characteristics of both image and stripe as constrained priors for the construction of a unified model to restore image and stripe. Compared with the forward four method, the JAWS method produces the suitable visual results with more abundant details and more comfortable stripes, as shown in Figure 9f, Figure 10f, Figure 11 and Figure 12. Finally, the proposed D3CNNs method yields better promising image and stripe results (as shown in Figure 9g, Figure 10g, Figure 11 and Figure 12) on details preservation in image and stripe regularity, illustrating that our method is better than others in image restoration.

4.3.2. Quantitative Assessment

As noise levels especially the stripe’s types are different, we employ mean PSNR (MPSNR) and mean SSIM (MSSIM) to objectively evaluate each method. Such indices are defined as follows:
M P S N R = 1 N i 1 = 1 2 i 2 = 1 3 i 3 = 1 3 P S N R i 1 , i 2 , i 3 ,
and
M S S I M = 1 N i 1 = 1 2 i 2 = 1 3 i 3 = 1 3 S S I M i 1 , i 2 , i 3 ,
for each image at one noise level, where N = 2 × 3 × 3 = 18 , i 1 = 1 represents the periodical stripe while i 1 = 2 is the non-periodical stripe. i 2 = 1 , 2, and 3 respectively denote the stripe’s intensity as 10, 30, and 50 while i 3 = 1 , 2, and 3 respectively represent the stripe’s proportion as 0.1, 0.4, and 0.6. Quantitative MPSNR and MSSIM results are respectively compared in Table 1 and Table 2, from which the following two conclusions are involved: (1) for each image, both MPSNR and MSSIM are decreasing along with the increment of noise level, the larger noise level will seriously contaminate the images, making it more difficulty in the estimated results being closer to ground-truth. (2) The proposed D3CNNs approach generates the highest MPSNR and MSSIM values on each image at the same noise level, illustrating that it is effective and even better than the state-of-the-arts, which is consistent with visual comparisons.
Run time, an important index to assess the efficiency of algorithms, is tested on images with different sizes, the results of which are shown in Table 3. From the table, we can find several observations: First, the run time of the discriminative learning methods (the results are marked with color font), even in the CPU version, is faster than that of the model-based optimization approaches. Second, the pure discriminative learning (HSI-DeNet, its results are marked with red font) gets the fastest speed on either CPU or GPU version under the same image size, which is reasonable and is used to learn a specialized prior (such as denoiser prior in UV-DCNN, the results are marked with green font) to be plugged into the model-based optimization methods for improving the computation time and boosting the modeling ability. Third, with the help of two deep priors, the proposed D3CNNs generates the second fastest runtime, which is a little slower than that of HSI-DeNet, but it has much better MPSNR (as shown in Table 1) and MSSIM (as shown in Table 2) than HSI-DeNet.
According to the comprehensive consideration of the comparable results, we can get that D3CNNs is a flexible and faithful method for image restoration.

4.4. Applications to the Real-World Degraded Rsis

To further verify the efficiency of the proposed D3CNNs scheme, we apply it to the real-world degraded RSIs shown in Figure 6 and compare it to state-of-the-arts qualitatively and quantitatively. The visual comparisons of the estimated images and the calculated stripe maps are respectively shown in Figure 13 and Figure 14, from which we can find that the details of images produced by our method are richer while the stripes are cleaner and more regular than those produced by others. Such conclusions denote that the results decomposed by our method are more faithful and are closer to original clean maps including images and stripes.
Table 4 shows the comparisons of quantitative results generated by the state-of-the-arts, from which several observations can be concluded: First, all of them yield considerable QM values, illustrating that they perform well on denoising. Our method produces the highest QM value for each RSI, showing the strongest ability of rich detail preservation. Then, the small difference of MICV values on the same RSI represents that they are all well in stripe removal in homogeneous regions. Meanwhile, for the same RSI, our method produces the smallest MMRD value, reflecting that it can generate pleasing results with more fine details in noise-free sharp regions. Finally, the smallest NIQE value obtained by our method on the same RSI demonstrates that it can better improve the image quality than others.
In sum, the comparisons of quantitative associated with qualitative results contribute to validate the effectiveness of the proposed D3CNNs method.

5. Conclusions

Random noise (additive Gaussian white noise, AGWN) as well as stripe noise always coexists in remotely sensed images, increasing the difficulty in constructing the inverse problems. To cope with this problem and preserve as more details as possible in estimated remotely sensed image, this paper had proposed a novel dual denoiser driven convolutional neural networks (D3CNNs) which has the following key points: (1) two deep learning networks are trained for different specialized tasks, specially denoising and stripe estimation. (2) The prelearned modules are respectively employed as denoiser and striper priors plugged into the model-based optimization method of HQS to solve the image restoration problem. Experimental results had validated that the two deep powerful priors can improve the effective of model-based methods, with which the proposed D3CNNs strategy yields quite competitive visual and quantitative results compared to the state-of-the-arts. The satisfactory run time is suitable for further applications. Although the proposed method performs well on mixed noise removal (main for two additive noise), it is still to be further extended to reduce other noise, such as impulse noise and Poisson noise.

Author Contributions

Z.H.: Investigation, Writing—original draft. Z.Z.: Software. Z.W. and X.L.: Visualization, Investigation. B.X. and Y.Z.: Writing—review and editing. Z.Z.: formal analysis; H.F.: Conceptualization, Methodology. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the National Natural Science Foundation of China under Grant No. 61901309.

Data Availability Statement

Not applicable.

Acknowledgments

We appreciate the critical and constructive comments and suggestions from the reviewers that helped improve the quality of this manuscript.

Conflicts of Interest

The authors declare no conflict of interest.

References

  1. Li, Y.; Chen, W.; Zhang, Y.; Tao, C.; Xiao, R.; Tan, Y. Accurate cloud detection in high-resolution remote sensing imagery by weakly supervised deep learning. Remote Sens. Environ. 2020, 250, 112045. [Google Scholar] [CrossRef]
  2. Li, Y.; Zhang, Y.; Huang, X.; Zhu, H.; Ma, J. Large-scale remote sensing image retrieval by deep hashing neural networks. IEEE Trans. Geosci. Remote Sens. 2018, 56, 950–965. [Google Scholar] [CrossRef]
  3. Li, Y.; Kong, D.; Zhang, Y.; Tan, Y.; Chen, L. Robust deep alignment network with remote sensing knowledge graph for zero-shot and generalized zero-shot remote sensing image scene classification. ISPRS JPRS 2021, 179, 145–158. [Google Scholar] [CrossRef]
  4. Li, Y.; Zhou, Y.; Zhang, Y.; Zhong, L.; Wang, J.; Chen, J. DKDFN: Domain knowledge-guided deep collaborative fusion network for multimodal unitemporal remote sensing land cover classification. ISPRS JPRS 2022, 186, 170–189. [Google Scholar] [CrossRef]
  5. Li, Y.; Chen, W.; Huang, X.; Gao, Z.; Li, S.; He, T.; Zhang, Y. Mfvnet: Deep adaptive fusion network with multiple field-of-views for remote sensing image semantic segmentation. Sci. China Inf. Sci. 2022. [Google Scholar] [CrossRef]
  6. Huang, Z.; Zhang, Y.; Li, Q.; Zhang, T.; Sang, N.; Hong, H. Progressive dual-domain filter for enhancing and denoising optical remote sensing images. IEEE Geosci. Remote Sens. Lett. 2018, 15, 759–763. [Google Scholar] [CrossRef]
  7. Chang, Y.; Chen, M.; Yan, L.; Zhao, X.; Li, Y.; Zhong, S. Toward universal stripe removal via wavelet-based deep convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2020, 58, 2880–2897. [Google Scholar] [CrossRef]
  8. Rudin, L.I.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Physica D 1992, 60, 259–268. [Google Scholar] [CrossRef]
  9. Han, L.; Zhao, Y.; Lv, H.; Zhang, Y.; Liu, H.; Bi, G. Remote Sensing Image Denoising Based on Deep and Shallow Feature Fusion and Attention Mechanism. Remote Sens. 2022, 14, 1243. [Google Scholar] [CrossRef]
  10. Zhang, B.; Aziz, Y.; Wang, Z.; Zhuang, L.; Michael, K.N.; Gao, L. Hyperspectral Image Stripe Detection and Correction Using Gabor Filters and Subspace Representation. IEEE Geosci. Remote Sens. Lett. 2022, 19, 1–5. [Google Scholar] [CrossRef]
  11. Sun, L.; He, C.; Zheng, Y.; Wu, Z.; Jeon, B. Tensor Cascaded-Rank Minimization in Subspace: A Unified Regime for Hyperspectral Image Low-Level Vision. IEEE Trans. Image Process. 2023, 32, 100–115. [Google Scholar] [CrossRef]
  12. Yu, Y.; Samaki, B.; Rashidi, M.; Mohammadi, M.; Nguyen, T.; Zhang, G. Vision-based concrete crack detection using a hybrid framework considering noise effect. J. Build. Eng. 2022, 61, 105246. [Google Scholar] [CrossRef]
  13. Syam, T.; Muthalif, A. Magnetorheological Elastomer based torsional vibration isolator for application in a prototype drilling shaft. J. Low Freq. Noise Vib. Act. Control 2022, 41, 676–700. [Google Scholar] [CrossRef]
  14. Chambolle, A. An algorithm for total variation minimization and applications. J. Math. Imag. Vis. 2004, 20, 89–97. [Google Scholar]
  15. Chan, T.F.; Esedoglu, S. Aspects of total variation regularized li function approximation. SIAM J. Appl. Math. 2005, 65, 1817–1837. [Google Scholar] [CrossRef] [Green Version]
  16. Osher, S.; Burger, M.; Goldfarb, D.; Xu, J.; Yin, W. An iterative regularization method for total variation-based image restoration. Multiscale Model. Simul. 2005, 4, 460–489. [Google Scholar] [CrossRef]
  17. Peyre, G.; Bougleux, S.; Cohen, L.D. Non-local regularization of inverse problems. Inverse Probl. Imaging 2011, 5, 511–530. [Google Scholar] [CrossRef]
  18. Condat, L. Semi-local total variation for regularization of inverse problems. In Proceedings of the 2014 22nd European Signal Processing Conference (EUSIPCO), Lisbon, Portugal, 1–5 September 2014; pp. 1806–1810. [Google Scholar]
  19. Jidesh, P.; Shivarama, H.K. Non-local total variation regularization models for image restoration. Comput. Electr. Eng. 2018, 67, 114–133. [Google Scholar] [CrossRef]
  20. Elad, M.; Aharon, M. Image denoising via sparse and redundant representations over learned dictionaries. IEEE Trans. Image Process. 2006, 15, 3736–3745. [Google Scholar] [CrossRef]
  21. Mairal, J.; Elad, M.; Sapiro, G. Sparse representation for color image restoration. IEEE Trans. Image Process. 2008, 17, 53–69. [Google Scholar] [CrossRef] [Green Version]
  22. Mairal, J.; Bach, F.; Ponce, J.; Sapiro, G.; Zisserman, A. Non-local sparse models for image restoration. In Proceedings of the IEEE 12th International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, 10–17 October 2021; pp. 2272–2279. [Google Scholar]
  23. Dong, W.; Zhang, L.; Shi, G.; Li, X. Nonlocally centralized sparse representation for image restoration. IEEE Trans. Image Process. 2013, 22, 1620–1630. [Google Scholar] [CrossRef] [PubMed]
  24. Dong, W.; Shi, G.; Ma, Y.; Li, X. Image restoration via simultaneous sparse coding: Where structured sparsity meets Gaussian scale mixture. Int. J. Comput. Vis. (IJCV) 2015, 114, 217–232. [Google Scholar] [CrossRef]
  25. Huang, Z.; Li, Q.; Zhang, T.; Sang, N.; Hong, H. Iterative weighted sparse representation for X-ray cardiovascular angiogram image denoising over learned dictionary. IET Image Process. 2018, 12, 254–261. [Google Scholar] [CrossRef]
  26. Gu, S.; Zhang, L.; Zuo, W.; Feng, X. Weighted nuclear norm minimization with application to image denoising. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2862–2869. [Google Scholar]
  27. Xie, Y.; Gu, S.; Liu, Y.; Zuo, W.; Zhang, W.; Zhang, L. Weighted schatten p-norm minimization for image denoising and background subtraction. IEEE Trans. Image Process. 2016, 25, 4842–4857. [Google Scholar] [CrossRef] [Green Version]
  28. Wu, Z.; Wang, Q.; Jin, J.; Shen, Y. Structure tensor total variation-regularized weighted nuclear norm minimization for hyperspectral image mixed denoising. Signal Process. 2017, 131, 202–219. [Google Scholar] [CrossRef]
  29. Huang, Z.; Li, Q.; Fang, H.; Zhang, T.; Sang, N. Iterative weighted nuclear norm for X-ray cardiovascular angiogram image denoising. Signal Image Video Process. 2017, 11, 1445–1452. [Google Scholar] [CrossRef]
  30. Huang, T.; Dong, W.; Xie, X.; Shi, G.; Bai, X. Mixed Noise Removal via Laplacian Scale Mixture Modeling and Nonlocal Low-Rank Approximation. IEEE Trans. Image Process. 2017, 26, 3171–3186. [Google Scholar] [CrossRef] [PubMed]
  31. Yao, S.; Chang, Y.; Qin, X.; Zhang, Y.; Zhang, T. Principal component dictionary-based patch grouping for image denoising. J. Vis. Commun. Image Represent. 2018, 50, 111–122. [Google Scholar] [CrossRef]
  32. Chen, Y.; Pock, T. Trainable nonlinear reaction diffusion: A flexible framework for fast and effective image restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 1256–1272. [Google Scholar] [CrossRef] [Green Version]
  33. Zhang, K.; Zuo, W.; Chen, Y.; Meng, D.; Zhang, L. Beyond a Gaussian denoiser: Residual learning of deep CNN for image denoising. IEEE Trans. Image Process. 2017, 26, 3142–3155. [Google Scholar] [CrossRef] [Green Version]
  34. Zhang, K.; Zuo, W.; Gu, S.; Zhang, L. Learning deep CNN denoiser prior for image restoration. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 2808–2817. [Google Scholar]
  35. Zhang, K.; Zuo, W.; Zhang, L. FFDNet: Toward a fast and flexible solution for CNN-based image denoising. IEEE Trans. Image Process. 2018, 27, 4608–4622. [Google Scholar] [CrossRef] [PubMed]
  36. Tai, Y.; Yang, J.; Liu, X.; Xu, C. MemNet: A persistent memory network for image restoration. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4549–4557. [Google Scholar]
  37. Scetbon, M.; Elad, M.; Milanfar, P. Deep K-SVD Denoising. IEEE Trans. Image Process. 2021, 30, 5944–5955. [Google Scholar] [CrossRef] [PubMed]
  38. Zhang, H.; Yong, H.; Zhang, L. Deep convolutional dictionary learning for image denoising. In Proceedings of the Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 630–641. [Google Scholar]
  39. Carfantan, H.; Idier, J. Statistical linear destriping of satellite-based pushbroom-type images. IEEE Trans. Geosci. Remote Sens. 2010, 48, 1860–1872. [Google Scholar] [CrossRef]
  40. Fehrenbach, J.; Weiss, P.; Lorenzo, C. Variational algorithms to remove stationary noise: Applications to microscopy imaging. IEEE Trans. Image Process. 2012, 21, 860–1871. [Google Scholar] [CrossRef] [Green Version]
  41. Liu, X.; Lu, X.; Shen, H.; Yuan, Q.; Jiao, Y.; Zhang, L. Stripe noise separation and removal in remote sensing images by consideration of the global sparsity and local variational properties. IEEE Trans. Image Process. 2016, 54, 3049–3060. [Google Scholar] [CrossRef]
  42. Liu, X.; Lu, X.; Shen, H.; Yuan, Q.; Zhang, L. Oblique stripe removal in remote sensing images via oriented variation. arXiv 2018, arXiv:1809.02043. [Google Scholar]
  43. Chang, Y.; Yan, L.; Wu, T.; Zhong, S. Remote sensing image stripe noise removal: From image decomposition perspective. IEEE Trans. Geosci. Remote Sens. 2016, 54, 7018–7031. [Google Scholar] [CrossRef]
  44. Chang, Y.; Yan, L.; Zhong, S. Transformed low-rank model for line pattern noise removal. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 1726–1734. [Google Scholar]
  45. Chang, Y.; Yan, L.; Chen, B.; Zhong, S.; Tian, Y. Hyperspectral image restoration: Where does the low-rank property exist. IEEE Trans. Geosci. Remote Sens. 2021, 59, 6869–6884. [Google Scholar] [CrossRef]
  46. Chen, Y.; Huang, T.-Z.; Zhao, X.L.; Deng, L.J.; Huang, J. Stripe noise removal of remote sensing images by total variation regularization and group sparsity constraint. Remote Sens. 2017, 9, 559. [Google Scholar] [CrossRef] [Green Version]
  47. Chen, Y.; Huang, T.-Z.; Deng, L.J.; Zhao, X.-L.; Wang, M. Group sparsity based regularization model for remote sensing image stripe noise removal. Neurocomputing 2017, 267, 95–106. [Google Scholar] [CrossRef]
  48. Cao, W.; Chang, Y.; Han, G.; Li, J. Destriping remote sensing image via low-rank approximation and nonlocal total variation. IEEE Geosci. Remote Sens. Lett. 2018, 15, 848–852. [Google Scholar] [CrossRef]
  49. Song, Q.; Wang, Y.; Yan, X.; Gu, H. Remote sensing images stripe noise removal by double sparse regulation and region separation. Remote Sens. 2018, 10, 998. [Google Scholar] [CrossRef]
  50. Dhivya, R.; Prakash, R. Stripe noise separation and removal in remote sensing images. J. Comput. Theor. Nanosci. 2018, 15, 2724–2728. [Google Scholar] [CrossRef]
  51. Cui, H.; Jia, P.; Zhang, G.; Jiang, Y.; Li, L.; Wang, J.; Hao, X. Multiscale intensity propagation to remove multiplicative stripe noise from remote sensing images. IEEE Trans. Geosci. Remote Sens. 2020, 58, 2308–2323. [Google Scholar] [CrossRef]
  52. Kuang, X.; Sui, X.; Chen, Q.; Gu, G. Single infrared image stripe noise removal using deep convolutional networks. IEEE Photonics J. 2017, 9, 1–13. [Google Scholar] [CrossRef]
  53. Xiao, P.; Guo, Y.; Zhuang, P. Removing stripe noise from infrared cloud images via deep convolutional networks. IEEE Photonics J. 2018, 10, 1–14. [Google Scholar] [CrossRef]
  54. Zhong, Y.; Li, W.; Wang, X.; Jin, S.; Zhang, L. Satellite-ground intergraded destriping network: A new perspective for eo-1 hyperion and chinese hyperspectral satellite datasets. Remote Sens. Environ. 2020, 237, 111416. [Google Scholar] [CrossRef]
  55. Song, J.; Jeong, H.-H.; Park, D.-S.; Kim, H.-H.; Seo, D.-C.; Ye, J.C. Unsupervised denoising for satellite imagery using wavelet directional cyclegan. IEEE Trans. Geosci. Remote Sens. 2020, 59, 6823–6839. [Google Scholar] [CrossRef]
  56. Huang, Z.; Zhang, Y.; Li, Q.; Li, X.; Zhang, T.; Sang, N.; Hong, H. Joint analysis and weighted synthesis sparsity priors for simultaneous denoising and destriping optical remote sensing images. IEEE Trans. Geosci. Remote Sens. 2020, 58, 6958–6982. [Google Scholar] [CrossRef]
  57. Chang, Y.; Yan, L.; Fang, H.; Liu, H. Simultaneous destriping and denoising for remote sensing images with unidirectional total variation and sparse representation. IEEE Geosci. Remote Sens. Lett. 2014, 11, 1051–1055. [Google Scholar] [CrossRef]
  58. Xiong, F.; Zhou, J.; Qian, Y. Hyperspectral restoration via L0 gradient regularized low-rank tensor factorization. IEEE Geosci. Remote Sens. Lett. 2019, 57, 10410–10425. [Google Scholar] [CrossRef]
  59. Zheng, Y.; Huang, T.; Zhao, X.; Jiang, T.; Ma, T.; Ji, T. Mixed noise removal in hyperspectral image via low-fibered-rank regularization. IEEE Trans. Geosci. Remote Sens. 2020, 58, 734–749. [Google Scholar] [CrossRef]
  60. Liu, L.; Xu, L.; Fang, H. Simultaneous intensity bias estimation and stripe noise removal in infrared images using the global and local sparsity constraints. IEEE Trans. Geosci. Remote Sens. 2020, 58, 1777–1789. [Google Scholar] [CrossRef]
  61. Zeng, H.; Xie, X.; Cui, H.; Yin, H.; Ning, J. Hyperspectral image restoration via global L1−2 spatial-spectral total variation regularized local low-rank tensor recovery. IEEE Trans. Geosci. Remote Sens. 2020, 59, 3309–3325. [Google Scholar] [CrossRef]
  62. Xie, T.; Li, S.; Sun, B. Hyperspectral images denoising via nonconvex regularized low-rank and sparse matrix decomposition. IEEE Trans. Image Process. 2020, 29, 44–56. [Google Scholar] [CrossRef]
  63. Hu, T.; Li, W.; Liu, N.; Tao, R.; Zhang, F.; Scheunders, P. Hyperspectral image restoration using adaptive anisotropy total variation and nuclear norms. IEEE Trans. Geosci. Remote Sens. 2021, 59, 1516–1533. [Google Scholar] [CrossRef]
  64. Wang, M.; Wang, Q.; Chanussot, J.; Hong, D. l0l1 Hybrid total variation regularization and its applications on hyperspectral image mixed noise removal and compressed sensing. IEEE Trans. Geosci. Remote Sens. 2021, 59, 7695–7710. [Google Scholar] [CrossRef]
  65. Wang, S.; Zhu, Z.; Zhao, R.; Zhang, B. Hyperspectral image denoising via nonconvex logarithmic penalty. Math. Probl. Eng. 2021, 2021, 5535169. [Google Scholar] [CrossRef]
  66. Dong, W.; Zuo, W.; Zhang, D.; Zhang, L.; Yang, M.H. Simultaneous fidelity and regularization learning for image restoration. arXiv 2019, arXiv:1804.04522v4. [Google Scholar]
  67. Dong, W.; Wang, P.; Yin, W.; Shi, G.; Wu, F.; Lu, X. Denoising prior driven deep neural network for image restoration. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 2019, 41, 2305–2318. [Google Scholar] [CrossRef] [Green Version]
  68. Zhang, K.; Li, Y.; Zuo, W.; Zhang, L.; Timofte, R. Plug-and-play image restoration with deep denoiser prior. IEEE Trans. Pattern Anal. Mach. Intell. (TPAMI) 2021, 44, 6360–6376. [Google Scholar] [CrossRef] [PubMed]
  69. Huang, Z.; Zhang, Y.; Li, Q.; Li, Z.; Zhang, T.; Sang, N.; Xiong, S. Unidirectional variation and deep CNN denoiser priors for simultaneously destriping and denoising optical remote sensing images. Int. J. Remote Sens. 2019, 40, 5737–5748. [Google Scholar] [CrossRef]
  70. Zeng, H.; Xie, X.; Cui, H.; Yin, H.; Zhao, Y.; Ning, J. Hyperspectral image restoration via CNN denoiser prior regularized low-rank tensor recovery. Comput. Vis. Image Unders. 2020, 197–198, 103004. [Google Scholar] [CrossRef]
  71. He, Z.; Cao, Y.; Dong, Y.; Yang, J.; Cao, Y.; Tisse, C.-L. Single-image-based non-uniformity correction of uncooled long-wave infrared detectors: A deep-learning approach. Appl. Opt. 2018, 57, D155–D164. [Google Scholar] [CrossRef] [PubMed]
  72. Chang, Y.; Yan, L.; Fang, H.; Zhong, S.; Liao, W. HSI-DeNet: Hyperspectral image restoration via convolutional neural network. IEEE Trans. Geosci. Remote Sens. 2018, 57, 667–682. [Google Scholar] [CrossRef]
  73. Zhang, Q.; Yuan, Q.; Li, J.; Liu, X.; Shen, H.; Zhang, L. Hybrid noise removal in hyperspectral imagery with a spatial-spectral gradient network. IEEE Trans. Geos. Remote Sens. 2019, 57, 7317–7329. [Google Scholar] [CrossRef]
  74. Luo, Y.; Zhao, X.; Jiang, T.; Zheng, Y.; Chang, Y. Unsupervised hyperspectral mixed noise removal via spatial-spectral constrained deep image prior. arXiv 2021, arXiv:2008.09753. [Google Scholar] [CrossRef]
  75. Liu, H.; Fang, S.; Zhang, Z.; Li, D.; Lin, K.; Wang, J. MFDNet: Collaborative poses perception and matrix fisher distribution for head pose estimation. IEEE Trans. Multimed. 2021, 24, 2449–2460. [Google Scholar] [CrossRef]
  76. Ronneberger, O.; Fischer, P.; Brox, T. U-net: Convolutional networks for biomedical image segmentation. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Munich, Germany, 5–9 October 2015; pp. 234–241. [Google Scholar]
  77. Zhang, Y.; Yuan, L.; Wang, Y.; Zhang, J. SAU-Net: Efficient 3D spine MRI segmentation using inter-slice attention. Proc. Mach. Learn. Res. 2020, 121, 903–913. [Google Scholar]
  78. Yong, H.; Huang, J.; Meng, D.; Hua, X.; Zhang, L. Momentum batch normalization for deep learning with small batch size. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 224–240. [Google Scholar]
  79. He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  80. Vedaldi, A.; Lenc, K. MatConvNet: Convolutional neural networks for matlab. In Proceedings of the 23rd ACM Conference on Multimedia Conference, Brisbane, Australia, 26–30 October 2015; pp. 689–692. [Google Scholar]
  81. MODIS Data. Available online: https://modis.gsfc.nasa.gov/data/ (accessed on 30 January 2018).
  82. A Freeware Multispectral Image Data Analysis System. Available online: https://engineering.purdue.edu/~biehl/MultiSpec/hyperspectral.html (accessed on 30 January 2018).
  83. Gloabal Digital Product Sample. Available online: http://www.digitalglobe.com/product-samples (accessed on 30 January 2018).
  84. Tsuruoka, Y.; Tsujii, J.; Ananiadou, S. Stochastic gradient descent training for L1-regularized log-linear models with cumulative penalty. In Proceedings of the ACL 2009 the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, Singapore, 2–7 August 2009; pp. 477–485. [Google Scholar]
  85. Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2015, arXiv:1412.6980. [Google Scholar]
  86. Nichol, J.E.; Vohora, V. Noise over water surfaces in Landsat TM images. Int. J. Remote Sens. 2010, 25, 2087–2093. [Google Scholar] [CrossRef]
  87. Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a ‘completely blind’ image quality analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
Figure 1. A visual comparison example between a combination method and our unified scheme.
Figure 1. A visual comparison example between a combination method and our unified scheme.
Remotesensing 15 00443 g001
Figure 2. Flowchart of the proposed D3CNNs method.
Figure 2. Flowchart of the proposed D3CNNs method.
Remotesensing 15 00443 g002
Figure 3. Flowchart of the proposed U-shape denoiser network. “SDConv” represents strided DConv while “TDConv” is transposed DConv, “⊕” is an adding operation.
Figure 3. Flowchart of the proposed U-shape denoiser network. “SDConv” represents strided DConv while “TDConv” is transposed DConv, “⊕” is an adding operation.
Remotesensing 15 00443 g003
Figure 4. Flowwork of the proposed stripe estimation network.
Figure 4. Flowwork of the proposed stripe estimation network.
Remotesensing 15 00443 g004
Figure 5. Eight testing images respectively from Terra MODIS data ((ac) named as STM1, STM2, and STM3), Aqua MODIS data ((df) named as SAM1, SAM2, and SAM3), Hyperspectral data ((g) Washington DC Mall, SWDCM), and Multispectral data ((h) Washington DC, SWDC).
Figure 5. Eight testing images respectively from Terra MODIS data ((ac) named as STM1, STM2, and STM3), Aqua MODIS data ((df) named as SAM1, SAM2, and SAM3), Hyperspectral data ((g) Washington DC Mall, SWDCM), and Multispectral data ((h) Washington DC, SWDC).
Remotesensing 15 00443 g005
Figure 6. Eight real-life images respectively from Aqua MODIS data ((ac) named as RAM1, RAM2, and RAM3), Terra MODIS data ((df) named as RTM1, RTM2, and RTM3), and Hyperspectral ((g) urban) data.
Figure 6. Eight real-life images respectively from Aqua MODIS data ((ac) named as RAM1, RAM2, and RAM3), Terra MODIS data ((df) named as RTM1, RTM2, and RTM3), and Hyperspectral ((g) urban) data.
Remotesensing 15 00443 g006
Figure 7. The intermediate results of x k and s k on the SWDCM image in Figure 5g at different iterations. Where noise level σ = 25 , the proportion P and the intensity I of the periodical stripe are respectively 0.1 and 50, and the width of each stripe is 20 pixels. (a) Ground-truth. (b) Degraded SWDCM image. (ce) Visual images respectively at the 5rd, 13th, and 19th. (f) Convergence curves of x and s. (g) Ground-truth stripe. (hj) Visual stripes respectively at the 5th, 13th, and 19th iteration.
Figure 7. The intermediate results of x k and s k on the SWDCM image in Figure 5g at different iterations. Where noise level σ = 25 , the proportion P and the intensity I of the periodical stripe are respectively 0.1 and 50, and the width of each stripe is 20 pixels. (a) Ground-truth. (b) Degraded SWDCM image. (ce) Visual images respectively at the 5rd, 13th, and 19th. (f) Convergence curves of x and s. (g) Ground-truth stripe. (hj) Visual stripes respectively at the 5th, 13th, and 19th iteration.
Remotesensing 15 00443 g007
Figure 8. The intermediate results of x k and s k on the STM1 image in Figure 5a at different iterations. Where noise level σ = 30 , the proportion P and the intensity I of the non-periodical stripe are respectively 0.4 and 30. (a) Ground-truth. (b) Degraded STM1 image. (ce) Visual images respectively at the 3rd, 11th, and 21th. (f) Convergence curves of x and s. (g) Ground-truth stripe. (hj) Visual stripes respectively at the 3rd, 11th, and 21th iteration.
Figure 8. The intermediate results of x k and s k on the STM1 image in Figure 5a at different iterations. Where noise level σ = 30 , the proportion P and the intensity I of the non-periodical stripe are respectively 0.4 and 30. (a) Ground-truth. (b) Degraded STM1 image. (ce) Visual images respectively at the 3rd, 11th, and 21th. (f) Convergence curves of x and s. (g) Ground-truth stripe. (hj) Visual stripes respectively at the 3rd, 11th, and 21th iteration.
Remotesensing 15 00443 g008
Figure 9. Visual comparisons on the SWDCM image in Figure 5g. Where noise level σ = 25 , the proportion P and the intensity I of the periodical stripe are respectively 0.1 and 50. (a) Upper: Degraded SWDCM image. Down: Ground-truth stripe. (bg) Estimated images and stripes respectively produced by the UTVSR, WNNM-WDSUV, HSI-DeNet, UV-DCNN, JAWS, and proposed methods.
Figure 9. Visual comparisons on the SWDCM image in Figure 5g. Where noise level σ = 25 , the proportion P and the intensity I of the periodical stripe are respectively 0.1 and 50. (a) Upper: Degraded SWDCM image. Down: Ground-truth stripe. (bg) Estimated images and stripes respectively produced by the UTVSR, WNNM-WDSUV, HSI-DeNet, UV-DCNN, JAWS, and proposed methods.
Remotesensing 15 00443 g009
Figure 10. Visual comparisons on the STM1 image in Figure 5a. Where noise level σ = 30 , the proportion P and the intensity I of the non–periodical stripe are respectively 0.4 and 30. (a) Upper: Degraded STM1 image. Down: Ground-truth stripe. (bg) Estimated images and stripes respectively produced by the UTVSR, WNNM-WDSUV, HSI-DeNet, UV-DCNN, JAWS, and proposed methods.
Figure 10. Visual comparisons on the STM1 image in Figure 5a. Where noise level σ = 30 , the proportion P and the intensity I of the non–periodical stripe are respectively 0.4 and 30. (a) Upper: Degraded STM1 image. Down: Ground-truth stripe. (bg) Estimated images and stripes respectively produced by the UTVSR, WNNM-WDSUV, HSI-DeNet, UV-DCNN, JAWS, and proposed methods.
Remotesensing 15 00443 g010
Figure 11. Visual comparisons of the estimated results on the synthetic SAM1, SWDC, and SWDC images. For SAM1: σ = 15 , P = 0.4 , and I = 10 . For STM2: σ = 20 , P = 0.4 , and I = 30 . For SWDC: σ = 25 , P = 0.6 , and I = 30 . The stripes on SAM1 and SWDC are nonperiodical while the stripe on STM2 is periodical.
Figure 11. Visual comparisons of the estimated results on the synthetic SAM1, SWDC, and SWDC images. For SAM1: σ = 15 , P = 0.4 , and I = 10 . For STM2: σ = 20 , P = 0.4 , and I = 30 . For SWDC: σ = 25 , P = 0.6 , and I = 30 . The stripes on SAM1 and SWDC are nonperiodical while the stripe on STM2 is periodical.
Remotesensing 15 00443 g011
Figure 12. Visual comparisons of the estimated stripes corresponding to the SAM1, STM2, and SWDC images in Figure 11.
Figure 12. Visual comparisons of the estimated stripes corresponding to the SAM1, STM2, and SWDC images in Figure 11.
Remotesensing 15 00443 g012
Figure 13. Visual comparisons of the estimated results on the real-world degraded RAM1, RTM1, and Hyperspectral urban images in Figure 6.
Figure 13. Visual comparisons of the estimated results on the real-world degraded RAM1, RTM1, and Hyperspectral urban images in Figure 6.
Remotesensing 15 00443 g013
Figure 14. Visual comparisons of the estimated stripes corresponding to the RAM1, RTM1, and Hyperspectral urban images in Figure 13.
Figure 14. Visual comparisons of the estimated stripes corresponding to the RAM1, RTM1, and Hyperspectral urban images in Figure 13.
Remotesensing 15 00443 g014
Table 1. Quantitative MPSNR comparisons of the state-of-the-arts on the synthetic optical RSIs shown in Figure 5.
Table 1. Quantitative MPSNR comparisons of the state-of-the-arts on the synthetic optical RSIs shown in Figure 5.
MethodsSTM1STM2STM3
σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30
UTVSR28.1426.8625.925.2229.9628.8327.5827.133.1232.0431.0530.25
WNNM-WDSUV28.8227.5226.6225.9430.9129.6328.7528.0934.5533.4232.6131.98
HSI-DeNet29.0227.7126.7726.0431.0229.7228.7828.134.7133.5632.7232.08
UV-DCNN29.0527.7726.8926.2431.1329.872928.3434.7833.632.8432.11
JAWS29.2427.8626.9326.2531.2129.9329.0328.3634.8633.7132.9332.28
Proposed29.6828.1527.4626.7631.5930.4729.7328.6935.1834.0633.3832.79
MethodsSAM1SAM2SAM3
σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30
UTVSR28.6727.1425.9825.2128.4627.1326.0225.2928.1226.9226.1125.27
WNNM-WDSUV29.3127.7526.6225.7429.2527.8126.8426.0929.0127.7426.8826.25
HSI-DeNet29.2427.7726.6825.8229.3627.9826.9526.1429.1727.8826.9826.29
UV-DCNN29.4728.0126.892629.4528.1127.1426.429.3127.9727.1326.5
JAWS29.5227.9926.8225.9729.5828.1827.226.4429.3228.0527.1726.52
Proposed29.7528.6427.3326.4730.0228.8327.9127.0829.8428.6827.7927.01
MethodsSWDCMSWDCM
σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30
UTVSR28.532725.6824.831.0329.7128.8528.33
WNNM-WDSUV29.4427.7626.5125.5332.4130.9629.8729.03
HSI-DeNet29.527.9126.7225.7532.6131.230.0629.16
UV-DCNN29.5227.8726.5725.6932.6931.2730.2629.27
JAWS29.6228.0226.7725.8432.8331.4130.3329.48
Proposed29.8428.3327.0626.3733.1731.7630.829.97
Table 2. Quantitative MSSIM comparisons of the state-of-the-arts on the synthetic optical RSIs shown in Figure 5.
Table 2. Quantitative MSSIM comparisons of the state-of-the-arts on the synthetic optical RSIs shown in Figure 5.
MethodsSTM1STM2STM3
σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30
UTVSR0.78720.72510.66260.62050.8150.75920.71180.68250.87660.85870.84820.8342
WNNM-WDSUV0.81960.77480.72090.68420.82370.77340.74760.71490.88210.8720.85670.8413
HSI-DeNet0.82160.78430.72570.68610.8320.77710.74710.71570.88260.87150.85590.8424
UV-DCNN0.84660.79440.73180.68980.84330.7820.74820.71790.88950.87280.85770.8426
JAWS0.84720.80030.73560.69170.84820.78670.74880.71870.89190.87240.85850.8462
Proposed0.85180.81260.74150.71720.85270.80370.75240.73180.90110.88290.86170.8534
MethodsSAM1SAM2SAM3
σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30
UTVSR0.89080.85240.82160.79130.8050.72490.64520.58860.7480.67720.59810.5562
WNNM-WDSUV0.89980.86620.83120.8060.81270.76150.70940.67530.78460.71790.66610.6229
HSI-DeNet0.9050.86460.83350.80540.82250.76480.71370.67830.77630.71780.66730.6243
UV-DCNN0.89960.86760.83090.80960.82660.77370.71680.68020.7940.71960.67180.6277
JAWS0.90640.86980.83340.80730.82720.77550.71550.67930.79920.71920.67280.6283
Proposed0.91720.88130.85940.82160.83530.79620.73670.70150.81270.75330.71190.6527
MethodsSWDCMSWDCM
σ = 15 σ = 20 σ = 25 σ = 30 σ = 15 σ = 20 σ = 25 σ = 30
UTVSR0.89120.82410.78020.73840.8610.82870.78070.7477
WNNM-WDSUV0.89370.84860.80940.78080.86450.830.79710.7725
HSI-DeNet0.90030.85250.81230.78320.86190.83190.79760.7743
UV-DCNN0.89780.85390.81390.78580.86510.84680.80660.7805
JAWS0.90140.85490.81450.78330.86470.84760.80790.7817
Proposed0.91530.87920.83860.81240.88780.86290.82370.8019
Table 3. Average computation time (Unit: Seconds) comparison on different image size. For learning-based methods, the computation times on CPU/GPU are both presented.
Table 3. Average computation time (Unit: Seconds) comparison on different image size. For learning-based methods, the computation times on CPU/GPU are both presented.
Image SizeMethods
UTVSRWNNM-WDSUVHSI-DeNetUV-DCNNJAWSProposed
256 × 256 653.923108.7141.048/0.0161.077/0.035874.6751.068/0.024
512 × 512 2674.641440.2835.869/0.0277.953/0.1422937.4246.667/0.073
Table 4. Quantitative comparisons of the state-of-the-arts on the real-life optical RSIs shown in Figure 6.
Table 4. Quantitative comparisons of the state-of-the-arts on the real-life optical RSIs shown in Figure 6.
IndexesMethodsRAM1RAM2RAM3RTM1RTM2RTM3Urban
QMUTVSR25.1832.3711.8212.5712.8323.7827.58.
WNNM-WDSUV25.4732.6912.2413.0813.2624.3328.81
HSI-DeNet26.3933.9712.8913.6713.8124.8530.49
UV-DCNN26.6234.5713.4814.1314.3725.6131.14
JAWS26.7934.7814.1714.5214.8825.9431.63
Proposed27.1135.2714.7215.1815.4926.3732.16
MICVUTVSR35.7233.2638.1937.5436.4736.7929.46
WNNM-WDSUV35.9133.6738.4237.7636.6536.9229.58
HSI-DeNet36.1533.8338.7937.9336.8937.2729.84
UV-DCNN36.2934.0838.9438.1737.0937.5530.09
JAWS36.6734.4139.1838.5437.5137.8630.57
Proposed36.8634.7439.4938.8237.7438.2831.07
MMRDUTVSR0.450.670.0510.0580.0620.360.57
WNNM-WDSUV0.390.520.0460.0510.0550.320.53
HSI-DeNet0.370.0490.0410.0470.0480.290.48
UV-DCNN0.330.0410.0370.0420.0430.270.44
JAWS0.290.0360.0310.0360.0380.220.35
Proposed0.210.0260.0220.0270.0290.190.27
NIQEUTVSR7.037.374.184.224.266.737.19
WNNM-WDSUV6.947.184.054.114.176.677.07
HSI-DeNet6.817.063.944.084.096.516.91
UV-DCNN6.676.843.783.833.856.176.39
JAWS6.466.613.623.673.715.956.08
Proposed6.236.333.283.363.395.285.62
For each image, the best result is marked with bold font.
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Huang, Z.; Zhu, Z.; Wang, Z.; Li, X.; Xu, B.; Zhang, Y.; Fang, H. D3CNNs: Dual Denoiser Driven Convolutional Neural Networks for Mixed Noise Removal in Remotely Sensed Images. Remote Sens. 2023, 15, 443. https://doi.org/10.3390/rs15020443

AMA Style

Huang Z, Zhu Z, Wang Z, Li X, Xu B, Zhang Y, Fang H. D3CNNs: Dual Denoiser Driven Convolutional Neural Networks for Mixed Noise Removal in Remotely Sensed Images. Remote Sensing. 2023; 15(2):443. https://doi.org/10.3390/rs15020443

Chicago/Turabian Style

Huang, Zhenghua, Zifan Zhu, Zhicheng Wang, Xi Li, Biyun Xu, Yaozong Zhang, and Hao Fang. 2023. "D3CNNs: Dual Denoiser Driven Convolutional Neural Networks for Mixed Noise Removal in Remotely Sensed Images" Remote Sensing 15, no. 2: 443. https://doi.org/10.3390/rs15020443

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop