Convolutional Neural Network and Guided Filtering for SAR Image Denoising

Coherent noise often interferes with synthetic aperture radar (SAR), which has a huge impact on subsequent processing and analysis. This paper puts forward a novel algorithm involving the convolutional neural network (CNN) and guided filtering for SAR image denoising, which combines the advantages of model-based optimization and discriminant learning and considers how to obtain the best image information and improve the resolution of the images. The advantages of proposed method are that, firstly, an SAR image is filtered via five different level denoisers to obtain five denoised images, in which the efficient and effective CNN denoiser prior is employed. Later, a guided filtering-based fusion algorithm is used to integrate the five denoised images into a final denoised image. The experimental results indicate that the algorithm cannot eliminate noise, but it does improve the visual effect of the image significantly, allowing it to outperform some recent denoising methods in this field.


Introduction
Synthetic aperture radar (SAR) is a significant coherent imaging system that generates high-resolution images of terrain and targets.Since SAR possesses inherent all-time and all-weather features that can overcome the shortcomings of the optical and infrared systems, it is widely used in ocean monitoring, resource exploration, and military development.Multiplicative noise, called speckle, often interferes with SAR images.Speckle is formed by interference echo of each resolving unit and brings difficulties to the analysis and processing on computer vision systems.Therefore, removing the coherent noise is very important for applications in the SAR image field.
Over the past few decades, scholars have proposed a lot of methods for SAR image denoising.Some denoising methods are based on spatial filtering, for example, Lee filtering [1], Kuan filtering [2], Frost filtering [3], Gamma maximum a posteriori (MAP) filtering [4], and non-local means (NLM) denoising [5].Since the spatial filtering tends to darken the denoised SAR images, denoising algorithms based on the transform domain have been developed and have had remarkable achievements in recent years.These transform domain filters are mainly based on wavelet transform and multi-scale geometric transforms, such as wavelet-domain Bayesian denoising [6], contourlet-domain SAR image denoising [7,8], Shearlet-domain SAR image denoising [9][10][11], and so on.The general procedure of Recently, scholars have proposed many image fusion algorithms.Data-driven image fusion methods and multi-scale image fusion methods are the two most popular image fusion methods [27].However, these methods do not fully consider spatial consistency and therefore tend to produce brightness and color distortion.Then, a variety of optimization-based image fusion methods were introduced, for example, image fusion algorithms based on generalized random walks [28], which can utilize the spatial information of an image fully.These methods try to estimate the weights of pixels in different source images via energy functions that work on the same positions in different source images, and then the source images are fused into one image through the weighted average of the pixel values.Nevertheless, the optimization-based methods are affected by the computational complexity because, to find a global optimal solution, several iterations are needed.Another disadvantage is that methods based on global optimization tend to over-smooth the weights, which is harmful to image fusion [29,30].To overcome the above-mentioned problems, Li et al. [30] proposed a method called fusion based on guided filtering (GFF), which is able to combine the pixel saliency with the spatial information of the image to produce image fusion without relying on specific image decomposition to achieve the rapid fusion of images.Thus, the GFF algorithm is employed to fuse the denoised images after using the CNN denoisers in our paper.
Traditionally, most of the noise suppression algorithms need to know the variance of noise.Normally, it is difficult to estimate the level of noise of SAR images, but the level of noise has a great influence on denoising.As an example, the performance of the denoising algorithm proposed in [12] relies on the noise level of the image.Generally, when the estimated noise level is larger than the ground truth noise level of the image, the last denoised image relying on the model-based denoising algorithm tends to be over-smoothed [31], and if the estimated noise level is smaller than the ground truth noise level of the image, the final denoised image contains more noise and artificial textures [31,32].That is to say, for SAR image denoising, there would be many speckles left after the image processing through the denoisers at a low noise level, and the final denoised image obtained via the denoiser at a high noise level would appear over-smoothed as a side effect.However, a feasible idea [32] is to fuse the denoised images obtained from different denoising algorithms through a fusion algorithm to achieve a superior performance.Obviously, CNN prior denoising algorithms that rely on different levels of noise training and processing can work as different denoising algorithms, while the fusion algorithm based on guided filtering is also a fast and advanced fusion algorithm.Therefore, according to this idea, this paper proposes a new SAR image denoising algorithm based on convolutional neural networks and guided filtering.Firstly, the algorithm chooses five noise level CNN prior denoisers to denoise the SAR image and then fuses the denoised images through the GFF fusion algorithm to obtain the final denoised image.Compared with the traditional despeckling methods and CNN based despeckling methods, the most obvious advantage of the proposed algorithm is the combination of model-based optimization method and discriminant learning method.The discriminant denoisers, which are obtained by CNN, are plugged in the model-based optimization method to solve the speckle suppression problem.It not only can suppress the speckle like the model-based optimization method, but also has the advantage of the discriminative learning method, which is fast.The experimental results show that the algorithm can remove noise effectively and retain the detailed texture in the final images.

Image Denoising Model
In general, recovering an underlying clean image x is the purpose of image denoising from a degraded observation model, y = x + v, where y represents the observed image, and v is additive white Gaussian noise whose standard deviation is σ.Therefore, the denoising problem can be converted to the following energy minimization problem [12]: where 1 2 y − x 2 is the fidelity term, and it ensures the similarity between the denoised image and the source image.Φ(x) is a regularization term to suppress noise, and it contains image prior information.That is to say, the fidelity term ensures that the solution conforms to the degradation process, and the regularization term implements the expected result of the output.λ is a trade-off parameter to balance the relationship between the fidelity term and the regularization term.
Generally, the algorithms for solving Equation ( 1) can be divided into two categories: discriminative learning algorithms and model-based optimization algorithms.The model-based optimization methods aim to directly solve Equation (1) with some optimization algorithms that usually involve a time-consuming iterative inference.On the contrary, discriminative learning methods try to learn the prior parameters Θ and a compact inference through an optimization of a loss function on a training set containing degraded-clean image pairs [12].The objective is generally given by min Usually, model-based optimization algorithms are able to handle noise suppression flexibly through specific degradation matrix, which tends to be time-consuming.On the contrary, discriminant learning algorithms, with the sacrifice of flexibility, can achieve not only relatively faster speeds, but also a superior denoising effect due to their combined optimization with end-to-end training [12].Therefore, it is an intuitive idea to take advantage of both categories for denoising.The half quadratic splitting [33] algorithm is used to combine the two methods to solve the inverse problem of the images.Based on this framework, we only describe the denoising model that is based on a CNN prior.
In order to plug the CNN denoisers into the optimization procedure of Equation (1), we can insert the denoiser prior into the iterative scheme to separate the fidelity term and the regularization term according to half quadratic splitting algorithm.Equation (1) can be transformed into a sub-problem related to the fidelity term and a denoising sub-problem.Equation (1) can be redefined as an optimization problem by introducing auxiliary variables z as follows: Then, Equation (3) can be solved by the half quadratic splitting algorithm.We firstly construct the following cost function: where the penalty parameter µ is slightly iteratively adjusted in a non-descending order [12,31].That is, the solution of Equation ( 3) can be obtained by minimizing Equation (4).Since the condition of unconstrained optimization can be solved, it is clear that Equation (4) can be solved by using the Karush-Kuhn-Tucker (KKT) condition.The most direct algorithm is the alternating direction method of multipliers (ADMM), and ADMM is an algorithm that solves convex optimization problems by breaking them into smaller pieces, each of which is then easier to handle.It has recently found application in a number of areas such as image recovery and autoregressive identification in neuroimaging time series [33][34][35][36].If the fixing of z = z k and λΦ(z) = λΦ(z k ) is constant, Equation ( 4) can be converted to Equation (5), that is, If the fixing of x = x k+1 , and 1 2 y − x 2 = 1 2 y − x k+1 2 is constant, the solution of minimizing L µ (x, z) can be converted to Equation (6), that is, It can be seen that the fidelity term and regularization item are divided into two separate sub-problems.Obviously, z k can be regarded as a constant in the process of solving Equation (5).The solution of Equation ( 5) is the same as the solution of minimizing f We conducted a derivative operation for f (x).That is, We divided both sides of Equation ( 6) simultaneously by λ, and the operation has no effect on z k+1 , which can be redefined as From the Bayesian maximum posterior probability, Equation ( 9) denotes that the image x k+1 can be denoised by a Gaussian denoiser with the noise level λ/µ [12].Thus, any Gaussian denoiser that can serve as a modular part solves Equation (1).This means that we can obtain the denoised image z k+1 from x k+1 by using any denoiser.Denoiser() is a denoiser function, and we rewrite Equation (9) as z k+1 = Denoiser(x k+1 , λ/µ) (10) From Equations ( 9) and (10), we find that the regularization term 1 2 y − x 2 constructed by the image prior can be implicitly replaced by other denoisers.Obviously, even the regularization term Φ(x) is unknown, and Equation (10) can also be solved by denoisers containing complementary image priors.In this paper, the CNN denoisers trained in [12] are employed, and they are introduced in the coming section.

CNN Denoiser
The architecture of CNN in this paper is the same as that in [12], as shown in Figure 1.It has three different modules consisting of a seven-layer network.In Figure 1, "DCon-s" indicate dilated convolutions, and s = 1, 2, 3, and 4, "BN" denotes batch normalization, and "ReLU" represents rectified linear units.The first layer is "Dilated Convolution + ReLU", and the dilated convolution operation is as follows: Each parameter in the convolution kernel expands according to the inflation factor in the four directions of up, down, left, and right.The number of convolution kernel parameters does not change, and the receptive field becomes large.An example for dilated convolution is given in Figure 2. Figure 2a is a normal convolution.After 1-dilated convolution, we can get a receptive field of 3 × 3. Figure 2b is obtained from Figure 2a through 2-dilated convolution, and we can get a receptive field of 7 × 7. Figure 2c  Firstly, the network model uses the dilated convolution to balance the size of the receptive field and the depth of the network.We know about the dilated convolution because of its expansion ability in the receptive field while retaining the advantages of a traditional 3 3 convolution.An expansion filter with an expansion factor s can be simply translated to a s s sparse filter, where only nine fixed position terms can be non-zero.Therefore, 3, 5, 7, 9, 7, 5, and 3 are the equivalent receptive domains for each layer, respectively.We know that the size of the receptive field is
Secondly, residual learning and batch normalization are used for the network model to speed up the training.They are two of the most influential architecture design techniques in CNN network structure design.The combination of these two techniques means that the CNN does not have immediate stabilizing training, but it can more easily produce a better denoising performance [12].
Thirdly, the network model uses small size training samples to prevent boundary artifacts.Owing to the characteristics of convolution operations, image boundary processing may result in boundary artifacts when denoised images of CNN are not properly processed.Zhang et al. [12] found that using the zero padding boundary expansion strategy by utilizing a small training sample size helps to prevent boundary artifacts, since the small blocks allow the CNN to use more boundary information.Therefore, an image block is cropped into 35


small non-overlapping blocks in the network model to strengthen the boundary information of the image.
For training the CNN, we used a dataset consisting of 400 Berkeley segmentation dataset images of size 180 × 180, as mentioned in [12].For convenience, we converted the images to gray images, and we cropped the images into small patches of size 35 × 35 and selected 12,000 patches for training.As for the generation of corresponding noisy patches, we achieved this by adding additive Gaussian noise to the clean patches during training.During training, the loss function of CNN was the same as the loss function in [12].Firstly, the network model uses the dilated convolution to balance the size of the receptive field and the depth of the network.We know about the dilated convolution because of its expansion ability in the receptive field while retaining the advantages of a traditional 3 3 convolution.An expansion filter with an expansion factor s can be simply translated to a sparse filter, where only nine fixed position terms can be non-zero.Therefore, 3, 5, 7, 9, 7, 5, and 3 are the equivalent receptive domains for each layer, respectively.We know that the size of the receptive field is
Secondly, residual learning and batch normalization are used for the network model to speed up the training.They are two of the most influential architecture design techniques in CNN network structure design.The combination of these two techniques means that the CNN does not have immediate stabilizing training, but it can more easily produce a better denoising performance [12].
Thirdly, the network model uses small size training samples to prevent boundary artifacts.Owing to the characteristics of convolution operations, image boundary processing may result in boundary artifacts when denoised images of CNN are not properly processed.Zhang et al. [12] found that using the zero padding boundary expansion strategy by utilizing a small training sample size helps to prevent boundary artifacts, since the small blocks allow the CNN to use more boundary information.Therefore, an image block is cropped into 35


small non-overlapping blocks in the network model to strengthen the boundary information of the image.
For training the CNN, we used a dataset consisting of 400 Berkeley segmentation dataset images of size 180 × 180, as mentioned in [12].For convenience, we converted the images to gray images, and we cropped the images into small patches of size 35 × 35 and selected 12,000 patches for training.As for the generation of corresponding noisy patches, we achieved this by adding additive Gaussian noise to the clean patches during training.During training, the loss function of CNN was the same as the loss function in [12].Firstly, the network model uses the dilated convolution to balance the size of the receptive field and the depth of the network.We know about the dilated convolution because of its expansion ability in the receptive field while retaining the advantages of a traditional 3 × 3 convolution.An expansion filter with an expansion factor s can be simply translated to a (2s + 1)(2s + 1) sparse filter, where only nine fixed position terms can be non-zero.Therefore, 3, 5, 7, 9, 7, 5, and 3 are the equivalent receptive domains for each layer, respectively.We know that the size of the receptive field is 33 × 33.
Secondly, residual learning and batch normalization are used for the network model to speed up the training.They are two of the most influential architecture design techniques in CNN network structure design.The combination of these two techniques means that the CNN does not have immediate stabilizing training, but it can more easily produce a better denoising performance [12].
Thirdly, the network model uses small size training samples to prevent boundary artifacts.Owing to the characteristics of convolution operations, image boundary processing may result in boundary artifacts when denoised images of CNN are not properly processed.Zhang et al. [12] found that using the zero padding boundary expansion strategy by utilizing a small training sample size helps to prevent boundary artifacts, since the small blocks allow the CNN to use more boundary information.Therefore, an image block is cropped into 35 × 35 small non-overlapping blocks in the network model to strengthen the boundary information of the image.
For training the CNN, we used a dataset consisting of 400 Berkeley segmentation dataset images of size 180 × 180, as mentioned in [12].For convenience, we converted the images to gray images, and we cropped the images into small patches of size 35 × 35 and selected 12,000 patches for training.As for the generation of corresponding noisy patches, we achieved this by adding additive Gaussian noise to the clean patches during training.During training, the loss function of CNN was the same as the loss function in [12].
Finally, the network model trains specific denoisers with small spaced noise levels for noise images with different noise levels.Ideally, the denoisers in Equation ( 9) should use the training set of the current noise level to train the network model.Zhang et al. [12] trained a series of denoisers with a noise level range of [0, 50] and divided it by a step size of 2 for each model, resulting in a total of 25 denoisers.Since the small fluctuation of the SAR image noise level have less effect on the denoising result, we chose CNN denoisers with noise levels of 5, 10, 15, 20, and 25 to remove the speckle and then fused the five denoised images to obtain the final denoised image.The GFF algorithm employed in this paper is briefly introduced below.

Image Fusion-Based Guided Filtering
Derived from a local linear model, the guided filter computes the filtering output by considering the content of a guidance image, which can be the input image itself or another different image.The guided filter can be used as an edge-preserving smoothing operator, like the popular bilateral filter, but it has better behavior near the edges.The guided filter is also a more generic concept beyond smoothing: it can transfer the structures of the guidance image to the filtering output.Moreover, the guided filter naturally has a fast and non-approximate linear time algorithm, regardless of the kernel size and the intensity range.Thus, guided filtering is applied widely to the image processing field [37].We suppose that the guided image is I, the input image is p (i.e., the image needs to be filtered), and the output image is q.The local linear model is the vital assumption of guided filtering between the guided image and the output image, that is, where a k , b k are the linear coefficients, i is a pixel index, and ω k is the local window centered on the point k in the guided image I.It is a square window whose size is (2r + 1)(2r + 1).The edge-preserving filtering problem of the image is transformed into an optimization problem.The optimization problem involves minimizing the difference between p and q when meeting the linear relationship in Equation (11).That is, we should solve the minimization optimization problem, Equation ( 12): where α is the normalization factor.We can use linear regression [37] to solve Equation ( 12): where µ k and σ 2 k represent the mean and variance of the local window ω k in I. N ω represents the number of pixels in the window, and the mean of p in the window ω k is p k .To ensure the calculation amount of q i in Equation ( 11  An image fusion algorithm with guided filtering is given in Figure 2 [30].First of all, the source image I n is decomposed into two scales by mean filtering, namely the basic layer B n and the detailed layer D n .Then, we apply Laplacian filtering to obtain the high-pass portion H n for each source image I n .The saliency map S n is constructed through the local averages of the absolute values of H n .We use a large saliency map in the source image to construct the weight map P n .Next, we use the corresponding source image I n as a guide image for performing guided filtering on each weight map P n .We get where r 1 , ε 1 , r 2 and ε 2 are the parameters of the filter, and W B n and W D n are the final weight maps of the basic layer and the detailed layer.
Then, the basic layers and the detailed layers of different original images are merged by the weighted average: Finally, the fused image F is acquired by In GFF, the size of ω k should be decided experimentally.To fuse the base layers, the size of ω k is (2r 1 + 1) × (2r 1 + 1).A big filter size r 1 is preferred.To fuse the detailed layers, the size of ω k is (2r 2 + 1) × (2r 2 + 1), and the fusion performance will be worse when the filter size r 2 is too big or too small.In this paper, the value of r 1 was set to 45 and the value of r 2 was set to 7 based on the experiment.The flow diagram of GFF is shown in Figure 4.

CNN Denoiser Prior and Guided Filtering for SAR Image Denoising
In the SAR image, the relative phase between the scattering points in each resolution unit is closely related to the radar azimuth.The speckle is considered to be produced by the coherent superposition of the echoes of many scattering points, which randomly distribute in the same resolution of the scene.It has been proven that fully developed speckle is multiplicative noise by Goodman [9], and its multiplicative model is as follows: where G denotes the SAR image contaminated by speckle; F indicates the radar scattering characteristic of the ground target (i.e., the clear image); and N denotes the speckle due to fading.
The random process N and F are independent.N conforms to a Gamma distribution where the mean is one and the variance is where is the Gamma function, and L is the equivalent number of looks (ENL).

CNN Denoiser Prior and Guided Filtering for SAR Image Denoising
In the SAR image, the relative phase between the scattering points in each resolution unit is closely related to the radar azimuth.The speckle is considered to be produced by the coherent superposition of the echoes of many scattering points, which randomly distribute in the same resolution of the scene.It has been proven that fully developed speckle is multiplicative noise by Goodman [9], and its multiplicative model is as follows: where G denotes the SAR image contaminated by speckle; F indicates the radar scattering characteristic of the ground target (i.e., the clear image); and N denotes the speckle due to fading.The random process N and F are independent.N conforms to a Gamma distribution where the mean is one and the variance is 1 L : where L ≥ 1, N ≥ 0, Γ is the Gamma function, and L is the equivalent number of looks (ENL).
The main purpose of denoising is to eliminate N and to restore F from G. To facilitate the denoising process, homomorphic filtering is usually chosen in Equation (17), and the multiplication model is replaced by an addition model, as shown in Equation (19): It can be seen that the current noise can be assumed to obey a Gaussian distribution [38].Thus, Equation ( 19) can be rewritten as y = x + v, where y represents the observed image, v is the additive noise, and x is the clean image.Therefore, the SAR image can be denoised by the above denoising algorithm.
Figure 5 gives the flow chart of SAR image denoising based on CNN denoiser priors and the guided filtering fusion algorithm.The specific algorithm workflow of this paper is as follows: Step 1: Equation ( 19) is used to process the original SAR image by homomorphic filtering and to obtain the denoising image called y ; The  The specific algorithm workflow of this paper is as follows: Step 1: Equation ( 19) is used to process the original SAR image by homomorphic filtering and to obtain the denoising image called y; Step 2: Train the CNN prior denoisers; Step 3: The initial value of x k is x k = y; Step 4: The CNN denoisers are adopted with noise levels of 5, 10, 15, 20, and 25 to denoise the image x k and to get the denoised images z k1 , z k2 , z k3 , z k4 , and z k5 by Equation (10); Step 5: The denoised images z k1 , z k2 , z k3 , z k4 , and z k5 are fused to obtain the denoised image z f by the GFF fusion algorithm with Equations ( 15) and ( 16).Here, five images are fused instead of two images through the process in Figure 3; Step 6: Assign the value of z f to z k .From Equation ( 8), we can get x k+1 ; Step 7: Let k = k + 1, and repeat Step 4, Step 5, and Step 6 until the norm of x k+1 and z k is less than 0.01; Step 8: The image x k+1 is indexed to obtain the final denoised image y f .

Experimental Results
The training sets and training process of the CNN denoisers used in this paper are the same as those described in the literature [12].The denoiser model training platform used was Matlab R2014b which is from Mathworks company in Natick, MA, USA, the CNN toolbox was MatConvnet (MatConvnet-1.0-beta24,Mathworks, Natick, Massachusetts, USA), and the GPU platform was Nvidia Titan X Quadro K6000 (Santa Clara, California, USA) which is from NVIDIA Corporation in Santa Clara, CA, USA.The parameters of the GFF algorithm used in our algorithm were the same as those described in the literature [30].
In order to verify the reliability and effectiveness of the proposed algorithm, the proposed algorithm was tested on a simulated SAR image.The specific steps of the experiment were as follows: The first step was to convert the clean SAR image into the logarithmic domain to obtain the logarithmic SAR image by using the logarithmic function.
In the second step, a random matrix whose size is the same as the logarithmic SAR image is produced according to various noise variance, the noise variance that this paper is based on is 0.04, 0.05, 0.06, respectively.Then, add the random matrix, which is Gaussian noise to the logarithmic SAR image.Finally, the simulated noise SAR images are obtained.
In the third step, with the simulated noise SAR image as input, the proposed algorithm was used to obtain the denoised image.
In the fourth step, the denoised image was exponentially transformed to obtain the final denoised image.
Figure 6a,b show the original image and the noise image, respectively, the six images in (c), (d), (e), (f), (g), and (h) are the denoised images produced through denoisers whose noise levels were 5, 10, 15, 20, and 25, and the final denoised image was produced by using the proposed method.Figure 6c indicates that, when the selected denoiser level is smaller than the ground truth noise level, the denoised image still has a lot of noise, while (f) and (g) illustrate that, when the selected denoiser level is bigger than the ground truth noise level, the denoised image will appear to be over-smoothed.Thus, we fused all the denoised images using the GFF algorithm in order to obtain better denoised results, as shown in Figure 6h.It turned out that the denoised image obtained by the proposed algorithm had less noise while retaining the detailed texture and having a promising visual effect.
Figure 7 shows a comparison between the proposed algorithm and other denoising algorithms.For all algorithms, the images in the figure had added Gaussian noise with 0.05 noise variance.The denoising algorithms used were the Lee filter [1]; the sparse representation-based Bayesian threshold shrinkage denoising algorithm in the Shearlet domain (BSS-SR), as described in [39]; the local linear minimum-mean-square-error (LLMMSE) wavelet shrinkage-based nonlocal denoising algorithm for SAR image (SAR-BM3D), as described in [40]; SAR image denoising based on continuous cycle spinning via sparse representation in the Shearlet domain (CS-BSR), as described in [41]; probabilistic patch-based weights iteration weighted maximum likelihood denoising (PPB), as described in [42]; the use of texture strength and weighted nuclear norm minimization for SAR image denoising (BWNNM), as described in [43]; deep CNN based on residual learning for image denoising (DnCNN), as described in [44]; and the proposed algorithm.
image.Finally, the simulated noise SAR images are obtained.
In the third step, with the simulated noise SAR image as input, the proposed algorithm was used to obtain the denoised image.
In the fourth step, the denoised image was exponentially transformed to obtain the final denoised image.
Figure 6a,b show the original image and the noise image, respectively, the six images in (c), (d), (e), (f), (g), and (h) are the denoised images produced through denoisers whose noise levels were 5, 10, 15, 20, and 25, and the final denoised image was produced by using the proposed method.Figure 6c indicates that, when the selected denoiser level is smaller than the ground truth noise level, the denoised image still has a lot of noise, while (f) and (g) illustrate that, when the selected denoiser level is bigger than the ground truth noise level, the denoised image will appear to be over-smoothed.Thus, we fused all the denoised images using the GFF algorithm in order to obtain better denoised results, as shown in Figure 6h.It turned out that the denoised image obtained by the proposed algorithm had less noise while retaining the detailed texture and having a promising visual effect.Figure 7 shows a comparison between the proposed algorithm and other denoising algorithms.For all algorithms, the images in the figure had added Gaussian noise with 0.05 noise variance.The denoising algorithms used were the Lee filter [1]; the sparse representation-based Bayesian threshold shrinkage denoising algorithm in the Shearlet domain (BSS-SR), as described in [39]; the local linear minimum-mean-square-error (LLMMSE) wavelet shrinkage-based nonlocal denoising algorithm for SAR image (SAR-BM3D), as described in [40]; SAR image denoising based on continuous cycle spinning via sparse representation in the Shearlet domain (CS-BSR), as described in [41]; probabilistic patch-based weights iteration weighted maximum likelihood denoising (PPB), as described in [42]; the use of texture strength and weighted nuclear norm minimization for SAR image denoising (BWNNM), as described in [43]; deep CNN based on residual learning for image denoising (DnCNN), as described in [44]; and the proposed algorithm.Figure 7 shows a comparison between the proposed algorithm and other denoising algorithms.For all algorithms, the images in the figure had added Gaussian noise with 0.05 noise variance.The denoising algorithms used were the Lee filter [1]; the sparse representation-based Bayesian threshold shrinkage denoising algorithm in the Shearlet domain (BSS-SR), as described in [39]; the local linear minimum-mean-square-error (LLMMSE) wavelet shrinkage-based nonlocal denoising algorithm for SAR image (SAR-BM3D), as described in [40]; SAR image denoising based on continuous cycle spinning via sparse representation in the Shearlet domain (CS-BSR), as described in [41]; probabilistic patch-based weights iteration weighted maximum likelihood denoising (PPB), as described in [42]; the use of texture strength and weighted nuclear norm minimization for SAR image denoising (BWNNM), as described in [43]; deep CNN based on residual learning for image denoising (DnCNN), as described in [44]; and the proposed algorithm.By observing the experimental results, we found that Figure 7a still retains much noise after Lee filtering, and the edge of the denoised images shown in Figure 7b,d have some blur after the BSS-SR and CS-BSR methods.Although the SAR-BM3D and the PPB methods effectively suppress the speckle, they lose a lot of detail and appear to be over-smoothed, as shown in Figure 7c,e.The BWNNM method and DnCNN algorithm have a good noise suppression effect and better preserve the edge, but there is still some residual noise, as shown in Figures 7f,g.The proposed algorithm, as Meanwhile, our method also achieved the best results on the three evaluation indexes of PSNR, EPI and SSIM when noise with variance of 0.05 was added to images.Through Table 1, we can see that the SSIM of BSS-SR, SAR-BM3D, and other algorithms decreased slightly, which means that these algorithms cannot retain the detail and reduce the distortion simultaneously, but our method, as well as the DnCNN and Lee filter, showed satisfactory results.Moreover, the ENL value of our method remained stable, which is better than that of DnCNN.When we added noise variance of 0.06 to the image, the experimental results were the same as those analyzed above.
Generally speaking, whatever the noise level is, our proposed algorithm can preserve the structural information of the image, suppress the noise effectively, and retain the edge details to some extent.
Moreover, our proposed algorithm was tested by using the actual SAR image.The test images were the SAR images of TerraSar-X, and they can be downloaded from the website of Federico II University in Naples, Italy.They are shown in Figure 8.Meanwhile, our method also achieved the best results on the three evaluation indexes of PSNR, EPI and SSIM when noise with variance of 0.05 was added to images.Through Table 1, we can see that the SSIM of BSS-SR, SAR-BM3D, and other algorithms decreased slightly, which means that these algorithms cannot retain the detail and reduce the distortion simultaneously, but our method, as well as the DnCNN and Lee filter, showed satisfactory results.Moreover, the ENL value of our method remained stable, which is better than that of DnCNN.When we added noise with variance of 0.06 to the image, the experimental results were the same as those analyzed above.
Generally speaking, whatever the noise level is, our proposed algorithm can preserve the structural information of the image, suppress the noise effectively, and retain the edge details to some extent.
Moreover, our proposed algorithm was tested by using the actual SAR image.The test images were the SAR images of TerraSar-X, and they can be downloaded from the website of Federico II University in Naples, Italy.They are shown in Figure 8. Figure 8a shows an SAR image of trees, Figure 8b shows an SAR image of a city area, and Figure 8c shows an SAR image of a lake.They were denoised by the above denoising algorithms.
Figure 9 shows the denoised images of Figure 8a.In addition, the red boxes in Figures 9-11 mark out the region of the objective evaluation parameter UM.The specific values are given in the objective evaluation index section.We can see that the Lee filter is the worst denoising algorithm from Figure 9. BSS-SR and CS-BSR blurred some edge texture, while SAR-BM3D and PPB brought in little artificial texture.BWNNM and DnCNN produced over-smoothing.Our algorithm not only preserved the texture and edge information well, but it also suppressed the generation of artificial texture.Figure 8a shows an SAR image of trees, Figure 8b shows an SAR image of a city area, and Figure 8c shows an SAR image of a lake.They were denoised by the above denoising algorithms.
Figure 9 shows the denoised images of Figure 8a.In addition, the red boxes in Figures 9-11 mark out the region of the objective evaluation parameter UM.The specific values are given in the objective evaluation index section.We can see that the Lee filter is the worst denoising algorithm from Figure 9. BSS-SR and CS-BSR blurred some edge texture, while SAR-BM3D and PPB brought in little artificial texture.BWNNM and DnCNN produced over-smoothing.Our algorithm not only preserved the texture and edge information well, but it also suppressed the generation of artificial texture.Meanwhile, our method also achieved the best results on the three evaluation indexes of PSNR, EPI and SSIM when noise with variance of 0.05 was added to images.Through Table 1, we can see that the SSIM of BSS-SR, SAR-BM3D, and other algorithms decreased slightly, which means that these algorithms cannot retain the detail and reduce the distortion simultaneously, but our method, as well as the DnCNN and Lee filter, showed satisfactory results.Moreover, the ENL value of our method remained stable, which is better than that of DnCNN.When we added noise with variance of 0.06 to the image, the experimental results were the same as those analyzed above.
Generally speaking, whatever the noise level is, our proposed algorithm can preserve the structural information of the image, suppress the noise effectively, and retain the edge details to some extent.
Moreover, our proposed algorithm was tested by using the actual SAR image.The test images were the SAR images of TerraSar-X, and they can be downloaded from the website of Federico II University in Naples, Italy.They are shown in Figure 8. Figure 8a shows an SAR image of trees, Figure 8b shows an SAR image of a city area, and Figure 8c shows an SAR image of a lake.They were denoised by the above denoising algorithms.
Figure 9 shows the denoised images of Figure 8a.In addition, the red boxes in Figures 9-11 mark out the region of the objective evaluation parameter UM.The specific values are given in the objective evaluation index section.We can see that the Lee filter is the worst denoising algorithm from Figure 9. BSS-SR and CS-BSR blurred some edge texture, while SAR-BM3D and PPB brought in little artificial texture.BWNNM and DnCNN produced over-smoothing.Our algorithm not only preserved the texture and edge information well, but it also suppressed the generation of artificial texture.As shown in Figures 10 and 11, the performances of the eight denoising algorithms presented in this paper are similar to the results shown in Figure 8a.The denoising effect of our proposed algorithms is the most promising.However, we have neither the clean image nor an expert interpreter, which is difficult to ensure whether such artifacts mean any loss of detail.Some help comes from the analysis of ratio images obtained, as mentioned in [40], as the pointwise ratio between the original SAR image and de-noised SAR images.Given a perfect denoising, the ratio image should only contain speckle.On the contrary, the existence of structures or details related to the original image shows that the algorithm has removed not only noise but also some useful information.In order to highlight the better visual effect of our method, we give the ratio images in Figures 12-14   As shown in Figures 10 and 11, the performances of the eight denoising algorithms presented in this paper are similar to the results shown in Figure 8a.The denoising effect of our proposed algorithms is the most promising.However, we have neither the clean image nor an expert interpreter, which is difficult to ensure whether such artifacts mean any loss of detail.Some help comes from the analysis of ratio images obtained, as mentioned in [40], as the pointwise ratio between the original SAR image and de-noised SAR images.Given a perfect denoising, the ratio image should only contain speckle.On the contrary, the existence of structures or details related to the original image shows that the algorithm has removed not only noise but also some useful information.In order to highlight the better visual effect of our method, we give the ratio images in Figures 12-14  Figure 10 shows the denoised images of the city area in Figure 8b by using different denoising algorithms, and Figure 11 shows the denoised images of a lake SAR in Figure 8c by using different denoising algorithms.
As shown in Figures 10 and 11, the performances of the eight denoising algorithms presented in this paper are similar to the results shown in Figure 8a.The denoising effect of our proposed algorithms is the most promising.However, we have neither the clean image nor an expert interpreter, which is difficult to ensure whether such artifacts mean any loss of detail.Some help comes from the analysis of ratio images obtained, as mentioned in [40], as the pointwise ratio between the original SAR image and de-noised SAR images.Given a perfect denoising, the ratio image should only contain speckle.On the contrary, the existence of structures or details related to the original image shows that the algorithm has removed not only noise but also some useful information.In order to highlight the better visual effect of our method, we give the ratio images in Figures 12-14.(h) ratio image by using our method.
From Figure 12, we can see that the ratio image of our algorithm is closer to speckle. Figure 13 gives the ratio images from Figure 10.(h) ratio image by using our method.
Figure 14 shows the ratio images from Figure 11.(h) ratio image by using our method.
From Figure 12, we can see that the ratio image of our algorithm is closer to speckle. Figure 13 gives the ratio images from Figure 10.(h) ratio image by using our method.
Figure 14 shows the ratio images from Figure 11.(h) ratio image by using our method.
From Figure 12, we can see that the ratio image of our algorithm is closer to speckle. Figure 13 gives the ratio images from Figure 10.
Figure 14 shows the ratio images from Figure 11.
From the ratio images in Figures 12-14, we can find that our method have no obvious pattern and obtain the least signal information.From this view, it can show that our method can attain a better visual effect.From the ratio images in Figures 12-14, we can find that our method have no obvious pattern and obtain the least signal information.From this view, it can show that our method can attain a better visual effect.
To show the superiority of our algorithm, we used various common objective evaluation parameters for the denoising algorithms, including UM, ENL, EPI and SSIM.Tables 2-4 give the experimental results of the objective evaluation results of the above denoised images.  2 presents the results of evaluation indexes of the tree SAR image denoised by eight algorithms.First of all, the UM value was 25.4 of our proposed algorithm, and it was the smallest and the best of the eight algorithms.This shows that the proposed algorithm has an excellent comprehensive performance in terms of noise suppression.The ENL value of our method was not ideal, but its value was still bigger than the DnCNNs.The reason for this phenomenon is not only the complex and inherent denoising structure of CNN, but it is also related to the texture, light, and shade of SAR images.Finally, it is easy to see that the EPI and SSIM values of our method were the biggest, which shows that our proposed method has the strongest ability to preserve edges, and the integrity of the image structure was also the best.To show the superiority of our algorithm, we used various common objective evaluation parameters for the denoising algorithms, including UM, ENL, EPI and SSIM.Tables 2-4 give the experimental results of the objective evaluation results of the above denoised images.Table 2 presents the results of evaluation indexes of the tree SAR image denoised by eight algorithms.First of all, the UM value was 25.4 of our proposed algorithm, and it was the smallest and the best of the eight algorithms.This shows that the proposed algorithm has an excellent comprehensive performance in terms of noise suppression.The ENL value of our method was not ideal, but its value was still bigger than the DnCNNs.The reason for this phenomenon is not only the complex and inherent denoising structure of CNN, but it is also related to the texture, light, and shade of SAR images.Finally, it is easy to see that the EPI and SSIM values of our method were the biggest, which shows that our proposed method has the strongest ability to preserve edges, and the integrity of the image structure was also the best.
As shown in Tables 3 and 4, the performances of all the algorithms basically showed a similar trend to those in Table 2. Compared with other methods, we found that our method significantly improved UM, EPI, and SSIM.In summary, our algorithm possesses the best denoising ability, the strongest edge and detail preservation ability, and the most promising visual effects.
Without any loss of generality, the abilities to preserve detailed information and smoothness are contradictory in our method.Although our method is better than the Lee filter for ENL, it is not as good as PPB or SAR-BM3D because the selection of the CNN model and fusion algorithm is just empirical.If there were more suitable models and fusion methods, the performance of our method could be improved furtherly.

Conclusions
In this paper, a novel SAR image denoising algorithm based on CNN and the guided filtering fusion algorithm was proposed.First, five different noise level denoisers from the prior set of CNN denoisers were used to obtain five denoised SAR images.Then, the five denoised images were fused with guided filtering to obtain the final denoised image.The experimental results indicate that our proposed algorithm can significantly increase the PSNR after image denoising, effectively suppress the speckle noise, maintain the edge and detail information, and obtain promising visual effects.However, due to the limitation of the CNN structure, our proposed algorithm cannot obtain the highest ENL and EPI values at the same time, which could be a future task for this field.

Figure 2 .
Figure 2. The example of a dilated convolution process.
) does not change with the change of the local window, we apply mean filtering to a k and b k in the local window after calculating a k and b k .For simplicity, we adopt G r,ε (p, I) to represent guided filtering, where r is the size of the filtering kernel, and ε is the normalization factor.

Figure 3 20 Figure 3 .Figure 3 .
Figure 3. Illustrations of the guided filtering process.An image fusion algorithm with guided filtering is given in Figure 2 [30].First of all, the source image n I is decomposed into two scales by mean filtering, namely the basic layer n B and the Remote Sens. 2019, 11, x FOR PEER REVIEW 9 of 20

Figure 4 .
Figure 4.The flow diagram of the fused algorithm.

Figure 4 .
Figure 4.The flow diagram of the fused algorithm.

20 Figure 5 .
Figure 5.The flow diagram of the proposed algorithm.

Figure 5 .
Figure 5.The flow diagram of the proposed algorithm.

Figure 10
Figure10shows the denoised images of the city area in Figure8bby using different denoising algorithms, and Figure11shows the denoised images of a lake SAR in Figure8cby using different denoising algorithms.

Figure 12 .
Figure 12.The ratio images using all denoising methods for Figure 8.(a) ratio image by using lee filter; (b) ratio image by using BSS-SR; (c) ratio image by using SAR-BM3D; (d) ratio image by using CS-BSR; (e) ratio image by using PPB; (f) ratio image by using BWNNM; (g) ratio image by using DnCNN; (h) ratio image by using our method.

Figure 13 .
Figure 13.The ratio images using all denoising methods for Figure 9. (a) ratio image by using lee filter; (b) ratio image by using BSS-SR; (c) ratio image by using SAR-BM3D; (d) ratio image by using CS-BSR; (e) ratio image by using PPB; (f) Ratio image by using BWNNM; (g) ratio image by using DnCNN; (h) ratio image by using our method.

Figure 12 .Figure 12 .
Figure 12.The ratio images using all denoising methods for Figure 8.(a) ratio image by using lee filter; (b) ratio image by using BSS-SR; (c) ratio image by using SAR-BM3D; (d) ratio image by using CS-BSR;(e) ratio image by using PPB; (f) ratio image by using BWNNM; (g) ratio image by using DnCNN; (h) ratio image by using our method.

Figure 13 .
Figure 13.The ratio images using all denoising methods for Figure 9. (a) ratio image by using lee filter; (b) ratio image by using BSS-SR; (c) ratio image by using SAR-BM3D; (d) ratio image by using CS-BSR; (e) ratio image by using PPB; (f) Ratio image by using BWNNM; (g) ratio image by using DnCNN; (h) ratio image by using our method.

Figure 13 .
Figure 13.The ratio images using all denoising methods for Figure 9. (a) ratio image by using lee filter; (b) ratio image by using BSS-SR; (c) ratio image by using SAR-BM3D; (d) ratio image by using CS-BSR; (e) ratio image by using PPB; (f) Ratio image by using BWNNM; (g) ratio image by using DnCNN; (h) ratio image by using our method.

Figure 14 .
Figure 14.The ratio images using all denoising methods for Figure 10.(a) ratio image using lee filter; (b) ratio image using BSS-SR; (c) ratio image using SAR-BM3D; (d) ratio image using CS-BSR; (e) ratio image using PPB; (f) ratio image using BWNNM; (g) ratio image using DnCNN; (h) ratio image using our method.

Figure 14 .
Figure 14.The ratio images using all denoising methods for Figure 10.(a) ratio image using lee filter; (b) ratio image using BSS-SR; (c) ratio image using SAR-BM3D; (d) ratio image using CS-BSR; (e) ratio image using PPB; (f) ratio image using BWNNM; (g) ratio image using DnCNN; (h) ratio image using our method.

Table 2 .
The evaluation parameter values of all denoising methods in the tree SAR image.

Table 2 .
The evaluation parameter values of all denoising methods in the tree SAR image.

Table 3 .
The evaluation parameter values of all denoising methods in the city SAR image.

Table 4 .
The evaluation parameter values of all denoising methods in the lake SAR image.