Based on the results presented in the previous section, we will consider in the following the merits and the drawbacks of the proposed edge detection method.
4.1. Images Affected by Synthesized Speckle
Analyzing
Table 1, it can be seen that in the case of the image Lena perturbed by AWGN, the proposed method outperforms Canny’s detection method for all values of the noise standard deviation in terms of edges’ MSE values. As a consequence of the analysis in
Figure 2, it can be affirmed that in case of the Lena image perturbed by AWGN, the Canny’s edge detector is not robust against noise, despite the utilization of a Gaussian filter in the first step of the algorithm, introducing numerous false edges when it is applied directly to noisy image. The proposed method misses some edges because of the smoothing effect of the denoising method selected. Analyzing
Figure 3 by visual inspection, we observe some false edges in case of the method proposed in [
14], which cannot be observed in case of the proposed edge detection method. Therefore, the proposed method has a performance superior to the edge detection method proposed in [
14]. Similar remarks can be formulated analyzing
Table 2 and
Figure 4 for the image Boats and
Table 3 and
Figure 5 for the image Barbara, concerning the superiority of the proposed method versus the direct application of the Canny’s detection method in case of natural images perturbed by AWGN.
We have additionally studied the effect of the proposed edge detection method in the case of synthesized speckle noise. We can observe visually, analyzing
Figure 6, that the classical despeckling filters (Lee, Kuan and Frost) are not able to eliminate the entire noise; this effect is more visible in homogeneous regions. Moreover, the classical despeckling filters blur textures and edges, erasing weak contours. Visually, the best results between classical despeckling filters belong to Frost filter. The MBD algorithm has better performance than classical despeckling filters concerning textures and contours, but some noisy pixels remain in the homogeneous regions. The filter proposed here has the best performance, eliminating practically all the noise from the homogeneous regions and preserving textures and edges. The results shown on the last line of
Figure 6 prove the anisotropy of the proposed denoising method, which is consequence of the improved directional selectivity of HWT. As it can be observed, analyzing the right column in
Figure 6, the edge map obtained applying the proposed despeckling method is the best. We have compared quantitatively the proposed despeckling method with state-of-the-art methods, in the case of synthesized speckle noise in
Table 4 and
Table 5. The H-BM3D algorithm has the best PSNR results, followed closely by the SAR-BM3D algorithm (
Table 5). The results obtained applying the denoising method proposed here are good, outperforming the results obtained with other wavelet-based denoising methods: PPB, MAP-S, SA-WBMMAE, HWT-Bishrink, HWT-marginal ASTF, and the method in [
48] (
Table 4 and
Table 5). Comparing the results obtained by applying the HWT-Bishrink denoising association with the results of the proposed denoising method, we suggest that the idea of two stages and the proposed improvements of the bishrink filter are effective, and further, our algorithm is faster than SAR-BM3D and H-BM3D algorithms.
4.2. Real Remote Sensing Images
We have continued by applying the proposed edge detection method on three real remote sensing images. In case of the aerial image shown in
Figure 7, the speckle strongly affects the original image (
L = 2). The first stage of the proposed denoising system is not able to remove completely the speckle, whereas the second stage of the proposed denoising system does completely remove the speckle. As in the case of synthesized images, the Canny’s edge detector is not robust against speckle noise. Even after the application of the first stage of the proposed denoising system, the number of false edges remains high, and the proposed edge detection method is robust against noise. We do not observe false edges in the last image in
Figure 7. To appreciate visually better the effects of the proposed despeckling method, in
Figure 8 are presented zooms in the images in
Figure 7. Indeed, the first denoising stage does not reject all speckle noise and the result of proposed denoising system is slightly oversmoothed. To quantitatively appreciate the proposed despeckling system, we have computed the ENLs in
Table 6. Our first stage denoising system increases the ENL but is not able to reject entirely the speckle noise. The complete denoising system completely rejects the speckle and increases more the ENL. Hence, the proposed despeckling system has a good performance in the case of aerial SAR images degraded by speckle, despite the small value of the number of looks, but slightly smooths the input image. This over-smoothing prevents the generation of false edges but erases some weak edges.
The multiple looks SAR images, as shown in
Figure 9, represents the second example of treating real remote sensing image with the proposed edge detection method, and is obtained by applying a multilooking procedure to Single Look Complex (SLC) SAR images acquired by satellites. The multilooking procedure reduces the speckle, but also reduces the spatial resolution of the SLC SAR image. An alternative to the multilooking procedure which does not reduce the spatial resolution is the despecklization of the SLC SAR image. A very interesting procedure for the despecklization of SLC SAR images is based on the estimation of the rugosity of the scene; this estimation can be performed by evaluating locally the Hurst exponent of the scene. Different semi-local Hurst exponent estimations can be done using the DTCWT [
60,
61,
62]. The corresponding despeckling methods are reported in [
63]. The authors of [
63] estimated the Hurst parameter by using low resolution coefficients, which are relatively free of noise and the energy of high-resolution level coefficients most affected by noise are corrected according to the estimated parameter. The performance of estimators in [
63] is analyzed in [
64].
The third example of applying the proposed edge detection procedure to remote sensing images refers to SONAR images, more precisely to the SONAR image in
Figure 9. The proposed edge detection method erases some weak edges as consequence of the smoothening effect of the proposed denoising method, but the form of the wreck is easily to perceive based on the last image in
Figure 10, despite the reduced number of looks,
L=1, of the original image. Conversely, we cannot perceive the form of the wreck analyzing the first two images in the right column in
Figure 10. This proves the superiority of the proposed denoising method in comparison with the denoising method based on the association HWT-marginal ASTF.
4.3. Comparison with Modern Despeckling Methods
Deep learning in remote sensing has become a modern direction of research, but it is mostly limited to the evaluation of optical data [
65]. The development of more powerful computing devices and the increase of data availability has led to substantial advances in machine learning (ML) methods. The use of ML methods allows remote sensing systems to reach high performance in many complex tasks, e.g., despecklization [
66,
67,
68,
69,
70,
71,
72,
73,
74,
75,
76,
77], object detection, semantic segmentation or image classification. These advancements are due to the capability of Deep Neural Networks to automatically learn suitable features from images in a data-driven approach, without manually setting the parameters of specific algorithms. The Deep Neural Networks act as universal function approximators, using some training data to learn a mapping between an input and the corresponding desired output.
In the following, we present some of the ML methods for despeckling, found in two very recent references, firstly in [
65]. Inspired by the success of image denoising using a residual learning network architecture in the computer vision community, [
66] in [
67] was introduced a residual learning Convolutional Neural Network (CNN) for SAR image despeckling, named SAR-CNN, a 17-layered CNN for learning to subtract speckle components from noisy images in a homomorphic framework. As in the case of the proposed despeckling method in this paper, the homomorphic approach is performed before and after feeding images to a denoising kernel, which in case of [
67] is represented by the neural network itself. In this case, multiplicative speckle noise is transformed into an additive form and can be recovered by residual learning, where log-speckle noise is regarded as residual. An input log-noisy image is mapped identically to a fusion layer via a shortcut connection, and then added element-wise with the learned residual image to produce a log-clean image. Afterwards, denoised images can be obtained by the logarithm inversion. The SAR-CNN is trained on simulated single-look SAR images. However, to ensure a better fidelity to the actual statistics of SAR signal and speckle, it is retrained on real SAR data using multi-looked images as approximate clean references.
Wang et al. [
68] proposed the ID-CNN, for SAR image despeckling, which can directly learn denoised images via a component-wise division-residual layer with skip connections. Therefore, homomorphic processing is avoided, but at a final stage the noisy image is divided by the learned noise to yield the clean image. This approach makes sense, considering the multiplicative nature of noise. Of course, a pointwise ratio of images may easily produce outliers in the presence of estimated noise values close to zero. However, a
tanh nonlinearity layer placed right before the output performs a soft thresholding thus avoiding serious shortcomings.
In [
69], Yue et al. proposed a novel deep neural network architecture specifically designed for SAR despeckling. It models both speckle and signal itself as random processes, to better account for the homogeneous/heterogeneous nature of the observed cell. Working in the log-domain, the pdf of the observed signal can be regarded as the result of a convolution between the pdfs of clean signal (unknown) and speckle. The authors of [
69] used a CNN to extract image features and reconstruct a discrete RADAR Cross Section (RCS) pdf. It is trained by a hybrid loss function which measures the distance between the actual SAR image intensity pdf and the estimated one that is derived from convolution between the reconstructed RCS pdf and prior speckle pdf.
The unique distribution of SAR intensity images was considered in [
70]. A different loss function, which contains three terms between the true and the reconstructed image, is proposed in [
70]. These terms are, the common L
2 loss; the L
2 difference between the gradient of the two images and the Kullback–Leibler divergence between the distributions of the two images. The three terms are designed to emphasize the spatial details, the identification of strong scatterers, and the speckle statistics, respectively.
In [
71], the problem of despeckling was tackled by a time series of images. The authors utilized a multi-layer perceptron with several hidden layers to learn non-linear intensity characteristics of training image patches.
Again, using single images instead of time series, the authors of [
72] proposed a deep encoder–decoder CNN architecture with a focus on a weakness of CNNs, namely feature preservation. They modified U-Net [
73] in order to accommodate speckle statistical features.
Another notable CNN approach was introduced in [
74], where the authors used a NLM algorithm, while the weights for pixel-wise similarity measures were assigned using a CNN. The network takes as input a patch extracted from the original domain image, and outputs a set of filter weights, adapted to the local image content. Two types of CNN were conceived to implement this task; the first type is a standard CNN with 12 convolutional layers, while the second type is a 20-layer CNN which includes also two N3 layers to exploit image self-similarities. These layers associate the set of its K nearest neighbors with each input feature, which can be exploited for subsequent nonlocal processing steps. Training for both types of CNN is both on synthetic data and on real multi-looked SAR images. The results of this approach, called CNN-NLM, are impressive, with feature preservation and speckle reduction being clearly observable.
One of the drawbacks of the aforementioned algorithms is the requirement of noise-free and noisy image pairs for training. Often, those training data are simulated using optical images with multiplicative noise. This is of course not ideal for real SAR images. Therefore, one elegant solution is the noise2noise framework [
75], where the network only requires two noisy images of the same area. The authors of [
75] proved that the network is able to learn a clean representation of the image given the noise distributions of the two noisy images are independent and identical. This idea has been employed in SAR despeckling in [
76]. The authors make use of multitemporal SAR images of a same area as the input to the noise2noise network. To mitigate the effect of the temporal change between the input SAR image pairs, the authors multiples a patch similarity term to the original loss function.
Some of the ML methods for SAR images despeckling, already mentioned, are discussed in the second very recent publication, [
77], as well. We find very useful
Table 2 in [
77], which present 31 relevant deep learning-based despeckling methods with their main features. Between these methods, we can find the SAR-CNN, the ID-CNN and the CNN-NLM methods.
Based on [
65] and [
77], it can be observed that most ML-based despeckling methods employ CNN-based architectures with single images of the scene for training; they either output the clean image in an end-to-end fashion or propose residual-based techniques to learn the underlying noise model.
The despeckling method proposed in this paper is faster and requires less computational resources than the ML-based methods, due to the sparsity of HWT (in conformity with the property (A) mentioned at the end of the
Section 1.1), it does not use any training and does not necessitate any learning methodology.
With the availability of large archives of time series thanks to the Sentinel-1 mission, an interesting direction for the ML-based despeckling is to exploit the temporal correlation of speckle characteristics for despeckling applications. Acting in the space domain, these methods require the statistical model of the speckle. On the contrary, the proposed despeckling method acts in the Hyperanalytic Wavelet Transform Domain. Due to the statistical properties (D) and (C) of the wavelet coefficients, mentioned at the end of
Section 1.1, the proposed method does not necessitate the knowledge of the speckle’s statistical model. Due to the statistical property (B) of the wavelet coefficients, mentioned at the end of
Section 1.1, the proposed despeckling method already exploits the spatial correlation of wavelet detail coefficients. We consider a good idea to exploit the temporal correlation of wavelet detail coefficients, using the available archives of time series.
One critical issue of both ML-based and proposed despeckling methods is the over-smoothing. Many of the CNN-based methods and the proposed despeckling method perform well in terms of speckle removal but are not able to preserve weak edges. This is quite problematic in the despeckling of high-resolution SAR images of urban areas and for robust edge detection in particular.
Another problem in supervised deep learning-based despeckling techniques is the lack of ground truth data. This problem affects the validation of the proposed despeckling method as well, because some objective quality indicators, as for example the PSNR or the SSIM cannot be computed without reference. A solution could be the utilization of optical images of the same scenes, but frequently such images are not accessible or have different parameters as for example size or contrast. In many studies on ML-based despecklization, this problem is more acute, because the training data set is built by corrupting optical images by multiplicative noise. This is far from realistic for despeckling applied to real SAR data. Therefore, despeckling in an unsupervised manner would be highly desirable and worthy of attention.