Speckle Noise Filtering in Side-Scan Sonar Images Based on the Tucker Tensor Decomposition

Real signals are usually contaminated with various types of noise. This phenomenon has a negative impact on the operation of systems that rely on signals processing. In this paper, we propose a tensor-based method for speckle noise reduction in the side-scan sonar images. The method is based on the Tucker decomposition with automatically determined ranks of factoring tensors. As verified experimentally, the proposed method shows very good results, outperforming other types of speckle-noise filters.


Introduction
Side-scan sonar allows for underwater object detection thanks to the measurement of the sent and reflected low-frequency waves. This way, obtained sonar images can be processed for object recognition, obstacle avoidance, and drone manoeuvring, to name a few. However, sonar images are heavily contaminated with unwanted signals with the most prominent being speckle noise. This type of noise arises from the coherent interference of backscattered waves due to the physical properties of the object surfaces as well as wave propagation. In this paper, we tackle the problem of sonar image filtering, which allows for noise removal while preserving useful information stored in the original signals. For this purpose, the tensor-based filtering method is proposed, which accounts well for the multi-dimensionality of the sonar images. To the best of our knowledge, this is the first work in which the Tucker-based tensor decomposition with automatic rank determination is proposed for filtering of the side-scan sonar images. The obtained results show that the proposed method outperforms many other filtering methods [1][2][3].
The rest of the paper is organized as follows. Section 2 describes the related works. Section 3 briefly highlights the physical properties of the speckle noise, its model, as well as various filter types for its reduction. Our proposed tensor-based denoising method is explained in Section 4. Experimental results with a discussion on results are described in Section 5. Finally, Section 6 concludes the paper.

Related Works
Speckle noise is an unwanted phenomenon encountered in many electronic systems. In this section, we provide an overview of the known methods that were proposed for filtering of this type of signal distortion. Many papers address the speckle noise problem encountered in the domains of medical ultrasound, radars and optical coherence tomography (OCT), often proposing state-of-the-art solutions to the problem [2][3][4][5].
The speckle noise reduction is a well-defined problem, and many methods have been developed to achieve a stable balance between the level of filtering and detail preservation. Apart from the well-known simple filters, the hybrid and wavelet filters have gained attention from the researchers.
In this respect, the system proposed by Yu and Acton employs a nonlinear anisotropic diffusion technique (SRAD) for removing multiplicative noise in imagery [4]. They used a partial differential equation approach which, due to the filter size and shape, allows for the generation of an image scale space without bias. SRAD, as an adaptive technique, not only preserves edges but also enhances them by inhibiting diffusion across edges.
Karthikeyan and Chandrasekar proposed a method combining the SRAD filter with the wavelet-based BayesShrink technique [5]. An input image is decomposed using the Discrete Wavelet Transform, followed by a nonlinear thresholding step. The data are then reconstructed by the Inverse Discrete Wavelet Transform. Their method achieved a PSNR metric value of over 70 dB, as measured on a test dataset.
On the other hand, in the paper by Vanithamani et al. [1], the Modified Hybrid Median Filter is presented. The method is a three-step ranking operation using a non-standard window. The system uses neighbors forming "X" and "+" shapes, overcoming the problem of edge degradation.
A system for speckle reduction for the OCT images was proposed by Adabi et al. [2]. It is based on a multi-layer perceptron neural network, which estimates parameters for the filtering module of the system. As reported, the achieved average parameter estimation is above 99%. The denoising, based on the Rayleigh distribution model, is conducted independently on the small blocks generated from the image. The final output is combined into one image, with simultaneous block artifacts reduction.
However, neither of the above solutions accounts for the multi-dimensionality of the input signal. A method proposed in this paper, which relies on signal representation and processing with tensors, accounts for this effect what leads to superior results, as discussed below.

Characteristics of the Speckle Noise and Its Filtering Methods
Speckle noise is one of the primary sources of visual noise in sonar, ultrasound or radar images [6,7]. It is mainly caused by the returning wave interference inside the transducer due to the roughness of the material surface in the wavelength scale. The scattered signal adds coherently producing patterns of constructive and destructive interference, visible as brighter or darker dots in an image. Speckle noise can be modeled as follows.
where g(m, n) denotes corrupted image matrix at spatial position (m, n); u(m, n) and η(m, n) stand for the multiplicative and additive component of of the noise, respectively; and f is the original image. Speckle noise filtration is mainly based on assumptions that the signal and the noise are statistically independent, and the sample mean and variance of a single pixel are equal to the mean and variance of the whole local area. The different methods developed to eliminate speckle noise can be divided into adaptive and non-adaptive ones. In theory, they are low pass filters removing high-frequency noise. However, the negative side effect of this process is degradation of useful image features such as edges, corners, and other high-frequency patterns [8]. The common filtering methods, as well as their properties, are outlined in the following sections.

Average Filter
The average filter calculates the grey level from the mean of all pixels in a kernel surrounding the center of the window and returns a value of this central position. The mean for m × n window region is determined as follows: where I denotes pixel intensity of an m × n patch, g is intensity of a noisy pixel, at grid coordinates {m} × {n} around the central pixel position (x, y).

Median Filter
In the median filter, the grey level for actual window position is calculated from the median value of all pixels in the surrounding kernel of m × n. Similar to the average filter, median filtering smooths the image reducing also noise. However, due to its nonlinearity, the median filter has better performance in edge preservation and impulse noise removal than the average filter. On the other hand, the median filter has higher computational cost as a result of a need for sorting values of at least half of the pixels in the kernel. The median filtering algorithm can be outlined as follows [9]: 1. Take an m × n kernel centered around a pixel (x, y). 2. Sort the intensity values of the pixels in the kernel into ascending order. 3. Select the middle value as the new value for the pixel (x, y).

Frost Filter
The Frost filter calculates the grey level for each pixel using an exponentially dumped convolution kernel that can adapt its parameters [4]. This is kind of a guided filter in which its local adjustment level is controlled by the adaptive filter coefficient calculated for each window.
The filter smooths image data without removing edges or sharp features in the images while minimizing the loss of valuable information. In homogeneous areas, speckles are removed using a low-pass filter. On the other hand, in the areas containing isolated point targets, the filter preserves the observed value. More precisely, the Frost filter can be defined as follows where K denotes dumping factor and I s is an intensity value of the center pixel in the kernel. Values (x, y) and (x p , y p ) indicate grid coordinates of the centre of the window and the pixel p, respectively. C s is a local statistic value used for adaptive computation of the filter coefficients.

Lee Filter
The Lee filter is another adaptive method based on calculation of a local statistics [10]. A low value of the variance of the kernel causes the algorithm to have a lower impact on the image as a low pass filter. This allows for detail preservation in both low and high contrast images. Formally, the Lee filter is defined as follows where I m and I cp denote mean intensity value of the kernel window and a value for the center pixel, respectively. The size of its filter window W is given as follows where σ 2 and ρ 2 are variances computed for an image and estimated for the additive noise, respectively.

Kuan Filter
The Kuan filter transforms the multiplicative noise model into an additive noise model and then applies the local linear minimum mean square error criteria [11]. This method is similar to the Lee filter except that the weighting function is defined as follows where C u denotes an estimated noise variance and C i is a variance of an image.

Enhanced Lee Filter
The Enhanced Lee method calculates the grey level for each pixel similar to the already described Lee filter but additionally using two threshold values [12]. These thresholds control the allowable local variance and determine filter performance. Namely, low pass filtering is active when the local variance coefficient is below a lower threshold. On the other hand, the all pass filter is applied when the coefficient is above the higher threshold. For the coefficient value between the thresholds, the output is calculated with a weighting function balancing between averaging and the identity operations, respectively.

Tensor-Based Speckle Noise Filtering
Tensors are mathematical objects which can be regarded as multidimensional arrays of data, in which each separate dimension corresponds to a different degree of freedom of a measurement. Such an approach provides tools which extend the classical matrix analysis, and which can take into account correlations hidden in data, to yield better results in various applications, such as filtering [9].

Multi-Dimensional Signals Filtering in the Tensor Framework
Tensors extend the notion of vectors and matrices into higher-dimensional objects [9,[13][14][15]. As discussed below, they allow for better representation and processing of the multidimensional signals and, in effect, also for better filtering.
A multidimensional measurement T can be expressed as a sum of pure signal tensor S and noise N caused by imperfections introduced during the measurement process as well as by other physical phenomena [15]. Hence, where T , S, N ∈ N 1 ×N 2 ×...×N P are Pth order tensors. The multidimensional filtering can based on the following tensor productT where filtered version of noisy tensor T is denoted asT , whereas F i is the ith mode filtering matrix.
In the above equation, the kth modal product T × k M of a tensor T ∈ N 1 ×N 2 ×...×N P and a matrix M ∈ Q×N k is used. The result is also a tensor S ∈ N 1 ×N 2 ×...N k−1 ×Q×N k+1 ×...×N P , whose elements are expressed as follows: As shown below, the filter matrices F i , called factors, can be obtained using the Tucker decomposition of tensors [16]. Thanks to the proper selection of the ranks of the tensor decomposition factors, decomposition usually well separates useful signal from noise, at the same time taking multidimensional characteristics of the signal into account. The decomposition procedure of the tensor T is done by calculation of an approximating tensorT that is close to the input tensor in terms of the Frobenius norm. Hence, a minimization function is defined as follows The concept of the Tucker decomposition of a 3D tensor is presented in Figure 1. Assuming that the approximating tensorT contains the same amount of useful information as the original tensor T , it can be expressed as followsT where Z ∈ R 1 ×R 2 ×...×R P is a core tensor and S i ∈ N i ×R i are the so-called mode matrices. Using algebraic operations, from Equation (13), the formula for the core tensor is obtained: Combining Equation (14) with Equations (12) and (13) yields The Tucker decomposition in Equation (15) reads that a tensor T is approximated by its projection onto space spanned by the matrices S k . To compute the series of S k matrices, the alternating method can be used [9,[17][18][19]. Finally, it can be shown that the factor matrices F i can be obtained from the tensor decomposition matrices S i , as follows [15] F The approximation in Equation (13) includes only the components conveying the majority of energy available in the signal. However, to ensure the best quality, the estimation of the proper ranks R 1 , R 2 , and R 3 of the mode matrices S i is necessary. Although fixed values can be used as a first approximation, in real dynamic systems with unpredictable noise value, the proper ranks need to be based on the signal content. To solve this problem, the method presented by Muti and Bourennane [14] is used. For this purpose, the minimum description length parameter (MDL) is computed for each dimension separately. For an argument i, 1 ≤ i ≤ N k , a value of MDL is given as follows where λ j denotes jth eigenvalue in the eigendecomposition and c k is a number of observations. Such an approach is also used in the presented filtering system.

The Tensor Filtering Algorithm
The Tucker tensor decomposition algorithm consecutively computes the dominating eigenspace spanned by the flattened tensor at each tensor dimension [9]. In each such a step, a minimum i min is sought as a last component number from eigenvalues with significant decomposition impact.
The proposed method utilizes a moving window pattern, which is common in image processing [6]. The kernel of size of w s × w s traverses image I from the upper-left to the bottom-right corner, respectively, with the step fixed as w s /4 to ensure area overlapping for better representation. Such movement of the kernel results in a number of patches that are collected into the input tensor T . Depending on the Close Neighbor Distance (CND) parameter, more or fewer windows from the close neighborhood are included. For CND = 1, which is the lowest possible value, there are nine windows combined into the input tensor. This procedure is outlined in Algorithm 1.
This way, the prepared tensor is Tucker decomposed with the help of the Higher Order Orthogonal Iteration (HOOI) algorithm [16]. In the result, the core tensor and the set of mode matrices are obtained. These, in the reverse signal synthesis process, are then used for signal filtering, as described in the previous section. This way approximated signal contains most important information with noise significantly attenuated due to properly adjusted rank of the tensor decomposition. In the next step, this way filtered signal patch is inserted to the output image X into a position corresponding to the actual position of the kernel in I. The process repeats itself until the end position is reached. Then each pixel in X is scaled by dividing its value by a number of its occurrences during the filtering process. The output image is then returned. The routine is implemented as shown in Algorithm 2. As alluded to previously, to determine proper ranks of the tensor, the MDL calculation process, as in Equation (17), is performed for all tensor dimensions. generate filtering tensor 6: if t r = ∅ then 7: for k = 1 : 3 do 8: calculate t r [k] for k-axis using MDL 9: end for 10: end if 11: reconstruct window Y with Tucker reconstruction and the ranks t r 12: add window Y to X 13: end for 14: end for 15: rescale window X 16: return X 17: end procedure

Experimental Results
The presented method was implemented in Python, using the numpy, scipy and skimage libraries. Additionally, for tensor decomposition, the TensorLy library was used [20]. The benchmark methods were implemented using the PyRadar library with small changes, allowing them to be run in the test environment. Experiments presented in this section were performed on a laptop computer equipped with 16 GM of RAM, 6-core processor i7-8750H with the 2.2 GHz base clock, and 64-bit Ubuntu 18.04 OS.
However, finding proper test sonar images is difficult. Therefore, for the quantitative evaluation, the synthetic sonar images were generated, which contain different types of objects. During the experiment, the set of five images was contaminated by the addition of speckle noise generated as follows where n is a uniform noise with the meanx = 0 and variance σ 2 = 0.0001, 0.0005, 0.001, 0.005, 0.01, 0.05, and 0.1. Matrices g and f denote corrupted image and original image, respectively. For better demonstration of filtering capabilities of all methods, the real sonar images were also used. Examples are presented in Figures 2 and 3 for the synthetic and and real side scan sonar images, respectively. The full code, as well as all generated images and plots, are available in an online repository [21]. The quantitative results were measured in terms of the Peak Signal to Noise Ratio (PSNR), Structural Similarity (SSIM), and Mean Squared Error (MSE) parameters calculated for each picture, respectively [22,23]. The MSE for an n × m image I and its noised approximation K is defined as follows: On the other hand, PSNR is a term for the ratio between a maximum possible pixel value I max and the value MSE of a noise corrupted signal. It is defined as: Finally, the SSIM denotes a way for a quality assessment based on the degradation of the structural information between two signals x and y, respectively, and is defined as follows: where µ represents an average and σ 2 the variance of a given image. σ xy is the covariance between x and y, and C 1 and C 2 are constants stabilising the division with small factors in the denominator.
(a) (b) Figure 3. Exemplary real side scan sonar images used as an input. (a) Sample image containing man-made circular objects (tyres); (b) Sample image containing man-made circular and box-shaped objects.
Edges of the objects were visually inspected to determine the impact of a filtering algorithm on their sharpness. To provide a benchmark for the proposed algorithm, the filters such as Mean, Median, Frost, Lee, Lee enhanced and Kuan were used. The window size and other input parameters for the benchmark filters were set to achieve a balance between speckle reduction and edge preservation.
In the presented experiments, the noise reduction, as well as the influence of various parameters on this process, were measured. This way, the obtained performance of the different denoising methods is presented in Table 1. P1 and P2 denote results for the proposed method with different values of the input parameters. Their exact values, for each P1 and P2, respectively, are presented in Table 2. The next proposed method, called PAuto, is the one endowed with the MDL algorithm for the automatic rank calculation. In this case, for every assembled tensor, a new set of ranks is estimated. The average execution time for each algorithm is presented in Table 3. On the other hand, the comparative results for all tested algorithms are collected in Figure 4. For each filtering method, a figure consisting of original, noised, filtered and difference image was prepared. The difference image was calculated as an absolute difference between the original and filtered data and visualizes advantages and disadvantages of each denoising method.
There are few visible trends between changes in the input parameters and achieved values of the MSE, PSNR, and SSIM, as well as in respect to the execution time. The resulting figures with plotted metrics against noise variance are presented in Figure 5.
It can be observed that increasing the CND parameter improves the SSIM metrics due to better averaging capabilities of the proposed filter with a larger dataset. The improvement is more significant for higher noise variance values. At the same time, a bigger value of this parameter increases the execution time and slightly worsens the PSNR. On the other hand, decreasing rank values of the Tucker decomposition reduces execution time and does not affect significantly achieved metrics values. As found experimentally, the filters achieve their best SSIM performance with window size = 32, 64 and window size = 8 for the PSNR, respectively. Performance of the methods for which their maximal value in the Tucker rank is lower than the window size, is correlated with the execution time and the noise variance level. Such correlation is not observed for filters with maximal value in the Tucker rank greater than the window size, however. The achieved values of the SSIM metrics for the proposed method usually are lower in the case of the selected benchmark methods. The difference is more significant for higher noise variance, which leads to a conclusion that the proposed method has lower filtering capabilities of structural distortions. The greater impact of structural component is visible in Figures 6 and 7. However, Figures 8-13 show higher edge degradation in the benchmark methods. 6.521 × 10 −2 6.557 × 10 −2 6.529 × 10 −2 6.579 × 10 −2 6.520 × 10 −2 6.571 × 10 −2 6.496 × 10 −2 6.561 × 10 −2 6.520 × 10 −2  Hence, the proposed method has better edge preservation capabilities than benchmark methods, at least as measured in the available test signals. Comparison between Figures 14 and 15 shows significantly less edge erosion for presented method. The comparison of results in real side-scan sonar images are presented in Figure 16.

Conclusions
In this paper, a novel filtering method for speckle noise removal in sonar images is presented. The method is based on the Tucker decomposition of tensors composed of local patches of the sonar images. The presented algorithm performs well against images corrupted with speckle noise characteristic of the broad variance range. It outperformed all used benchmark methods in terms of the PSNR and MSE measures. As experimentally observed, the second advantage of the proposed method is less degradation of the important features, such as edges. However, a disadvantage is the high computational cost, especially observed for smaller sized patches. On the other hand, the Python implementation leaves space for further improvements in this respect. Although the method was developed mainly for sonar image enhancement, it can be useful for other signals contaminated with speckle noise such as ultrasound, radar or the OCT images.