Abstract
Wavelet decomposition is pivotal for underwater image processing, known for its ability to analyse multi-scale image features in the frequency and spatial domains. In this paper, we propose a new biorthogonal cubic special spline wavelet (BCS-SW), based on the Cohen–Daubechies–Feauveau (CDF) wavelet construction method and the cubic special spline algorithm. BCS-SW has better properties in compact support, symmetry, and frequency domain characteristics. In addition, we propose a K-layer network (KLN) based on the BCS-SW for underwater image enhancement. The KLN performs a K-layer wavelet decomposition on underwater images to extract various frequency domain features at multiple frequencies, and each decomposition layer has a convolution layer corresponding to its spatial size. This design ensures that the KLN can understand the spatial and frequency domain features of the image at the same time, providing richer features for reconstructing the enhanced image. The experimental results show that the proposed BCS-SW and KLN algorithm has better image enhancement effect than some existing algorithms.
MSC:
94A12; 68T07; 42C10
1. Introduction
Underwater imaging introduces unique challenges that necessitate the exploration of more advanced wavelet-based solutions. Underwater images are often compromised by complex environmental factors, including lighting conditions, water quality, and scattering, leading to blurred images and color distortion. Notably, there are significant differences in the attenuation rates of the different wavelengths of light in water, longer wavelengths, such as red light, attenuate more rapidly than shorter wavelengths, like blue or green light. This discrepancy results in a pronounced color cast in underwater images, which typically exhibit a blue or green hue.
The underwater environment’s lighting, combined with the scattering effects of ambient light on plankton and suspended particles, exacerbates the blurring of underwater images. Hence, there is a pressing need for techniques capable of analysing the depth, details, and texture information of images to discern their detailed features accurately. Moreover, underwater images often feature complex backgrounds and biological elements. Adjacent regions within an image may exhibit significantly different feature information due to variations in the structure and physical location, while non-adjacent regions might share similar features [1,2]. This complexity demands a nuanced approach to underwater image analysis, highlighting the critical role of sophisticated wavelet-based methodologies in addressing these challenges.
Physics-based methods construct models using the physical and optical characteristics of images captured underwater [3]. These approaches examine the physical processes responsible for degradation, such as color distortion or scattering, and aim to correct them to enhance underwater images. However, a singular physics-based model may not encompass the diverse array of intricate physical and optical factors inherent in underwater environments. This limitation results in inadequate generalization and may yield outcomes characterized by either excessive or insufficient enhancement.
Wavelet analysis is a potent tool for signal processing, playing an instrumental role in underwater image enhancement. While wavelet decomposition is known for its ability to process image signals across multiple scales and to analyse signals in the frequency domain [4,5]. Traditional wavelets, such as Haar, have been applied across a variety of vision tasks for decomposing visual signals into single or multiple layers, aiming to reduce noise and enhance the contrast of these signals. However, when faced with the complexities inherent to underwater images, particularly in terms of multi-level decomposition and the capture of high-frequency details, the limitations of these traditional wavelets become apparent. This has led to a reevaluation of the trajectory of wavelet theory, with particular emphasis on the contributions of Mallat’s multi-resolution analysis [6]. This reassessment serves as a catalyst for rethinking the design of wavelet functions, thereby more effectively addressing the specific enhancement needs of underwater images. By refining wavelet functions to cater to the unique challenges posed by underwater environments, such as blurring, color cast, and detail loss, this approach seeks to develop more sophisticated and effective methods for underwater image enhancement.
Additionally, spline wavelets usually exhibit less of a ringing effect, which helps to prevent the introduction of unnecessary artifacts in underwater image enhancement tasks. This provides important inspiration for us to explore the combination of wavelet decomposition with deep learning-based image enhancement tasks. As shown in Figure 1, this is our decomposition of underwater images based on BCS-SW. After the underwater image is decomposed by BCS-SW, four sub-band images, ca, cd, ch, and cv, are obtained by conducting convolution and downsampling. Most deep neural network methods that incorporate wavelet analysis tend to introduce Haar wavelets into the network; we aim to explore the integration of more complex wavelets with deep neural networks to better combine the advantages of deep neural networks with wavelet analysis. Our contributions to this work can be summarized as follows:
1. We propose a new BCS-SW based on the CDF wavelet construction method and the cubic special spline algorithm. BCS-SW is a compactly supported, symmetric spline wavelet. Both BCS-SW and its corresponding dual wavelet are constructed for image decomposition and reconstruction, respectively, based on a multi-resolution analysis and two-scale equations. BCS-SW demonstrates superior performance in the frequency domain signal extraction, particularly for complex signals, providing a more versatile and efficient approach for interpreting local features and color degeneration in underwater images;
2. We propose a K-layer network (KLN) based on the BCS-SW for underwater image enhancement. Specifically, the KLN utilizes K-layer wavelet decomposition on underwater images to extract features across multiple frequency domains. Each layer of decomposition is paired with convolution-based layers, each tuned to a distinct scale. These convolution-based layers enhance the network’s comprehension of the images’ spatial characteristics, facilitating more stable frequency domain signals;
3. We devise qualitative and quantitative experiments on multiple underwater image datasets, showcasing the effectiveness of the BCS-SW and KLN in underwater image enhancement. Our work serves as an important reference for the integration of complex wavelet transforms into deep neural networks, demonstrating the potential of this approach.
Figure 1.
Visualization of the decomposed underwater original and ground truth images by the biorthogonal cubic special spline wavelet (BCS-SW). After the underwater image is decomposed by BCS-SW, ca is the approximate information of the original image, cd is the diagonal information of the original image, ch is the horizontal information of the original image, and cv is the vertical information of the original image.
2. Related Work
For spline wavelet algorithms, Khan et al. [7] pioneered the formulation of B-spline wavelet packets, alongside the creation of their corresponding dual wavelet packets, delving into the exploration of the inherent properties of B-spline wavelet packets. Cohen et al. [8] proposed the Cohen–Daubechies–Feauveau (CDF) wavelet family, and this wavelet family uses the compact support set spline function as the parent wavelet and has the properties of compact support set and orthogonal, which becomes an important milestone in the study of spline wavelet. Olkkonen et al. [9] introduced the shift-invariant gamma spline wavelet transform for the tree-structured subscale analysis of asymmetric signal waveforms and systems with asymmetric impulse responses. Building on this work, Tavakoli and Esmaeili [10] subsequently orchestrated the development of biorthogonal multiple knot B-spline (MKBS) scaling functions, along with the inception of multiple knot B-spline wavelet (MKBSW) basis functions, thereby enriching the repository of the tools available in wavelet analysis. Compared to Haar wavelets and Daubechies wavelets, MKBSW offers superior smoothness and continuity, which make them perform better in accurately approximating and analysing continuous signals.
For wavelet analysis applied to image processing, the advantages include its ability to provide multi-resolution analysis, efficient compression, and effective denoising capabilities. However, the disadvantages include higher computational complexity, issues with boundary effects, and difficulties in choosing the appropriate wavelet basis. Spline wavelet transform has been applied to various image processing tasks, such as image super-resolution and denoising proposed by Huang et al. [11] and Kang et al. [12]. A new spatially adaptive method for recovering noisy blurred images proposed by Banham and Katsaggelos [13], which is particularly effective in producing crisp deconvolution while suppressing noise in the flat regions of the image. In the realm of underwater image enhancement, several physically based methods utilize the discrete wavelet transform (DWT) to decompose the image and process it in the frequency domain. For instance, Sree Sharmila et al. [14] proposed a novel image resolution enhancement method based on the combination of DWT and stabilized wavelet transform (SWT), employing the histogram shift shaping method for wavelet decomposition to enhance the contrast and resolution of the image. Singh et al. [15] employed a discrete wavelet transform-based interpolation technique for resolution enhancement. Ma and Oh [16] introduced a dual-stream network based on Haar wavelets [17], which effectively tackles color cast and enhances blurry details in underwater images. Huang et al. [11] introduced a convolutional neural network (CNN) approach based on wavelets, capable of ultra-resolving very low-resolution face images, as small as 16 × 16 pixels, to larger versions at multiple scaling factors (2×, 4×, 8×, and even 16×) within a unified framework.
For deep neural networks applied to underwater image processing, their capability primarily depends on the quality of the training dataset and the capacity of the network. Introducing wavelet analysis into deep neural networks can effectively accelerate the network’s processing of image features and enhance the network’s understanding of images. Due to the robustness and generalization capabilities of deep neural networks in underwater image enhancement tasks, a significant body of work has emerged focusing on leveraging these networks. Perez et al. [18] proposed employing CNN for underwater image enhancement. They trained the CNN using image restoration techniques to achieve an end-to-end transformation model between a hazy image and its corresponding clear image. Wang et al. [19] introduced a CNN-based method named UIE-Net, which is trained on two tasks: color correction and haze removal. This unified training approach enables the network to learn powerful feature representations for both tasks simultaneously. To enhance the extraction of intrinsic features within local blocks, their learning framework incorporates a pixel perturbation strategy, significantly improving convergence speed and accuracy. Goodfellow et al. [20] introduced generative adversarial nets (GANs), presenting a novel methodology for training generative models. As research into GANs deepens, underwater image enhancement increasingly becomes a task of transforming between the underwater domain and the enhancement domain. Li et al. [21] proposed Water-GAN to generate a large training dataset comprising corresponding depth, in-air color images, and realistic underwater images. Water-GAN’s generator is responsible for synthesizing real and depth images into an underwater image, while Water-GAN’s discriminator classifies the real images from the synthesized ones. Moreover, Fabbri et al. [22] proposed UGAN, GANs specifically designed to enhance the quality of underwater images. The model’s design objective is to improve the visibility and clarity of underwater images through adversarial training, addressing the challenges posed by the absorption and scattering of light in underwater environments.
Although deep neural networks (DNNs) have demonstrated strong capabilities in underwater image processing, especially in image enhancement, object detection, and classification, there are some obvious limitations in their application. Underwater images are often heavily affected by noise and perturbations, and deep neural networks may exhibit some vulnerability when processing such data, especially if the network has not been trained on data containing high noise. We propose a new deep neural network K-layer network for underwater image processing based on BCS-SW. In the experimental part, we also compare it with other cutting-edge underwater image processing techniques to show the advantage of the KLN network over other methods.
3. Methodology
The current research that has more combinations with deep neural networks is mostly the Haar wavelets because of the orthogonality of Haar wavelets. Although they have the advantages of simple computation and easy implementation, they also have some limitations, such as a lack of smoothness, resulting in a loss of low-frequency information, and sensitivity to noise, and may not be as good as spline wavelet transform in processing complex images and continuous signals. Hence, we propose a novel spline wavelet combined with deep neural networks to process underwater images.
Inspired by compactly supported spline wavelets [23], which provide a better approximation of various images and its wavelet basis functions can be flexibly adjusted according to the needs of applications. We propose a new biorthogonal special cubic special spline wavelet (BCS-SW) for underwater processing in this work. Furthermore, we also propose a new deep neural network, the K-layer network (KLN), for underwater image enhancement based on BCS-SW.
3.1. A New Biorthogonal Cubic Special Spline Wavelet (BCS-SW)
3.1.1. Cubic Special Spline Algorithm
In our work, BCS-SW is based on the cubic special spline algorithm proposed by Chen and Cai [24]. They proposed a novel spline algorithm and provided various representations of cubic splines with different compact supports. In this paper, we selected one of the cubic splines with the smallest compact support, as shown in Equation (1), and on the basis of this spline, we derived a new class of spline wavelet algorithms following the CDF method of wavelet construction.
and is the cubic B-spline:
where is the unit step function
comes out from a linear combination of the normalized and the shifted B-splines of the same order. Consequently, can inherit nearly all the favourable properties of , including analyticity, central symmetry, local support, and high-order smoothness. Moreover, can directly interpolate the provided data without the need to solve coefficient equations, a capability that B-spline lacks.
The Fourier transform expressions of :
The spline and the Fourier transform are separately plotted in Figure 2.
Figure 2.
Analysis of the cubic special spline . (a) The graph of the . (b) The graph of the Fourier transform .
3.1.2. Constructing Biorthogonal Cubic Special Spline Wavelet (BCS-SW)
Chui [25] and Graps [26] have proved B-spline is the scale function of the corresponding multi-resolution analysis. is formed by the linear combination of B-spline translation and expansion. Therefore, we can naturally deduce the following conclusion.
The subspaces are generated by binary dilation and integer translation, as follows:
where forms a general multi-resolution analysis (GMRA) in , called spline multi-resolution analysis. is the corresponding scaling function. According to the theory of wavelet construction, , as a scale function, can construct a new wavelet . Let be the dual scaling function of and be the dual wavelet of , then their corresponding low-pass filters are:
And high-pass filters are:
where are all integers, and are the lengths of and , respectively, and , . All coefficients are real coefficients.
We also construct a new class of compactly supported wavelets based on CDF. We are aware that wavelets with compact supports exist as long as the two-scale sequence of the related scaling function is finite. In the paper, we set and as odd-length and the support set is symmetric at about 0. The vanishing moment order of and are N and , respectively, and they can also have the following representation:
where , are the polynomials of .
Let
and when , we also have:
where .
From the time domain expression of the two-scale equation corresponding to , the low-pass filter of in the frequency domain can be obtained as follows:
When , L takes different values, for example, , we can obtain multiple corresponding . Bring these values into Equations (11), (9) and (14), by taking the inverse Fourier transform as in Algorithm 1, we can obtain multiple groups of the corresponding low-pass filter coefficients of the new biorthogonal spline wavelet in Table 1. Considering the symmetry of the coefficients, we only give , as shown in Table 1. In practical application, the corresponding odd coefficients can be symmetrically selected for image processing.
| Algorithm 1 BCS-SW Filter Algorithm |
| Input: L: sum of the vanishing moment order of and by Equations (8) and (9), and ; : defined by Equation (11); : axis angel; n: integer, the subscript of the low-pass filter coefficient; Output: : the set of corresponding low-pass filter coefficients of ; : the set of corresponding low-pass filter coefficients of ;
|
Table 1.
The low-pass filter coefficients , of the BCS-SW and its dual wavelet.
Due to , from the data in Table 1, we can calculate the corresponding , the high-pass filter coefficients of the and . The filter bank in the frequency domain is , the decomposition and reconstruction processes use two different sets of filters, respectively. It was decomposed with and , the reconstruction uses a different pair of filters and . Because of this, we make and wavelet decomposition filters, and and wavelet synthesis filters.
3.2. K-Layer Network
Compared to traditional wavelets, the BCS-SW combines the smoothness of the spline function with the localized accuracy of wavelet analysis [27]. In underwater image processing, due to the influence of water and the scattering of light, the image is often affected by noise and blur. BCS-SW is able to better capture smooth parts of the image, helping to reduce noise and blur. BCS-SW provides better frequency localization. This means that in underwater image processing, the spline wavelet can better capture the local details and structural information of the image. BCS-SW generally has better time–frequency localization characteristics and can better adapt to the different scale and frequency features present in underwater images. And its gradient is more accurate and less prone to be lost, which is beneficial for deep neural networks.
In this section, we propose the K-layer network (KLN) based on BCS-SW. BCS-SW decomposition offers significant advantages for underwater image enhancement tasks and separates high- and low-frequency information from images, and it can effectively remove or reduce noise caused by suspended particles in underwater images. In the KLN network, the input image undergoes K-layer decomposition, the encoder part has two branches: one branch uses wavelets for feature decomposition, and the other branch uses convolution for feature decomposition. Decoder part: the middle tensor of the same shape of the decoder part is spliced upsampling, the upsampling process of decoding, and the output of the previous step is used as the input of the next step, which speeds up the network’s processing of features.
Specifically, the filters of decomposition and reconstruction of BCS-SW are truncated, let , selects 7 numbers that are symmetric about 0, in the same way, the length of the also takes 7; and selects 21 numbers that are symmetric about 0, similarly, the length of is also 21.
As shown in Figure 3, the KLN decomposes underwater image X based on BCS-SW to get four wavelet coefficients in each layer, for example, , , , and , in the first layer, is the approximate information of the original image, is the diagonal information of the original image, is the horizontal information of the original image, is the vertical information of the original image. In subsequent layers, the low-frequency coefficient of the previous layer is decomposed to get new wavelet coefficients following Equation (15):
where represents the low-frequency coefficient obtained in the jth layer. th wavelet decomposition is carried out by the previous jth low-frequency coefficient. Specifically, the low-pass filter lod and the high-pass filter hid filter each row of and sample at intervals, and then filter each column of and sample at intervals with them, respectively, to obtain the th layer wavelet coefficients: , , , . We also use Equation (16) to represent this process.
where represents the wavelet decomposition based on BCS-SW. At the same time, the KLN will also perform K convolutions on the input underwater image X.
where , represents the features obtained in the kth convolution layer. Equation (17) represents , which comes from the convolution processing of , which is the convolution result of the previous layer.
Figure 3.
The structure of a K-layer network based (KLN) on BCS-SW. Our method overview involves training on a synthetic dataset. We obtain sub-band images with multiple frequency bands through discrete wavelet transform (DWT). This process effectively decouples color cast and blurry details in underwater images.
The KLN reconstructs the enhanced underwater images as follows Equation (18):
where represents reconstructing the enhanced image based on upsampling and convolution, is the enhanced underwater image, and also is the output of the KLN. We use the L1 loss, L2 loss, SSIM loss, and Perceptual loss for reconstructing enhanced images. Because we used more loss functions and spent a longer time on training, we also helped to obtain good experimental results, as shown:
where is the ground truth enhanced image.
The KLN leverages the Adam optimization method, which is renowned for its efficiency and effectiveness in handling large datasets with high-dimensional parameter spaces. This method significantly enhances the convergence rates by adapting the learning rates based on the estimations of the first and second moments of the gradients, making it ideal for complex models like the KLN. In our implementation, the ‘concat’ command plays a crucial role, as it is employed to perform the concatenation step during the model’s layer fusion process. Specifically, this command facilitates the merging of features from different layers, which is vital for preserving and integrating diverse spatial and contextual information across the network. This technique not only enriches the model’s feature representation but also boosts its overall performance by enabling more comprehensive learning from multiple perspectives within the data.
4. Experiments
4.1. Implementation Details
In this section, we will introduce the quantitative experiments and qualitative experiments of the BCS-SW and KLN. We verify the effectiveness of the BCS-SW and KLN on various underwater datasets. In particular, the KLN is trained under the Pytorch. The training is performed on an NVIDIA A100-PCIE-40GB GPU, we use the Adam optimizer with a batch size of 25 and a learning rate of . The training set, which includes 5000 images, is the combination of 800 images from UIEBD proposed by Li et al. [28], 3200 images from LSUI proposed by Peng et al. [29], and 1000 images from UIQS proposed by Liu et al. [30]. The test set includes UIEBD90 and UIQS. To evaluate the ability of the BCS-SW and KLN, we introduce the non-reference underwater images metrics: UCIQE proposed by Yang and Sowmya [31] and UIQM proposed by Panetta et al. [32], and the full-reference metrics: PCQI proposed by Wang et al. [33]. UCIQE evaluates the quality of the underwater images by calculating the color intensity, saturation, and contrast of the underwater images. The larger the value, the better the quality of the underwater image. UIQM evaluates the quality of the underwater images by calculating the brightness, contrast, and saturation of the underwater images. The larger the value, the better the quality of the underwater image. PCQI evaluates the quality of the enhanced image by calculating the contrast difference between the enhanced image and the ground truth image in a localized area. The larger the value, the more similar the enhanced image is to the ground truth image.
We first compared the performance of the BCS-SW, Haar, Bior3.5 and DB2 wavelet in underwater image denoising and enhancement tasks on the UIEBD and LSUI underwater image datasets, as shown in Section 4.2. And then, we compare the KLN with other underwater image enhancement methods, including UDCP proposed by Drews et al. [34], GDCP proposed by Peng et al. [35], Ucolor proposed by Li et al. [36], MLLE proposed by Zhang et al. [37], TOPAL proposed by Jiang et al. [38], WWPF proposed by Zhang et al. [39], U-shape proposed by Peng et al. [29], and UIEBD and UIQS proposed by Liu et al. [30], as shown in Section 4.3.
4.2. BCS-SW vs. Other Wavelets in Underwater Image Related Tasks
All existing wavelet-based deep learning network models predominantly utilize Haar wavelets. The newly proposed BCS-SW has better local properties in the frequency domain than Haar and can capture local features in images better. It can represent complex image structures and features more accurately and maintain image details and contours better during image reconstruction, with continuity and smoothness. This means that more continuous and smoother results can be produced during image reconstruction, in contrast to the mutability of the transformation results of Haar, which may result in image reconstruction results with jagged edges.
The process of image approximation information reconstruction after one layer decomposition based on the BCS-SW is: the low-frequency information, horizontal high-frequency information, vertical high-frequency information, and diagonal high-frequency information are obtained after image wavelet decomposition. The wavelet decomposed the first layer of the image, reconstructed the low-frequency information of the first layer of decomposition, obtained the general appearance of the original image, and compared the general appearance of these images. As shown in Figure 4 and Table 2, the experimental outcomes demonstrate that the BCS-SW surpasses the Haar wavelet in terms of PSNR proposed by Korhonen and You [40] and SSIM proposed by Hore and Ziou [41], indicating a better performance in preserving the original image’s quality and structural integrity.
Figure 4.
The reconstruction images of approximation information after 1 layer decomposition using BCS-SW transform and Haar wavelet transform. And (a–d) are the original images of the four underwater images and processed by wavelet transform. The first column is the original underwater images; the second column is the reconstructed image after haar wavelet transform; the third column is the reconstructed images after BCS-SW wavelet transform.
Table 2.
The PSNR and SSIM of approximation information reconstruction after 1 layer decomposition using Haar and BCS-SW. The 1st best results are in bold. ↑: The higher, the better.
As shown in Figure 5 and Table 3, we add Gaussian noise to the underwater images, and the wavelet threshold denoising method is adopted. Firstly, the noisy image is decomposed in two layers by the wavelet, and the wavelet coefficient is processed by threshold; that is, the wavelet coefficient greater than (or) less than a certain threshold is processed, and the original image is reconstructed though using the processed results by Haar, Bior3.5, DB2, and BCS-SW, respectively. Because of the BCS-SW’s representation capabilities, it usually produces better denoising results. The coefficients of the BCS-SW transform are generally easier to distinguish between signal and noise, and are therefore more suitable for removing noise from images. In Table 3, the comparison of PSNR and SSIM indicates that, compared to other wavelets, the images processed by the BCS-SW are the closest to their corresponding unaltered images captured on land.
Figure 5.
The figures of four underwater images of BCS-SW, Haar, Bior3.5, and DB2 wavelet denoising. And (a–d) are the original images, denoising with Gaussian noise and denoised by various wavelets of the four underwater images. The first column is the original images, the second column is with Gaussian noise, and columns 3–6 are denoising images using Haar, Bior3.5, DB2, and BCS-SW.
Table 3.
The PNSR and SSIM of denoising results of four underwater images with Gaussian noise intensity of 0.02 using different wavelets. The 1st best results are in bold. ↑: The higher, the better.
In this section, we also show that when multi-layer wavelets decompose the underwater image, the decomposed signal based on a spline wavelet has lower noise, and the partially reconstructed image is closer to the original underwater image.
4.3. KLN vs. Other Underwater Image Enhancement Algorithms
As shown in Figure 6, the test results on UIEBT90 demonstrate significant improvements in the background quality of the underwater images processed by the KLN. Particularly in the images in the first and third rows, a clearly visible change in background color can be observed when compared with other processing methods. More importantly, this enhancement in background color not only allows for the revelation of more details in the background but also makes the subjects in the images more prominent, with color restoration becoming more natural and realistic. The second row of images showcases scenes abundant with fish. After the background improvement by the KLN, the contrast between the background and the fish becomes more pronounced, making the target objects within the images more discernible. Additionally, the texture information of the fish is more clearly displayed in the images. As shown in Table 4, although the scores in PCQI, UIQM, and UCIQE are not the highest, the gap from the highest value is minimal. Other methods might perform optimally in certain metrics, for example, the Ucolor method achieves the highest score in UIQM, but its scores in UCIQE and PCQI are lower than our method’s scores. Furthermore, compared with other methods, the images processed by our method achieve the highest PSNR and SSIM values. Considering all factors, our model presents the best overall performance.
Figure 6.
The KLN vs. other underwater image enhancement methods on UIEBD. The PCQI of the KLN is the best, meaning that the processing method has achieved good results in improving the perceived color and quality of the underwater images.
Table 4.
The mean UIQM, UCIQE, PCQI, PSNR, and SSIM scores of different methods on UIEBD90. The best results are in bold. ↑: The higher, the better.
As shown in Figure 7, a detailed examination of the images in the first and second rows of the collection, especially the starfish at their centres, reveals a significant color difference. These starfish should display a full orange-red hue in their natural state, but after being processed by different methods, their color rendition varies. Compared to the more subdued and greyish-orange appearance of the starfish processed by other methods, those processed by the KLN exhibit a more vivid, rich, and natural orange-red color. This stark contrast not only illustrates the KLN’s remarkable capability in restoring and enhancing the red spectrum colors of underwater images but also highlights its superiority in color authenticity. Further analysis of the data in Table 5 shows that the KLN ranks just below WWPF in the UCIQE metric, demonstrating its strong performance in improving the overall color quality of the underwater images. Notably, the KLN’s performance in UIQM and PCQI is also very close to the best, validating its efficacy in maintaining image color saturation, contrast, and brightness, while also emphasizing its exceptional ability to preserve image details and texture. These comprehensive performances make the KLN one of the best choices among various underwater image enhancement technologies for overall effectiveness.
Figure 7.
The KLN vs other underwater image enhancement methods on UIQS. The UCQIE of the KLN is the 2nd best, showing that the processing method has a good effect on improving the color quality of the underwater images.
Table 5.
The mean UIQM, UCIQE, and PCQI scores of different methods on UIQS. The 1st and 2nd best results are in bold and underline, respectively. ↑: The higher, the better.
When comparing our model with two other deep learning models, we observed significant differences in the number of parameters and computational complexity (FLOPS), as shown in Table 6. Our model has the highest number of parameters (57.24 M) and FLOPS (207.8 G), indicating that its network structure is the most complex and suitable for tasks that require synthesizing a large amount of global and local information. In contrast, the U-shaped model performs tasks with fewer parameters (31.6 M) and FLOPS (26.11 G), which may be more effective in simple or real-time applications. The UIR model operates with an extremely low number of parameters (1.68 M) but relatively high FLOPS (36.44 G), reflecting its efficient use of parameters, making it suitable for environments with limited computational resources but that require high processing capabilities.
Table 6.
The Flops (G) and total parameters of our method with others.
5. Conclusions
In this work, we propose the BCS-SW and introduce the KLN model based on the BCS-SW. We substantiate the effectiveness of both the BCS and the KLN through the theoretical analysis in Section 3. Further, in Section 4, we demonstrate that that BCS-SW surpasses other wavelets in terms of image decomposition, denoising, and reconstruction capabilities. Additionally, we validate the effectiveness of the KLN for underwater image processing tasks. Our approach provides a practical methodology that serves as a benchmark for integrating wavelet and deep learning techniques in underwater image enhancement and related efforts. However, our integration of wavelet processing in the network leads to a higher computational demand compared to other methods, thus limiting our ability to further expand the depth and breadth of the KLN to enhance its performance. In future research, we will focus on how to conveniently incorporate wavelet analysis into deep neural networks. Moving forward, we aim to continue exploring strategies to address image processing challenges through the fusion of wavelet and deep learning technologies.
Author Contributions
Conceptualization, D.Z. and Z.C.; methodology, D.Z. and Z.C.; software, D.Z.; validation, D.Z., Z.C., and D.H.; formal analysis, D.Z.; investigation, D.Z. and Z.C.; resources, D.Z.; data curation, D.Z. and D.H.; writing—original draft preparation, D.Z.; writing—review and editing, D.Z. and D.H.; visualization, D.Z. and D.H.; supervision, D.Z.; project administration, Z.C.; funding acquisition, D.Z. All authors have read and agreed to the published version of the manuscript.
Funding
This research was funded in part by Philosophy and Social Sciences Planning Project, Zhuhai, 2023 OF FUNDER grant number 2022ZDZX4061, and in part by Undergraduate Universities Online Open Course Steering Project. 313314 Guangdong, 2022 OF FUNDER grant number 2022ZXKC558.
Data Availability Statement
Data set in our manuscript can be obtained as follows: UIEB: https://li-chongyi.github.io/proj_benchmark.html (accessed on 28 November 2019); UIQS: https://github.com/dlut-dimt/Realworld-Underwater-Image-Enhancement-RUIE-Benchmark (accessed on 3 January 2020); LSUI: https://github.com/LintaoPeng/U-shape_Transformer_for_Underwater_Image_Enhancement (accessed on 18 May 2023).
Acknowledgments
The authors thank Zhuang Zhou and Fangli Sun for their careful review and advice.
Conflicts of Interest
The authors declare no conflicts of interest.
References
- Song, H.; Wang, R. Underwater Image Enhancement Based on Multi-Scale Fusion and Global Stretching of Dual-Model. Mathematics 2021, 9, 595. [Google Scholar] [CrossRef]
- Zhu, D. Underwater image enhancement based on the improved algorithm of dark channel. Mathematics 2023, 11, 1382. [Google Scholar] [CrossRef]
- Peng, Y.T.; Cosman, P.C. Underwater Image Restoration Based on Image Blurriness and Light Absorption. IEEE Trans. Image Process. 2017, 26, 1579–1594. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Zhang, H.; Zhang, X.; Su, Y.; Wang, Z. Voiceprint Recognition under Cross-Scenario Conditions Using Perceptual Wavelet Packet Entropy-Guided Efficient-Channel-Attention–Res2Net–Time-Delay-Neural-Network Model. Mathematics 2023, 11, 4205. [Google Scholar] [CrossRef]
- Garai, S.; Paul, R.K.; Rakshit, D.; Yeasin, M.; Emam, W.; Tashkandy, Y.; Chesneau, C. Wavelets in combination with stochastic and machine learning models to predict agricultural prices. Mathematics 2023, 11, 2896. [Google Scholar] [CrossRef]
- Mallat, S.G. Multiresolution approximations and wavelet orthonormal bases of L2(R). Trans. Am. Math. Soc. 1989, 315, 69–87. [Google Scholar]
- Khan, S.; Ahmad, M.K. A study on B-spline wavelets and wavelet packets. Appl. Math. 2014, 5, 3001. [Google Scholar] [CrossRef][Green Version]
- Cohen, A.; Daubechies, I.; Feauveau, J.C. Biorthogonal bases of compactly supported wavelets. Commun. Pure Appl. Math. 1992, 45, 485–560. [Google Scholar] [CrossRef]
- Olkkonen, H.; Olkkonen, J.T. Gamma splines and wavelets. J. Eng. 2013, 2013, 625364. [Google Scholar]
- Tavakoli, A.; Esmaeili, M. Construction of Dual Multiple Knot B-Spline Wavelets on Interval. Bull. Iran. Math. Soc. 2019, 45, 843–864. [Google Scholar] [CrossRef]
- Huang, H.; He, R.; Sun, Z.; Tan, T. Wavelet-srnet: A wavelet-based cnn for multi-scale face super resolution. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 1689–1697. [Google Scholar]
- Kang, E.; Chang, W.; Yoo, J.; Ye, J.C. Deep convolutional framelet denosing for low-dose CT via wavelet residual network. IEEE Trans. Med. Imaging 2018, 37, 1358–1369. [Google Scholar] [CrossRef] [PubMed]
- Banham, M.R.; Katsaggelos, A.K. Spatially adaptive wavelet-based multiscale image restoration. IEEE Trans. Image Process. 1996, 5, 619–634. [Google Scholar] [CrossRef] [PubMed]
- Sree Sharmila, T.; Ramar, K.; Sree Renga Raja, T. Impact of applying pre-processing techniques for improving classification accuracy. Signal Image Video Process. 2014, 8, 149–157. [Google Scholar] [CrossRef]
- Singh, S.R. Enhancement of contrast and resolution of gray scale and color images by wavelet decomposition and histogram shaping and shifting. In Proceedings of the 2014 International Conference on Medical Imaging, m-Health and Emerging Communication Systems (MedCom), Greater Noida, India, 7–8 November 2014; pp. 300–305. [Google Scholar]
- Ma, Z.; Oh, C. A Wavelet-Based Dual-Stream Network for Underwater Image Enhancement. In Proceedings of the ICASSP 2022—2022 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Singapore, 22–27 May 2022; pp. 2769–2773. [Google Scholar] [CrossRef]
- Haar, A. Zur Theorie der Orthogonalen Funktionensysteme; Georg-August-Universitat: Gottingen, Germany, 1909. [Google Scholar]
- Perez, J.; Attanasio, A.C.; Nechyporenko, N.; Sanz, P.J. A deep learning approach for underwater image enhancement. In Proceedings of the Biomedical Applications Based on Natural and Artificial Computing: International Work-Conference on the Interplay between Natural and Artificial Computation, IWINAC 2017, Corunna, Spain, 19–23 June 2017; pp. 183–192. [Google Scholar]
- Wang, Y.; Zhang, J.; Cao, Y.; Wang, Z. A deep CNN method for underwater image enhancement. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 1382–1386. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014, 27, 2672–2680. [Google Scholar]
- Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 2017, 3, 387–394. [Google Scholar] [CrossRef]
- Fabbri, C.; Islam, M.J.; Sattar, J. Enhancing underwater imagery using generative adversarial networks. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 7159–7165. [Google Scholar]
- Chui, C.K.; Wang, J.Z. On compactly supported spline wavelets and a duality principle. Trans. Am. Math. Soc. 1992, 330, 903–915. [Google Scholar] [CrossRef]
- Chen, J.; Cai, Z. A New Class of Explicit Interpolatory Splines and Related Measurement Estimation. IEEE Trans. Signal Process. 2020, 68, 2799–2813. [Google Scholar] [CrossRef]
- Chui, C.K. An Introduction to Wavelets; Academic Press: Cambridge, MA, USA, 1992; Volume 1. [Google Scholar]
- Graps, A. An introduction to wavelets. IEEE Comput. Sci. Eng. 1995, 2, 50–61. [Google Scholar]
- Lamnii, A.; Nour, M.Y.; Zidna, A. A reverse non-stationary generalized B-splines subdivision scheme. Mathematics 2021, 9, 2628. [Google Scholar]
- Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar]
- Peng, L.; Zhu, C.; Bian, L. U-shape transformer for underwater image enhancement. IEEE Trans. Image Process. 2023, 32, 3066–3079. [Google Scholar] [CrossRef] [PubMed]
- Liu, R.; Fan, X.; Zhu, M.; Hou, M.; Luo, Z. Real-world underwater enhancement: Challenges, benchmarks, and solutions under natural light. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 4861–4875. [Google Scholar] [CrossRef]
- Yang, M.; Sowmya, A. An underwater color image quality evaluation metric. IEEE Trans. Image Process. 2015, 24, 6062–6071. [Google Scholar] [CrossRef] [PubMed]
- Panetta, K.; Gao, C.; Agaian, S. Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean. Eng. 2015, 41, 541–551. [Google Scholar] [CrossRef]
- Wang, S.; Ma, K.; Yeganeh, H.; Wang, Z.; Lin, W. A patch-structure representation method for quality assessment of contrast changed images. IEEE Signal Process. Lett. 2015, 22, 2387–2390. [Google Scholar]
- Drews, P.; Nascimento, E.; Moraes, F.; Botelho, S.; Campos, M. Transmission estimation in underwater single images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, Australia, 1–8 December 2013; pp. 825–830. [Google Scholar]
- Peng, Y.T.; Cao, K.; Cosman, P.C. Generalization of the dark channel prior for single image restoration. IEEE Trans. Image Process. 2018, 27, 2856–2868. [Google Scholar] [CrossRef] [PubMed]
- Li, C.; Anwar, S.; Hou, J.; Cong, R.; Guo, C.; Ren, W. Underwater image enhancement via medium transmission-guided multi-color space embedding. IEEE Trans. Image Process. 2021, 30, 4985–5000. [Google Scholar] [CrossRef] [PubMed]
- Zhang, W.; Zhuang, P.; Sun, H.H.; Li, G.; Kwong, S.; Li, C. Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement. IEEE Trans. Image Process. 2022, 31, 3997–4010. [Google Scholar] [CrossRef] [PubMed]
- Jiang, Z.; Li, Z.; Yang, S.; Fan, X.; Liu, R. Target oriented perceptual adversarial fusion network for underwater image enhancement. IEEE Trans. Circuits Syst. Video Technol. 2022, 32, 6584–6598. [Google Scholar] [CrossRef]
- Zhang, W.; Zhou, L.; Zhuang, P.; Li, G.; Pan, X.; Zhao, W.; Li, C. Underwater image enhancement via weighted wavelet visual perception fusion. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 2469–2483. [Google Scholar] [CrossRef]
- Korhonen, J.; You, J. Peak signal-to-noise ratio revisited: Is simple beautiful? In Proceedings of the 2012 Fourth International Workshop on Quality of Multimedia Experience, Melbourne, Australia, 5–7 July 2012; pp. 37–38. [Google Scholar]
- Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).