LEPF-Net: Light Enhancement Pixel Fusion Network for Underwater Image Enhancement

Yan, Jiaquan; Wang, Yijian; Fan, Haoyi; Huang, Jiayan; Grau, Antoni; Wang, Chuansheng

doi:10.3390/jmse11061195

Open AccessArticle

LEPF-Net: Light Enhancement Pixel Fusion Network for Underwater Image Enhancement

¹

Fujian Provincial Key Laboratory of Information Processing and Intelligent Control, College of Computer and Control Engineering, Minjiang University, Fuzhou 350121, China

²

College of Mathematics and Data Science, Minjiang University, Fuzhou 350108, China

³

School of Computer and Artificial Intelligence, Zhengzhou University, Zhengzhou 450001, China

⁴

New Engineering Industry College, Putian University, Putian 351100, China

⁵

Department of Automatic Control Technical, Polytechnic University of Catalonia, 08034 Barcelona, Spain

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng. 2023, 11(6), 1195; https://doi.org/10.3390/jmse11061195

Submission received: 20 April 2023 / Revised: 25 May 2023 / Accepted: 6 June 2023 / Published: 8 June 2023

(This article belongs to the Special Issue Underwater Acoustic Communication and Network)

Download

Browse Figures

Versions Notes

Abstract

:

Underwater images often suffer from degradation due to scattering and absorption. With the development of artificial intelligence, fully supervised learning-based models have been widely adopted to solve this problem. However, the enhancement performance is susceptible to the quality of the reference images, which is more pronounced in underwater image enhancement tasks because the ground truths are not available. In this paper, we propose a light-enhanced pixel fusion network (LEPF-Net) to solve this problem. Specifically, we first introduce a novel light enhancement block (LEB) based on the residual block (RB) and the light enhancement curve (LE-Curve) to restore the cast color of the images. The RB is adopted to learn and obtain the feature maps from an original input image, and the LE-Curve is used to renovate the color cast of the learned images. To realize the superb detail of the repaired images, which is superior to the reference images, we develop a pixel fusion subnetwork (PF-SubNet) that adopts a pixel attention mechanism (PAM) to eliminate noise from the underwater image. The PAM adapts weight allocation to different levels of a feature map, which leads to an enhancement in the visibility of severely degraded areas. The experimental results show that the proposed LEPF-Net outperforms most of the existing underwater image enhancement methods. Furthermore, among the five classic no-reference image quality assessment (NRIQA) indicators, the enhanced images obtained by LEPF-Net are of higher quality than the ground truths from the UIEB dataset.

Keywords:

underwater image enhancement; light enhancement curve; pixel fusion; adaptive feature weights

1. Introduction

Underwater image processing technology plays an important role in undersea operations [1,2], such as underwater docking and submarine cable inspection [3], which require clear underwater images to provide accurate visual information [4]. However, the imaging process of an underwater image is complicated due to the different decay rates of the various primary colors in water [5]. Generally, clear underwater images can be obtained by optical cameras, laser scanning, distance selectivity, and polarized light. However, due to technical and equipment limitations, the current existing methods usually use certain underwater image enhancement (UIE) algorithms to enhance the visual effect of low-quality underwater images [6,7]. Therefore, studies on UIE technologies have far-reaching significance and broad application prospects for marine exploration.

In order to better promote the development of submarine engineering, Jaffe [8] and McGlamery [9] proposed a well-known Jaffe–McGreevey model to describe the underwater optical imaging process. In their model, the total radiant energy

E_{T}

captured by the camera is the linear superposition of transmission

E_{d}

, background scattering

E_{b}

, and forward-scattering components

E_{f}

. Its formula is as follows:

E_{T} = E_{d} + E_{b} + E_{f}

(1)

where the

E_{d}

and

E_{b}

of the model were further expressed as follows:

E_{d} = J_{λ} (x) t_{λ} (x)

(2)

E_{b} = B_{λ} (1 - t_{λ} (x))

(3)

where x is a pixel coordinate,

λ \in {R, G, B}

denotes a color channel,

J_{λ} (x)

is an uncontaminated image,

t_{λ} (x)

is the transmission map of

J_{λ} (x)

, and

B_{λ}

denotes the background light. Since the distance between the camera and the object is relatively close when shooting underwater scenes [10],

E_{f}

can be neglected in Equation (1). Therefore, a contaminated image I was captured by the camera and can be acquired from Equations (1)–(3), which can be defined as follows:

I_{λ} (x) = J_{λ} (x) t_{λ} (x) + B_{λ} (1 - t_{λ} (x))

(4)

Considering the existence of multiple unknown parameters in the Jaffe–McGlamery model [8,9], physical model-based methods usually exploit various prior parameters as constraints to determine the optimal solution of J. Since Equation (4) can describe the imaging process of underwater images reasonably, many researchers have conducted underwater image enhancement research based on it [11]. However, image blurring can be eliminated to a certain extent based on the model, there are still problems existed such as high noise, low definition, and color distortion. Therefore, further enhancement and refinement is significant.

Recently, with the development of artificial intelligence, many researchers have adopted deep learning-based methods to perform underwater image enhancement [12,13]. Traditional end-to-end CNNs usually mapping the relationship between input and reference, which attempt to make the output result infinitely close to the reference. Nevertheless, the quality of the reference images will severely limit the performance of traditional UIE models based on CNNs. Additionally, many existing UIE algorithms are trained on synthetic underwater image datasets, which differ significantly from actual underwater images in visual characteristics. Therefore, references will limit the performance of UIE models. In the quantitative evaluation of the image enhancement performance, no-reference image quality assessment methods do not acquire a reference image and can give a more objective assessment [14,15].

To overcome these limitations, we propose a LEB that combines RB [16] and LE-Curve [17]. LEB can be understood as a novel block structure that enables the simultaneous learning of texture features and luminance information of an image. Texture features are acquired based on residual blocks, while luminance information is acquired based on the LE-Curve [17]. The RB [16] typically consists of two convolutional layers. First, a new feature map is obtained by performing two convolutional operations on the input feature map. Then, the new feature map is fused with the input feature map, and the resulting fused feature map is used as the final output. This process enables better feature extraction from the input. On the other hand, LE-Curve [17] is a pixel level mapping method that can automatically map low-light images to their enhanced versions. This process can capture the color information of the input image in a non-referential manner. To ensure that the enhanced image details are superior to the reference image, and mentions the design of the PF-SubNet in relation to this goal. The main idea behind the PF-SubNet is to first apply adaptive weightings to different feature maps using the PAM [18]. Subsequently, the optimal feature fusion effect is obtained through linear superposition, thereby improving the model’s receptive domain. PAM [18] is an attention mechanism that focuses on thick-shaded pixels and high-frequency image regions in an image. As the results depicted in Figure 1, LEPF-Net outperforms to other UIE methods, as demonstrated by its outstanding results.

The main contributions of this paper can be summarized as follows:

To overcome the limitation of traditional end-to-end image color restoration models, we proposed a LEB based on LE-Curve with RB. The LEB can not only learn the feature map, but also obtain illumination information without reference. Moreover, with sufficient training data, the recovery effect can even exceed that of the reference image. As per our knowledge, this is the first time that LE-Curve has been utilized for underwater image processing missions.
To achieve image detail recovery capability beyond the reference image, we proposed a PF-SubNet based on PAM and feature fusion strategy. The PF-SubNet first adaptively assigns corresponding weights to different levels of feature maps and then fuses them. This strategy not only makes the model pay more attention to the image atomization and local high frequency region, but also compensates the information loss from a shallow network to a deep network.
Experiments on UIEB and NYU-v2 show that LEPF-Net outperforms most existing UIE methods. Additionally, the enhanced image quality obtained by LEPF-Net outperformed the corresponding UIEB reference image on five different no-reference image quality assessment indexes. In addition, to validate the performance of UIE, we performed comprehensive ablation experiments to validate the importance of each module designed by this method.

The rest of this paper is structured as follows: The related works are summarized in Section 2. The methodologies of LEPF-Net are described in Section 3. Experimental results and analysis are discussed in Section 4. The paper is concluded in Section 5.

2. Related Work

Generally, UIE algorithms can be broadly classified into three categories: physical, non-physical and deep learning methods. Most physics-based models obtain a clear image by estimating transmission map and background illumination. Xiao et al. presented a turbid UIE method based on parameter-tuned stochastic resonance (SR), along with the creation of a synthetic turbid underwater image data set (UWCHIC) [19]. Lu et al. proposed an underwater image color restoration network (UICRN) based on the estimation of main parameters of the underwater imaging model, as well as an underwater image generation method, which combines inherent optical properties and apparent optical properties [20]. Zhuang et al. have introduced a novel edge-preserving filtering retinex algorithm that relies on gradient-domain guided image filtering (GGF) priors and a retinex-based variational framework [21]. Zhuang et al. proposed a hyper-Laplacian reflection priors based on the retinex variational model and developed an alternating minimization algorithm [22]. Derya et al. proposed the Sea-thru method, which uses RGBD images, the darkest pixels in the image, and known range information to enhance the color of the image [23]. One of the most important problems in physics-based models is the mathematical description of the objects on the image [24]. Although these are efficient to some extent, their performances depend on the estimation accuracy of the prior parameters.

Non-physical-based models directly enhance pixel intensity to obtain more excellent image quality such as Retinex [25] and histogram equalization [26]. Fu et al. proposed a real-time method for enhancing underwater images based on piece-wise linear transformation, which is not based on prior knowledge of the imaging conditions [27]. Zhang et al. proposed a UIE algorithm utilizing extended multi-scale Retinex (Lab-MSR), which can substantially suppress the image halo phenomenon [28]. Singh et al. put forward a new UIE approach adopting exposure recursive histogram equalization (ERHE) for low underwater lighting conditions [29]. Dai et al. proposed a dual-purpose image enhancement (DPIE) method based on Retinex, which was applied to both underwater and low-light images [30]. However, the above methods ignore the imaging properties of tanglesome underwater environments, which leads to their performance plummeting.

Recently, data-driven deep learning methods have become a research hotspot in the community of computer vision, which can avoid the selection of various prior parameters. Convolutional Neural Networks (CNNs) are extensively applied to UIE tasks due to their powerful feature extraction capabilities. The CNN was first introduced into UIE domain by Wang et al. [31], who trained an end-to-end transform model between underwater and restored images. Fu et al. proposed a UIE method that incorporates a two-branch network and compressed-histogram equalization. The former significantly reduces learning difficulty, while the latter complements data-driven deep learning [32]. Zhang et al. proposed a novel WaterFormer network using the soft reconstruction sub-network based on the Jaffe–McGramery model and the hard enhancement sub-network to restore the underwater images [33]. Xue et al. proposed an underwater image enhancement method utilizing U-Net network structure, full convolution and a novel confidence estimation algorithm [34]. Tang et al. proposed a conditional generative adversarial network model based on the attention mechanism and U-Net network structure for underwater image restoration [35]. Hsiang et al. put forward a lightweight adaptive feature fusion network (LAFFNet), which has good suppression ability for artifacts and distortions [36].

In this paper, we proposed a novel end-to-end network LEPF-Net for UIE applications. However, different from the above methods, the LEPF-Net implements color restoration directly without the reference images by LEB module. Meanwhile, we develop a new strategy leveraging PAM and feature fusion to settle the problems of image blur and noise pollution. Compared to the preceding networks, our network structure has a unique design. The plentiful experiments in this paper testify that LEPF-Net has attained better performance.

3. Proposed Method

Based on the classical end-to-end CNNs, this paper proposes a light enhancement pixel fusion network for underwater image enhancement (named LEPF-Net). The LEPF-Net introduces a LE-Curve-based [17] LEB and a PAM-based PF-SubNet to recover colors and details of underwater images. The structure of the proposed LEPF-Net is shown in Figure 2. In particular, the encoder is adopted to extract the initial feature map from an input raw image. Then, the extracted initial feature map and the raw image are fed into the LEB group to iteratively learn three refine feature maps and a curve parameter map

A

required by the LE-Curve [17], respectively. It is noted that the curve parameter map

A

will be further input into illumination smoothness loss function to ensure that adjacent pixel values do not appear with significant differences in an enhanced image [17]. Next, the three refine feature maps extracted from different levels (marked as map1, map2 and map3) are adaptively fused by the PF-SubNet to obtain the final feature map. Since the PF-SubNet is based on a PAM, it can allocate adaptively corresponding different weights to different feature maps. Finally, the decoder is used to decode the final fused feature map and obtain the enhanced underwater image.

3.1. Light Enhancement Block

The proposed light enhancement block (LEB) is shown in the black dotted box at the top of Figure 2. Specifically, a LEB consists of a residual block (RB) [16] and a light enhancement curve (LE-Curve) [17], which are used to learn the refine feature map and curve parameter map

A

[17], respectively. In fact, the feature map can display the mapping relationship between underwater images and the ground truth, while LE-Curve [17] is used to solve the problem of color cast during the mapping process. It is noted that the inputs of the RB [16] and the LE-Curve are the initial feature map extracted by the encoder and the raw underwater image, respectively. Additionally, the LE-Curve [17] is a single-parameter and differentiable pixel-level high-order curve, which can be expressed as follows:

\begin{matrix} L E_{n} (x) = L E_{n - 1} (x) + A_{n} (x) L E_{n - 1} (x) (1 - L E_{n - 1} (x)) \end{matrix}

(5)

where x denotes a pixel coordinate,

L E_{n} (x)

is a high-order expression of LE-Curve [17] at the pixel x, and it will be applied to every pixel in an image. n is the number of iterations, which controls the curvature of the LE-Curve [17],

A_{n}

is the curve parameter map to be learned, and it will go through n iterations. In other words, it is the mapping of n-th order curve parameters. It is noted that Equation (5) will degenerate into a two-parameter form as follows when n = 1:

L E (I (x); α) = I (x) + α I (x) (1 - I (x))

(6)

where

L E (I (x); α)

denotes the low-order expression of LE-Curve [17] at the pixel x, which have two parameters, one is the input image

I (x)

, and another is the curve parameter

α \in

[−1, 1].

α

is low-order version of

A_{n}

with n = 1, it controls the curvature of LE-Curve [17] and image exposure. Usually, LE-Curve [17] in the lower order simply adjusts each pixel within the range of [0, 1], but it can generate better results in the higher order within the same range.

In order to understand the curve parameter map

A_{n}

more intuitively, a visual example is provided in Figure 3. Figure 3b accurately shows the value changes in different image regions within a range of [−1, 1]. Figure 3c shows the final enhanced image obtained after pixel-level curve mapping of the proposed LEPF-Net. Meanwhile, from Figure 3c, it can be found that the proposed method obviously improves the visual effect of the underwater image. In fact, the curve with a single parameter can reduce the computational cost, thus improving the computational speed. Moreover, the curve is differentiable so the CNN can be used to learn the adjustable parameters of the curve and ensure the smooth back-propagation of the network. This is a pixel-level curve, which means that each pixel has a LE-Curve [17]. High-order representation curves can be iterated multiple times for better performance.

Since the LE-Curve [17] has the high-order property, this paper also utilizes this property to improve the performance of the LEB. In order to obtain a more reasonable number of iterations, we performed different iterations in our work and determined the number of iterations to be 7 as the best. This related experiments are explained in detail in the later parameter discussion part of Section 4. Additionally, considering that too many iterations will make model training difficult, we use skip connection to ensure the stability of the network training. Moreover, to prove the effectiveness of this strategy, the loss convergence, PSNR, and SSIM curves with/without skip connections are shown in Figure 4. According to Figure 4, the loss curve with residual learning converges faster than without residual learning. Similarly, the PSNR and SSIM curves with residual learning rise faster than without residual learning.

3.2. Pixel Fusion SubNet

Generally, fusion of feature maps from different levels can improve the model’s receptive domain and enhance the performance of deep-learning-based model in different image enhancement tasks. Therefore, we proposed a PF-SubNet to expand model’s reception domain, so that can perceive fuzzy and high-frequency regions of underwater image to improve the enhance effect. The PF-SubNet consists of four convolutional layers and four ReLU activation functions, which will be detailed in the following subsection.

The processing flow of the PF-SubNet includes two stages. In the first stage, PAM is used to assign weights to three feature maps extracted from different levels adaptively. Specifically, the PF-SubNet first connects the three different feature maps (

F_{l}

,

F_{h}

,

F_{m}

). This step is performed because many pixels have close weights in the UIE task. Then, the concatenated feature maps will be fed into the adaptive average pool (AAP) to obtain the channel tensor. Finally, the channel tensor will be fed into the PAM P to obtain the three corresponding weights

M_{l}

,

M_{h}

, and

M_{m}

. The whole process can be simplified as follows:

(M_{l}, M_{h}, M_{m}) = P (A A P (F_{l}, F_{h}, F_{m}))

(7)

In fact, for the high-frequency and dense pixel regions of underwater images, PAM will assign them lager weights. For a more intuitive understanding, Figure 5 shows an example of a bluish underwater image and its pixel attention map. In particular, Figure 5c gives the corresponding weight changes for each pixel of (b), it increases from left (white) to right (black). From Figure 5b, it can be found that the pixel weights of the image background are significantly larger than other image regions, such as fishes in the image foreground. In other words, PAM assigns larger weights to the image background, which are consistent with the above property of PAM.

In the second stage, it fuses the three different feature maps by linear superposition with their corresponding assigned weights. It is worth noting that PAM includes a channel attention mechanism, which assigns different weights to different color channels of image, and the exact assigned weights is determined by PAM. The specific linear superposition process of the three different feature maps (

F_{l}

,

F_{m}

, and

F_{h}

) with their corresponding weights (

M_{l}

,

M_{m}

, and

M_{h}

) can be expressed as follows:

F_{o} = M_{l} * F_{l} + M_{m} * F_{m} + M_{h} * F_{h}

(8)

where

F_{o}

denotes the fused feature map, it will be fed into the decoder to return to the initial image resolution to obtain the final enhanced image.

3.3. Network Architecture

The overall structure of LEPF-Net can be divided into four parts including encoder, Light Enhancement Block Group (LEBG), PF-SubNet, and decoder. The specific architecture of them is shown in Table 1. Specifically, the Encoder consists of three convolutional layers (Convs), and each Conv followed by instance normalization (IN) layers and ReLU activation functions (ReLU). The four parameters in each Conv represent input channels, output channels, convolutional kernels, and stride, respectively. The three parameters of output size in Table 1 denote the width, height, and channel number of the output feature map, respectively. The LEBG consists of seven LEBs. Each LEB includes four Convs, where the first two Convs are followed by an IN and ReLU layer, and the last Conv is only followed by a ReLU layer. Additionally, the first two Convs compose a RB and the last two Convs compose a LE-Curve. Since the LEBG uses skip connections, the remaining six LEBs differ only in the input and output channels in their last two Convs. Therefore, we only gives the architecture of the first LEB in Table 1. Specifically, the

# 3

and

# 4

Convs of the second to forth LEB are Conv(32, 32, 3, 1) and Conv(32, 32, 1, 1). The

# 3

and

# 4

Convs of the fifth to sixth LEB are Covn(64, 64, 3, 1) and Conv(64, 32, 1, 1), and the

# 3

and

# 4

Convs of the seventh LEB are Convn(64, 64, 3, 1) and (64, 3, 1, 1). The PF-SubNet consists of four Convs. The first Conv uses an adaptive average pool (AAP) followed by a standard convolution, an IN and a ReLU layer. The last three Convs only composed of a standard convolution followed by a ReLU layer. Since the input of the PF-SubNet is the connected result of three feature maps with 64-channels, the input channels of the the first Conv is set to 192. Similarly, since the output of the last Conv is used to obtain the fused feature map, the output channel of the last Conv is also set to 192. The architecture of the decoder is symmetrical with that of the encoder, but its last Conv does not use IN and ReLU layer, which is because the output of the decoder is the final enhanced image. Additionally, since the encoder performs the down-sampling operation and the decoder performs the up-sampling operation, the stride of its first Conv is symmetrically set to 2.

3.4. Loss Function

The total loss function of the LEPF-Net is composed of two items including L1 loss (

L_{1}

) and Illumination smoothness loss (

L_{t v_{A}}

).

L1 loss (

L_{1}

): Also generally known as Least Absolute Error, it is used to predict the difference between the enhanced image

{\hat{r}}_{i}

and the ground truth image

r_{i}

. The specific

L_{1}

can be expressed as follows:

L_{1} = ∥{\hat{r}}_{i} - r_{i}∥

(9)

Illumination smoothness loss (

L_{t v_{A}}

) [17]: In order to ensure that adjacent pixel values do not show significant differences in the enhanced image, we add the illumination smoothness loss to each curve parameter map

A

, which can be expressed as follows:

L_{t v_{A}} = \frac{1}{N} \sum_{n = 1}^{N} \sum_{c \in ξ} {(|\nabla_{x} A_{n}^{c}| + |\nabla_{y} A_{n}^{c}|)}^{2}, ξ = {R, G, B}

(10)

where N is the number of iterations,

\nabla_{x}

and

\nabla_{y}

represent the horizontal gradient operation and the vertical gradient operation, respectively. The overall loss function is defined as follows:

L_{total} = W_{1} L_{1} + W_{t v_{A}} L_{t v_{A}}

(11)

where the weights

W_{1}

and

W_{t v_{A}}

are used to measure the importance of different loss items. Moreover, we found that the L1 loss is more suitable for LEPF-Net than the Mean Square Error (MSE) loss according to the related experiment observations. The specific experimental details are conducted in Section 4.5.

4. Experiments

4.1. Datasets and Experimental Details

In this paper, we randomly selected 1200 pairs of synthetic underwater images from NYU-v2 [37] and 800 pairs of real underwater images from UIEB [38] for training. At the same time, we designed two datasets (named test900 and test90) for testing, of which, test900 consists of 900 synthetic underwater images from the NYU-v2 [37], and test90 consists of 90 real underwater images from the UIEB [38]. It is worth noting that all the training and testing images were resized to 350 × 350 in the paper. The proposed LEPF-Net was trained and tested with the PyTorch framework, and the network parameters were optimized using the ADAM optimizer. In the experiments, the learning rate was set to 10⁻⁴, the batch size was set to 4, and the iteration was 50. The experimental details of other methods are provided in Table 2.

4.2. Experiments on Synthetic Images

In this paper, the enhancement performance of LEPF-Net is evaluated on the synthetic underwater images of test900 first. The qualitative results are shown in Figure 6, and the quantitative results are provided in Table 3. Figure 6a presents the synthetic underwater images, and Figure 6l denotes the ground truths; Figure 6b–j show the enhancement results by using other algorithms for comparison. It is worth noting that UWCNN (h) [37] has 10 different pre-trained models, each model trained with 10 types of images. In this paper, only the type1 model is selected for the comparative experiments. For the models in (b)–(f) for comparison with our model, they are able to achieve good processing results for one or two classes of images, but in most cases, they cannot recover the contrast and content well. DUIENet (g) [38] achieves a good performance on restoring color, but not so good on deblurring. UIEIFM (h) [39] can achieve very limited detail reproduction, and, in general, the enhanced colors by this method show a severe color cast. UIE-WD (i) [40] can restore the image details to a certain extent, but its enhancement results are significantly darker. UIEC²-Net (j) [41] can effectively recover the color and details of the image, but its deblurring ability still requires improvement, as shown the first row of images in Figure 6. In particular, as depicted by the part in the red box in the first row, none of the methods in (b)–(h) can remove the color bias. In comparison, UIE-WD (i) [40] and UIEC²- Net (j) [41] can effectively remove the color bias, but the former has limited ability in recovery of the color gray, and the latter cannot provide good defogging performance. In contrast, our method can provide better image recovery results than other algorithms for comparison in general.

We choose Peak Signal to Noise Ratio (PSNR) and Structural Similarity index (SSIM) as full-reference quality assessments to assess the processed synthetic underwater images for all of the compared methods. In addition to the aforementioned evaluation metrics, there exist several other reliable metrics, such as UCIQE [42]. For PSNR metrics, higher PSNR denotes the result images as more close to the reference images in terms of image content. For SSIM, the higher SSIM scores mean the result images are more similar to the reference images in terms of image structure and texture. Table 3 reports the quantitative results of different methods on the test900. We highlight the top 1 performance in red, whereas the second-best is in blue. Significantly, our PSNR metrics are far higher than the second. Similarly, our SSIM is obviously higher than the compared methods. This can demonstrate that our proposed method achieves the best performance in terms of full-reference image quality assessment on synthetic underwater images.

4.3. Experiments on Real Images

We used test90 to evaluate the enhancement performances of the LEPF-Net and the other algorithms for comparison on real underwater images. The qualitative results are shown in Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11, and the quantitative results are shown in Table 4.

As shown in Figure 7 and Figure 8, most underwater images depict a bluish-green color because the blue light travels further than other light waves in water. In addition, as shown in Figure 9, Figure 10 and Figure 11, the underwater images also have the problems of low illuminance, a yellowish tone and shallowness. From Figure 7 to Figure 11, it can be seen that when UDCP (b) [43] and ULAP (c) [44] are used for image enhancement, the impact of color bias is aggravated. Most of the images enhanced by UWGAN (d) [45] show a bluish color, but these images appear reddish in the greenish underwater environment (Figure 10). The enhanced images using UGAN (e) [46] appear yellow-greenish in almost all underwater conditions, but this model performs well in low-illuminated scenes (Figure 9). UWCNN (f) [37] and UIEIFM (h) [39] have very poor enhancement performances in all color-biased environments. The enhanced images using DUIENet(g) [38] have a low luminance, especially in the greenish environment (Figure 10) and the shallow water environment (Figure 11), where the low illuminance is more pronounced. Obvious grids are demonstrated in most results of UIE-WD (i) [40], as shown in Figure 8 and Figure 11. UIEC²Net (j) [41] can effectively eliminate most of the effects of the underwater environment, but local overexposure occurs in low-illuminance conditions (images in rows 1 and 2 in Figure 9).

As shown in the red box in the second row of Figure 7b–i, none of the algorithms for comparison can remove the colour shift. For example, UIEC² Net (j) [41] results in a warm tone, which is the opposite of the cool tone of the reference image. According to the red box in the second row of Figure 8, none of (b)–(h) can simultaneously remove the color washout and color bias. UIE-WD (i) [40] manages to remove the color bias, but a clear gridding is shown in the enhanced image. As shown in the red box in the second row of Figure 9, none of the (b)–(h) methods can simultaneously boost luminance and remove color shift. The results obtained by UIE-WD (i) [40] present significant gridding; the results of UIEC² Net (j) [41] show partial overexposure. According to the part of image in the red box in the second row of Figure 10, only UIEC² Net (j) [41] and the method proposed in this paper perform relatively well. As shown by the red box in the second row of Figure 11, except for UIEC² Net (j) [41] and our proposed method, none of the other methods can effectively remove the color bias.

Similarly, we also choose PSNR and SSIM to assess the recovered results on real-world underwater images. We calculate the results of each method and the corresponding reference image, the quantitative results of different methods on test90 are shown in Table 4. Although the method of this paper is slightly lower than the first in the SSIM metrics, in the PSNR metrics, the method of this paper has achieved first place, and there is a certain gap with second place.

In summary, the proposed LEPF-Net can not only alleviate the problems of overexposure, image haze and color cast, but also reduces gridding. This is also consistent with the performance data in Table 4. In addition, we selected entropy (Entropy), blind image quality index (BIQI) [47], naturalness image quality evaluator (NIQE) [48], average gradient (AG) and NR quality measure for JPEG compressed images (JPEG) [49] as the reference-free quality assessment metrics. The no-reference quantitative results are presented in Table 5. To obtain a more intuitive picture of the differences between the methods in Table 5, we normalize the results of Table 5 in the range 0∼5, and then present them in a radar plot. The results are shown in Figure 12, where a larger range of yellow areas indicates better enhancement performance. In Figure 12, UDCP (b) [43] achieves a high score on the indicator of BIQI only, while it has poor performances on all other measures. ULAP (c) [44] only scores poorly on the NIQE [48], with relatively good performances on the other indicators. UWGAN (d) [45] has poor performance overall, only more than 3 points on the BIQI [47] indicator scoring. UGAN (e) [46] scores over 4 on the Entropy, but close to zero on the NIQE [48] metric. UWCNN (f) [37] performs generally well on the Entropy metric, but poorly on the other metrics. The overall performance of DUIENet (g) [38] is poor, only exceeding 2 points on the Entropy metric. UIEIFM (h) [39] has the worst performances among all methods, scoring almost zero on all metrics. UIE-WD (i) [40] performs well overall, even achieving the highest score on the NIQE [48] indicator. The overall performance of UIEC² Net (j) [41] is good, especially on the Entropy indicator, where it achieves an almost perfect score. Compared to other methods, the proposed method scores best in terms of Entropy, BIQI [47], AG and JPEG [49], even scoring higher than the reference image on all metrics, which proves that the proposed method is better at processing details and that our enhancement results outperform the reference image.

4.4. Discussion on Parameters

Since the proposed LEB can achieve better performances through multiple iterations, seven experiments with different iterations were carried out to find out which number of iterations could lead to the best performance. The dataset and training parameters remained unchanged in these experiments. The initial LEB number was 1, and increased gradually with the increase in the iteration number. The experimental results are shown in Figure 13. In the experiment, firstly, the original image was input into LEPF-Net to learn the curve parameter map

A

; then, the LE-Curve was used to enhance the image. When the iteration number is 1, the formula will degenerate into a two-parameter form. From the results in the first row of Figure 13, it is clear that the images remain yellowish when the iteration number increases from 1 to 5, while the yellowish tone is effectively eliminated when the iteration number is 6 or 7. However, the enhancement results still show a certain bluish color when the iteration number is 6, whereas when the number of iterations is 7, the image is no longer bluish. Moreover, as shown in the red and blue boxes in the second row of Figure 13, the images obtained after one to six iterations have a more serious color cast and more artifacts, while the images obtained with seven iterations have fewer artifacts. Based on the above analysis of the experiment results, the iteration number is set to 7 in this paper.

4.5. Discussion of Loss Function

To demonstrate that the L1 + TV loss [17] is more suitable for LEPF-Net than the MSE + TV loss [17], we experimented with different combinations of MSE, L1, and TV losses [17]. It is worth noting that color enhancement can only be achieved by using the TV [17] loss alone. The quantitative results are shown in Table 6. It can be observed that the scores of LEPF-Net in terms of various performance metrics with the L1 loss are higher than scores with the MSE loss. To understand their differences more intuitively, we present the qualitative results and their corresponding histograms in Figure 14. Figure 14a is a yellowish underwater image. By comparing its histogram to that of (f), it can be found that they have the least similarities in the blue and green channels. Among Figure 14b–e, only the histogram of (e) is the closest to that of (f). In addition, according to the parts in the red boxes in Figure 14, the yellowish color is effectively eliminated in (b), but it shows obvious overexposure. (c,d) show the grid artifacts. In (e), not only the yellowish color is eliminated, but there is no overexposure or grid artifact either. In summary, based on both the quantitative and qualitative experimental results, the configuration of L1 + TV [17] is more suitable for the LEPF-Net.

4.6. Ablation Experiment

To demonstrate the importance of LEBG and PFSubNet to LEPF-Net, we performed the ablation experiments with/without LEBG and with/without PF-SubNet. The quantitative comparison results are shown in Table 7, which indicate that the final performance keeps improving with the addition of components. Remarkably, the most significant performance improvement can be achieved by adding the PF-SubNet. To understand their differences more intuitively, we present the qualitative results and their corresponding histograms in Figure 15. Figure 15a is a bluish underwater image. By comparing its histogram to that of (f), it can be found that they show the least similarity in the blue channel. According to Figure 15b–e, only the histogram of (e) is closest to that of (f) in the blue channel. Moreover, from the red boxes in Figure 15, it can be seen that the bluish color is effectively eliminated in (b), but it shows an obvious local color cast and grid artifacts. On the contrary, the local color cast and grid artifacts are alleviated to a certain extent in (c), but the improvement is very limited. In (d), the grid artifacts are effectively eliminated, but it presents an obvious color cast in general. In (e), the grid artifacts and color cast are addressed in a relatively balanced manner. In summary, both quantitative and qualitative experimental results verify the importance of LEBG and PF-SubNet to LEPF-Net.

4.7. Potential Applications

To prove that the proposed model is also applicable to other visual tasks, we perform four applications: object detection, salient detection, local key points matching, and edge detection. Figure 16 shows the object detection results on four original underwater images and their corresponding enhanced images by YOLO-v4 [50]. In addition, we randomly selected 40 images from the UIEB test set for obtaining mAP metrics. It is noteworthy that the mAP index of the original images is 92.20% and that of the enhanced images is 95.76%. It can be seen that YOLO-v4 [50] is able to detect more people and fish in the enhanced images. Figure 17 compares the SIFT feature match on two pairs of original underwater images and their corresponding enhanced image pairs. It can be observed that the enhanced image pairs have more feature points and accurate match results than the original underwater image pairs. Figure 18 presents the saliency detection results of the two original underwater images and their corresponding enhanced images. According to Figure 18, more salient information can be detected from the enhanced images. Figure 19 illustrates the edge detection results of the two original underwater images and their corresponding enhanced images. According to Figure 19, the enhanced images have significantly more edge features than the original images. To sum up, the proposed LEPF-Net can enhance the underwater images by providing better image quality and richer textures.

5. Conclusions

In this paper, we proposed LEPF-Net with LEB and PF-SubNet for UIE. Specifically, the LEPF-Net designs the LEB based on RB and LE-Curve to correct image color cast, and a PF-SubNet with PAM to fuse feature maps extracted from different layers to obtain the final fused feature map with more image details. To the best of our knowledge, the LEPF-Net is the first one to introduce LE-curve into UIE. To demonstrate the effectiveness of our innovative LEPF-Net, we evaluated it on both synthetic and realistic underwater images. Furthermore, to validate the positive affects on high-level computer vision tasks of the proposed LEPF-Net, we also conducted the related experiments on object detection, key point matching, saliency detection, and edge detection. A series of experimental results demonstrate that although the structure is simple, the LEPF-Net shows its superiority to most existing underwater enhancement algorithms and the great significance for other computer vision tasks. In the future, we will be devoted to exploring more effective strategies to reduce the model size and computational complexity to make it more possible for practical application.

Author Contributions

Conceptualization, methodology, software, validation, writing—original draft J.Y.; investigation, methodology, software, writing—original draft, writing—review and editing Y.W. and C.W.; resources, data curation, project administration, H.F.; writing—original draft preparation, writing—review and editing, J.H.; supervision, project administration, A.G. and C.W. All authors have read and agreed to the published version of the manuscript.

Funding

This work is supported by the National Natural Science Foundation of Fujian Province (2021J05202), Research Project of Fashu Foundation (MFK23006), the University-level Research Project of Minjiang University (MYK20023), Key Project of Colleges and Universities of Henan Province (23A52002) and Science and Technology Innovation 2030-“New Generation of Artificial Intelligence” Major Project (2021ZD0111000).

Data Availability Statement

Not applicable.

Conflicts of Interest

The authors declare no conflict of interest.

References

Mangeruga, M.; Bruno, F.; Cozza, M.; Agrafiotis, P.; Skarlatos, D. Guidelines for underwater image enhancement based on benchmarking of different methods. Remote Sens. 2018, 10, 1652. [Google Scholar] [CrossRef] [Green Version]
Levy, D.; Peleg, A.; Pearl, N.; Rosenbaum, D.; Akkaynak, D.; Korman, S.; Treibitz, T. SeaThru-NeRF: Neural Radiance Fields in Scattering Media. arXiv 2023, arXiv:2304.07743. [Google Scholar]
Bingham, B.; Foley, B.; Singh, H.; Camilli, R.; Delaporta, K.; Eustice, R.; Mallios, A.; Mindell, D.; Roman, C.; Sakellariou, D. Robotic tools for deep water archaeology: Surveying an ancient shipwreck with an autonomous underwater vehicle. J. Field Robot. 2010, 27, 702–717. [Google Scholar] [CrossRef] [Green Version]
Dong, L.; Zhang, W.; Xu, W. Underwater image enhancement via integrated RGB and LAB color models. Signal Process. Image Commun. 2022, 104, 116684. [Google Scholar] [CrossRef]
Liu, Y.; Xu, H.; Zhang, B.; Sun, K.; Yang, J.; Li, B.; Li, C.; Quan, X. Model-Based Underwater Image Simulation and Learning-Based Underwater Image Enhancement Method. Information 2022, 13, 187. [Google Scholar] [CrossRef]
Lin, P.; Wang, Y.; Wang, G.; Yan, X.; Jiang, G.; Fu, X. Conditional generative adversarial network with dual-branch progressive generator for underwater image enhancement. Signal Process. Image Commun. 2022, 108, 116805. [Google Scholar] [CrossRef]
Yan, X.; Wang, G.; Wang, G.; Wang, Y.; Fu, X. A novel biologically-inspired method for underwater image enhancement. Signal Process. Image Commun. 2022, 104, 116670. [Google Scholar] [CrossRef]
Jaffe, J.S. Computer modeling and the design of optimal underwater imaging systems. IEEE J. Ocean. Eng. 1990, 15, 101–111. [Google Scholar] [CrossRef]
McGlamery, B. A computer model for underwater camera systems. In Ocean Optics; SPIE: Bellingham, WA, USA, 1980; Volume 208, pp. 221–231. [Google Scholar]
Uplavikar, P.M.; Wu, Z.; Wang, Z. All-in-One Underwater Image Enhancement Using Domain-Adversarial Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2019, Long Beach, CA, USA, 16–20 June 2019; Computer Vision Foundation/IEEE: Piscataway, NJ, USA, 2019; pp. 1–8. [Google Scholar]
Hu, K.; Weng, C.; Zhang, Y.; Jin, J.; Xia, Q. An Overview of Underwater Vision Enhancement: From Traditional Methods to Recent Deep Learning. J. Mar. Sci. Eng. 2022, 10, 241. [Google Scholar] [CrossRef]
Wang, W.; Chen, Z.; Yuan, X. Simple low-light image enhancement based on Weber-Fechner law in logarithmic space. Signal Process. Image Commun. 2022, 106, 116742. [Google Scholar] [CrossRef]
Li, X.; Lei, C.; Yu, H.; Feng, Y. Underwater image restoration by color compensation and color-line model. Signal Process. Image Commun. 2022, 101, 116569. [Google Scholar] [CrossRef]
Lin, W.; Wu, Y.; Xu, L.; Chen, W.; Zhao, T.; Wei, H. No-reference quality assessment for low-light image enhancement: Subjective and objective methods. Displays 2023, 78, 102432. [Google Scholar] [CrossRef]
Zheng, Y.; Chen, W.; Lin, R.; Zhao, T.; Le Callet, P. UIF: An objective quality assessment for underwater image enhancement. IEEE Trans. Image Process. 2022, 31, 5456–5468. [Google Scholar] [CrossRef] [PubMed]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Li, C.; Guo, C.; Chen, C.L. Learning to Enhance Low-Light Image via Zero-Reference Deep Curve Estimation. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 44, 4225–4238. [Google Scholar] [CrossRef]
Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. Proc. AAAI Conf. Artif. Intell. 2020, 34, 11908–11915. [Google Scholar] [CrossRef]
Xiao, F.; Yuan, F.; Huang, Y.; Cheng, E. Turbid Underwater Image Enhancement Based on Parameter-Tuned Stochastic Resonance. IEEE J. Ocean. Eng. 2022, 48, 127–146. [Google Scholar] [CrossRef]
Lu, J.; Yuan, F.; Yang, W.; Cheng, E. An imaging information estimation network for underwater image color restoration. IEEE J. Ocean. Eng. 2021, 46, 1228–1239. [Google Scholar] [CrossRef]
Zhuang, P.; Ding, X. Underwater image enhancement using an edge-preserving filtering retinex algorithm. Multimed. Tools Appl. 2020, 79, 17257–17277. [Google Scholar] [CrossRef]
Zhuang, P.; Wu, J.; Porikli, F.; Li, C. Underwater image enhancement with hyper-laplacian reflectance priors. IEEE Trans. Image Process. 2022, 31, 5442–5455. [Google Scholar] [CrossRef]
Akkaynak, D.; Treibitz, T. Sea-thru: A method for removing water from underwater images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1682–1691. [Google Scholar]
Andriyanov, N.A.; Gavrilina, Y.N. Image models and segmentation algorithms based on discrete doubly stochastic autoregressions with multiple roots of characteristic equations. In Proceedings of the CEUR Workshop Proceedings, Lausanne, Switzerland, 1–3 June 2018; Volume 2076, pp. 19–29. [Google Scholar]
Land, E.H. The Retinex Theory of Color Vision. Sci. Am. 1977, 237, 108–129. [Google Scholar] [CrossRef]
Zuiderveld, K. Contrast Limited Adaptive Histogram Equalization. Graph. Gems 1994, 474–485. [Google Scholar]
Fu, X.; Fan, Z.; Ling, M.; Huang, Y.; Ding, X. Two-step approach for single underwater image enhancement. In Proceedings of the 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Xiamen, China, 6–9 November 2017; IEEE: Piscataway, NJ, USA, 2017; pp. 789–794. [Google Scholar]
Zhang, S.; Wang, T.; Dong, J.; Yu, H. Underwater image enhancement via extended multi-scale Retinex. Neurocomputing 2017, 245, 1–9. [Google Scholar] [CrossRef] [Green Version]
Singh, K.; Kapoor, R.; Sinha, S.K. Enhancement of low exposure images via recursive histogram equalization algorithms. Optik 2015, 126, 2619–2625. [Google Scholar] [CrossRef]
Dai, C.; Lin, M.; Wang, J.; Hu, X. Dual-Purpose Method for Underwater and Low-Light Image Enhancement via Image Layer Separation. IEEE Access 2019, 7, 178685–178698. [Google Scholar] [CrossRef]
Wang, Y.; Zhang, J.; Cao, Y.; Wang, Z. A deep CNN method for underwater image enhancement. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP), Beijing, China, 17–20 September 2017; pp. 1382–1386. [Google Scholar]
Fu, X.; Cao, X. Underwater image enhancement with global–local networks and compressed-histogram equalization. Signal Process. Image Commun. 2020, 86, 115892. [Google Scholar] [CrossRef]
Zhang, Y.; Chen, D.; Zhang, Y.; Shen, M.; Zhao, W. A Two-Stage Network Based on Transformer and Physical Model for Single Underwater Image Enhancement. J. Mar. Sci. Eng. 2023, 11, 787. [Google Scholar] [CrossRef]
Xue, T.; Zhang, T.; Zhang, J. Research on Underwater Image Restoration Technology Based on Multi-Domain Translation. J. Mar. Sci. Eng. 2023, 11, 674. [Google Scholar] [CrossRef]
Tang, P.; Li, L.; Xue, Y.; Lv, M.; Jia, Z.; Ma, H. Real-World Underwater Image Enhancement Based on Attention U-Net. J. Mar. Sci. Eng. 2023, 11, 662. [Google Scholar] [CrossRef]
Yang, H.H.; Huang, K.C.; Chen, W.T. Laffnet: A lightweight adaptive feature fusion network for underwater image enhancement. In Proceedings of the 2021 IEEE International Conference on Robotics and Automation (ICRA), Xi’an, China, 30 May–5 June 2021; IEEE: Piscataway, NJ, USA, 2021; pp. 685–692. [Google Scholar]
Anwar, S.; Li, C.; Porikli, F. Deep Underwater Image Enhancement. arXiv 2018, arXiv:1807.03528. [Google Scholar]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An Underwater Image Enhancement Benchmark Dataset and Beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef] [Green Version]
Chen, X.; Zhang, P.; Quan, L.; Yi, C.; Lu, C. Underwater Image Enhancement based on Deep Learning and Image Formation Model. arXiv 2021, arXiv:2101.00991. [Google Scholar]
Ma, Z.; Oh, C. A Wavelet-Based Dual-Stream Network for Underwater Image Enhancement. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Singapore, 23–27 May 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2769–2773. [Google Scholar]
Wang, Y.; Guo, J.; Gao, H.; Yue, H. UIEC²-Net: CNN-based underwater image enhancement using two color space. Signal Process. Image Commun. 2021, 96, 116250. [Google Scholar] [CrossRef]
Yang, M.; Sowmya, A. An underwater color image quality evaluation metric. IEEE Trans. Image Process. 2015, 24, 6062–6071. [Google Scholar] [CrossRef] [PubMed]
Drews, P.L.; Nascimento, E.R.; Botelho, S.S.; Campos, M.F.M. Underwater depth estimation and image restoration based on single images. IEEE Comput. Graph. Appl. 2016, 36, 24–35. [Google Scholar] [CrossRef]
Song, W.; Wang, Y.; Huang, D.; Tjondronegoro, D. A Rapid Scene Depth Estimation Model Based on Underwater Light Attenuation Prior for Underwater Image Restoration. In Proceedings of the Advances in Multimedia Information Processing—PCM 2018—19th Pacific-Rim Conference on Multimedia, Hefei, China, 21–22 September 2018; Proceedings, Part I; Lecture Notes in Computer Science. Hong, R., Cheng, W., Yamasaki, T., Wang, M., Ngo, C., Eds.; Springer: Berlin/Heidelberg, Germany, 2018; Volume 11164, pp. 678–688. [Google Scholar]
Wang, N.; Zhou, Y.; Han, F.; Zhu, H.; Zheng, Y. UWGAN: Underwater GAN for Real-world Underwater Color Restoration and Dehazing. arXiv 2019, arXiv:1912.10269. [Google Scholar]
Fabbri, C.; Jahidul Islam, M.; Sattar, J. Enhancing Underwater Imagery using Generative Adversarial Networks. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018. [Google Scholar]
Moorthy, A.K.; Bovik, A.C. A Two-Step Framework for Constructing Blind Image Quality Indices. IEEE Signal Process. Lett. 2010, 17, 513–516. [Google Scholar] [CrossRef]
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a ‘Completely Blind’ Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
Wang, Z.; Sheikh, H.R.; Bovik, A.C. No-reference perceptual quality assessment of JPEG compressed images. In Proceedings of the International Conference on Image Processing, New York, NY, USA, 22–25 September 2002; IEEE: Piscataway, NJ, USA, 2002; Volume 1, p. I. [Google Scholar]
Wang, C.Y.; Bochkovskiy, A.; Liao, H.Y.M. Scaled-YOLOv4: Scaling Cross Stage Partial Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13029–13038. [Google Scholar]

Figure 1. Comparison of enhanced results obtained by different UIE methods. The proposed LEPF-Net improves brightness, color, contrast, and detail recovery compared to other methods.

Figure 2. The network structure of LEPF-Net.

Figure 3. Visual result of a pixel-level curve parameter map

A_{n}

. (a) is a raw underwater image, (b) is the corresponding curve parameter map

A_{n}

, and (c) is the enhanced image by the proposed LEPF-Net.

Figure 3. Visual result of a pixel-level curve parameter map

A_{n}

. (a) is a raw underwater image, (b) is the corresponding curve parameter map

A_{n}

, and (c) is the enhanced image by the proposed LEPF-Net.

Figure 4. Comparisons of loss function curves, PSNR curves, and SSIM curves with/without using skip connection.

Figure 5. Visual examples of pixel attention map and channel attention map of a bluish underwater image. (a) is an underwater image with a large and dense background of bluish pixels, (b) is its corresponding pixel attention map, and (c) is the channel attention weight map for each pixel, the larger the pixel value, the greater the attention it receives.

Figure 6. Qualitative comparisons on synthetic underwater images. The last row is the enlarged pictures of the red boxes in the first row.

Figure 7. Qualitative comparisons on bluish underwater images. The last row is the enlarged pictures of the red boxes in the first row.

Figure 8. Qualitative comparisons on greenish underwater images. The last row is the enlarged pictures of the red boxes in the first row.

Figure 9. Qualitative comparisons on low-illuminated underwater images. The last row is the enlarged pictures of the red boxes in the first row.

Figure 10. Qualitative comparisons on yellowish underwater images. The last row is the enlarged pictures of the red boxes in the first row.

Figure 11. Qualitative comparisons on shallow water underwater images. The last row is the enlarged pictures of the red boxes in the first row.

Figure 12. In the form of radar map, the no-reference quality assessment results of 10 images in Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11 are represented. The indicators are Entropy, BIQI, NIQE, AG, and JPEG. The larger the area of the yellow area, the better. The score ranges from 0 (worst) to 5 (best).

Figure 13. The framework of LE-ICurve, which is used to iteratively enhance the input image

I

. Different curve degree mapping

A

and iteration number n correspond to different output results. The horizontal axis represents the input pixel value, and the vertical axis represents the output pixel value.

Figure 13. The framework of LE-ICurve, which is used to iteratively enhance the input image

I

. Different curve degree mapping

A

and iteration number n correspond to different output results. The horizontal axis represents the input pixel value, and the vertical axis represents the output pixel value.

Figure 14. The comparison results of different loss function configurations. Because the raw image (a) is a yellowish underwater image, its histogram is the least close to that of the reference image (f) on the blue-green channel. (b) shows an overexposure effect, and its red-green channel has a certain gap compared with (f). (c,d) have artifacts, which lead to the loss of image detail information. In the histogram of the former, its blue-green channel is much different from that of (f), while in the histogram of the latter, the three channels R, G, B are very different from those of (f). Only (e) in various configurations shows relatively good quality, and its histogram is closest to that of (f).

Figure 15. The ablation study. Since the raw image (a) is a bluish underwater image, its histogram is the least close to that of the reference image (f) on the blue channel. Both (b,c) show some grid artifacts, but the former has the most serious grid artifacts. (d) has serious color deviation and obvious gridding. In all configurations, the result image of (e) performs best, and its histogram is relatively close to that of (f). (e) does not show multicolor bias and artifacts.

Figure 16. Object detection results by applying the YOLO-v4.

Figure 17. Local keypoints matching results by applying the SIFT operator.

Figure 18. Salient detection results by applying the PoolNet+.

Figure 19. Edge detection results by applying Canny operator.

Table 1. The proposed LEPF-Net architecture.

Number	Layer Description	Output Size
Encoder
#1	Conv(3,64,3,1)+IN+ReLU	350 × 350 × 64
#2	Conv(64,64,3,1)+IN+ReLU	350 × 350 × 64
#3	Conv(64,64,3,2)+IN+ReLU	350 × 350 × 64
LEB
#1	Conv(64,64,3,1)+IN+ReLU	350 × 350 × 64
#2	Conv(64,64,3,1)+IN+ReLU	350 × 350 × 64
#3	Conv(3,3,3,1)	350 × 350 × 3
#4	Conv(3,32,1,1)+ReLU	350 × 350 × 32
PF-SubNet
#1	AAP+Conv(192,4,1,1)+ReLU	350 × 350 × 4
#2	Conv(4,192,1,1)+ReLU	350 × 350 × 192
#3	Conv(192,8,1,1)+ReLU	350 × 350 × 8
#4	Conv(8,192,1,1)+ReLU	350 × 350 × 192
Decoder
#1	Conv(64,64,4,2)+IN+ReLU	350 × 350 × 64
#2	Conv(64,64,3,1)+IN+ReLU	350 × 350 × 64
#3	Conv(64,3,1,1)	350 × 350 × 3

Table 2. Training details of different compared algorithms, including the framework, training epoch, learning rate, batch size, and image size.

Methods	UDCP	ULAP	UWGAN	UGAN	UWCNN	DUIENet	UIEIFM	UIE-WD	UIEC² -Net	Ours
Frame	MatLab	MatLab	TensorFlow	TensorFlow	TensorFlow	TensorFlow	PyTorch	PyTorch	PyTorch	PyTorch
Epochs	-	-	50	50	50	50	50	50	50	50
Learning_rate	-	-	$2 \times 10^{- 4}$	$10^{- 4}$	10 $^{- 4}$	10 $^{- 4}$	10 $^{- 4}$	10 $^{- 4}$	10 $^{- 4}$	10 $^{- 4}$
Batch_size	-	-	8	8	8	8	4	4	4	4
Image_size	350 × 350	350 × 350	350 × 350	350 × 350	350 × 350	350 × 350	350 × 350	350 × 350	350 × 350	350 × 350

Table 3. Full reference image quality assessment of the test900. The best and second-best results are marked in red and blue.

Method	PSNR	SSIM
raw	14.4978	0.6948
UDCP	13.6778	0.6373
ULAP	14.3739	0.6989
UWGAN	13.3007	0.7083
UGAN	16.2636	0.6625
UWCNN	14.8638	0.7419
DUIENet	16.1073	0.7734
UIEIFM	17.5494	0.7123
UIE-WD	20.4510	0.8661
UIEC² -Net	20.5442	0.8486
Ours	24.1325	0.8733

Table 4. Full-reference image quality assessment of test90, the best and second-best results are marked in red and blue.

Method	PSNR	SSIM
raw	18.2701	0.8151
UDCP	11.1646	0.5405
ULAP	18.6789	0.8194
UWGAN	18.6209	0.8454
UGAN	21.3031	0.7691
UWCNN	18.2851	0.8150
DUIENet	16.2906	0.7884
UIEIFM	17.4574	0.6583
UIE-WD	19.0074	0.7872
UIEC²-Net	24.5663	0.9346
Ours	26.2857	0.9000

Table 5. No-reference image quality assessment of 10 images in Figure 7, Figure 8, Figure 9, Figure 10 and Figure 11. The best and second-best results are marked in red and blue.

Method	Entropy	NIQE	BIQI	AG	JPEG
Raws	6.5081	39.8008	49.0445	4.7933	11.2021
UDCP	5.8717	39.8005	39.3064	5.6096	10.6871
ULAP	6.4558	39.8006	35.9631	7.5027	10.2069
UWGAN	6.549	39.8009	36.9106	6.4002	10.6459
UGAN	7.3468	39.8005	37.5143	8.6566	10.1087
UWCNN	6.5177	39.8008	44.8743	4.9923	11.0149
DUIENet	6.6314	39.8011	43.2616	5.9285	10.5725
UIEIFM-Net	5.9095	39.8009	45.7180	4.1659	11.2879
UIE-WD	6.7663	39.8026	31.8501	10.2489	9.7099
UIEC2-Net	7.4661	39.8016	33.5492	9.1011	9.7671
Ours	7.4682	39.8022	29.9085	11.0408	9.2780
References	7.4292	39.8014	35.9838	8.7786	9.7458

Table 6. The quantitative results of different loss function configurations; the best and second-best results are marked in red and blue.

	Only MSE	Only L1	MSE + TV	L1 + TV
ine MSE	✓		✓
L1		✓		✓
TV			✓	✓
PSNR	25.3233	25.9720	25.9342	26.2943
SSIM	0.8852	0.8937	0.8916	0.9025

Table 7. The quantitative results of ablation study; the best and second-best results are marked in red and blue.

	No configuration	Only LEBG	Only PF-SubNet	Full configuration
LEBG		✓		✓
PF-SubNet			✓	✓
PSNR	25.4886	25.6389	26.1559	26.2857
SSIM	0.8752	0.8850	0.8902	0.9000

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Yan, J.; Wang, Y.; Fan, H.; Huang, J.; Grau, A.; Wang, C. LEPF-Net: Light Enhancement Pixel Fusion Network for Underwater Image Enhancement. J. Mar. Sci. Eng. 2023, 11, 1195. https://doi.org/10.3390/jmse11061195

AMA Style

Yan J, Wang Y, Fan H, Huang J, Grau A, Wang C. LEPF-Net: Light Enhancement Pixel Fusion Network for Underwater Image Enhancement. Journal of Marine Science and Engineering. 2023; 11(6):1195. https://doi.org/10.3390/jmse11061195

Chicago/Turabian Style

Yan, Jiaquan, Yijian Wang, Haoyi Fan, Jiayan Huang, Antoni Grau, and Chuansheng Wang. 2023. "LEPF-Net: Light Enhancement Pixel Fusion Network for Underwater Image Enhancement" Journal of Marine Science and Engineering 11, no. 6: 1195. https://doi.org/10.3390/jmse11061195

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

LEPF-Net: Light Enhancement Pixel Fusion Network for Underwater Image Enhancement

Abstract

1. Introduction

2. Related Work

3. Proposed Method

3.1. Light Enhancement Block

3.2. Pixel Fusion SubNet

3.3. Network Architecture

3.4. Loss Function

4. Experiments

4.1. Datasets and Experimental Details

4.2. Experiments on Synthetic Images

4.3. Experiments on Real Images

4.4. Discussion on Parameters

4.5. Discussion of Loss Function

4.6. Ablation Experiment

4.7. Potential Applications

5. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI