RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing

Alenezi, Fayadh

doi:10.3390/electronics11182894

Open AccessArticle

RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing

by

Fayadh Alenezi

Department of Electrical Engineering, Faculty of Engineering, Jouf University, Sakakah 72388, Saudi Arabia

Electronics 2022, 11(18), 2894; https://doi.org/10.3390/electronics11182894

Submission received: 8 August 2022 / Revised: 1 September 2022 / Accepted: 2 September 2022 / Published: 13 September 2022

(This article belongs to the Special Issue Artificial Intelligence (AI) for Image Processing)

Download

Browse Figures

Versions Notes

Abstract

:

In this paper, we present a powerful underwater image dehazing technique that exploits two image characteristics—RGB color channels and image features. In using RGB color channels, each color channel is decomposed into two units based on the similarities via the k-mean. This markedly improves the adaptability and identification of similar pixels, and thus reduces pixels with a weak correlation, leaving only pixels with a higher correlation. We use an infinite impulse response (IIR) in the triple-dual and parallel interaction structure to suppress hazed pixels via a pixel comparison and amplification to increase the visibility of even very minor features. This improves the visual perception of the final image, thus improving the overall usefulness and quality of the image. The softmax-weighted fusion is finally used to fuse the output color channel features to attain the final image. This preserves the color, leaving our proposed method’s output very true to the original scene’s. This is accomplished by taking advantage of adaptive learning based on the confidence levels of the pixel contribution variation in each color channel during subsequent fuses. The proposed technique both visually and objectively outperforms the existing methods in several rigorous tests.

Keywords:

underwater image dehazing; RGB color channel; triple dual; parallel interaction; softmax weighted

1. Introduction

Underwater images have tremendous usage in marine engineering. However, poor underwater image quality caused by the presence of wavelength-dependent light absorption and scattering [1] often hinders their use. Underwater image dehazing is an approach to combat this, where underwater images are processed to improve quality, thereby increasing their application in the marine environment. The processing focuses on reducing the effects of wavelength-dependent light absorption and scattering. According to Alenezi et al. [1,2], an underwater dehazing model can be defined as:

Γ_{c} (x) = Λ_{c} (x) τ_{c} (x) + Λ_{c} (x) τ_{c} (x) * η_{c} (x) + Θ_{c} (1 - τ_{c} (x))

(1)

where

Γ_{c} (x)

denotes the intensities of the

c \in

{R,G,B} color channel at the pixel

x

in an input underwater image.

Λ_{c} (x)

denotes the scene radiance image and

Θ_{c}

denotes the ambient light.

Λ_{c} (x) τ_{c} (x)

is the direct transmission, representing the attenuated scene radiance by transmission.

η_{c} (x)

denotes a point spread function of pixel x. Similar to in-air dehazing models [3], underwater dehazing models aim to reduce the effect of haze in images. However, unlike in the atmospheric model, the underwater model presented in (1) considers scene radiance,

Λ_{c} (x)

, as a function of point spread to take into consideration the effects of wavelength-dependent light absorption. This makes underwater image dehazing a complex phenomenon that requires the continual exploration of effective techniques in order to improve the image quality and the usability of underwater images.

Recent years have seen underwater image dehazing attracting attention, leading to many suggested techniques. Traditional methods, such as image restoration and enhancement, estimate the dehazing models’ parameters to reduce the effect of haze. Such models include He et al.’s [4] dark channel prior (DCP) that estimates the transmission map and atmospheric light from a hazy image to recover the underwater image. The simplicity of the DCP has enabled its modification to address the severe attenuation of the red color in water to attain images with improved color. Such techniques include that of Galdran et al. [5] who recovered short wavelength-based colors via a red channel prior (RCP) to restore image contrast. Peng et al. [6] used a DCP to present an underwater dehazing technique where they considered the green and blue color channels to restore the underwater image. Chiang and Chen [7] enhanced underwater images via a compensation-based light attenuation dehazing algorithm. Peng and Cosman [8] used depth estimators, image blurriness and light absorption to restore underwater images. These methods have been popular due to their ability to reduce the effect of blur and color cast effects on underwater images to a significant degree. However, they also estimate many parameters, making their results inflexible and sometimes inferior compared to other complex methods, such as those by [9,10]. On the other hand, the most recently proposed underwater image dehazing models reduce the effect of haze by disregarding the underwater modeling parameters and improving the image’s visual quality by adjusting the image pixel values. Such methods include Ancuti et al. [11], where the contrast of the underwater images was improved via a fusion-based method. Fu et al. [12] proposed an effective retinex-based method to enhance underwater images. Though effective in improving the underwater image quality in many instances, these methods have one major shortcoming: they fail to consider underwater physical parameters, making them inefficient in recovering high-quality images.

Deep neural network-based methods solved the problems of DCP-related models and image pixel-based models. Such techniques include image segmentation by Zhang et al. [13], pattern recognition by Gedamu et al. [14] and image dehazing by Liang et al. [15]. Some methods utilized deep neural networks to utilize similarities between clear or ground-truth and hazy images. Such methods used similar network structures to achieve a higher image quality. These methods’ major shortcoming come when the images compose a harsh and complex scenario where attaining ground-truth images is impossible. In order to solve such problems, researchers have explored the pairing of in-air images with haze images and then used the same analogy to solve underwater images from harsh and complex scenarios. Such models are generative adversarial networks (GAN) for underwater images (WaterGAN)and others [16]. The technique corrects the color of underwater images by pairing in-air images and using the attained depth information to simulate underwater images. Fabbri [17] proposed paired training data generated by employing a cycle-consistent GAN (CycleGAN) by [18] to formulate an underwater GAN (UGAN), which simulated the degradation process. Finally, they used the pix2pix model to reduce the effect of haze in underwater images. The suggested techniques were later exploited by Guo et al. [19] and Fabbri [17] to develop a more sophisticated model based on the dense multiscale GAN, boosting the performance and rendering more details in the final underwater images than the previous methods. Li et al. [20] recently proposed an underwater convolutional neural network (UWCNN), which used underwater scenes prior to generating satisfactory clear underwater images. The deep neural network’s major shortcoming is its reliance on in-air images to attain underwater clear images. This is not always directly usable in underwater applications and is regarded as an extension of in-air dehazing networks; thus, their results may be misleading.

The general approach of underwater dehazing based on neural networks (NN) has not yielded accurate images due to an over-reliance on scene depth and on atmospheric light based on image pixels. However, if considered in terms of the red, green and blue (RGB) color channels, these pixels can yield images whose visual appearances are closer to that of the raw images. Inspired by this fact, the proposed technique approaches the underwater imaging dehazing problem based on the difference in pixel arrangements within the RGB color channels. Thus, a novel triple-dual end-to-end NN is proposed, in other words, a triple-dual-path recurrent network (TDPRN) to model scene radiance and ambient light. The proposed TDPRN consists of a feature extraction block, a transmission estimation map block, a TDPRN block with a parallel interaction function, image reconstruction and the softmax function for image fusion. The network information is modified from the already existing network by [21]. Given the hazy underwater image, the network decomposes the image into the RGB channels and then the TDPRN uses the feature extraction block and the transmission map estimation block to extract features from the color channels from the hazed underwater images. These features are then fed into the dual-path block via three parallel branches to restore the image features and improve the color of the dehazed images. Unlike [21]’s structure, the proposed structure has three units of convolution—long short-term memory (convLSTM) in each branch and a convolution layer based on the corresponding color channels’ pixels. ConvLSTM has the ability to learn and store information on the input image of the pixel correlation and compare it with the output. The communication between the interacting layers enables a comparison of the correlation patterns, thus enhancing the extraction of features in the output images. This extraordinary communication and comparison help approximate the infinite impulse response (IIR) model, which was already proposed by [21,22]. A parallel interaction function is also proposed to fuse the intermediary features between the branches. Thus, the basic features and information of the dehazed image are recovered alternatively. The corresponding features based on each color channel are then processed stepwise to obtain the ultimate dehazed image via a series of softmax-weighted fusion, whose details are discussed in detail by Zhao et al. [23].

The proposed technique, presented in Figure 1, can produce an output image with improved visual perception. Figure 1 shows a summary of the visual perception improvement of the proposed method compared to the input images. The top row contains the raw (hazed) images. The bottom (second) row shows the corresponding output of the proposed method. The summary presented in Figure 1 indicates that the proposed technique can learn and reduce the effects of haze in the output images.

Contribution

This proposed paper contains the following significant contributions:

The input image is decomposed according to the RGB color channels and the features, with each color channel decomposed into two units based on the similarities via the k-means. The k-means are described in detail by [24,25]. This guarantees the ease of adaptability and identification of similar pixels, and thus, by extension, removes pixels with a weak correlation, leaving only pixels with a higher correlation.
The structure’s triple-dual and parallel interaction allows a comprehensive comparison; hence, even minor features, i.e., pixels with the weakest correlations, are considered. This improves the visual perception of the final image.
The use of softmax-weighted fusion in the arrangement of the proposed structure also preserves the color, which explains why the proposed result’s color is very similar to the input color. This is achieved via adaptive learning based on the confidence levels of the pixel contribution variation in each color channel during the subsequent fuses.

2. Proposed Methodology

2.1. Triple-Dual-Path Recurrent Network

2.1.1. Network Architecture

The proposed TDPRN network consists of feature and transmission map blocks. The triple-dual-path block has a series of parallel interaction functions and a softmax-weighted fusion block. The first and second convolution layers increase the width to 16, reduce the resolution of the feature maps in each color channel and increase the width of the image to 32. After each convolution layer, a Leaky ReLU with a slope of

- 0.01

(based on the experimental findings) is added to the feature extraction block. The block for transmission map uses RGB color channels from the underwater hazy image to estimate transmission maps. The respective color channel image features and their corresponding transmission maps are fed into the dual-path blocks. The blocks contain parallel branches for restoration and dynamic fusion of the basic content of the intermediate image details. The reconstruction block consists of a

9 \times 9

convolution layer, a bi-linear up-sampling layer and a

3 \times 3

convolution layer. The bi-linear up-sampling layer up-samples the color channel image features to twice the input size. The width of the color channels’ feature maps are reduced by the

3 \times 3

convolution layer. The reconstructed dehazed underwater color channels are then fused via softmax-weighted fusion to attain final image.

2.1.2. Triple-Dual-Path Block

Equation (1) suggests that the underwater dehazed image can be found from

Λ_{c} (x) = \frac{1}{1 + η_{c}} (\frac{Γ_{c} (x)}{τ_{c} (x)} + \frac{θ_{c}}{Ψ_{c}}),

(2)

where

Ψ_{c} = \frac{τ_{c}}{τ_{c} - 1},

(3)

and

η_{c} (x) = (e^{- Ω_{c} d_{c} (x)} - e^{- Π_{c} d_{c} (x)}) F_{c}^{- 1} \{e^{- δ_{c} d_{c} (x) ϖ_{c}}\} .

(4)

Ω_{c}

and

δ_{c}

are the empirical coefficients of the c color channel related to the hazed image scene, such that

|Ω_{c}| < |Π_{c}|

.

F_{c}^{- 1}

denotes the inverse Fourier transform and

ϖ_{c}

denotes the radial frequency. The term

Θ_{c} (1 - τ_{c} (x))

in (1) is the backward scattering term with

Θ_{c}

being the background light of the c color channel.

d_{c} (x) = ϕ_{c_{0}} + ϕ_{c_{1}} ε_{c} (x) + ϕ_{c_{2}} ϱ_{c} (x),

(5)

where

d_{c} (x)

is the underwater depth scene at pixel

x_{c} \in {i, j}

.

ϕ_{c_{i}}

are color channel-based linear coefficients derived from the pixel difference plots between the highest and lowest pixel values.

ε_{c} (x)

is the mean intensity function showing the absolute difference between the pixels in the color channels.

ϱ_{c} (x)

is the mean intensity function showing the absolute difference between the pixels in the color channels.

ϵ_{c} (x)

is the mean intensity function showing the absolute difference between the pixels in the color channels. As one of the improvements in the proposed method as compared to the existing [2,26,27,28,29,30], this proposed scene depth use of pixel intensities in the color channels strengthens scene artifacts. We re-write (5) as

\begin{matrix} d_{c} (x) = ϕ_{c_{0}} + \\ ϕ_{c_{1}} | [(argmax (min_{x \in (i, j)} (min_{(i, j) \in c} ε_{c} (x)))) \\ - (argmin (max_{x \in (i, j)} (max_{(i, j) \in c} ε_{c} (x))))] | \\ + ϕ_{c_{2}} | [(argmax (min_{x \in (i, j)} (min_{(i, j) \in c} ϱ_{c} (x)))) \\ - (argmin (max_{x \in (i, j)} (max_{(i, j) \in c} ϱ_{c} (x))))] | \\ + ϕ_{c_{3}} | [(argmax (min_{x \in (i, j)} (min_{(i, j) \in c} ϵ_{c} (x)))) \\ - (argmin (max_{x \in (i, j)} (max_{(i, j) \in c} ϵ_{c} (x))))] |, \end{matrix}

(6)

Equation (6) suggests that the scene depth value increases with an increase in the pixel difference between the maximum and minimum values. The decomposition of (6) into three different color channels enhances the accuracy of the representation of the original image. This is because each pixel is a sample of an original image. This further enables accurate estimation of scene depth, and global background light helps improve the accuracy of underwater image dehazing. It also allows the network to concentrate on features per the color channels. This helps control and identification of features more accurately.

Equation (3) suggests that dehazed image has two components: the basic content details

\frac{Γ_{c} (x)}{τ_{c} (x)}

and the image details

\frac{θ_{c}}{Ψ_{c}}

. In air dehazing models,

η_{c} = 0

. However, in underwater imaging,

η_{c}

is given by (4). Therefore, the proposed dehazing model is assumed to be composed of two functions:

\begin{matrix} Λ_{c} (x) & = \frac{1}{1 + η_{c}} (Ξ_{1} (Γ_{c}, τ_{c}) + Ξ_{2} (Θ_{c}, Ψ_{c})) \\ = \frac{1}{1 + η_{c}} (Λ_{1}^{c} + Λ_{2}^{c}) |^{c \in {r, g, b}} \end{matrix}

(7)

Equation (7) indicates that underwater image dehazing comprises two parts—the basic content details and the image details divided by the point spread function

η_{c}

. The motivation for this approach is that the hazing effect varies throughout the image; thus, the assumption of parameter values does not give accurate results. Therefore, in order to dehaze the images accurately, the treatment of the image pixels should be homogeneous but non-static. Using

Λ_{1}^{c}

and

Λ_{2}^{c}

to approximate the clear image may render the estimation of transmission map and global light useless. In this paper, we consider the

d_{c} (x)

as given by (6) to tighten the approximation of the transmission map. This is indicated in Figure 2 and helps increase the color concentration of the output image compared to the existing input.

The estimates of

Ξ_{1} (Γ_{c}, τ_{c})

and

Ξ_{2} (Θ_{c}, Ψ_{c})

are fundamental to meeting the objective of the paper. We employ the infinite impulse response (IIR) due to its versatility, ease of computation and cost-friendliness. Its use in this paper was also guided by the need to amplify the pixels with strong correlation to suppress the hazed pixels. This also explains why the proposed method has a more extensive concentration of color channels than existing methods, as presented in Figure 2. IIR models are often approximated as a cascade of summation of lower structures, via recurrent neural networks [21,31]. Using IIR model,

Λ_{1}^{c} = \frac{\sum_{i = 0}^{N - 1} Γ_{c} x^{- i}}{\sum_{i = 0}^{M - 1} τ_{c} x^{- i}},

(8)

if

M = N

, then

Λ_{1}^{c} = \frac{Γ_{c}}{τ_{c}}

(9)

Figure 2. A summary of the effectiveness of including the transmission map function to extract more color channels compared to existing techniques. From left to right are the raw underwater images, results from Fus [32], WCID [33], Ts [34], LD [35] and proposed results. Image, R, G, B color channel concentration are shown from top to bottom.

We use a similar approach to estimate

Λ_{2}^{c}

. With this, we propose a dual-path block for underwater image dehazing based on the IIR models summarized by (8) and (9). Figure 3 illustrates that the proposed recurrent neural network used to approximate IIR for the proposed technique consists of five units with three branches, each branch representing different color channels. Each unit contains ConvLSTM and a pixel-wise convolution layer. ConvLSTM decides the type of information to store and omit from the network at every step each branch gradually reduces the effect of haze in the image, as indicated in Figure 4 and Figure 5. Furthermore, the LSTM’s general proven capabilities in handling the long-range dependencies make it usable in establishing correlations between local and global pixel neighborhoods. Thus, image features are extracted and preserved throughout the process. The output features from each branch are fused via softmax-weighted fusion to obtain the final output image (see Figure 3 and Figure 4). The details of the softmax-weighted fusion stack are summarized by [23]. The stack is chosen due to its ability to adaptively learn the variation of pixels in the corresponding

{R, G, B}

color channels output images and fuse the images based on the contribution’s modalities to the final image.

2.1.3. Interaction Functions within the Dual Blocks

Figure 3, Figure 4 and Figure 5 indicate that

Λ_{i, j}^{c}

∀

i = 1 \dots 5 : j = 1, 2

are the image input features restored within the branches of triple-dual blocks. The image feature and content are complementary, and thus, are inseparable. The parallel interaction in the block integrates the image content and features to produce fine-tuned details in the final images. Thus, every arrow in the block performs a unique function: blue estimates the global atmospheric light, red estimates the transmission maps and the yellow arrow transfers features to the next unit. The process is repeated until the last stage. The arrows enable the solution of complex processes, which would need complex algorithms in other cases. The network is reactive, thus emphasizing the visual aspect of the images in terms of features. Therefore, it does not predict the futuristic outcome of the images such as the end color, hence image tent to retain the initial colors of the input image in the final output. This is the weakness of the proposed method compared to existing techniques. In addition, the arrows ensure continuity of the image features, thus restoring, preserving and emphasizing the strong pixel correlations. The dual interaction enables control and identification of features and content. This is the strength of the proposed technique compared to the existing methods. This ability makes the network focus on suppressing the haze effects while strengthening the image features.

3. Experimental Results

3.1. Dataset

In order to test, analyze and compare the proposed algorithm, we performed many tests and simulations. The experiment was conducted in the Zorin OS 16/15 April 2021 using the Tensorflow deep learning framework. The computer used was a BIZON X5000 G2 with 16 GB RAM.

3.2. Comparison Methods

In order to compare the visual and perception competitiveness of the proposed methods, the proposed results are compared with those from common or recently developed techniques. These techniques are listed in Table 1. The comparison techniques were chosen based on the objective (improving the image features, specifically the visual features). In order to show the competitiveness of the proposed methods, these techniques are selected based on the years as follows: 2010 (1), 2011 (1), 2012 (1), 2016 (2), 2017 (5), 2018 (1), 2019 (4), 2020 (4) and 2021 (1). This indicates that out of 20 comparison methods, 15 are less than five years old.

3.3. Objective Evaluation of the Proposed Images’ Visual Quality

The proposed technique was evaluated based on an objective evaluation because a subjective evaluation is time-consuming, though the latter is more accurate. In this paper, a mixed objective evaluation method was employed. One objective evaluation relied on the reference approaches of the mean Average Precision

m A P

. Three non-reference approaches were also used; the Naturalness Image-Quality Evaluator (NIQE)52 [43], Normalized Underwater Image-Quality Metric UIQM

_{n o r m}

55 [44] and Underwater Color Image-Quality Evaluator (UCIQE) [45] are used in their values presented in Table 2, Table 3, Table 4, Table 5 and Table 6. The later sections also present the underwater image sharpness measure (UISM) [44].

3.4. Subjective Assessment

The perceptual quality of the images is evaluated by the presentation of the different categories of the images shown in Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10.

Figure 6 shows that the proposed method tends to retain the original color of the input image when compared to the existing methods. The visual inspection shows that the results of the Ancuti [11] are almost similar to the raw images. There is not much change in the final results, which indicates either a failure of the accurate transmission map estimation or background light. The DCP [4] results also exhibit similar traits to the results by Ancuti [11] but are closer to the raw results than Ancuti’s [11]. The HP [36] results have exaggerated red colors. The Water-Net [26] results have a grayish color in the top image, while the bottom is almost similar to the input image. The proposed results have evidently improved the color output in both images. The proposed results are visually more appealing compared to the existing techniques.

Figure 7 shows a subjective comparison of the proposed output with the existing techniques using different scenes. The Fus [32] tends to have an exaggerated red color in three out of five image samples, making its results less appealing. The WCID [33] also has exaggerated red colors in four out of five samples, showing a trend of increasing unwanted artifacts in the final results. The Ts [34] results are more appealing than the Fus and WCID results but tend to darken the whale image (third from top). The LD [35] results are better than the first three. However, the method tends to overexpose the surfaces (see the whale image). Finally, the proposed technique results have more exaggerated colors but are visually appealing. The coral reefs (first image from the top) and whale (third image from the top) show the strength of the proposed balancing of the colors. The second and fourth images suggest a weakness in the proposed methods, that is, the exaggeration of the blue colors in cases of overexposed regions.

Figure 8 shows a visual comparison of the performance of the existing and proposed techniques for synthetic underwater images. The synthetic images have ground-truth images (the images in the last column). A visual inspection indicates that the Guo [36] results were closer to the synthetic ground-truth images but failed in the first image (top image) because it is darker than the ground truth. The Zhuang [19] and gl [38] results are almost similar, with the slight difference being the top and bottom images. However, the proposed results produce images with more details than the existing techniques. This observation is a strength of the proposed method because one of the main aims of image dehazing is to expose image details in addition to suppressing the haze effects.

Figure 9 shows the comparison of the visual effect of the proposed technique in the case of the synthetic images without the ground-truth images. While the Ancuti [11], Guo [36] and Berman [37] results exhibit reddish regions in the final images, the Cosman [8] results appear overexposed. The Zhuang [19] results have exaggerated colors. The gl [38] is gray such that the algae in the final image, known to be greenish, are also gray. This means the [38] technique is not versatile. Like the previous examples, the proposed method tends to retain the input image colors but enhances the image details. The proposed results, compared to their counterparts, are more visually appealing.

Figure 10 shows the subjective evaluation of the proposed technique compared to others in the case where the images have rich colors. The aim here is to show that the proposed method can detect the color variation. In a comparison with the existing methods, CBF [32], ULAP [39], UWCNN [20], MLFcGAN [40], FUnIEGAN [41], waterNet [26] and UICoE-Net [42], the proposed method, on two occasions (images in the second and third rows), appears to outperform the existing methods. For the case of the first and the last row images, the proposed images exhibit its weakness—an exaggeration of the green color channels. This predominant trait might be due to the failure of the network to estimate the global background light accurately. This could be true because the network estimated the global ambient light while the transmission map was slightly predetermined.

Analysis of Underwater Image Overall Quality

Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10 present different underwater dehazing techniques with many different types of environments. All the examples presented show that different techniques produced different visual distortion and residual color casts. The methods of improvement in clarity vary in the examples. However, the proposed algorithm results, compared with the existing, in almost all cases, improve the sharpness, color and precision in many examples.

Table 2, Table 3, Table 4, Table 5 and Table 6 present the objective evaluation for the overall quality indicators for the images presented in Figure 6, Figure 7, Figure 8, Figure 9 and Figure 10. Besides the indicators, we calculated the average statistics because we could not present everything in the tables. Table 2 indicates that the proposed have better results in the UIQM

_{n o r m}

and UCIQE. Table 3 indicates that the proposed have better UIQM

_{n o r m}

values. Table 4 indicates that the proposed technique outperforms in all the metrics compared to the existing techniques. Table 5 and Table 6 indicate that the proposed have better UIQM

_{n o r m}

and UCIQE values. The consistency in better performance in the values of the UIQM

_{n o r m}

and UCIQE is due to the ability of the proposed technique to restore image-rich colors. While this could be a weakness because the dehazing methods need to restore the natural colors, it is also an advantage because the resulting output images tend to be more appealing than the existing methods. Figure 11 shows the box plot of the mean Average Precision of the proposed technique compared to the existing methods. The box plot indicates that the proposed has higher values and mean (red line) than its counterparts. The mAP values of the proposed are also higher because the box-plot body is shorter and higher than other box plots. The subjective and objective evaluation indicates that the proposed algorithm has the best effect on the overall underwater quality in various scenes compared to the existing techniques.

The proposed technique also focused on increasing the sharpness of the final image. Figure 12 presents a graphical summary of the underwater image sharpness measure (UISM). The figure indicates that the proposed technique averagely outperforms the existing techniques.

The effectiveness of the proposed network in improving the pixel correlations is presented in Figure 13. This is one of the main aims of the proposed techniques—removing pixels with a weak correlation and leaving only pixels with a higher correlation. The use of the IIR attains this. The IIR is used to amplify the pixels with a strong correlation, thereby suppressing the hazed pixels. This leads to a smoother pixel correlation compared to the original.

4. Conclusions

We presented an underwater image dehazing technique based on two image character- istics—RGB color channels and image features. Using RGB color channels markedly improved the adaptability and identification of similar pixels and effectively removed pixels with a weak correlation to leave only pixels with a high correlation. The IIR in the triple-dual and parallel interaction structure allowed suppressed hazed pixels to make even minute features, such as pixels with weak correlations, visible. This improved the visual perception of the final image and thus also the overall usefulness and quality of the image. The softmax-weighted fusion used to attain the final image helped preserve the original scene’s color. This was accomplished thanks to adaptive learning based on the confidence levels of the pixel contribution variation in each color channel during the subsequent fuses. The proposed technique was compared with the existing state-of-the-art algorithms, both visually and objectively, using various metrics: niqe, mAP, UIQM

_{n o r m}

, UCIQE and UISM. The results indicated that the proposed technique outperforms the existing methods. The one significant weakness of the proposed technique is that it predominantly exaggerates green colors in some environments. Future studies may consider the external control mechanism, such as using ground-truth images so that the color of the final image may be restored. This will also help address the weakness of the network.

Funding

This research received no external funding.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data Availability Statement: The dataset used during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare no conflict of interest.

References

Alenezi, F.; Armghan, A.; Mohanty, S.N.; Jhaveri, R.H.; Tiwari, P. Block-greedy and cnn based underwater image dehazing for novel depth estimation and optimal ambient light. Water 2021, 13, 3470. [Google Scholar] [CrossRef]
Park, E.; Sim, J.Y. Underwater image restoration using geodesic color distance and complete image formation model. IEEE Access 2020, 8, 157918–157930. [Google Scholar] [CrossRef]
Zhu, Z.; Luo, Y.; Wei, H.; Li, Y.; Qi, G.; Mazur, N.; Li, Y.; Li, P. Atmospheric light estimation based remote sensing image dehazing. Remote Sens. 2021, 13, 2432. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar] [PubMed]
Galdran, A.; Pardo, D.; Picón, A.; Alvarez-Gila, A. Automatic red-channel underwater image restoration. J. Vis. Commun. Image Represent. 2015, 26, 132–145. [Google Scholar] [CrossRef]
Drews, P.L.; Nascimento, E.R.; Botelho, S.S.; Campos, M.F.M. Underwater depth estimation and image restoration based on single images. IEEE Comput. Graph. Appl. 2016, 36, 24–35. [Google Scholar] [CrossRef] [PubMed]
Chen, D.; He, M.; Fan, Q.; Liao, J.; Zhang, L.; Hou, D.; Yuan, L.; Hua, G. Gated context aggregation network for image dehazing and deraining. In Proceedings of the 2019 IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa Village, HI, USA, 7–11 January 2019; pp. 1375–1383. [Google Scholar]
Peng, Y.T.; Cosman, P.C. Underwater image restoration based on image blurriness and light absorption. IEEE Trans. Image Process. 2017, 26, 1579–1594. [Google Scholar] [CrossRef] [PubMed]
Xiong, J.; Zhuang, P.; Zhang, Y. An Efficient Underwater Image Enhancement Model With Extensive Beer-Lambert Law. In Proceedings of the 2020 IEEE International Conference on Image Processing (ICIP), Online, 25–28 October 2020; pp. 893–897. [Google Scholar]
Wang, Y.; Yu, X.; An, D.; Wei, Y. Underwater image enhancement and marine snow removal for fishery based on integrated dual-channel neural network. Comput. Electron. Agric. 2021, 186, 106182. [Google Scholar] [CrossRef]
Ancuti, C.; Ancuti, C.O.; Haber, T.; Bekaert, P. Enhancing underwater images and videos by fusion. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; pp. 81–88. [Google Scholar]
Fu, X.; Huang, Y.; Zeng, D.; Zhang, X.P.; Ding, X. A fusion-based enhancing approach for single sandstorm image. In Proceedings of the 2014 IEEE 16th International Workshop on Multimedia Signal Processing (MMSP), Jakarta, Indonesia, 22–24 September 2014; pp. 1–5. [Google Scholar]
Zhang, Y.; Sun, X.; Dong, J.; Chen, C.; Lv, Q. GPNet: Gated pyramid network for semantic segmentation. Pattern Recognit. 2021, 115, 107940. [Google Scholar] [CrossRef]
Gedamu, K.; Ji, Y.; Yang, Y.; Gao, L.; Shen, H.T. Arbitrary-view human action recognition via novel-view action generation. Pattern Recognit. 2021, 118, 108043. [Google Scholar] [CrossRef]
Liang, Z.; Wang, Y.; Ding, X.; Mi, Z.; Fu, X. Single underwater image enhancement by attenuation map guided color correction and detail preserved dehazing. Neurocomputing 2021, 425, 160–172. [Google Scholar] [CrossRef]
Li, J.; Skinner, K.A.; Eustice, R.M.; Johnson-Roberson, M. WaterGAN: Unsupervised generative network to enable real-time color correction of monocular underwater images. IEEE Robot. Autom. Lett. 2017, 3, 387–394. [Google Scholar] [CrossRef]
Fabbri, C.; Islam, M.J.; Sattar, J. Enhancing underwater imagery using generative adversarial networks. In Proceedings of the 2018 IEEE International Conference on Robotics and Automation (ICRA), Brisbane, Australia, 21–25 May 2018; pp. 7159–7165. [Google Scholar]
Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
Guo, Y.; Li, H.; Zhuang, P. Underwater image enhancement using a multiscale dense generative adversarial network. IEEE J. Ocean. Eng. 2019, 45, 862–870. [Google Scholar] [CrossRef]
Li, C.; Anwar, S.; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 2020, 98, 107038. [Google Scholar] [CrossRef]
Zhang, X.; Jiang, R.; Wang, T.; Luo, W. Single image dehazing via dual-path recurrent network. IEEE Trans. Image Process. 2021, 30, 5211–5222. [Google Scholar] [CrossRef]
Liu, S.; Pan, J.; Yang, M.H. Learning recursive filters for low-level vision via a hybrid neural network. In Proceedings of the 2016 European Conference on Computer Vision, Amsterdam, The Netherlands, 11–16 October 2016; pp. 560–576. [Google Scholar]
Zhao, C.; Sun, L.; Purkait, P.; Duckett, T.; Stolkin, R. Dense rgb-d semantic mapping with pixel-voxel neural network. Sensors 2018, 18, 3099. [Google Scholar] [CrossRef]
Burney, S.A.; Tariq, H. K-means cluster analysis for image segmentation. Int. J. Comput. Appl. 2014, 96. [Google Scholar]
Dehariya, V.K.; Shrivastava, S.K.; Jain, R. Clustering of image data set using k-means and fuzzy k-means algorithms. In Proceedings of the 2010 International Conference on Computational Intelligence and Communication Networks, Bhopal, India, 26–28 November 2010; pp. 386–391. [Google Scholar]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef]
Alenezi, F.; Santosh, K. Geometric Regularized Hopfield Neural Network for Medical Image Enhancement. Int. J. Biomed. Imaging 2021, 2021, 6664569. [Google Scholar] [CrossRef]
Alenezi, F.S.; Ganesan, S. Geometric-Pixel Guided Single-Pass Convolution Neural Network With Graph Cut for Image Dehazing. IEEE Access 2021, 9, 29380–29391. [Google Scholar] [CrossRef]
Deng, X.; Wang, H.; Liu, X. Underwater image enhancement based on removing light source color and dehazing. IEEE Access 2019, 7, 114297–114309. [Google Scholar] [CrossRef]
Berman, D.; Levy, D.; Avidan, S.; Treibitz, T. Underwater single image color restoration using haze-lines and a new quantitative dataset. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 43, 2822–2837. [Google Scholar] [CrossRef] [PubMed]
Siqueira, M.G.; Diniz, P.S. Digital filters. In The Electrical Engineering Handbook; Elsevier: Amsterdam, The Netherlands, 2005; pp. 839–860. [Google Scholar]
Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Bekaert, P. Color balance and fusion for underwater image enhancement. IEEE Trans. Image Process. 2017, 27, 379–393. [Google Scholar] [CrossRef] [PubMed]
Chiang, J.Y.; Chen, Y.C. Underwater image enhancement by wavelength compensation and dehazing. IEEE Trans. Image Process. 2011, 21, 1756–1769. [Google Scholar] [CrossRef]
Fu, X.; Fan, Z.; Ling, M.; Huang, Y.; Ding, X. Two-step approach for single underwater image enhancement. In Proceedings of the 2017 International Symposium on Intelligent Signal Processing and Communication Systems (ISPACS), Xiamen, China, 6–9 November 2017; pp. 789–794. [Google Scholar]
Iqbal, M.; Riaz, M.M.; Ali, S.S.; Ghafoor, A.; Ahmad, A. Underwater Image Enhancement Using Laplace Decomposition. IEEE Geosci. Remote Sens. Lett. 2020, 19, 1–5. [Google Scholar] [CrossRef]
Li, C.Y.; Guo, J.C.; Cong, R.M.; Pang, Y.W.; Wang, B. Underwater image enhancement by dehazing with minimum information loss and histogram distribution prior. IEEE Trans. Image Process. 2016, 25, 5664–5677. [Google Scholar] [CrossRef]
Berman, D.; Treibitz, T.; Avidan, S. Diving into haze-lines: Color restoration of underwater images. In Proceedings of the 2017 British Machine Vision Conference (BMVC), London, UK, 4–7 September 2017; pp. 1–50. [Google Scholar]
Fu, X.; Cao, X. Underwater image enhancement with global–local networks and compressed-histogram equalization. Signal Process. Image Commun. 2020, 86, 115892. [Google Scholar] [CrossRef]
Song, W.; Wang, Y.; Huang, D.; Tjondronegoro, D. A rapid scene depth estimation model based on underwater light attenuation prior for underwater image restoration. In Proceedings of the 2018 Pacific Rim Conference on Multimedia, Hefei, China, 21–22 September 2018; pp. 678–688. [Google Scholar]
Liu, X.; Gao, Z.; Chen, B.M. MLFcGAN: Multilevel feature fusion-based conditional GAN for underwater image color correction. IEEE Geosci. Remote Sens. Lett. 2019, 17, 1488–1492. [Google Scholar] [CrossRef]
Islam, M.J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
Qi, Q.; Zhang, Y.; Tian, F.; Wu, Q.J.; Li, K.; Luan, X.; Song, D. Underwater image co-enhancement with correlation feature matching and joint learning. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 1133–1147. [Google Scholar] [CrossRef]
Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “completely blind” image quality analyzer. IEEE Signal Process. Lett. 2012, 20, 209–212. [Google Scholar] [CrossRef]
Panetta, K.; Gao, C.; Agaian, S. Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean. Eng. 2015, 41, 541–551. [Google Scholar] [CrossRef]
Yang, M.; Sowmya, A. An underwater color image quality evaluation metric. IEEE Trans. Image Process. 2015, 24, 6062–6071. [Google Scholar] [CrossRef]
Fu, Z.; Fu, X.; Huang, Y.; Ding, X. Twice mixing: A rank learning based quality assessment approach for underwater image enhancement. Signal Process. Image Commun. 2022, 102, 116622. [Google Scholar] [CrossRef]

Figure 1. These examples show the visual performance of the proposed technique. Top row: input (hazed) images. Bottom row: the corresponding results produced by the proposed technique.

Figure 3. The proposed overall architecture of TDPRN. Feature maps are extracted via the feature extraction and transmission map estimation blocks. These blocks also estimate the transmission map for the dual-path block. Within the dual-path block, the top sequential units constitute a branch Path

Λ_{1}^{c}

, and the bottom units constitute the other branch Path

Λ_{2}^{c}

. These two paths interact with feature maps through the parallel interaction function until the final stage. The image reconstruction block is used to reconstruct the corresponding color channel’s dehazed images. The color channels are then fused via the softmax-weighted stack to obtain the final dehazed image.

Figure 3. The proposed overall architecture of TDPRN. Feature maps are extracted via the feature extraction and transmission map estimation blocks. These blocks also estimate the transmission map for the dual-path block. Within the dual-path block, the top sequential units constitute a branch Path

Λ_{1}^{c}

, and the bottom units constitute the other branch Path

Λ_{2}^{c}

. These two paths interact with feature maps through the parallel interaction function until the final stage. The image reconstruction block is used to reconstruct the corresponding color channel’s dehazed images. The color channels are then fused via the softmax-weighted stack to obtain the final dehazed image.

Figure 4. A detailed illustration of the dual-path block and the parallel interaction function. Within the dual-path block, the top sequential units constitute a branch Path

Λ_{1}^{c}

, and the bottom units constitute the other branch Path

Λ_{2}^{c}

. These two paths interact with feature maps through the parallel interaction function (four red-colored dotted lines) until the final stage. The image reconstruction block is used to reconstruct the corresponding color channel’s dehazed images. The color channels are then fused via a softmax-weighted stack to obtain the dehazed image. At each stage, units

S_{i, j}^{c}

\forall i = {1, 2}, j = {1, \dots 5}

of a ConvLSTM block and a pixel-wise convolution take features from the parallel interaction function and previous units as inputs.

Figure 4. A detailed illustration of the dual-path block and the parallel interaction function. Within the dual-path block, the top sequential units constitute a branch Path

Λ_{1}^{c}

, and the bottom units constitute the other branch Path

Λ_{2}^{c}

. These two paths interact with feature maps through the parallel interaction function (four red-colored dotted lines) until the final stage. The image reconstruction block is used to reconstruct the corresponding color channel’s dehazed images. The color channels are then fused via a softmax-weighted stack to obtain the dehazed image. At each stage, units

S_{i, j}^{c}

\forall i = {1, 2}, j = {1, \dots 5}

of a ConvLSTM block and a pixel-wise convolution take features from the parallel interaction function and previous units as inputs.

Figure 5. An extract of the detailed illustration of the dual-path block and the parallel interaction function for the green color channels’ RCNN from Figure 4.

Figure 6. Visual comparison of real underwater images sourced from [46]. From left to right are raw underwater images, and the proposed images. The proposed results are compared with the results from Ancuti [11], dark channel prior (DCP) [4], histogram distribution prior (HP) [36], and Water-Net [26].

Figure 7. Comparison of coral reef, fish, whale, ship anchor and stingray images from the top row to the last row. From left to right are raw underwater images, and proposed images. These images are compared with the results from Fus [32], WCID [33], Ts [34], LD [35].

Figure 8. Visual comparison of synthetic underwater images. From left to right are raw underwater images, proposed and clean (ground-truth) images. The proposed and clean images are compared with results from Ancuti [11], Guo [36], Berman [37], Cosman [8], Zhuang [19], and gl [38].

Figure 9. Visual comparison of natural underwater images. From left to right are raw underwater images, and proposed images. The proposed images were compared with the results from Ancuti [11], Guo [36], Berman [37], Cosman [8], Zhuang [19], gl [38]. The images are sourced from [38].

Figure 10. Subjective comparison of underwater images from [42]. From left to right are raw underwater images, results from CBF [32], ULAP [39], UWCNN [20], MLFcGAN [40], FUnIEGAN [41], waterNet [26], UICoE-Net [42] and the proposed.

Figure 11. Box plot showing the performance of the proposed technique compared to the existing methods.

Figure 12. Graphical comparison of UISM.

Figure 13. Pixel- correlation correction.

Table 1. Competitive methods used to compare the proposed algorithm.

Method	Reference	Year	Method	Reference	Year
Ancuti	[11]	2012	Guo	[36]	2016
Dark channel prior (DCP)	[4]	2010	Berman	[37]	2017
Histogram distribution prior (HP)	[36]	2016	Cosman	[8]	2017
Water-Net	[26]	2019	Zhuang	[19]	2019
Fus	[32]	2017	gl	[38]	2020
WCID	[33]	2011	CBF	[32]	2017
Ts	[34]	2017	ULAP	[39]	2018
LD	[35]	2020	UWCNN	[20]	2020
MLFcGAN	[40]	2019	FUnIEGAN	[41]	2020
waterNet	[26]	2019	UICoE-Net	[42]	2021

Table 2. Average niqe, UIQM

_{n o r m}

, UIQE comparison of different techniques whose results are partially presented in Figure 6. The best result is bold.

Table 2. Average niqe, UIQM

_{n o r m}

, UIQE comparison of different techniques whose results are partially presented in Figure 6. The best result is bold.

Technique	Niqe	UIQM $_{norm}$	UCIQE
Input	6.3007	1.0773	30.6619
Ancuti [11]	4.704	1.2584	31.4635
DCP [4]	5.9239	1.1049	32.0602
HP [36]	4.2751	1.5625	35.1167
Water-Net [26]	6.6270	1.0714	26.7070
Proposed	4.0614	1.6016	35.7874

Table 3. Average niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 7. The best result is bold.

Table 3. Average niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 7. The best result is bold.

Technique	Niqe	UIQM $_{norm}$	UCIQE
Input	3.4554	1.0952	26.7702
Fus [32]	7.5204	1.5673	34.9629
WCID [33]	6.7322	1.7766	29.1488
Ts [34]	3.5524	1.1900	26.4551
LD [35]	3.5483	1.4806	31.6199
Proposed	3.9341	1.9499	32.9188

Table 4. Average mAP, niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 8. The best result is bold.

Table 4. Average mAP, niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 8. The best result is bold.

Technique	(mAP)	Niqe	UIQM $_{norm}$	UCIQE
Input	0.1795	7.9761	1.4175	32.9212
Ancuti [11]	0.1891	8.4061	1.5278	33.1459
Guo [36]	0.2669	7.2922	1.5350	33.3579
Berman [37]	0.3394	7.4781	1.5617	33.4739
Cosman [8]	0.4095	7.1248	1.5371	34.1637
Zhuang [19]	0.4181	7.5798	1.5399	32.9392
gl [38]	0.4841	8.1367	1.5220	32.6404
Proposed	0.5017	9.2712	1.8174	35.9634

Table 5. Average niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 9. The best result is bold.

Table 5. Average niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 9. The best result is bold.

Technique	Niqe	UIQM $_{norm}$	UCIQE
Input	5.4243	1.4116	32.6663
Ancuti [11]	5.9633	1.4541	37.1339
Guo [36]	5.6586	1.5828	33.8156
Berman [37]	5.4780	1.5612	33.3881
Cosman [8]	6.0975	1.4827	34.4037
Zhuang [19]	6.6109	1.4549	33.0792
gl [38]	5.9403	1.4536	34.5999
Proposed	6.2905	2.0299	37.1459

Table 6. Average niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 10. The best result is bold.

Table 6. Average niqe, UIQM

_{n o r m}

, UCIQE comparison of different techniques whose results are partially presented in Figure 10. The best result is bold.

Technique	Niqe	UIQM $_{norm}$	UCIQE
Input	5.6203	1.6118	31.1119
CBF [32]	7.9360	1.6938	31.0506
ULAP [39]	11.8648	1.8419	33.7829
UWCNN [20]	6.9041	1.6054	30.4853
MLFcGAN [40]	4.9413	1.4881	33.0000
FUnIEGAN [41]	7.6138	1.7062	32.1150
waterNet [26]	6.9541	1.6953	31.3519
UICoE-Net [42]	4.7577	1.5692	31.2444
Proposed	7.0530	2.1111	35.2936

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

© 2022 by the author. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alenezi, F. RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing. Electronics 2022, 11, 2894. https://doi.org/10.3390/electronics11182894

AMA Style

Alenezi F. RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing. Electronics. 2022; 11(18):2894. https://doi.org/10.3390/electronics11182894

Chicago/Turabian Style

Alenezi, Fayadh. 2022. "RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing" Electronics 11, no. 18: 2894. https://doi.org/10.3390/electronics11182894

APA Style

Alenezi, F. (2022). RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing. Electronics, 11(18), 2894. https://doi.org/10.3390/electronics11182894

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

RGB-Based Triple-Dual-Path Recurrent Network for Underwater Image Dehazing

Abstract

1. Introduction

Contribution

2. Proposed Methodology

2.1. Triple-Dual-Path Recurrent Network

2.1.1. Network Architecture

2.1.2. Triple-Dual-Path Block

2.1.3. Interaction Functions within the Dual Blocks

3. Experimental Results

3.1. Dataset

3.2. Comparison Methods

3.3. Objective Evaluation of the Proposed Images’ Visual Quality

3.4. Subjective Assessment

Analysis of Underwater Image Overall Quality

4. Conclusions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI