DRGAN: Dense Residual Generative Adversarial Network for Image Enhancement in an Underwater Autonomous Driving Device

Underwater autonomous driving devices, such as autonomous underwater vehicles (AUVs), rely on visual sensors, but visual images tend to produce color aberrations and a high turbidity due to the scattering and absorption of underwater light. To address these issues, we propose the Dense Residual Generative Adversarial Network (DRGAN) for underwater image enhancement. Firstly, we adopt a multi-scale feature extraction module to obtain a range of information and increase the receptive field. Secondly, a dense residual block is proposed, to realize the interaction of image features and ensure stable connections in the feature information. Multiple dense residual modules are connected from beginning to end to form a cyclic dense residual network, producing a clear image. Finally, the stability of the network is improved via adjustment to the training with multiple loss functions. Experiments were conducted using the RUIE and Underwater ImageNet datasets. The experimental results show that our proposed DRGAN can remove high turbidity from underwater images and achieve color equalization better than other methods.


Introduction
With developments in science and technology, Internet of Things (IoT) technology has been introduced into advanced underwater vision tasks, such as autonomous underwater driving, ocean scene analysis, and fisheries.Underwater data, such as the marine environment, marine density, and seafloor pathways, are monitored to understand the state of the underwater environment and the growth and health of underwater life through the intelligent analysis of real-time photographs taken underwater.However, the accuracy of the intelligent analysis results is greatly influenced by the underwater images' quality; a complex imaging environment results in color casts and a loss of detail in the images obtained [1,2].As a result, it is critical to achieve clarity and improve the details in underwater images.
Underwater image enhancement is developing rapidly via both traditional and deep learning.In traditional methods, Drews et al. [3], influenced by the dark channel prior (DCP [4]), proposed transmission estimation in underwater single images (UDCP), which does not take into account the effects of red channels but is prone to overexpression.Ma et al. [5] proposed the restoration of underwater images using a mix of improved dark channel prior and gray world methods; this new model improved the DCP and gray world theory to restore underwater images.Ancuti et al. [6]  an image that was clear after white balance and gamma correction had been performed on the damaged image.To improve underwater image quality, Liang et al. [7] combined color correction based on attenuation maps with a detail retention and haze removal method based on multi-scale decomposition.Marques et al. [8] derived an effective atmospheric illumination model from local contrast information based on human observations and, from this model, they generated an enhanced image for highlighting details and an enhanced image for removing darkness.In turn, the underwater image is enhanced via multi-scale fusion.The interpretability of traditional methods is obvious, but the effect needs to be further improved.
In recent years, the application of deep learning methods in underwater image processing has become more and more prominent, especially deep learning methods based on genetic algorithms.For example, Li et al. [9] proposed Water-GAN to enhance underwater images.Synthetic underwater images are utilized as datasets for training the neural network to perform underwater image color correction.Fabbri et al. [10] suggested the enhancement of underwater imagery using generative adversarial networks.They first applied CycleGAN to paired images to create degraded underwater images.The underwater image pairs were then selected as datasets for further network training.Guo et al. [11] designed a multi-level intensive generative adversarial network, containing two multi-scale dense blocks that can correct color differences and enhance image details.Islam et al. [12] suggested fast underwater image enhancement to enhance the perception of images (FUnIE-GAN) based on U-Net, which improves image detail clarity by using residual connections in the generator.GAN-RS, a multi-branch discriminator proposed by Chen et al. [13], was developed to increase the quality of underwater images.However, numerous training parameters require careful tuning.If you train with incorrect parameters, the resulting images will produce artifacts.Huang et al. [14] proposed Semi-UIR, to enhance the model performance with a semi-supervised method and mean-teacher-based underwater image restoration model, by constructing a reliable bank and contrast learning.Compared with the traditional learning methods, a deep learning method can better solve the image color distortion problem and has a superior portability and learning ability in image processing.
The above methods focus on enhancing underwater images, as shown in Figure 1.However, their algorithm is not well-suited to intricate scenarios due to the lack of attention given to the color and data loss caused by the imaging environment.In addition, most methods that are available improve the network by increasing the network depth; however, this will result in problems such as gradients, training difficulties, and unstable parameters [15].

Generative Adversarial Network
Generative adversarial networks are composed of two distinct neural networks: a To solve the problems above, we propose the implementation of the Dense Residual Generative Adversarial Network (DRGAN).Here are the primary contributions: The remainder of this work is structured as follows.Section 2 discusses related work, such as dense residual theory and GANs.Section 3 describes our proposed method in detail, and Section 4 presents and discusses the experimental results and analysis.Finally, Section 5 concludes the paper.

Generative Adversarial Network
Generative adversarial networks are composed of two distinct neural networks: a generator and a discriminator [16].In this paper, we employ a generator to produce a distinct image from the deteriorated image; the discriminator utilizes both the clear image and the generated image as the input, and it outputs the probability that the generated image is true.The generator and discriminator engage in an adversarial relationship during training to encourage the discriminator to accurately distinguish between genuine and counterfeit samples.In the end, we want the generator to produce images of high quality as a result of the network.

Residual Network
He et al. [17] suggested a residual network as a solution to the issue of numerous parameters decreasing due to the network's excessive depth.
As shown in Figure 2, the addition of the details of the shallow layers to the subsequent deep layers allows the deep layers to focus on learning, avoids the loss of feature information, and prevents model degradation.Consequently, utilizing this technique on deep networks can address issues such as the escalation of gradients during training.

Generative Adversarial Network
Generative adversarial networks are composed of two distinct neural networks: a generator and a discriminator [16].In this paper, we employ a generator to produce a distinct image from the deteriorated image; the discriminator utilizes both the clear image and the generated image as the input, and it outputs the probability that the generated image is true.The generator and discriminator engage in an adversarial relationship during training to encourage the discriminator to accurately distinguish between genuine and counterfeit samples.In the end, we want the generator to produce images of high quality as a result of the network.

Residual Network
He et al. [17] suggested a residual network as a solution to the issue of numerous parameters decreasing due to the network's excessive depth.
As shown in Figure 2, the addition of the details of the shallow layers to the subsequent deep layers allows the deep layers to focus on learning, avoids the loss of feature information, and prevents model degradation.Consequently, utilizing this technique on deep networks can address issues such as the escalation of gradients during training.

Densely Connected Convolutional Network
The distinction between densely connected convolutional networks (DenseNets) [18] and residual networks lies in the fact that DenseNets facilitate the transmission of data between the various layers of the network and boost the number of links in each layer of the network, leading to improved feature reuse and a more powerful gradient propagation.As illustrated in Figure 3, the feature output of the preceding layer can be sent to all the following layers, and the transmission of feature information is improved via the linking of the layers in the network in pairs.Due to each layer being connected to all the previous layers when the gradient is back-propagated, the gradient is transferred to all the preceding layers in turn, a small number of convolutional kernels can still produce a substantial amount of feature information, and the preceding layers can fine-tune their parameters by taking advantage of the gradient data from the subsequent layers.Through this process, the issue of gradient disappearance is lessened, and the network's training performance is enhanced.

Densely Connected Convolutional Network
The distinction between densely connected convolutional networks (DenseNets) [18] and residual networks lies in the fact that DenseNets facilitate the transmission of data between the various layers of the network and boost the number of links in each layer of the network, leading to improved feature reuse and a more powerful gradient propagation.As illustrated in Figure 3, the feature output of the preceding layer can be sent to all the following layers, and the transmission of feature information is improved via the linking of the layers in the network in pairs.Due to each layer being connected to all the previous layers when the gradient is back-propagated, the gradient is transferred to all the preceding layers in turn, a small number of convolutional kernels can still produce a substantial amount of feature information, and the preceding layers can fine-tune their parameters by taking advantage of the gradient data from the subsequent layers.Through this process, the issue of gradient disappearance is lessened, and the network's training performance is enhanced.The dense connections in Figure 3 can be represented as: , ,..., where X − is the output of the characteristic map.

Generative Network
Our generator includes a multi-scale feature extraction module (MSFEM) and dense residual block (DRB), facilitating the generation of crystal-clear underwater images.Using the yellow convolution unit in the MSFEM as a case in point, as demonstrated in Figure 4, Conv1 is a convolution with a stride of 1, the number of convolution kernels is 16, the ReLU is an activation function, and the BN is a representation of batch normalization.The dense connections in Figure 3 can be represented as: where [X 0 , X 1 , . . ., X n−1 ] is the output of the characteristic map.

Generative Network
Our generator includes a multi-scale feature extraction module (MSFEM) and dense residual block (DRB), facilitating the generation of crystal-clear underwater images.Using the yellow convolution unit in the MSFEM as a case in point, as demonstrated in Figure 4, Conv1 is a convolution with a stride of 1, the number of convolution kernels is 16, the ReLU is an activation function, and the BN is a representation of batch normalization.As shown in Figure 5, 11  and 33  focus on extracting image detail information, the 55  and 77  convolution kernels can better extract the global features of the im- age, and each parallel convolution unit consists of two identical convolution kernels.By combining the above, MSFEM can extract different spatial features from the input image.We use concatenation to combine four feature maps to realize feature information fusion.To a certain extent, the loss of shallow details is avoided.The output result can be expressed as: where F is the feature image processed by the module, and As illustrated in Figure 4, we designed a DRB composed of 3 3  convolution ker- nels.The goal of dense connections is to process as much information as possible from all layers, and the use of residual connections not only improves the utilization of information and ensures image integrity but also enables the next dense residual block to preprocess the information.As shown in Figure 5, 1 × 1 and 3 × 3 focus on extracting image detail information, the 5 × 5 and 7 × 7 convolution kernels can better extract the global features of the image, and each parallel convolution unit consists of two identical convolution kernels.By combining the above, MSFEM can extract different spatial features from the input image.We use concatenation to combine four feature maps to realize feature information fusion.To a certain extent, the loss of shallow details is avoided.The output result can be expressed as: where F is the feature image processed by the module, and f 1 , f 2 , f 3 , and f 4 are the feature images obtained by the four convolution units, respectively.As shown in Figure 5, 1 1  and 3 3  focus on extracting image detail information, the 5 5  and 7 7  convolution kernels can better extract the global features of the im- age, and each parallel convolution unit consists of two identical convolution kernels.By combining the above, MSFEM can extract different spatial features from the input image.We use concatenation to combine four feature maps to realize feature information fusion.To a certain extent, the loss of shallow details is avoided.The output result can be expressed as: where F is the feature image processed by the module, and As illustrated in Figure 4, we designed a DRB composed of 3 3  convolution ker- nels.The goal of dense connections is to process as much information as possible from all layers, and the use of residual connections not only improves the utilization of information and ensures image integrity but also enables the next dense residual block to preprocess the information.As illustrated in Figure 4, we designed a DRB composed of 3 × 3 convolution kernels.The goal of dense connections is to process as much information as possible from all layers, and the use of residual connections not only improves the utilization of information and ensures image integrity but also enables the next dense residual block to preprocess the information.
By combining residual and dense connections, the DRB ensures the correct transmission of feature information and reduces the computational complexity of the module.More importantly, compared with other networks, the network efficiency is improved without additional logistical costs, making feature information fusion more efficient.The result of the residual can be expressed as: where Output is the output image of the module, h(x 1 ) is the direct mapping, F(x 1 , W l ) is the residual part, x 1 is the input, and W l is the convolution operation.Derived from the SSIM test results shown in Figure 6, the number of dense residual block cycles of the network is set to 7 to achieve the most desired outcome.

PEER REVIEW 6 of 17
By combining residual and dense connections, the DRB ensures the correct transmission of feature information and reduces the computational complexity of the module.More importantly, compared with other networks, the network efficiency is improved without additional logistical costs, making feature information fusion more efficient.The result of the residual can be expressed as: where Output is the output image of the module, x is the input, and l W is the convolution operation.
Derived from the SSIM test results shown in Figure 6, the number of dense residual block cycles of the network is set to 7 to achieve the most desired outcome.

Discrimination Network
The proposed discriminating network, as illustrated in Figure 7, is made up of fivelayer convolution units, like the design of PatchGAN [19].The convolution unit follows the Conv-BN-Leaky ReLU structure, and the step size of multiple units is set to 2 to increase the receptive field of the output characteristics.Due to the problem of neurons not being able to learn after the ReLU function enters the negative interval, we chose the leaky ReLU function in order to limit the appearance of silent neurons.The network uses the generated image and undistorted image as the inputs, and it outputs an image with a size of 30 30  .The discrimination network operates on small-sized image blocks, which greatly reduces the number of parameters and amount of computation, while also alleviating the problem of slow convergence that is characteristic of the GAN.The leaky ReLU expression is shown in the formula:

Discrimination Network
The proposed discriminating network, as illustrated in Figure 7, is made up of fivelayer convolution units, like the design of PatchGAN [19].The convolution unit follows the Conv-BN-Leaky ReLU structure, and the step size of multiple units is set to 2 to increase the receptive field of the output characteristics.Due to the problem of neurons not being able to learn after the ReLU function enters the negative interval, we chose the leaky ReLU function in order to limit the appearance of silent neurons.The network uses the generated image and undistorted image as the inputs, and it outputs an image with a size of 30 × 30.The discrimination network operates on small-sized image blocks, which greatly reduces the number of parameters and amount of computation, while also alleviating the problem of slow convergence that is characteristic of the GAN.The leaky ReLU expression is shown in the formula: where α is a tiny constant used to maintain some negative axis values so that the information on the negative axis is not completely lost.In the last layer, the sigmoid function is used to map the output pixel range to the undistorted image, which can clearly distinguish the authenticity of the created image and the undistorted image in a certain area.The function expression is as follows:

Loss Function
The adjustment of the network is facilitated by the linear combination of GAN loss and SSIM loss, as outlined below.
(1) GAN Loss GAN loss functions are used to make the generated sample distribution as close to the true sample distribution as possible.The following is the definition of countermeasure loss: where X is the degraded image, Y is the undistorted image, E is the mathematical expectation, and Z denotes the random noise.To ensure that D recognizes the image produced by G as an undistorted image, G generates an image that conforms to the undistorted data distribution as much as possible.
(2) SSIM Loss The structural similarity of the two images is measured using SSIM loss.SSIM loss functions similarly to the human visual system.It is sensitive to the perception of local structural changes and is conducive to enhancing the image's texture details.SSIM loss is defined as: where P is the image block, and p is the image block's center pixel.
(3) Perceptual Loss The parameters of the feature map in the trained convolutional neural network define the perceptual loss.The image details obtained after the function participates in the training are more realistic.The perceptual loss is defined as: In the last layer, the sigmoid function is used to map the output pixel range to the undistorted image, which can clearly distinguish the authenticity of the created image and the undistorted image in a certain area.The function expression is as follows: (1 + e −x ) 2 , (5)

Loss Function
The adjustment of the network is facilitated by the linear combination of GAN loss and SSIM loss, as outlined below.
(1) GAN Loss GAN loss functions are used to make the generated sample distribution as close to the true sample distribution as possible.The following is the definition of countermeasure loss: where X is the degraded image, Y is the undistorted image, E is the mathematical expectation, and Z denotes the random noise.To ensure that D recognizes the image produced by G as an undistorted image, G generates an image that conforms to the undistorted data distribution as much as possible.
(2) SSIM Loss The structural similarity of the two images is measured using SSIM loss.SSIM loss functions similarly to the human visual system.It is sensitive to the perception of local structural changes and is conducive to enhancing the image's texture details.SSIM loss is defined as: where P is the image block, and p is the image block's center pixel.
(3) Perceptual Loss The parameters of the feature map in the trained convolutional neural network define the perceptual loss.The image details obtained after the function participates in the training are more realistic.The perceptual loss is defined as: where φ i,j is the characteristic diagram of the output of the j convolution layer before the i pooling layer in the pre-training VGG19 network, and W i,j and H i,j are the dimensions of the characteristic diagram.In this paper, i is taken as 4, and j is taken as 3; the VGG 4,3 convolution characteristic diagram is selected to define the loss.
(4) Overall Loss The function of the overall loss obtained via the linear combination of the three loss functions can effectively improve the robustness of the network and is defined as: After many experiments, λ 1 is taken as 1, λ 2 as 100, and λ 3 as 10.

Experiment
To verify the effectiveness of DRGAN, in this study, we firstly set the experimental details.Then, we compared DRGAN with different representative methods.These methods included Fusion [6], ICCB [20], Lˆ2UWE [8], FUnIE-GAN [12] (replaced with FUnIE below), Semi-UIR [14], and UWCNN [21].Finally, to validate the components of DRGAN, we performed ablation experiments.Furthermore, we conducted experiments such as feature point matching and edge detection to validate the usefulness of our approach in realworld applications.

Experimental Details
We conducted experiments on the Underwater ImageNet [10] dataset and RUIE [22] dataset, respectively.The details are as follows.(1) From the Underwater ImageNet dataset, we randomly selected 4000 pairs of images from underwater scenes for training and 2000 pairs for testing.(2) We exploited the trained model of the Underwater ImageNet dataset to test the RUIE dataset, which demonstrated the generalization ability of DRGAN.We trained DRGAN with Adam and set the training and test image size to 256 × 256 × 3, the batch size to 2, and the epoch to 50.TensorFlow was used as the deep learning framework on an Ubuntu 18.04 machine with 32 GB RAM and a GTX1070Ti (8 GB).

Subjective Evaluation
The color of the undistorted swatch image would be degraded because of the complex underwater imaging environment.Therefore, the color restoration impact of DRGAN could be efficiently tested through color recovery experiments on the color card [23].
As can be seen in Figure 8, the Fusion method reduces the contrast between the yellow and pink color blocks, while deepening the overall hue of the color card picture, and the image processed via the ICCB algorithm suffers from a color distortion problem.Although the Semi-UIR algorithm can achieve color correction, the visual effect is negatively affected by the overall redness of the processed image.The problem of low discrimination is shown in the image that was processed via the Lˆ2UWE algorithm, as evidenced by the dark purple and green color cards that are visually close to black.Overall, the color cards processed via the UWCNN algorithm suffer from poor color correction, as shown through the blueish hue.Overall, the FUnIE algorithm tends to make the color cards appear red during the experiment.On the contrary, our method achieves promising visual results with the color card images, especially when dealing with indistinguishable color patches (specifically black, purple, and dark green), validating the superiority of the color correction capability of our method.Next, the method was applied to images from a complex underwater environment.The input image was affected by different degrees of color distortion, low brightness, and turbidity, resulting in various degradation phenomena.Figure 9 illustrates the processing results for each method.Images 1-2 are the normal degraded images, Images 3-4 are the atomized images, and Images 5-6 and Images 7-8 are green and blue partial images, respectively.Next, the method was applied to images from a complex underwater environment.The input image was affected by different degrees of color distortion, low brightness, and turbidity, resulting in various degradation phenomena.Figure 9 illustrates the processing results for each method.Images 1-2 are the normal degraded images, Images 3-4 are the atomized images, and Images 5-6 and Images 7-8 are green and blue partial images, respectively.
As can be seen in Figure 9, the Fusion algorithm fails to improve the sharpness and quality of low-brightness and color-distortion images.The ICCB algorithm has some success in improving the brightness and color correction, but the vividness of the image colors is greatly reduced.The Lˆ2UWE algorithm fails to improve green, blue, and normal degraded images.Although the fogging problem can be mitigated, the generated image seems to have insufficient brightness.The FunIE algorithm can solve the problem of low brightness, but the problem of color distortion remains.The fogged image processed via FUnIE shows the problem of an obvious reddish tint, which is not consistent with the real image, and the image processed via the UWCNN algorithm cannot achieve a good visual effect due to the overall bluish color.Image processing using the Semi-UIR method achieves some success in defogging and color correction, but the overall brightness of the final image is low.In addition, as shown in Figures 5-8, the ICCB method is unable to perform effective deblurring, as evidenced by the severe color distortion.On the contrary, the results of our method show brighter and clearer images compared to all the tested comparison algorithms.It was found that the algorithm can address degradation in complex underwater environments (off-color, low brightness, high turbidity, etc.) and that it exhibits a strong robustness.It was determined through subjective evaluation that our method produces better-clarity results for images with different degrees of degradation compared to other, similar new methods.
The input image was affected by different degrees of color distortion, low bright turbidity, resulting in various degradation phenomena.Figure 9 illustrates the p results for each method.Images 1-2 are the normal degraded images, Images 3 atomized images, and Images 5-6 and Images 7-8 are green and blue partial im spectively.

Objective Evaluation
The image quality when applying our method was further evaluated through five objective evaluation indexes: UCIQE, UIQM, SSIM, PSNR, and CIEDE2000.
(1) The underwater color image quality evaluation index [24] (UCIQE) is proportional to the underwater picture quality, and the formula for calculating the index is as follows: where σ c is the chromaticity standard deviation, con l represents the contrast in brightness, µ s represents the average value of saturation, and c 1 , c 2 , and c 3 are weighting coefficients.
(2) The underwater image quality measurement [25] (UIQM) is a quality-evaluated indicator of non-reference underwater images based on human visual system excitation.The calculation formula is as follows: where c 1 is set to 0.0282, c 2 is set to 0.2953, and c 3 is set to 3.5735.The underwater image quality measurement is a linear combination of the underwater image colorfulness measure (UICM), underwater image sharpness measure (UISM), and underwater image contrast measure (UICONM).The higher the UIQM, the better the image's color balance, sharpness, and contrast.(3) The structural similarity index measurement [26] (SSIM) is an index for determining how similar the two images are.When two images, x and y, are given, the calculation formula is: where µ x and µ y are the average of x and y, respectively; σ 2 x , σ 2 y are the variance of x and y; and c i = k i L, (i = 1, 2) is a constant to maintain stability.σ xy is the covariance of x and y; k 1 = 0.01, k 2 = 0.03.(4) The peak signal-to-noise ratio (PSNR) is an index to measure image quality.The calculation formula for the mean square error (MSE) is: where two images, I o , I p , are compared.The PSNR is obtained through the MSE, and the calculation formula is: (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.
Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of other algorithms.For Images 1 and Image 2, the DRGAN UCIQE was lower than that of Lˆ2UWE because the original image was less degradable, and Semi-UIR recovery was better than DRGAN enhancement.However, ICCB, with a higher UCIQE, showed significant color aberration in Image 6 and unnatural color restoration in Image 8.
The UIQM results from Figure 9 are shown in Table 3; our average UIQM for DRGAN is higher than for the other algorithms.The light degradation of Image 1 and Image 3 leads to a higher UIQM in the ICCB restoration algorithm than the enhancement effect of DRGAN, and when processed via FUnIE, Image 2 has a yellow color cast.Although Semi-UIR achieves good enhancement results, it is not thorough enough in detail processing, as shown in Image 8.
(5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of (5) The CIEDE2000 evaluation index [27], which has a range of [0, 100], measures the color changes between the standard color card and each processed color block.The color differences are reduced when the index decreases.For the evaluation in Figure 8, we used the CIEDE2000 evaluation index.Table 1 displays the results.Comparing the data in Table 1, we can see that, like DRGAN, FUnIE, and Semi-UIR both achieve good results, and FUnIE adds residual connections to the generator to enhance the network performance.DRGAN's CIEDE2000 average result is the lowest, showing that our technique performs better in terms of color recovery.
We used UCIQE to evaluate the images in Figure 9, and the results are shown in Table 2.The results show that the average value of the DRGAN algorithm is higher than that of Then, we randomly selected an image for subjective comparison.Figure 10 shows that the image processed w/o DRB has artifacts and is accompanied by a yellow color cast, while the image processed w/o MSFEM is subjectively better than that w/o DRB, but still has a small amount of color cast.The image enhanced via the full processing model is the best and the most visually natural.Figure 10 also demonstrates that the image color recovery is poor, and there are artifacts, after the removal of the residual connections in the dense residual block.On the contrary, after the removal of the dense connections in the dense residual block, the image is over-enhanced.

REVIEW
14 of 17 has a small amount of color cast.The image enhanced via the full processing model is the best and the most visually natural.Figure 10 also demonstrates that the image color recovery is poor, and there are artifacts, after the removal of the residual connections in the dense residual block.On the contrary, after the removal of the dense connections in the dense residual block, the image is over-enhanced.

Additional Experiments
Less image feature information makes underwater image detection more challenging.As shown in Figures 11 and 12, several images were selected for surf feature point matching and Canny operator experiments, which verified that our method can enhance edges and other feature information in underwater images.
Figure 11 shows the results from the surf feature point matching; it can be seen that

Additional Experiments
Less image feature information makes underwater image detection more challenging.As shown in Figures 11 and 12, several images were selected for surf feature point matching and Canny operator experiments, which verified that our method can enhance edges and other feature information in underwater images.Figure 12 shows the results of the Canny operator; after the processing in this method, more details of the image can be added (such as coral patterns, etc.).Compared with the degraded images, DRGAN can clearly show the contour information of the picture.This makes the detection and tracking of features of interest via underwater robots a much less taxing endeavor.Figure 12 shows the results of the Canny operator; after the processing in this method, more details of the image can be added (such as coral patterns, etc.).Compared with the degraded images, DRGAN can clearly show the contour information of the picture.This makes the detection and tracking of features of interest via underwater robots a much less taxing endeavor.Figure 11 shows the results from the surf feature point matching; it can be seen that the processed image has significantly more feature points than the original underwater image.These experiments show that the proposed algorithm successfully enriches the characteristics of underwater images, making the subsequent information processing much easier.
Figure 12 shows the results of the Canny operator; after the processing in this method, more details of the image can be added (such as coral patterns, etc.).Compared with the degraded images, DRGAN can clearly show the contour information of the picture.This makes the detection and tracking of features of interest via underwater robots a much less taxing endeavor.

Conclusions
In this paper, we propose DRGAN as a means of enhancing underwater images, drawing inspiration from ResNet and DenseNet.Through the utilization of a multi-scale feature extraction module and a dense residual block in the generator, multi-scale feature information is integrated.The incorporation of these multi-stage features broadens the receptive field and safeguards against any decline in network performance due to gradient disappearance.Additionally, DRGAN optimizes network utilization by leveraging the benefits of both residual and dense connections.It is worth mentioning that the generator's computational efficiency has been enhanced in comparison to networks that solely rely on dense blocks.We employ a discriminator akin to PatchGAN for adversarial training, and this augments the generator's ability to sharpen images.The findings from the experiments conducted on intricate underwater scenes indicate that DRGAN greatly enhances the quality of images in comparison to various renowned techniques.In the future, we plan to use the proposed method in other areas of marine engineering, such as object recognition and detection within wider underwater scenes.

Figure 1 .
Figure 1.Image comparison: (a) degraded image, (b) FUnIE-GAN, (c) UDCP, and (d) ours; the image in the second row is the enlarged detail image of the yellow box in the first row.

Figure 1 .
Figure 1.Image comparison: (a) degraded image, (b) FUnIE-GAN, (c) UDCP, and (d) ours; the image in the second row is the enlarged detail image of the yellow box in the first row.

( 1 )
A multi-scale feature extraction module is proposed to extract image detail information and expand the receptive field.(2) A dense residual block is proposed to fuse feature maps into clear images, not only fully utilizing all layers with local dense connections but also adding residual connections to reuse information.(3) We combine multiple loss functions to facilitate the learning of the generator regarding the generation of clear images.The experimental results show that DRGAN outperforms the state-of-the-art methods in terms of qualitative and quantitative indicators.

Sensors 2023 , 17 Figure 1 .
Figure 1.Image comparison: (a) degraded image, (b) FUnIE-GAN, (c) UDCP, and (d) ours; the image in the second row is the enlarged detail image of the yellow box in the first row.

Figure 3 .
Figure 3.The diagram of our densely connected convolutional network structure.Different colors are used here to distinguish and emphasize the different nodes of the densely connected convolutional network structure.

Figure 3 .
Figure 3.The diagram of our densely connected convolutional network structure.Different colors are used here to distinguish and emphasize the different nodes of the densely connected convolutional network structure.

Figure 4 .
Figure 4.The diagram of our generative network structure, where different colors represent convolutional layers with different convolutional kernels.
obtained by the four convolution units, respectively.

Figure 5 .
Figure 5.The diagram of our multi-scale feature extraction module.

Figure 4 .
Figure 4.The diagram of our generative network structure, where different colors represent convolutional layers with different convolutional kernels.

Figure 4 .
Figure 4.The diagram of our generative network structure, where different colors represent convolutional layers with different convolutional kernels.
obtained by the four convolution units, respectively.

Figure 5 .
Figure 5.The diagram of our multi-scale feature extraction module.

Figure 5 .
Figure 5.The diagram of our multi-scale feature extraction module.

Figure 6 .
Figure 6.The number of dense residual block cycles.

Figure 6 .
Figure 6.The number of dense residual block cycles.

Figure 7 .
Figure 7.The diagram of our discriminator network structure.

Figure 12 .
Figure 12.The Canny operator results.The images in the first and third rows represent degraded and enhanced images following our algorithm, while the images in the second and fourth rows represent the outcomes of the Canny operator detection output.Additionally, the red box line in the fourth row illustrates the more comprehensive image information that can be acquired through the processing of our algorithm.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 1 .
The evaluation results for CIEDE2000; the black bold font represents the best data.

Table 2 .
The quantitative comparison using the UCIQE dataset; the black bold font represents the best data.

Table 3 .
The quantitative comparison using the UIQM dataset; the black bold font represents the best data.

Table 6 .
The ablation experiment results for different variants of DRGAN; w/o refers to without.The black bold font represents the best data.