An Overview of Underwater Vision Enhancement: From Traditional Methods to Recent Deep Learning

: Underwater video images, as the primary carriers of underwater information, play a vital role in human exploration and development of the ocean. Due to the optical characteristics of water bodies, underwater video images generally have problems such as color bias and unclear image quality, and image quality degradation is severe. Degenerated images have adverse effects on the visual tasks of underwater vehicles, such as recognition and detection. Therefore, it is vital to obtain high-quality underwater video images. Firstly, this paper analyzes the imaging principle of underwater images and the reasons for their decline in quality and brieﬂy classiﬁes various existing methods. Secondly, it focuses on the current popular deep learning technology in underwater image enhancement, and the underwater video enhancement technologies are also mentioned. It also introduces some standard underwater data sets, common video image evaluation indexes and underwater image speciﬁc indexes. Finally, this paper discusses possible future developments in this area.


Introduction
The ocean covers 71% of the Earth's surface, with a total area of 360 million square kilometers, and contains rich resources.Exploration and development of the ocean have been long-term concerns of human development.With the increasing scarcity of resources, it has become an inevitable choice to strengthen the exploration and development of the ocean [1].However, due to the harsh and complex underwater environment, it is too dangerous to explore and develop it manually.Therefore, it is safer and more efficient to adopt autonomous underwater vehicles (AUV) to carry out ocean exploration and development.In addition, AUVs are also widely used in lakes, rivers, and other water areas.
Visual information, which plays an essential role in detecting and perceiving the environment, is easy for underwater vehicles to obtain.However, due to many uncertainties in the aquatic environment and the influence of water on light absorption and scattering, and the quality of directly captured underwater images can degrade significantly.Large amounts of solvents, particulate matter, and other inhomogeneous media in the water cause less light to enter the camera than in the natural environment.According to the Beer-Lambert-Bouger law, the attenuation of light has an exponential relationship with the medium.Therefore, the attenuation model of light in the process of underwater propagation is expressed as In Equation ( 1), E is the illumination of light, r is the distance, a is the absorption coefficient of the water body, and b is the scattering coefficient of the water body.The sum of a and b is equivalent to the total attenuation coefficient of the medium.
The process of underwater imaging is shown in Figure 1.As light travels through water, it is absorbed and scattered.Water bodies have different absorption effects on light with different wavelengths.As shown in Figure 1, red light attenuates the fastest and will disappear at about 5 m underwater, blue and green light attenuates slowly, and blue light will disappear at about 60 m underwater.The scattering of suspended particles and other media causes light to change direction during transmission and spread unevenly.The scattering process is influenced by the properties of the medium, the light, and polarization.McGlamery et al. [2] presented a model for calculating underwater camera systems.The irradiance of non-scattered light, scattered light and backscattered light can be calculated by input geometry, source properties and optical properties of water.Finally, the parameters such as contrast, transmittance and the signal-to-noise ratio can be obtained.Then, the classical Jaffe-McGlamery [3] underwater imaging model was proposed.It indicates that the total illuminance entering the camera is a linear superposition of the direct component, the forward scatter component, and the backscattered component In the equation, E d , E j and E b represent the components of direct irradiation, forward scattering, and backscattering, respectively.The direct irradiation component is the light directly reflected from the surface of the object into the receiver.The forward scattering component refers to the light reflected by the target object in the water, deflected into the receiver by the small angle of suspended particles in the water during straight propagation.Backscattering refers to illuminated light that reaches the receiver through the scattering of the water body.In general, the forward scattering of light attenuates more energy than the backscattering of light.Due to the absorption and scattering of incident light by water bodies, the video images collected underwater generally appear blue-green and have an apparent fog-like effect.In addition, blur, low contrast, color distortion, more noise, unclear details, and limited visual range are the typical problems that degrade the quality of underwater video images [4].Figure 2 shows some low-quality underwater images.There is obvious color bias in images a and b, and the overall style is green.The problem with image c is low contrast.Image d represents the atomization phenomenon commonly seen in underwater images.Low-quality video images are not compatible with the perception of human eyes.They will affect subsequent computer vision tasks, such as video image segmentation [5,6], target detection [7], 3D reconstruction [8,9], and other visual processing tasks.In practical application, low-quality underwater video images pose significant challenges to underwater archaeology, biological research, acquisition, and other projects.How to use existing technology to obtain high-quality underwater video images is a very important question.Improving the underwater imaging environment and optimizing the acquisition equipment would strengthen video image acquisition.Although these actions have specific effects, the implementation cost is too high.In contrast, by using computer equipment through digital image processing, high-quality images can be obtained more conveniently and quickly.
Underwater vision enhancement uses computer technology to process degraded underwater images and convert original low-quality images into a high-quality image.The problems of color bias, low contrast, and atomization of original underwater video images are effectively solved by using vision enhancement technology.Enhanced video images improve the visual perception ability and are beneficial for subsequent visual tasks.Therefore, underwater video image enhancement technology has important scientific significance and application value.
This article is searched in the Google Academic database and CNKI database by the keywords of 'underwater image enhancement', 'underwater image processing', and 'underwater video enhancement', etc.A total of 106 relevant articles were selected, including nine reviews, and the rest were specific algorithms, which were analyzed and summarized.In addition, some commonly used underwater image data sets and evaluation indicators are summarized, involving a total of 28 references.In this paper, the existing underwater image enhancement techniques are classified and summarized, as shown in Figure 3.The current algorithms are mainly divided into traditional and deep learning-based methods.Traditional methods include model-based and non-model methods.Non-model enhancement methods, such as the histogram algorithm, can directly enhance the visual effect through pixel changes without considering the imaging principle.Model-based enhancement is also known as the image restoration method.According to the imaging model, the relationship between clear, fuzzy, and transmission images is estimated, and clear images are derived, such as through the dark channel prior (DCP) algorithm [10].With the rapid development of deep learning technology and its excellent performance in computer vision, underwater image enhancement technology based on deep learning is also developing rapidly.The methods based on deep learning can be divided into those based on convolution neural networks (CNN) [11] and those based on generative adversarial networks (GAN) [12].Most of the existing enhancement techniques are extensions of underwater single image enhancement techniques in the video field.Since the development of underwater video enhancement technology is not fully mature, this paper will not classify it for the time being.
We introduced underwater visual enhancement technologies (including video and image) and their development and current status to promote researchers' further exploration in this field.Prior to this paper, there have been many excellent review articles focused on the area of underwater image enhancement.As time goes on, new methods, especially algorithms based on deep learning, need to be updated.In addition, the urgent need for underwater video enhancement technology also deserves the review article's attention.According to the function of the algorithm, the article [13] divides the algorithms into underwater image dehazing and underwater image color evaluation, surveys the intelligence algorithms in underwater image dehazing and restoration, demonstrates the performance with different methods, and summarizes the application of underwater image processing.However, there is no obvious distinction between algorithms of different principles, and the overview of deep learning-based algorithms is not comprehensive.The article [14] selects representative methods for discussion, classifies the approaches in two categories: image restoration (physical-based model) and image enhancement (nonphysical-based model), and compares and analyzes these methods from both qualitative and quantitative perspectives.Although some deep learning algorithms have been introduced, the popular generative adversarial network-based approach is missing.Similar to the article [14], the article [15] reviews the image enhancement and restoration methods that tackle typical underwater image impairments, including some extreme degradations and distortions.Moreover, a large number of experiments were conducted to compare and evaluate different algorithms, using subjective and objective analysis.Although the enhancement algorithm based on CNN is classified, the popular generative adversarial network-based algorithm is not included.The article [16] introduces a review of relatively mature and representative underwater image processing models, which are classified into seven categories, including enhancement, fog removal, noise reduction, segmentation, salient object detection, color constancy and restoration.This helps us to understand the whole field of underwater image processing, but in contrast, the algorithm introduced for the specific task of underwater image enhancement is not comprehensive enough.In the article [17], the authors categorize, analyze and compare underwater image filtering methods for restoration and enhancement and discuss the merits and limitations of these methods and of the evaluation measures used for their validation.This paper presents a number of tables to compare different algorithms, but the algorithm of reference is not described in detail.In addition, the reasons, data sets and evaluation indexes of underwater image degradation are summarized in the above papers.
On the basis of existing reviews, we update and supplement the latest development of underwater visual enhancement technology and divide underwater image enhancement methods into traditional methods and deep learning-based methods.Then the algorithms are classified according to different deep neural network structures and whether physical models are used or not.At the same time, similar to the above article, we summarize the degradation causes of underwater images and the characteristics of low-quality images.In addition, we provide links to commonly used data sets and calculation formulas of evaluation indexes.We classify and analyze the existing underwater image evaluation indexes and summarize the differences of each type of evaluation index and the shortcomings of the existing indexes in the underwater video and image quality evaluation.
However, the above articles, including [1,4,18], only focus on image enhancement algorithms, ignoring underwater video enhancement technology with higher practical application value.This paper makes the following contributions: We focus on the introduction of specific algorithms to help readers better understand the characteristics and development of certain kinds of algorithms.The application requirements of underwater video enhancement methods are more extensive.We summarize the algorithms suitable for underwater video enhancement and reveal the difficulties existing in the field of underwater video enhancement and some solutions.
The rest of this paper is organized as follows.Section 2 introduces the traditional underwater image enhancement algorithms, including the histogram-based, retinex-based, fusion-based, polarization-based, and dark channel priority-based methods.Section 3 introduces the underwater image enhancement algorithm based on deep learning, including CNN-based and GAN-based methods.In Section 4, some existing and imperfect underwater video enhancement algorithms are introduced.In Section 5, some commonly used datasets and quality evaluation indices in underwater visual enhancement are listed.Section 6 summarizes the problems of the existing algorithms and puts forward some future research directions.

Non-Physical Model Enhancement Methods
Due to the unique underwater optical environment, there are some limitations when traditional image enhancement methods are directly applied to image enhancement, so many targeted algorithms are proposed, including histogram-based, retinex-based, and image fusion-based algorithms.
(1) Histogram-based methods Image enhancement based on the histogram equalization (HE) algorithm [19] transforms the image histogram from narrow unimodal to balanced distribution.As a result, the original image has roughly the same number of pixels in most gray levels.After that, the adaptive histogram equalization (AHE) algorithm [20] was derived to improve the local contrast of the image.The contrast limited adaptive histogram equalization (CLAHE) algorithm [21] improves the calculation speed.In the field of underwater imagery, Iqbal et al. [22] proposed an unsupervised color correction method (UCM) based on color correction and selective histogram stretching, which can effectively remove the blue deviation and improve the low-component red channel and brightness.Ahmad et al. [23,24] proposed an adaptive histogram enhancement method using Rayleigh stretch limit contrast enhancement to improve image contrast, enhance details, and reduce over-enhancement, supersaturated area and noise introduction.When the color percentage of the image is low, the color image will be distorted.Then, a Recursive Adaptive Histogram Modification (RAHIM) algorithm was proposed to modify the image color in the HSV color space and improve the contrast of the background region.The complexity of the algorithm is increased.Li et al. [25] proposed an a priori histogram distribution algorithm based on underwater image defogging, which effectively improves contrast and brightness and is time-saving and straightforward.The disadvantage is that the enhancement effect is not obvious when the image is dark.Li et al. [26] proposed a hybrid framework for underwater image enhancement, which combines the improved underwater white balance algorithm with histogram stretching.By establishing a variational contrast and saturation enhancement model, contrast and saturation are improved and the blur caused by scattering is eliminated, and there is better color correction, haze removal, and clarification of details.The histogram-based underwater image enhancement methods are shown in Table 1.(2) Retinex-based methods Retinex theory, based on color constancy, obtains the true picture of the scene by eliminating the influence of the irradiation component on the color of the object and removing the uneven illumination.In the references [27], Plutino et al. gave a very detailed review of the automatic color equalization (ACE) algorithm, including the application of the ACE algorithm in the field of underwater images.Jobson et al. [28,29] proposed a multiscale retinex (MSR) enhancement algorithm and color enhancement.Joshi et al. [30] applied retinex theory to underwater images to enhance degraded images.The visual effects of underwater images are improved, but the enhancement is limited.Fu et al. [31] proposed a variational framework based on retinex, using an alternate-direction optimization strategy to solve reflectivity and illumination and adding a color correction to solve the problem of underexposure and blurring.However, iterative optimization results in higher algorithm complexity.Bianco et al. [32] present the first proposal for color correction of underwater images by using color space.The chromatic components are changed, moving their distributions around the white point (white balancing) and histogram cutoff, and stretching of the luminance component is performed to improve image contrast.Zhang et al. [33] proposed an underwater image enhancement algorithm based on extended multiscale retinex, which combined bilateral and trilateral filtering to suppress the halo phenomenon.The disadvantage is that the contrast enhancement is not obvious and the trilateral filtering is time-consuming.Mercado et al. [34] proposed deep-sea dark image enhancement based on MSRCR and reversed the color loss, which overcame the problem of uneven illumination.The illumination intensity of the enhanced image tended to reach the peak in the central intensity region.Li et al. [35] combined the MSRCR algorithm with a correction algorithm based on histogram quantization of each color channel.Zhang et al. [36] proposed a single-image defogging method based on multi-channel convolution (MC) with multiscale retinex with color recovery (MSRCR).It can be applied to underwater scenes to enhance the global contrast and detail information of the image, reduce noise, and eliminate the effect of light on the image's color without fog.However, overexposure may also occur.Tang et al. [37] proposed an underwater video image enhancement method called IMSRCP.First, the image is pre-corrected to even out the pixel distribution and reduce the dominant color.The classic multiscale retinex with intensity channels is then applied to pre-corrected images to further improve contrast and color.Due to many steps, the real-time performance of the algorithm is not high.Hu et al. [38] proposed an underwater image enhancement optimization (MSR-PO) algorithm, which uses the non-reference image quality assessment (NR-IQA) index as the optimization index.The gravitational search algorithm (GSA) is used to optimize the underwater image enhancement algorithm based on MSR and the NIQE index.The experimental results show that this algorithm has an adaptive ability to environmental changes.However, using the GSA algorithm to optimize parameters consumes more computing resources.
Tang et al. [39] proposed an underwater image enhancement algorithm based on adaptive feedback and the retinex algorithm.Guided filtering was used to improve the algorithm, reducing the time required for underwater image processing.They also proposed a method for adaptive feedback stretching of saturation that can maintain the structural information of the underwater image while improving the clarity of the image, but the global contrast enhancement is not significant enough.Zhuang et al. [40] developed a Bayesian retinex algorithm for enhancing a single underwater image with multiorder gradient priors of reflectance and illumination.A maximum a posteriori formulation for underwater image enhancement is established on the color-corrected image by imposing multiorder gradient priors on reflectance and illumination.This algorithm has the effectiveness of the proposed method in color correction, naturalness preservation, structures and details promotion, artifacts or noise suppression.However, the decomposition and alternate optimization of subproblems require too much time.
It can be seen that the direct application of retinex in underwater image enhancement is limited.The enhanced image has the problem of too low contrast or overexposure.The common practice is to use RGB combined with HSV and other color spaces to adjust the color and lighting.In addition, it can also be combined with filtering, contrast stretching, color correction and other pretreatment or post-processing methods.This can lead to obvious visual enhancement.An unavoidable problem is that the better models of this type of approach often contain too many parameters.Parameters need to be adjusted to suit different underwater environments.Table 2 contains the retinex algorithms applied to underwater images.(3) Fusion-based methods The image fusion algorithm fuses multiple images of the same scene to realize complementary information of various images to achieve richer and more accurate image informa tion after enhancement.Ancuti et al. [41] first used image fusion to improve underwater image quality.In this algorithm, white balance and histogram equalization are used to enhance degraded underwater images.Then, the fusion coefficient is defined according to the characteristics of the underwater images, and enhanced images are obtained by multiscale fusion.It reduces noise, improves global contrast, enhances edges and details, and is suitable for underwater video enhancement.The method showed good dehazing performance but still suffered from an artificial lighting source.On this basis, Ancuti et al. [42,43] continued to optimize the fusion algorithm, making full use of the complementary information between multiple images.Then, the acquisition process of the fused image and the definition of weight information are optimized, and an enhancement method is used to improve the exposure degree and keep the edge of the image, but selective compensation can not be implemented.Pan et al. [44] obtained hazing and color correction images of the original image through dehaze-net and white balance and then used the fusion strategy of the Laplace pyramid for fusion and the mixed wavelet for denoising and edge enhancement but cannot enhance the image contrast apparently.Chang et al. [45] proposed an adaptive fusion algorithm for underwater image restoration.Based on the knowledge of optical characteristics and image processing, background light and transmission images are extracted and adaptive weighted fusion is performed according to their respective salient maps.The algorithm can effectively correct the high definition and natural color in the foreground of the scene while maintaining a certain degree of blur in the background, but the contrast is still lacking.Gao et al. [46] proposed a method based on local contrast correction (LCC) and multiscale fusion to resolve low contrast and color distortion of underwater images.The local contrast corrected images are fused with sharpened images by the multiscale fusion method.The results show that this method can be applied to water degradation images in different environments effectively solving color distortion, low contrast, and unobvious details of underwater images.Although it aims to restore and enhance the underwater image, for the image with low resolution (the image contains some mosaics), the unnatural block mosaics will be enhanced in image detail enhancement.Song et al. [47] propose a method based on the multiscale fusion and global stretching of dual-model (MFGS).Use white-balancing to eliminate the undesirable color deviation and present an updated saliency weight coefficient strategy combining contrast and spatial cues to achieve highquality fusion.At the same time, the global stretching of the full channel in the red, green, blue (RGB) model is applied to enhance the color contrast.In terms of the color richness of the resulting images and the execution time, there are still deficiencies with this algorithm.
The fusion method can effectively improve the quality of underwater images.However, these methods need to obtain multiple fusion images and fusion weights.How to adopt efficient strategies to obtain the most suitable fusion weight is the key to solving the problem.Table 3 shows some distinctive underwater image enhancement methods based on image fusion algorithms.An updated strategy of saliency weight coefficient combining contrast and spatial cues to achieve high-quality fusion combine with white-balancing and the global stretching Eliminates color deviation, achieves high-quality fusion and a better de-hazing effect

Physical Model-Based Enhancement Algorithm
Different from the non-physical model enhancement algorithm, the algorithm based on the physical model analyzes the imaging process and uses the inverse operation of the imaging model to obtain a clear image to improve the image quality from the imaging principle.It is also known as the image restoration technique.
Underwater imaging models play a crucial role in physical model-based enhancement methods.The Jaffe-McGlamery underwater imaging model is a very widely used recovery model.In addition, Zhao et al. [48] found a correlation between the degraded original underwater image the optical characteristics of the water body.According to the correlation, inherent optical features are extracted from the background color of the original image and a new physical model is established by inverting its degradation process.Zhang et al. [49] proposed a model that takes into account wavelength-dependent attenuation of underwater light and color projection of underwater images and optimizes the estimation of global background light and the medium transmission amount of the RGB color channel.Akkaynak et al. [50] improved the classic Jaffe-McGlamery model, used the actual measured depth of the restored scene to make a spatial estimation of the attenuation coefficient and the restored image, and proposed a new underwater imaging model.
After the underwater imaging model is established, unknown parameters in the imaging model are obtained by using prior knowledge and other methods, and then undegraded images in the model are solved.The mainstream methods include image restoration based on light polarization and on prior information and integral imagingbased methods.
(1) Polarization-based methods An underwater image restoration method based on the principle of polarization imaging utilizes the polarization characteristics of scattered light to separate scene light and scattered light, estimate the intensity and transmission coefficient of scattered light, and realize the imaging intensification.Schechner et al. [51] used the polarization effect of light scattering in the water to restore visibility, scene contrast, and color correction of the underwater image.However, for the image with scattering that is too obvious, blur will appear after enhancement.Based on the independent component analysis, Namer et al. [52] estimated the polarization degree and intensity of the background light from the polarization image.They then calculated the depth map of the scene to realize the restoration of the atomized image.Chen et al. [53], aiming at non-uniform illumination, segments the underwater image according to whether it is an artificial illumination area, compensates the artificial illumination area in the image, and eliminates the influence of artificial illumination on the underwater image.However, overexposure may occur.Han et al. [54] considered the impact of backscattering in the imaging process, mitigated the scattering effect by changing the light source, obtained two images under orthogonal polarization, and proposed a point diffusion estimation method based on light polarization.However, it has not been verified in a real environment.Ferreira et al.
[55] estimated the polarization parameters through particle swarm optimization and used the unreferenced mass measure as the cost function for restoration, achieving better visual quality and better adaptability.However, the parameter optimization process increases the time complexity of the algorithm.
The restoration method based on polarization does not fully consider the absorption of light in the underwater scene and the noise contained in the image, which affects the restoration effect of underwater images.Moreover, this method requires multiple images of different polarization angles taken from the same scene as a priori knowledge, limiting its practical application scope.Table 4 shows the underwater image restoration algorithms based on polarization.Due to the limitation of the polarization-based algorithm, only some typical models are listed.(2) Dark channel prior-based methods He et al. [10] proposed the dark channel prior (DCP) algorithm.According to statistics, it is found that there is always a channel in most areas of a fog-free image, and a pixel has a meager gray value, which is called a dark channel.The dark channel prior theory is used to solve the transmission image and atmospheric light value, and the atmospheric scattering model is used to restore the image.Liu et al. [56] directly used DCP for underwater image enhancement, but there was no obvious enhancement effect or even distortion from the perspective of subjective vision.Yang et al. [57] proposed a fast underwater image restoration method based on DCP, using median filtering to replace image matting to estimate the depth of field information of the image, and introduced a color correction method to improve image contrast, but underwater images with color bias or low brightness cannot be restored.Chiang et al. [58] used wavelength compensation and image dehazing (WCID) to restore underwater images, making up for three channels with different attenuation characteristics, correcting image blur caused by artificial light sources, and improving image quality.However, the background might be overly bright.Drews et al. [59,60] proposed the underwater dark channel prior (UDCP) method, which only considers blue and green channels, and obtained a more accurate transmittance map than the DCP algorithm, thus improving the restoration effect.However, its reliability and robustness are insufficient to the limitations of the assumptions.Galdran et al. [61] proposed automatic red-channel underwater image restoration (ARUIR) based on red-channel prior, which improved DCP by minimizing the reverse red channel and blue-green channel, and introduced saturation information to adjust the influence of the artificial light source.This algorithm requires more additional information.Li et al. [62] proposed a method based on red-channel correction and blue-green channel defogging, using the gray world algorithm to perform color correction on the red channel, and using an adaptive exposure image to solve the problem of over-bright and over-dark areas, thus improving visibility and contrast.However, the image is of poor quality if restored in non-uniformly lighting.Meng et al. [63], based on color correction and image sharpening, applied the color balance and volume methods to underwater images.When the red channel value is close to the blue channel, the color balance method is used to restore the image.Otherwise, the DCP-based method for recovery and use based on the maximum a posteriori probability (MAP) of the sharpening method reduces the fuzziness, improves visibility, and provides better retention of foreground textures, but too many parameters are introduced.
The DCP algorithm has excellent defogging performance.When applied to underwater images, the dark channel is affected because the water absorbs too much red light.Therefore, the underwater DCP algorithm is usually improved for this feature.Table 5 lists the underwater-specific DCP algorithms.(3) Integral imaging-based methods Integral imaging technology is based on a multi-lens stereo vision system, which uses a lens array or camera array to quickly obtain information from different perspectives of the target, and combines all element images (each image that records information from different perspectives of a three-dimensional object) into element image array (EIA).According to the principle of optical path invertibility, the lens array with the same recording parameters can be placed in front of EIA to reconverge the rays emitted by EIA and realize the optical reproduction of a 3D scene.Integral imaging is considered as one of the interesting solutions for 3D visualization under a scattering environment, so it can be applied to underwater image degradation caused by water scattering, especially underwater 3D image visualization.
Cho et al. [64] used integral imaging for 3D reconstruction of objects in turbid water.Multi-perspective images are degraded due to light scattering and are treated by statistical image processing and computational 3D reconstruction algorithms to remedy the effects of scattering and to visualize the 3D scene.Lee et al. [65] proposed a three-dimensional visualization method of 3D objects in a scattering medium.The proposed method applies spectral analysis to the computational integral imaging reconstruction and introduces a signal model with a visibility parameter to analyze the scattering signal to improve the visual quality of 3D images.The reconstructed image presents better color presentation, edges, and detail information.However, the orange object was not reconstructed well in the experiment.Satoruet et al. [66] applied the descattering method to 3D integral imaging.A scattering mitigation process is applied to 3D reconstruction to reduce the effect of scattering.By computing maximum a posteriori estimates of the mean and variance of the turbid media containing object information, a Bayesian scattering mitigation process is implemented.The proposed method achieves a higher structural similarity index measure (SSIM).Neumann et al. [67] proposed a fast enhancement method for color correction of underwater images, which is based on the gray-world assumption applied in the Ruderman-opponent color space and can cope with non-uniformly illuminated scenes.Integral images are exploited by the proposed method to perform fast color correction, and locally changing luminance and chrominance are taken into account.However, the details of the reconstruction were missing.Bar et al. [68] proposed an approach to enhance image quality during recovery of objects hidden in turbid liquid by fusion of single-shot multi-view circularly polarized speckle images collected by a lens array and deconvolution algorithm based on multiple medium sub-PSFs viewpoints.The quality of reconstructed images is evaluated by using the image quality index, and the feasibility of imaging in turbid media is verified.Li et al. [69] proposed a thresholded single-photon imaging and detection scheme to extract photon signals from the noisy underwater environment.This method reconstructs the images obtained in a high-loss underwater environment by using photon-limited computational algorithms and improves the PSNR in principle in the high-noise regime.
The integral imaging technology can integrate signals from multiple images and has a remarkable effect in the face of the serious scattering effect of turbidity water.The underwater image enhancement technology based on integrated imaging technology can reconstruct the object obscured by muddy water, enhance the detail features, and restore brightness and contrast.However, this method depends on the establishment of the imaging system and the implementation cost is high.For images obtained from conventional water bodies, more convenient single-image enhancement techniques are usually used.Reconstruct the images obtained in a high-loss underwater environment by using photon-limited computational algorithms Improves the PSNR in the high-noise regime

Deep Learning-Based Enhancement Method
In recent years, deep learning has been widely applied in image processing due to its powerful classification performance and feature learning ability, and underwater image enhancement algorithms based on deep learning have also developed rapidly.Based on the differences in the deep learning network models, they can be divided into convolutional neural network (CNN) and generative adversarial network (GAN) methods.

Convolutional Neural Network Methods
LeCun et al. [11] first proposed the convolutional neural network structure LeNET.The convolutional neural network is a kind of deep feedforward artificial neural network.It is composed of multiple convolutional layers that can effectively extract different feature expressions, from low-level details to high-level semantics, and is widely used in computer vision.In the underwater image enhancement algorithm based on CNN, according to whether the algorithm uses a physical imaging model for restoration, it can be divided into non-physical and combined physical methods.
(  Cai et al. [70] first proposed the deep neural network DehazeNet to extract medium transmission images with or without images by using a convolutional neural network and restore them by using an atmospheric scattering model to achieve end-to-end single image dehazing.The effect on direct underwater image processing is not ideal.Shin et al. [71] proposed a general convolution structure to learn the transmission image and background light of the underwater image at the same time to realize image restoration.The results show promising performance of the dehazing ability, but the color overcompensation appears.Ding et al. [72] used the adaptive color correction algorithm to compensate for color distortion.The CNN network was used to estimate the depth map of the color-corrected image, which was directly converted into a transmission image for restoration.The robust adaptation and real-time performance of the algorithm need to be improved.Wang et al. [73] proposed a CNN-based underwater image enhancement network (UIE-NET), which can estimate color-corrected and transmission images from input underwater images.Its main learning strategy is to train both color-correction and "defogging" processes simultaneously to better extract the inherent characteristics of local blocks.In the training process, the pixel interference strategy is used to suppress the small texture interference information contained in the regional block, which improves the convergence speed and accuracy of the learning process.This method improves the brightness and contrast of the underwater image.However, there is a red overcompensation phenomenon.Since it is difficult to obtain a corresponding reference image for underwater images, Barbosa et al. [74] used a group of image quality measures to guide the restoration process based on the CNN network.By processing simulated data to recover the image, the difficulty of measuring the real scene data is avoided.Good results have been achieved when UCIQE measures are considered, but other indicators need to be improved.Hou et al. [75] proposed an underwater-residuals CNN.The underwater image enhancement task was modeled to learn the transfer diagram and scene residuals simultaneously and estimate the global background light from blue and green channels.The model includes a data-driven residuals structure for transmission image estimation and a knowledge-driven scene residuals calculation method for underwater lighting balance, and color correction is performed on the image to obtain the restored underwater image.The disadvantage is the need to use color correction algorithms for reprocessing.Cao et al. [76] constructed a deep neural network to directly estimate the learned background light transmission image from the input image to further improve the restoration effect.They then used the atmospheric model to realize image restoration with better contrast and brighter color.Strictly speaking, an end-to-end enhanced network is not built, and there are additional parameters.Wang et al. [77] proposed a parallel convolutional neural network for underwater image processing, consisting of two parallel branches, a transmission estimation network, and a global ambient light estimation network.The network uses cross-layer connection and multiscale estimation to prevent the halo phenomenon and maintain edge features.However, the contrast enhancement is not significant enough.Li et al. [78] present an underwater image enhancement network via medium transmission-guided multi-color space embedding, called Ucolor.The network has a multi-color space encoder and a medium transmission-guided decoder, which can effectively improve the visual quality of underwater images by exploiting multiple color spaces embedding and the advantages of both physical model-based and learning-based methods.However, it fails to produce a visually compelling result when processing an underwater image with limited lighting.The underwater image enhancement method combined with a physical model CNN is shown in Table 7. (2) Non-physical model methods In the non-physical model, the original underwater image is sent into the network model with the help of CNN's powerful learning ability.The enhanced underwater image is directly output after convolution, pooling, deconvolution, and other operations.The process is shown in Figure 5.This method can eliminate the constraints of model assumptions or prior conditions and directly learn the mapping relationship between the original underwater image and the clear underwater image.Perez et al. [79] first formed a paired dataset of degraded and clear underwater images and used the deep learning method to learn the mapping relationship between the two.The underwater image enhancement model based on the convolutional neural network was built to complete underwater image enhancement.Sun et al. [80] proposed a pixelto-pixel deep learning model to realize an underwater image enhancement model.This model adopts the code-decoder framework, using the convolutional layer as the encoder to de-noise the underwater image and the deconvolutional layer as the decoder to enhance the details of the pixelated underwater image.This method has a remarkable effect on underwater image denoising.The algorithm has good results in dealing with turbid images, but it is not effective in color enhancement.Li et al. [81] proposed a gated fusion network, WATER-NET.White balance, histogram equalization, and gamma correction algorithms are used to enhance the underwater image, and the final image is obtained by combining the confidence graphs of different enhancement algorithms.Although the quantitative analysis of the results is not good, it has good generalization performance and room for improvement, as a reference model.
Li et al. [82] proposed an underwater image enhancement CNN model based on the underwater scene prior, called UWCNN.The model does not need to estimate the parameters of the underwater imaging model but combines the physical model of the image and the optical characteristics of the underwater scene to synthesize image degradation datasets covering different types and degradation degrees.The corresponding training data are used to train the network, and multiple losses are optimized jointly to reconstruct clear underwater images while retaining the original structure and texture.However, the UWCNN cannot realize the prediction of single model.Naik et al. [83] proposed a shallow neural network (Shallow-UWnet) composed of a fully connected convolutional network and three densely connected convolutional blocks in series.By using convolutional blocks and jump connections, the network prevents overfitting problems and has better generalization performance.The real-time performance of the algorithm is good, but the enhancement effect needs to be improved.
Han et al. [84] proposed a deep supervised residual dense network (DS_RD _Net).DS_RD_Net first uses residual dense blocks to extract features to enhance feature utilization.Then, it adds residual path blocks between the encoder and decoder to reduce the semantic differences between the low-level features and high-level features.Finally, it employs a deep supervision mechanism to guide network training to improve gradient propagation.The proposed method can fully retain the local details of the image while performing color restoration and defogging.Because the method prefers to preserve detailed features, the color index of the result is not superior.Yang et al. [85] proposed a trainable endto-end neural model constituted by two parts.The first one is a non-parameter layer for the preliminary color correction.The second part consists of parametric layers for a selfadaptive refinement, namely the channel-wise linear shift.The proposed method can obtain high-quality enhancement results with better details, contrast, and colorfulness, but it does not work well in overall style and texture retention.Wang et al. [86] use HSV color space for underwater image enhancement based on deep learning.They proposed Convolution Neural Network using 2 Color spaces (UICE2-Net) to efficiently and effectively integrate RGB Color Space and HSV Color Space in one CNN.The RGB pixel-level block implements fundamental operations such as denoising and removing color cast, the HSV global-adjust block for globally adjusting underwater image luminance, color, and saturation by adopting a novel neural curve layer.The disadvantage is that extracting features of different color spaces increases the time complexity of the algorithm.The underwater image enhancement methods of non-physical model CNNs are shown in Table 8.

Generative Adversarial Network-Based Methods
Generative adversarial network(GAN) was proposed by GoodFellow et al. [12].A generative adversarial network (GAN) is used to produce better output through the mutual game confrontation learning of a generator and discriminator.By learning, the generator generates an image as similar to the actual image as possible so that the discriminator cannot distinguish between true and false images.The discriminator is used to indicate whether the image is a composite or actual image.If the discriminator cannot be cheated, the generator will continue to learn.The process is shown in Figure 5.The input of the generator is a low-quality image, and the output is a generated image.The input of the discriminant network is the generated image and the actual sample, and the output is the probability value that the generated image is true.The probability value is between 1 and 0. As an excellent generation model, GAN has a wide range of applications in image generation, image enhancement and restoration, and image style transfer.
The initial application of GAN in the underwater image field is to expand the underwater image dataset.In view of the insufficient dataset caused by the difficulty in obtaining natural and effective underwater images, Chen et al. [87] first used GAN to generate many images of the underwater environment.A pair of training sets was formed with clear images on land to train the model.Later, Anwar et al. [88] used the indoor environment to synthesize underwater images as the CNN training set and reconstructed clear underwater latent images using the established network.Yang et al. [89] also used generative adversarial networks to construct underwater image datasets.The difference is that this network model uses double discriminators to obtain global semantic information of underwater images to synthesize more realistic images.This method improves the overall clarity of underwater images, but the images still have fuzzy details and an unclear edge structure. (

1) CGAN-based methods
The original GAN network automatically learns the distribution characteristics of data with random noise as input, and outputs a random image following a certain distribution.The disadvantage is that the category attribute of the image cannot be controlled.By adding conditional information to the generator and discriminator of the original GAN, the conditional generation adversarial network (CGAN) [90] is obtained.The additional conditional information can be category tags or other ancillary information.Therefore, the image generation process of CGAN is controllable.The structure is shown in Figure 6.Li et al. [91] proposed the underwater image generation countermeasure network (Wa-terGAN), which uses atmospheric image and depth maps to synthesize underwater images as an end-to-end network training dataset for color correction of a single underwater image.Then, a two-stage deep learning network is constructed using the original underwater image, actual atmospheric color image, and depth map to realize real-time color correction of a single underwater image.The performance of the algorithm depends on the training dataset.Guo et al. [92] proposed a new multiscale dense generated adversarial network for underwater image enhancement, denoted as UWGAN, which introduces residual multiscale dense blocks into the generator.Multiscale manipulation, dense cascading, and residual learning are used to improve performance, render more detail, and take full advantage of features, respectively.The discriminant adopts the method of calculating spectral normalization to stabilize the training of the discriminant.The real time and adaptive ability of the algorithm are still lacking.Liu et al. [93] proposed a multiscale feature fusion network for underwater image color correction, denoted as MLFCGAN.This method realizes multiscale global and local feature fusion in the generator part.The fusion of global and local features can obtain more discriminative and effective feature expression and contribute to more effective and faster network learning.Thus, it has better performance in color correction and detail retention.Yang et al. [89] proposed an underwater image enhancement method based on GAN.In the generator part, a multiscale structure is used to generate a clear underwater image.In the discriminator part, a double discriminator is designed to obtain local and global semantic information.Therefore, the results generated by the constrained multiscale generator are real and natural, but the detail retention needs to be improved.Li et al. [94] propose the simple and effective fusion adversarial network, which employs the fusion method to extract the degraded underwater image features.The multi-term objective function combined generator loss, fusion enhanced image loss, SSIM loss, and PSNR loss is leveraged for correcting color casts effectively, and spectral normalization is utilized to improve image quality.The proposed method has superiority in both qualitative and quantitative evaluations.The disadvantage is that the method of setting reasonable parameters affects the generalization ability of the model.Liu et al. [95] offer an integrated approach, where the revised underwater image formation model, i.e., the Akkaynak-Treibitz model, is embedded into the network.The embedded physical model guides for network learning and the generative adversarial network (GAN) are adopted for the estimation of coefficients.This method takes full advantage of the merits of these two approaches to mutually benefit each other and can effectively restore the color of underwater images with fine details and alleviate the unwanted artifacts.However, the real-time performance of the algorithm needs to be tested.The underwater image enhancement methods based on CGAN are shown in Table 9. (2) CGycleGAN-based methods The cycle-consistent adversarial network (Cyc1eGAN) is an improvement on the traditional GAN network structure.A ring network consisting of two mirror-symmetric GAN generators and two corresponding discriminators is constructed.The structure is shown in Figure 7. CycleGAN trains two GAN networks.There are two generators, G,F and two discriminators, D X ,D Y .G,F are used to learn the mapping relationship from the X to the Y domain and from the Y to the X domain.In order for the input image and generated image to correlate, it is required that F(G(x)) ≈ x and G(F(y)) ≈ y, and Cyc1eGAN is proposed for the cyclic consistency loss function.This structure solves the difficulty that GAN requires paired data for training and has good application in underwater images that lack paired data.Fabbri et al. [96] proposed a network UGAN based on a generative adversarial network (GAN) to improve the visual quality of underwater images.It first uses unpaired clear and degraded underwater images to train and then forms a training set with clear underwater images generated by Cyc1eGAN and the corresponding degraded underwater images.Absolute error loss and gradient loss are added to restore the underwater image.Lu et al. [97] proposed a multiscale Cyc1eGAN network for underwater image restoration that combines dark channel prior and Cyc1eGAN.An adaptive image restoration process is established by using dark channel prior to obtain the depth information of underwater images.Then the depth information is input into the network to guide the multiscale calculation.Underwater image quality and detailed structural information are improved, and it has good performance in contrast enhancement and color correction.However, this model cannot produce a reliable image under non-uniform illumination.
Park et al. [98] added a pair of discriminators based on Cyc1eGAN.The model consists of two generators and four discriminators.Image enhancement is achieved while retaining the content of the input image.An adaptive weighting method is introduced to limit the loss of the two discriminators.Although it is difficult to train, this mechanism makes full use of the advantages of each discriminator while suppressing their negative effects.Islam et al. [99] proposed a fast underwater enhancement model, FUnIE-GAN, which develops an objective function based on global content, image content, local texture, and style information to evaluate perceived image quality.The specific approach is to use absolute error loss as global loss and a pre-trained VGG-19 network to extract advanced features to construct content loss.The local consistency of texture and style depends on the antagonistic implementation of the recognizer.In addition, different objective functions are designed for paired image training based on CGAN and unpaired image training based on CycleGAN.The results of the algorithm have an excellent effect on color restoration and sharpening, and the processing speed is fast.Experiments have also been carried out on underwater video enhancement.In this algorithm, the contrast effect of the generated image is not very good.Hu et al. [100] proposes to add the natural image quality evaluation (NIQE) index to the GAN to provide generated images with higher contrast and make them more in line with the perception of the human eye, and at the same time, grant generated images a better effect than the truth images set by the existing dataset.Contrast and NIIQE are good, but texture details are missing.Zhang et al. [101] propose an end-to-end dual generative adversarial network (DuGAN) for underwater image enhancement.The images are segmented into clear parts and unclear parts and two discriminators are used to complete adversarial training toward different areas of images with different training strategies, respectively.However, this method obtained reference images by relying on a user-guided approach, which made it difficult to train with new images.The underwater image enhancement methods based on CycleGAN are shown in Table 10.

Underwater Video Enhancement
With the development of underwater video acquisition and data communication technology, real-time underwater video transmission becomes possible.Underwater video with spatiotemporal information and motion characteristics has higher application prospects than underwater images in ocean development.Because of the optical properties, underwater video has some similar problems to underwater images, such as color bias, image blur, low contrast, uneven illumination, etc.At the same time, due to the influence of water flow on video acquisition equipment, the texture features and details of moving objects are weakened or disappear.These problems seriously affect the ability of the underwater video system to accurately collect scene and object features.Unlike atmospheric video enhancement technology, which tends to solve blur and jitter, underwater video enhancement focuses more on solving the harmful effects of the unique optical environment on color and visibility.
Compared with underwater image enhancement technology, underwater video enhancement is more complicated.The research in this direction has not yet reached a mature stage.Most of the existing underwater video enhancement methods are extensions of single image enhancement algorithms.When underwater image enhancement technology is directly applied to video, each frame is enhanced and then connected into a new video.Due to the differences in transmission images and background light between frames, the continuity of the enhanced video is not well maintained, and time artifacts and interframe flicker phenomena can occur.
Because of this defect, some scholars propose reducing the flicker phenomenon by accelerating the processing speed of each underwater image frame.IMSRCP, proposed by Tang et al. [37], is a fast MSR enhancement method applicable to underwater video.Considering that frequent convolution operation on a large scale will seriously affect the calculation speed, this algorithm extracts features from low-resolution images after subsampling.Subsampling and an IIR Gaussian filter are used to form a fast filter, and 1/2 subsampling is performed on the image repeatedly until the filter size reaches a reasonable range.The IIR Gaussian filter is composed of forward and reverse recursion.The filter is applied to a two-dimensional image's vertical and horizontal directions to complete the fast two-bit convolution operation.This strategy effectively solves the problem of the strain on computing resources by increased scale and has a significant advantage in speed, enabling it to be extended to underwater video enhancement.
UMCNN, proposed by Li et al. [82], adopts a lightweight modular structure to adapt to underwater video enhancement.Different from the widely used Densenet structure, the convolutional layer in the network structure of UMCNN is not connected to other convolutional layers in the same block.Moreover, the network does not use any full connection layer or batch normalization processing, making the network memory efficient and fast.Further, the inputting of images directly into each enhancement module layer and the inputting of data reduce the need for a deep network.The whole network comprises three enhancement units, and each unit is composed of three convolution layers.There is a single convolution layer at the end of the network.The total depth of the network is only 10 layers, which reduces the computational cost and is easy to train for use in frame-by-frame enhancement of underwater video.
FUnIE-GAN, proposed by Islam et al. [99], also has excellent image enhancement speed.A simpler model is adopted in the generator part of the GAN model, which only learns 256 feature graphs of size 8 × 8 with fewer parameters to realize fast inference.In the discriminant part, the recognition is only based on patch-level information rather than global recognition at the image level.This configuration is computationally more efficient because it requires fewer parameters.The entire network structure requires only 17 MB of memory.The computing speed reaches 25.4 frames per second in the embedded system (NVIDIA Jetson TX2), 148.5 frames per second on the graphics card (NVIDIA GTX 1080), and 7.9 frames per second on the CPU (Intel Core i36100 U).It can meet the real-time requirements of underwater robots and has been shown to effectively improve underwater target detection, saliency prediction, and human posture estimation through experiments.
In addition, there are enhancement algorithms to increase speed by reducing computational complexity.Lu et al. [102] proposed a prior estimation method based on attenuation differences between red channels to estimate transmission patterns, using a triangular filter to compensate the transmission, preserve the edge, remove the noise, and speed up the calculation.At the same time, in the color correction method, the summation operation is replaced by the convolution operation to reduce the computational complexity.Bicnao et al. [103] proposed a fast enhancement method for underwater images with nonuniform illumination.Based on the gray world hypothesis, according to local changes of brightness and chromaticity, color correction is carried out with the area summation table technique, which reduces the computational complexity and is suitable for video enhancement.Liu et al. [104] proposed a real-time multithreading underwater image enhancement system that uses an automatic multithreading method.The hardware's computing power is compacted by creating an optimal number of processing threads for both consistency and real-time performance.The framework improves the computing efficiency by optimizing the computing strategy of the processor, which is independent of the enhancement algorithm used and enhances the real-time performance.These algorithms often sacrifice image quality to achieve faster processing speed.The commonly used methods include fast filtering, optimal computing strategy, and compressed deep network model, etc.How to strike a balance between video quality and computing speed is an urgent problem to be solved.
The other method to enhance underwater video is to use the timing characteristics of the video combined with the timing relationship between frames.Simply enhancing video frames and connecting them will interrupt the correlations between adjacent frames, and the color consistency of the entire video will not be maintained.Furthermore, the time complexity will be very high.In the image fusion algorithm proposed by Ancuti et al. [41], the time-bilateral filtering strategy is used for the white balance version of the video frame.A bilateral filter is a non-iterative edge-preserving filter defined on a kernel domain that combines the pixel in the window with adjacent pixels to enhance sharpness and improve the stability of the smooth region.Time-sequence information is added in time-domain bilateral filtering.In window selection, pixels containing time alignment in adjacent frames are selected to achieve smoothing between frames and maintain temporal coherence.
Li et al. [105] performed video dehazing and stereo reconstruction simultaneously.The depth cues from stereo matching and fog information reinforce each other, producing better results than traditional stereo or fog removal algorithms.In order to simulate the appearance change caused by the scattering effect, first the light consistency term is improved.The a priori matting Laplace constraint of fog propagation imposes a smoothing constraint to preserve details on scene depth and strengthens the sequence consistency between the scene depth and the fog propagation of adjacent points.These constraints were added to the constructed MRF framework, and the auxiliary variables were introduced for iterative optimization.The algorithm calculates the fog transmission of each pixel directly from the scene depth (and the estimated fog density).This ensures that the stereo reconstruction and defogging results are consistent.Eliminating the ambiguity of air albedo during defogging maintains the time consistency of the final defogging video.In the experiment, the algorithm dealt with underwater video and obtained an excellent defogging effect.
Qing et al. [106] proposed a space-time information fusion algorithm for underwater video defogging.Based on the DCP algorithm, it optimizes the projection image and estimation of atmospheric light value.In terms of transmission image extraction, the transmission image of the first frame of the video is extracted based on DCP, then refined by the guided filter.Since there is little difference between the transmission images of adjacent frames, the images of subsequent frames are guided by the grayscale image of the frame.The transmission images of the previous frame are input and obtained through linear translation filtering.In estimating the background light, the adjustment factor is designed to avoid frequent changes of the atmospheric light value by combining the current frame's background light estimation.The computational complexity is reduced because only the projection image is extracted in the first frame using dark channel priors.In addition, the correlation between adjacent frames of the video is presented through the transmission image and estimation of atmospheric light value fused with spatial and temporal information, which reduces the scintillation caused by changes in the transmission image and atmospheric light value.
This algorithm makes full use of the relationship between frames and preserves the timing characteristics of the video.Although extracting time sequence information will increase the complexity of the algorithm, the key frame parameters can be used to replace the adjacent multi-frame images and shorten the calculation time.The underwater video enhancement algorithms are shown in Table 11.

Author Algorithm Contribution
Tang et al. [37] Extracts features from low-resolution images after subsampling; subsampling and IIR gaussian filter are used to form a fast filter to complete the fast two-bit convolution operation Solves the problem of the strain on computing resources by increased scale effectively Li et al. [82] The convolutional layer in the network structure of UMCNN is not connected to other convolutional layers in the same block and the network does not use any full connection layer or batch normalization processing The total depth of the network is only 10 layers, which reduces the computational cost and is easy to train Islam et al. [99] In The images of subsequent frames are guided by the grayscale image of the first frame and combine the current frame's background light estimation to avoid frequent changes of the atmospheric light value Reduces the computational complexity and the scintillation caused by changes in the transmission image and atmospheric light value

Underwater Vision Dataset
For underwater video and image enhancement, the underwater vision dataset is an enhancement object and a sample for deep learning model training, and is used to test algorithm performance.Through the efforts of many scholars, some well-recognized datasets have been established.Li et al. [91] set the Port Royal dataset by using WaterGAN.Jian et al. [107] established the OUC-Vision dataset by taking photos of different postures and positions of individuals underwater.Berman et al. [108] collected images of different locations and water quality levels and made data annotations to establish the SQUID dataset.Li et al. [81] established a large-scale real underwater image enhancement benchmark dataset (UIEBD), which includes underwater images with different degrees of degradation and corresponding high-quality reference images.Liu et al. [109] established real-world underwater image enhancement (RUIE) by using the multi-view underwater imaging system.It includes underwater image quality sets (UIQs), underwater color deviation sets (UCCS), and underwater advanced mission drivers (UHTS).Islam et al. [99] constructed the EUVP dataset with 12,000 paired and 8000 unpaired cases by capturing underwater video from different cameras and the internet.Wei et al. [110] established MABLs, the first dataset for background light estimation of underwater images, consisting of 500 images with different scenes and distortion levels, with manually labeled background light values.Underwater vision datasets are shown in Table 12.used in underwater data without labels.At the same time, the strategy of using NR assessments to optimize parameters in the algorithm is also very common.
Due to the universality of NR indicators, many quality assessment models based on deep learning have been proposed, for example, neural image assessment (NIMA) [119], which is a novel approach to predict both technical and aesthetic qualities of images, deep image quality assessor (DIQA) [120], and multi-task end-to-end optimized network (MEON) [121].Quality assessment models based on deep learning have better data fitting ability and less error with subjective indicators.
Non-reference quality evaluation indices used especially for underwater images include underwater color image quality evaluation (UCIQE) proposed by Yang et al. [122], underwater image quality measure (UIQM) proposed by Panetta et al. [123], and colorfulness-contrast-fog density (CCF) proposed by Wang et al. [124].Compared with evaluation indicators, these indexes can better reflect the contrast, color richness and atomization of underwater images.Underwater vision enhancement algorithms use these quality assessment models to measure the performance of the algorithm.
There are some similarities between video quality and image quality evaluation methods, and image quality evaluation indices such as PSNR and SSIM can be applied directly.However, they only reflect the quality of the frame in the video.They cannot reflect the motion characteristics of the video that are different from the image and the accompanying timing information.Quality assessment methods used for video include motion-based video integrity evaluation (MOVIE) [125], video quality model (VQM) [126], spatiotemporal most-apparent-distortion model (STMAD) [127], and deep learning-based video quality evaluation models such as DeepVQA [128], C3DVQA [129], SACONV A [130], and Deep BVQA [131].
At present, there are few studies on underwater video quality evaluation, and they do not have good applicability.For example, Moreno-Roldan et al. [132] proposed the generalization-non-linear regression model (NLR.G) and accuracy-non-linear regression model (NLR.A) to evaluate underwater video quality.Song et al. [133] proposed a no-reference underwater video quality assessment model (NR-NVQA) based on spatial natural characteristics and coding parameters.These models have not been widely used due to insufficient samples or over-fitting of models.The development of underwater video enhancement technology is in urgent need of more evaluation indexes with good performance.
Due to the complexity of the underwater environment, there are many reasons for distortion and quality degradation.The general video image quality index cannot fully reflect the natural underwater environment.Even the widely used indicators for underwater images, UCIQE and UIQN, tend to score more strongly in favor of highly colored underwater images.Reasonable and universal underwater video quality assessments are even rarer.Therefore, underwater video image enhancement is of great significance in order to develop more accurate and suitable indicators that reflect underwater video image quality and have good generalization performance.Some indexes with high utilization rates in underwater image quality assessment are calculated as follows: (1) In the equation, M and N are the height and width of the image, f (m, n) is the pixel value of the reference image (m, n), I(m, n) is the pixel value of the image to be tested (m, n), and represents the whole image after sum.The smaller the MSE is, the closer the image to be measured is to the reference image, and the higher the quality is. ( In the equation, the value of the numerator represents the maximum value of the color of the image point, which is 255 if each sample point is represented by 8 bits.The denominator is MSE.The larger the PSNR is, the higher the fidelity of the image to be tested to the reference image is, and the higher the image quality is. ( In the equation, x is the reference image, and y is the image to be measured.µ represents the average value of the image, while σ is the standard deviation of the image.σ xy represents the covariance between x and y.To avoid having a zero denominator, C 1 and C 2 are very small constants.The SSI M is between 0 and 1; the closer it is to 1, the better the image quality is. (4) Entropy In the equation, p(i) represents the probability of image gray value I appearing in the image, n is the total number of image gray levels, 0 ≤ n ≤ 255.The higher the Entropy, the higher the image quality. ( In the equation, the calculation of N IQE needs to obtain the mean value v 1 , v 2 and variance matrix σ 1 , σ 2 of the natural image and distorted image by fitting the natural image and distorted image and then calculate the distance between the fitting parameters of natural image and distorted image to measure the image quality.NIQE represents the distance between the image to be measured and the natural image, and the smaller the value, the higher the quality of the image. ( In the equation, δ c is the standard deviation of color concentration, con l is the contrast of brightness, µ s is the mean value of saturation.c 1 , c 2 , c 3 are weighted coefficients with values of 0.4680, 0.2745 and 0.2576, respectively.The coefficient is derived from data in Reference [122].The higher the UCIQE, the better the image quality. ( In the equation, U IQM is a weighted combination of U ICM (underwater image colorfulness measure), U ISM (underwater image sharpness measure) and U IConM (underwater image contrast measure).The weight coefficients c 1 = 0.0282, c 2 = 0.2953, c 3 = 3.5753.This coefficient setting is taken from Reference [123] and fitted by using multiple linear regression.The larger U IQM is, the better the overall quality of the image is.

Algorithm Result
To verify the performance of these algorithms, we selected some typical algorithms from different categories, including CLAHE [21], MSRCR [34], FUSION [42], UDCP [59], UWCNN [82], UGAN [96], and FGAN [99].We tested it on an effective and public underwater test dataset (U45) [134], which includes the color casts, low contrast and haze-like effects of underwater degradation.This represents a typical feature of low-quality underwater images.The results are shown in Figures 8-10  The performance of the algorithm cannot be fully reflected only from subjective visual perception.Therefore, the UCIQE underwater image quality index and NIQE natural image quality index were selected for test and evaluation.The average results are shown in Table 13.We evaluated both subjective visual effects and objective quality indicators.From the subjective visual effects, it can be seen that the histogram-based algorithm has an obvious effect on color and contrast enhancement, but the red part may have excessive enhancement.After the MSRCR algorithm is enhanced, the brightness is greatly improved, but some colors are distorted and the details are blurred.Based on the fusion algorithm, the green and fog environment is improved, and the blue scene will produce a redshift.The UDCP algorithm has excellent defogging performance but has obvious defects in improving image color and even deepens green and blue.The performance of traditional algorithms varies with different data sets.It can be seen that the performance of CLAHE algorithm and UDCP algorithm is not superior in processing green images.The MSRCR algorithm has insufficient ability to remove fog.The algorithm based on fusion can well adapt to the degraded images in various environments.Although DCP algorithm has a significant effect on fog removal, the green and blue parts are significantly deepened.The algorithm based on deep learning is more natural in color truth, without obvious distortion or excessive enhancement effects, and has a good enhancement effect for different underwater environments.Because the large underwater image data set used in the training network covers a variety of underwater degraded images, the deep network fully learns these degraded features.To analyze the objective indicators, when using UCIQE, a special evaluation index for underwater images, the MSRCR algorithm and FUSION algorithm of traditional algorithms have higher UCIQE value in processing green scene images, which is superior to CLAHE and UDCP algorithms.The three algorithms based on deep learning are better than traditional algorithms in UCIQE.In blue scenes, the UCIQE index of images evaluated by the UDCP algorithm is better than other traditional methods, and the image quality enhanced by MCNN and UGAN based on deep learning is comparable.In the atomization scene, the UCIQE index of the evaluated image of the UDCP algorithm is significantly improved, which is due to the excellent defogging ability of the original DCP algorithm.The algorithm based on deep learning has yet to be improved in this method.By integrating the three scenarios, traditional algorithms have different performances in processing images of different scenarios, while deep learning algorithms have good enhancement effects in different underwater environments.As a supplement to the evaluation results of natural image index NIQE, it can also be seen that the image processed by the enhancement method based on deep learning is significantly different from the natural or underwater image in terms of indicators, and the change effect is quite obvious.

Conclusions and Future Research Directions
As an essential carrier of marine ecosystem information, underwater video images play an indispensable role in advanced computer vision tasks such as underwater target recognition and detection, and underwater navigation, etc.However, due to the interference of the complex underwater environment and natural factors, underwater video images suffer from serious blurring and color fading.With the ongoing efforts of many scholars, underwater image enhancement technology has made significant progress, but the technology is not yet mature.Traditional enhancement methods can achieve better results when aiming at a certain type of underwater image or an image with certain characteristics.Still, their applicability is not broad enough due to the changing and complex underwater environment.The method based on deep learning can reduce the impact of the complex underwater environment on the results by learning many samples.However, it is highly dependent on the dataset, and the coverage of the current dataset is still limited.At the same time, most deep learning-based methods do not fully integrate the underwater imaging model and focus on enhancement.Therefore, the development of underwater video image enhancement technology can be further strengthened in the following directions: (1) Improve adaptability and robustness.Although single image processing methods have made significant progress due to the complex underwater environment, most of the existing image processing methods are only effective for a specific type of underwater image environment.The adaptability and robustness still need to be improved.
(2) Establish a more comprehensive underwater image dataset.Deep learning is highly dependent on the quality of datasets, but the lack of sufficient reference images for underwater images greatly limits the effectiveness of deep learning-based methods.Building a more comprehensive dataset covering different subsea environments will help improve the adaptability of the algorithm, and the dataset can be used to test and enhance the algorithm's performance.
(3) Improve the underwater video and image quality evaluation system.At present, most researchers only evaluate the performance of underwater image processing methods through subjective indicators, UIQM and UICQE.Although these are widely used, they are based on the characteristics of the human visual system and tend to be graded in favor of over-enhanced color maps.Therefore, it is of great significance to develop an objective evaluation index with good generalization performance and solid anti-jamming ability.We think the aesthetic image quality indicators, such as NIMA, can assist in underwater image optimization to improve the image of subjective feeling.
(4) Improve real-time performance and strengthen research on underwater video enhancement technology.Existing methods mainly focus on single underwater images, mostly on improving performance, and cannot be directly applied to underwater video enhancement let alone meet the high real-time requirements of underwater vehicles.At the same time, we should pay more attention to the enhancement effect of underwater video and make full use of the timing characteristics.

Figure 3 .
Figure 3. Classification of underwater image enhancement methods.

)
Combined physical methods Traditional model-based underwater image enhancement methods usually need to estimate the transmission graph and parameters of the underwater image based on prior knowledge and other strategies, and those estimated values thus have poor adaptability.The method combined with the physical model mainly uses the excellent feature extraction ability of the convolutional neural network to solve the parameter values in the imaging model, such as the transmission diagram.In this process, CNN replaces the assumptions or prior knowledge used in traditional methods, such as dark channel prior theory.The network model is usually divided into two parts: first, the transmission figure is calculated in which the original image is input to the convolution neural network for feature extraction and to obtain the transfer diagram, and then, from the calculation, a clear image restoration is extracted by the transfer diagram, and the original image is entered into the optical imaging model for inversion calculation.The algorithm process is shown in Figure 4.

Figure 4 .
Figure 4. CNN underwater image enhancement combined with model.

Figure 5 .
Figure 5. CNN underwater image enhancement for non-physical model.

Table 1 .
Histogram-based underwater image enhancement methods.

Table 2 .
Underwater image enhancement methods based on retinex theory.

Table 3 .
Underwater image enhancement methods based on image fusion algorithms.

Table 4 .
Underwater image restoration algorithms based on polarization.

Table 5 .
Underwater image restoration algorithms based on the DCP algorithm.
Table 6 lists the underwater image restoration algorithms based on integral imaging.

Table 6 .
Underwater image restoration algorithms based on the integral imaging.

Table 7 .
Underwater image enhancement method combined with physical model CNN.

Table 8 .
Underwater image enhancement methods of non-physical model CNNs.

Table 9 .
Underwater image enhancement methods based on CGAN.

Table 10 .
Underwater image enhancement and restoration based on CycleGAN.

Table 13 .
Objective evaluation index of the test image.