Underwater Image Enhancement Using Dynamic Color Correction and Lightweight Attention-Embedded SRResNet

Kui Zhang; Yingying Zhang; Da Yuan; Xiandong Feng

doi:10.3390/jmse13081546

,

and

¹

Institute of Oceanographic Instrumentation, Qilu University of Technology (Shandong Academy of Sciences), Qingdao 266300, China

²

State Key Laboratory of Physical Oceanography, Qingdao 266300, China

^*

Author to whom correspondence should be addressed.

J. Mar. Sci. Eng.2025, 13(8), 1546;https://doi.org/10.3390/jmse13081546

This article belongs to the Section Ocean Engineering

Version Notes

Order Reprints

Abstract

An enhancement method integrating dynamic color correction with a lightweight residual network is proposed to resolve the challenges of color bias and insufficient contrast in underwater imaging. The dynamic color correction module is implemented based on the gray-world assumption, adaptively adjusting inter-channel color shifts to mitigate blue-green dominance in acquired images. Subsequently, the corrected images are processed through an improved SRResNet architecture incorporating lightweight residual blocks with embedded channel–spatial attention mechanisms, enhancing the responses of feature channels and the saliency of spatial regions Model complexity is reduced through depthwise separable convolutions and channel dimension reduction, ensuring computational efficiency. Validation on UIEB and RUIE datasets demonstrates superior qualitative and quantitative performance, achieving PSNR gains of 0.92–5.95 dB and UCIQE improvements of 0.14–0.74, compared with the established methodologies. Ablation studies quantify the contributions of the color correction and attention mechanisms to the overall enhancement efficiency, verifying the network’s effectiveness.

Keywords:

underwater imaging; image enhancement method; dynamic color correction; lightweight residual blocks; channel–spatial attention mechanism

1. Introduction

The ocean’s abundant resources and biodiversity have established it as a strategically vital region for all nations. Underwater imaging plays a pivotal role in uncovering critical features such as seabed topography, sediment distribution, and biodiversity patterns. The effective exploration of marine resources and the assessment of ecosystem health depend on high-quality underwater imagery to support scientific research and inform engineering decision-making. However, the complex and dynamic underwater environment considerably alters light propagation, presenting major challenges for imaging systems. As shown in Figure 1, intense scattering and absorption in seawater generate blue-green channel shifts and contrast reduction, resulting in substantial image quality degradation []. This deterioration not only impairs visual clarity but also obstructs tasks such as target detection, identification, and 3D reconstruction. Consequently, underwater image enhancement (UIE) techniques, aiming at rectifying color casts and enhancing contrast, have garnered significant attention and extensive research [].

Figure 1. Underwater degradation image.

Drews-Jr et al. [] proposed the Underwater Dark Channel Prior (UDCP), utilizing the lowest 0.1% of blue and green pixels to estimate transmission and ambient light, effectively removing haze by mitigating red-light attenuation. Retinex-based methods [] separate illumination and reflectance through multi-scale filtering to correct color casts and enhance local contrast. However, they necessitate manual or semi-automatic parameter tuning, rendering them unstable under varying noise and lighting conditions. Pizer et al. [] introduced Contrast-Limited Adaptive Histogram Equalization (CLAHE), which restricts noise amplification by clipping local histograms, substantially improving local contrast and detail clarity. Schechner and Karpel [] employed polarization imaging to separate backscatter from transmitted light, effectively suppressing noise and improving visibility in turbid waters. Ancuti et al. [] developed a Laplacian pyramid fusion framework that initially enhances various preprocessed outputs (e.g., white balance, gamma correction) independently and subsequently fuses them adaptively based on naturalness and contrast, balancing global color consistency with detail preservation. Peng et al. [] presented a Physical Dicolor Model (PDM) histogram equalization approach that restores color distributions using an underwater imaging model before applying histogram equalization to enhance contrast, achieving both physical consistency and visual appeal. Although these methods obviate the need for large-scale annotations, their dependence on parameter estimation and susceptibility to environmental variations frequently result in inconsistent performance.

Recently, end-to-end deep learning has dominated underwater image enhancement by leveraging large labeled or synthetic datasets to learn direct degradation-enhancement mappings, enabling robust color bias removal and concurrent contrast and detail restoration. Li et al. [] introduced WaterNet, employing a gated network to fuse white balance, histogram equalization and gamma correction outputs for adaptive color-cast correction. Anwar et al. [] trained a lightweight UWCNN on synthetic datasets encompassing diverse degradations, circumventing physical-model errors while efficiently restoring fine details. Islam et al. [] proposed FUnIE-GAN, integrating multi-scale perceptual losses into a GAN framework to generate natural textures and colors in real time on embedded platforms. Wei et al. [] designed UHD-SFNet with dual branches—frequency-domain texture extraction via 1D convolutions and spatial color processing via U-RSGNet—whose fusion enables ultra-high-resolution reconstructions. Sun et al. [] realized image enhancement through an end-to-end adaptive approach based on deep convolutional neural networks with an encoding–decoding structure. A key advantage of this method is that it eliminates the need to account for physical environmental factors. Wang et al. [] presented UIEC²-Net, simultaneously extracting RGB and HSV features with adaptive attention fusion for end-to-end luminance and chromatic correction. Li et al. [] introduced Emerging from Water, a weakly supervised CycleGAN-based color-transfer method utilizing adversarial, cycle-consistency and SSIM losses to learn underwater-to-air mappings without paired data. Lin et al. [] proposed a dilated generative adversarial network (DGAN). Notably, this network not only enables the discriminator to judge the authenticity of the entire image, but also equips it with the capability to perform pixel-level classification for each constituent pixel. Xiang et al. [] proposed a four-input deep learning model with polarized residual dense networks to effectively recover underwater mode images. Fu et al. [] proposed an unsupervised underwater image restoration method. This approach leverages the homology between the original underwater image and its undegraded counterpart to effectively enhance image quality. Qin et al. [] proposed MCRNet, fusing RGB/HSV/Lab features via self-attention to substantially improve color correction and contrast in severely degraded images. Shin et al. [] proposed a three-stage cooperative ambient light estimation network. This network incorporates multiscale feature extraction, feature fusion, and non-linear regression to achieve accurate estimation of Ac. Hao et al. [] proposed a two-stage underwater image restoration method based on physical modeling and causal intervention. By introducing a novel underwater image degradation formula and constructing an image restoration network integrated with multi-scale convolution and skip connections, this method achieves high-quality image restoration. Spandana et al. [] put forward a multi-scale underwater image enhancement framework based on attention mechanisms and residual learning. Through designing an adaptive feature weighting module and building a cross-layer information interaction network, this framework realizes effective improvement of underwater image clarity. Despite excelling in color correction, contrast restoration, and detail preservation, these deep models remain constrained by large parameter counts, high computational costs, and substantial dependence on labeled data.

To address these issues, an underwater image enhancement method using dynamic color correction and lightweight attention-embedded SRResNet is presented in this paper. Dynamic color restoration is grounded in the gray-world hypothesis to adaptively rectify global color bias. A lightweight residual network—specifically, an enhanced SRResNet variant—was created for fine-grained detail refinement and contrast augmentation. By synergistically combining traditional color-constancy techniques for rapid bias mitigation with deep-learning-based high-frequency detail recovery, the presented approach substantially elevates visual fidelity. The contributions of this paper are listed as follows:

Dynamic Color-Recovery Module: Derives per-channel correction factors directly from image mean values—eliminating prior assumptions—to achieve effective color cast removal.
Lightweight Residual Blocks: An SRResNet-derived architecture leveraging depthwise separable convolutions with reduced channel dimensions and layer counts to minimize computational overhead, while preserving feature propagation through residual connections.
Comprehensive Experimental Validation: Conducts comparative and ablation studies on the UIEB dataset against traditional and established deep-learning methods, employing qualitative and quantitative metrics to substantiate the method’s superiority in color correction and detail enhancement.

The remainder of this paper is structured as follows: Section 2 details the proposed method; Section 3 presents experimental results and analyses; Section 4 concludes and discusses future directions.

2. Proposed Method

The proposed enhancement framework comprises two principal components, as illustrated in Figure 2. This section details the architectural design of (1) the Dynamic Color Recorrection module and (2) the lightweight residual-based SRResNet backbone, followed by the theoretical rationale for the training strategy and loss function design.

Figure 2. Workflow architecture of the proposed underwater image enhancement framework.

2.1. Dynamic Color Recorrection Module

Underwater images frequently manifest significant blue-green color casts due to wavelength-dependent light attenuation in aquatic environments. To address this phenomenon, a Dynamic Color Recorrection (DCR) module is introduced, grounded in the gray-world assumption that the mean intensities of all color channels in a natural image converge to equilibrium. This module adaptively rectifies color biases through a pixel-wise correction mechanism, as illustrated in Figure 3.

Figure 3. DCR module workflow architecture.

Given an input underwater image

I \in R^{H \times W \times 3}

, the proposed DCR module quantifies and rectifies color deviations through a four-step pipeline. First, the global mean intensities of the R, G, and B channels are computed to characterize color distribution:

{\bar{I}}_{c} = \frac{1}{H \times W} \underset{i = 1}{\sum^{H}} \underset{j = 1}{\sum^{W}} I_{c} (i, j), c \in {R, G, B}

(1)

Here,

{\bar{I}}_{c}

denotes the mean luminance of channel

c

, with its deviation directly reflecting color cast severity. Following the gray-world hypothesis, the ideal average intensity for the three channels is identical. Thus, the arithmetic mean of these channel averages is defined as the target grayscale reference:

{\bar{I}}_{g} = \frac{{\bar{I}}_{R} + {\bar{I}}_{G} + {\bar{I}}_{B}}{3}

(2)

This value establishes the neutral gray baseline for subsequent gain calculations. To ensure numerical stability, a lower-bound constraint is applied to the gray scale reference

{\bar{I}}_{g} = m a x ({\bar{I}}_{g}, ϵ)

, incorporating a constant

ϵ = 0.01

for stability and calculating per-channel correction gains:

g_{c} = \frac{{\bar{I}}_{g}}{{\bar{I}}_{c}}, c \in {R, G, B}

(3)

The gain factor

g_{c}

quantifies required compensation: if a channel’s mean intensity is below the target gray level (for instance, the red channel under a blue-green cast),

g_{c}

exceeds unity to boost its response; otherwise,

g_{c}

attenuates the channel. The corrected image is generated by applying these channel-specific gains to each pixel:

{\hat{I}}_{c} (i, j) = g_{c} \cdot I_{c} (i, j), c \in {R, G, B}

(4)

To prevent pixel values from exceeding valid ranges after channel-wise gain adjustment—which may lead to overexposure or color distortion—a clipping operation is introduced following color correction. This step constrains each channel’s pixel values within the valid range of [0, 1], and is expressed as follows:

I_{c}^{'} (i, j) = \min (\max ({\hat{I}}_{c} (i, j), 0), 1), c \in {R, G, B}

(5)

The output

I^{'} \in R^{H \times W \times 3}

exhibits balanced color chromaticity while preserving the global luminance structure. This dynamic gain strategy integrates the classical gray-world color constancy principle with an adaptive correction mechanism, effectively eliminating underwater color casts without relying on prior scene-specific knowledge.

2.2. Lightweight Residual Enhancement Network

Although dynamic color correction mitigates color casts, the enhanced image

I^{'}

often exhibits persistent low contrast and blurred details due to residual degradation. To address these limitations, a lightweight residual enhancement network based on SRResNet is proposed, as illustrated in Figure 4. The residual network integrates three key design principles: depthwise separable convolutions to reduce computational costs to approximately 1/8 of standard convolutional operations; a dual attention mechanism to amplify salient features; and residual connections to maintain information flow while mitigating gradient vanishing, thereby rendering the network suited for underwater image enhancement tasks that demand high-fidelity detail preservation.

Figure 4. Architecture of the proposed lightweight residual enhancement network.

The network initiates feature extraction through a head convolutional layer:

F_{0} = σ (W_{h} * I^{'} + b_{h})

(6)

where

W_{h} \in R^{9 \times 9 \times 3 \times 64}

denotes the convolutional kernel, and

σ

represents the activation function. The

9 \times 9

kernel captures spatial features within an extended receptive field, while ReLU function introduces non-linear transformation to enhance feature discriminability. The bias term

b_{h}

adaptively modulates activation threshold across channels, enabling robust adaptation to diverse input distributions.

Subsequently, the initial feature map

F_{0}

undergoes progressive refinement through eight cascaded lightweight residual blocks. Each residual block

R e s B l o c k_{i}

produces output defined as follows:

F_{i} = R e s B l o c k_{i} (F_{i - 1}), i = 1,2, \dots, N

(7)

This cascaded structure enables progressive extraction of multilevel feature representations, transitioning from low-level textures to high-level semantic information, ultimately producing the enhanced image through the tail fusion module:

I_{E} = W_{t} * F_{N} + b_{t}

(8)

where

W_{t}

maps the high-dimensional features back into the three-channel RGB color space, and

b_{t}

is the bias parameter of the tail convolutional layer used to adjust the reference values of RGB channels in the output image.

Core Design of Lightweight Residual Block

The lightweight residual block serves as the core building block of the enhancement network. Integrating depthwise separable convolution with an attention mechanism, achieves computational complexity of approximately 1/8 that of standard convolutions while maintaining almost equivalent performance. Given an input feature map

F_{i n}

, the block performs sequential feature transformation through two depthwise separable convolutional layers, where the first layer simultaneously performs spatial feature extraction and channel fusion, formulated as follows:

F_{1} = σ (B N (W_{p w 1} * (W_{d w 1} *_{g r o u p s} F_{i n})))

(9)

where

W_{p w}

is the depthwise convolution kernel (

3 \times 3 \times 1

, groups = 64) that independently processes the spatial information per channel, and

W_{p w}

is the pointwise convolution kernel

(1 \times 1 \times 64 \times 64)

for cross-channel fusion. The second layer preserves linear representation by omitting non-linear activation, formulated as follows:

F_{2} = B N (W_{p w 2} * (W_{d w 2} *_{g r o u p s} F_{1}))

(10)

To enhance feature representation, a tandem structure combining channel attention and spatial attention is introduced. The structure of the channel attention is illustrated in Figure 5, where the channel attention mechanism captures channel-wise statistics via global average pooling, as follows:

F_{c a} = F_{2} \otimes σ (M L P (G A P (F_{2})) + M L P (G M P (F_{2})))

(11)

Figure 5. Architecture of channel attention module.

GAP and GMP denote global average pooling and global max pooling, respectively, where MLP represents a two-layer fully connected network with shared weights across channels, and

\otimes

indicates element-wise multiplication. The spatial attention module generates spatial weight maps by aggregating channel-wise statistical information as follows:

F_{s a} = F_{c a} \otimes σ (C o n v_{7 \times 7} ([A v g P o o l (F_{c a}); M a x P o o l (F_{c a})])

(12)

Figure 6 shows the structure of spatial attention. AvgPool and MaxPool represent channel-wise global average pooling and global max pooling operations, respectively, which aggregate spatial information by computing mean and maximum statistics across spatial dimensions. Their concatenated outputs are processed through a

7 \times 7

convolution followed by the activation of

σ

to generate the spatial attention map. This dual attention design enables simultaneous focus on critical feature channels and spatial locations, enhancing feature discriminability. The enhanced features are fused with the original input through the residual concatenation as follows:

F_{o u t} = F_{i n} + F_{s a}

(13)

Figure 6. Architecture of spatial attention module.

Residual connections mitigate the vanishing gradient issues in deep networks while preserving intact information flow, thereby enabling the network to concentrate on learning enhancement-specific residuals rather than reproducing input features.

2.3. Loss Functions and Optimization Strategy

A hybrid loss function is designed to jointly optimize pixel-level accuracy and structural similarity:

L_{t} = λ L_{L 1} + (1 - λ) L_{S S I M}

(14)

where

L_{L 1} = | | I_{E} - I_{g t} | |_{1}

ensures pixel-level accuracy, and

L_{S S I M} = 1 - S S I M (I_{E}, I_{g t})

preserves structural consistency. Here,

I_{E}

denotes the enhanced image output by the mode,

I_{g t}

represents the corresponding high-quality underwater reference image, and

λ

is the weighting coefficient. This formulation guarantees that the enhanced output maintains both pixel-level proximity to the ground truth and structurally coherent characteristics, thereby ensuring multi-scale perceptual quality preservation. The complete underwater image enhancement process that integrates dynamic color restoration and residual learning can be formulated as follows:

I_{E} = F_{R} (F_{c} (I))

(15)

where

F_{c}

denotes the dynamic color restoration function, and

F_{R}

represents the lightweight residual network mapping. This two-stage framework synergistically combines physical priors of underwater image degradation with data-driven refinement, achieving efficient high-fidelity enhancement.

3. Experimental Validation

3.1. Dataset Selection and Experimental Configuration

Underwater datasets constitute a pivotal foundation for deep-learning-based image enhancement, enabling model training and performance evaluation. Underwater datasets are typically constructed via two paradigms: (1) synthetic data generation using style transfer techniques such as CycleGAN [] to generate clear reference images from underwater scenes, offering cost-effective scalability; and (2) real-world data acquisition involving underwater scenes capture followed by traditional enhancement for reference generation, which better preserves complex real-world degradation patterns despite higher resource requirements.

Evaluation is performed on two widely used UIE datasets: UIEB [] and RUIE []. UIEB comprises 950 real underwater images, including 890 paired samples for training and 60 challenging unpaired cases exhibiting greenish or bluish casts, low contrast, and turbidity. RUIE provides extended quality variations for robustness validation. Performance is assessed using both full-reference metrics (PSNR, SSIM) and no-reference metrics (UCIQE, UIQM).

Experiments were conducted on a platform equipped with an AMD R5-5600 CPU, 16 GB RAM, and an AMD Radeon RX 6750 GPU (AMD Inc., Santa Clara, CA, USA). The training input images were cropped to 512 × 512 pixels. The network architecture was implemented using PyTorch (version 2.0.1+cpu), with model optimization performed via the Adam optimizer. Data augmentation was achieved through horizontal flipping and random in-plane rotations of 90° and 180°. The training protocol was configured with a batch size of 8, 80 epochs, and an initial learning rate of 0.0002.

3.2. Qualitative Comparison

Qualitative evaluation was conducted on the UIEB dataset (including the Challenging Set) and the raw images from the RUIE dataset. The proposed model was compared against several existing underwater image enhancement algorithms, encompassing traditional methods such as UDCP [] and MLLE [], and deep-learning-based approaches including DICAM [], FA+Net [], DeepWater [], and PUIE-Net []. Most of the comparison methods were implemented using publicly available code or pretrained models released by the original authors, and re-evaluated under the same test conditions for fairness. Representative qualitative results are presented in Figure 7 and Figure 8.

Figure 7. Enhancement results on the UIEB dataset. (a) RAW input; (b) UDCP []; (c) MLLE []; (d) DICAM []; (e) FA+Net []; (f) DeepWater []; (g) PUIE-Net []; (h) proposed method.

Figure 8. Enhancement results on the RUIE dataset. (a) RAW input; (b) UDCP []; (c) MLLE []; (d) DICAM []; (e) FA+Net []; (f) DeepWater []; (g) PUIE-Net []; (h) proposed method.

As illustrated in Figure 7 and Figure 8, all examples were selected from the UIEB test set and the RUIE dataset. The first row displays the original underwater images, while rows 2–8 present enhancement results from seven comparative methods in the following order: UDCP [], MLLE [], DICAM [], FA+Net [], DeepWater [], PUIE-Net [], and the proposed method. Traditional approaches such as UDCP and MLLE exhibit some limitations, including color overcorrection, detail loss, and inadequate handling of complex underwater degradation, often resulting in unnatural outputs due to excessive compensation in blue and green channels. In contrast, deep-learning-based methods demonstrate improved detail restoration, where DICAM and FA+Net effectively enhance contrast but may introduce residual color casts or oversaturated regions, while PUIE-Net achieves more natural color reproduction yet maintains a slight greenish tint with constrained contrast improvement.

The experimental results demonstrate that the proposed method not only effectively eliminates blue-green color casts but also simultaneously enhances image contrast and texture details, thereby producing underwater images with significantly improved clarity and naturalness. The method exhibits high fidelity in structural detail preservation, generating output images that achieve better visual consistency with ground truth references. Across multiple comparative trials, the method consistently yields enhanced results characterized by natural color reproduction and rich detail representation, confirming its superior subjective visual quality.

3.3. Quantitative Comparison

To quantitatively evaluate the performance of the aforementioned methods, full-reference metrics including Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM) are employed. PSNR [] quantifies the ratio of maximum signal power to corrupting noise power, calculated based on pixel-wise differences between reference and distorted images:

P S N R = 10 \cdot {l o g}_{10} (\frac{M A X_{I}^{2}}{M S E})

(16)

where

M A X_{I}^{2}

is the maximum possible pixel value in the image (usually 255 for 8-bit images), and

M S E

is the mean square error. SSIM [] integrates luminance, contrast, and structural information into a comprehensive quality assessment of the perceived image, grounded in human visual perception, defined as follows:

S S I M (x, y) = \frac{(2 μ_{x} μ_{y} + C_{1}) (2 σ_{x y} + C_{2})}{(μ_{x}^{2} + μ_{y}^{2} + C_{1}) (σ_{x}^{2} + σ_{y}^{2} + C_{2})}

(17)

μ_{x}

,

μ_{y}

denote the mean values of images

x

and

y

, respectively.

σ_{x}^{2}

and

σ_{y}^{2}

represent the variances of images

x

and

y

, respectively;

σ_{x y}

stands for the covariance between images

x

and

y

. Additionally,

C_{1}, C_{2}

are constants introduced to stabilize the denominator. Higher PSNR and SSIM values indicate superior enhancement performance. For no-reference assessment, Underwater Color Image Quality Evaluation (UCIQE) and Underwater Image Quality Measure (UIQM) are adopted. UCIQE [] combines chroma, saturation, and contrast through a linear combination as follows:

U C I Q E = c_{1} \times σ_{c} + c_{2} \times c o n + c_{3} \times μ_{s}

(18)

where

σ_{c}

is the standard deviation of chromaticity,

c o n

denotes luminance contrast,

μ_{s}

denotes the mean value of saturation, and

c_{1}, c_{2}, c_{3}

are weighting factors. UIQM [] evaluates the colorfulness, sharpness, and contrast of underwater images using the following formula:

U I Q M = b_{1} \times U I C M + b_{2} \times U I S M + b_{3} \times U I C o n M

(19)

U I C M

denotes the color metric of the underwater image,

U I S M

denotes the sharpness metric of the underwater image,

U I C o n M

denotes the contrast metric of the underwater image, and

b_{1}, b_{2}, b_{3}

are weighting factors. Higher UIQM values reflect better perceptual quality of the underwater images. All methods were evaluated on 50 images from both the UIEB and RUIE datasets, with average scores reported in Table 1. The highest values in each category are highlighted in red.

Table 1. Quantitative comparison of enhancement methods in terms of PSNR, SSIM, UCIQE, and UIQM.

Table 1 summarizes the comparative performance of the evaluated methods on the UIEB and RUIE test sets. On the UIEB dataset, the proposed approach achieves the highest PSNR of 20.093 dB, demonstrating superior noise suppression capability while preserving fine structural details for high-fidelity image reconstruction. It attains leading scores of 0.965 in UCIQE and 1.563 in UIQM, where UCIQE quantifies the efficacy of color correction and contrast enhancement, and UIQM evaluates comprehensive colorfulness, sharpness, and contrast—performance metrics that this method outperforms others in. The substantial margin over competing methods validates the effectiveness of the proposed color-compensation and contrast-enhancement strategies in addressing light attenuation and color-cast distortions.

For the more challenging RUIE dataset, the framework exhibits robust generalization performance. It achieves the highest UCIQE score among all methods, indicating superior enhancement of underwater image quality in terms of colorfulness, sharpness, and contrast. Although RUIE does not provide paired high-quality reference images for full-reference metrics such as PSNR and SSIM, the improvements in UCIQE and UIQM, together with consistent visual inspection, confirm the effectiveness of the proposed method. Notably, under extreme degradation caused by suspended particles and light scattering, the method effectively reconstructs natural color tones and restores fine details, confirming its robustness and adaptability in complex underwater environments.

3.4. Ablation Studies

To validate the contributions of key components, four ablation experiments were systematically conducted to evaluate the individual and combined effects of the Dynamic Color Recorrection (DCR) module and the Convolutional Block Attention Module (CBAM). All models were trained under identical hyperparameter configurations on the UIEB dataset and evaluated across both UIEB and RUIE test sets. Qualitative results are summarized in Table 2, with performance metrics including PSNR, SSIM, UCIQE, and UIQM.

Table 2. Performance comparison of underwater image enhancement performance under different module configurations.

Table 2 reports the quantitative performance of ablation configurations on the UIEB and RUIE test sets. The baseline model, excluding both DCR and CBAM, exhibits the lowest performance, confirming inadequate detail reconstruction and color correction capabilities. The DCR module demonstrates significant improvements in PSNR and related metrics, highlighting its critical role in color restoration. Sole integration of the CBAM substantially elevates UCIQE scores, validating its effectiveness in enhancing salient features through attention mechanisms. The full model, incorporating both DCR and CBAM, achieves optimal performance across all metrics, underscoring the synergistic advantages of their combined implementation.

4. Conclusions

This paper presents a novel underwater image enhancement framework that synergistically combines gray-world-based dynamic color restoration with a lightweight residual network. The proposed Dynamic Color Recorrection module adaptively corrects underwater color casts, while the subsequent residual network integrates spatial and channel attention mechanisms to enhance image details and contrast. Experimental results on the UIEB and RUIE datasets demonstrate that this method effectively suppresses blue-green color casts, significantly improves image clarity and contrast, and outperforms both traditional and deep-learning-based methods across PSNR, SSIM, UCIQE, and UIQM metrics. The framework generates visually natural results while maintaining computational efficiency. Future work will focus on real-time inference optimization, deployment in lightweight underwater robotic systems, and extended validation in diverse marine environments.

Author Contributions

K.Z., Conceptualization, Methodology and Writing—original draft. Y.Z., Resources, Conceptualization and Funding acquisition. D.Y., Conceptualization, Methodology and Resources. X.F., Software, Validation and Supervision. All authors have read and agreed to the published version of the manuscript.

Funding

This study was financially supported by National Key R&D Program of China (No. 2024YFE0108700), Taishan Scholars Program, Key R&D Program Project of Shandong Province (No. 2023CXPT015; 2024KJHZ020), Qingdao Science and Technology Innovation Project (No. 24-1-3-hygg-20-hy), National Natural Science Foundation of Qingdao (No. 24-4-4-zrjj-166-jch) and Science and Education Production Project of Qilu University of Technology (Shandong Academy of Sciences) (No. 2024GH10).

Data Availability Statement

Data will be available upon a suitable request from the corresponding author.

Conflicts of Interest

The authors declare no conflicts of interest.

Correction Statement

This article has been republished with a minor correction to the readability of Table 1 and the main text. This change does not affect the scientific content of the article.

References

Raihan, J.A.; Abas, P.A.; De Silva, L.C. Review of underwater image restoration algorithms. IET Image Process. 2019, 13, 1587–1596. [Google Scholar] [CrossRef]
Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An Underwater Image Enhancement Benchmark Dataset and Beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef] [PubMed]
Drews-Jr, P.L.; Nascimento, E.; Moraes, F.; Botelho, S.; Campos, M. Transmission estimation in underwater single images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, Australia, 2–8 December 2013; pp. 825–830. [Google Scholar]
Land, E.H.; McCann, J.J. Lightness and Retinex Theory. J. Opt. Soc. Am. 1971, 61, 1–11. [Google Scholar] [CrossRef] [PubMed]
Pizer, S.M.; Amburn, E.P.; Austin, J.D.; Cromartie, R.; Geselowitz, A.; Greer, T.; Romeny, B.T.H.; Zimmerman, J.B.; Zuiderveld, K. Adaptive histogram equalization and its variations. Comput. Vis. Graph. Image Process. 1987, 39, 355–368. [Google Scholar] [CrossRef]
Schechner, Y.Y.; Karpel, N. Recovery of underwater visibility and structure by polarization analysis. IEEE J. Ocean. Eng. 2005, 30, 570–587. [Google Scholar] [CrossRef]
Ancuti, C. Color Balance and Fusion for Underwater Image Enhancement. IEEE Trans. Image Process. 2017, 27, 379–393. [Google Scholar] [CrossRef]
Peng, Y.T.; Chen, Y.R.; Chen, Z.; Wang, J.H. Underwater image enhancement based on histogram-equalization approximation using physics-based dichromatic modeling. Sensors 2022, 22, 2168. [Google Scholar] [CrossRef]
Anwar, S.; Li, C.; Porikli, F. Deep Underwater Image Enhancement. arXiv 2018, arXiv:1807.03528. [Google Scholar]
Islam, M.J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
Wei, Y.; Zheng, Z.; Jia, X. UHD underwater image enhancement via frequency-spatial domain aware network. In Proceedings of the Asian Conference on Computer Vision, Macao, China, 4–8 December 2022; pp. 299–314. [Google Scholar]
Sun, X.; Liu, L.P.; Dong, J.Y. Underwater image enhancement with encoding-decoding deep CNN networks. In Proceedings of the IEEE SmartWorld, Ubiquitous Intelligence & Computing, Advanced & Trusted Computed, Scalable Computing & Communications, Cloud & Big Data Computing, Internet of People and Smart City Innovation, San Francisco, CA, USA, 4–8 August 2017; IEEE Computer Society Press: Los Alamitos, CA, USA, 2017; pp. 1–6. [Google Scholar]
Wang, Y.; Guo, J.; Gao, H.; Yue, H. UIEC²-Net: CNN-based Underwater Image Enhancement Using Two Color Space. Signal Process. Image Commun. 2021, 96, 116250. [Google Scholar] [CrossRef]
Li, C.; Guo, J.; Guo, C. Emerging from Water: Underwater Image Color Correction Based on Weakly Supervised Color Transfer. IEEE Signal Process. Lett. 2018, 25, 323–327. [Google Scholar] [CrossRef]
Lin, J.C.; Hsu, C.B.; Lee, J.C. Dilated generative adversarial networks for underwater image restoration. J. Mar. Sci. Eng. 2022, 10, 500. [Google Scholar] [CrossRef]
Xiang, Y.; Yang, X.; Ren, Q.; Wang, G.; Gao, J.; Chew, K.H.; Chen, R.P. Underwater polarization imaging recovery based on polarimetric residual dense network. IEEE Photon. J. 2022, 14, 7860206. [Google Scholar] [CrossRef]
Fu, Z.; Lin, H.; Yang, Y.; Chai, S.; Sun, L.; Huang, Y.; Ding, X. Unsupervised underwater image restoration: From a homology perspective. In Proceedings of the 36th AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; AAAI Press: Palo Alto, CA, USA, 2022; pp. 643–651. [Google Scholar]
Qin, N.; Wu, J.; Liu, X. MCRNet: Underwater image enhancement using multi-color space residual network. Biomim. Intell. Robot. 2024, 4, 100169. [Google Scholar] [CrossRef]
Shin, Y.S.; Cho, Y.; Pandey, G.; Kim, A. Estimation of ambient light and transmission map with common convolutional architecture. In Proceedings of the OCEANS 2016 MTS/IEEE Monterey, Monterey, CA, USA, 19–23 November 2016; IEEE: New York, NY, USA, 2016; pp. 1–7. [Google Scholar]
Hao, J.; Yang, H.; Hou, X.; Zhang, Y. Two-stage underwater image restoration algorithm based on physical model and causal intervention. IEEE Signal Process. Lett. 2023, 30, 120–124. [Google Scholar] [CrossRef]
Spandana, C.; Srisurya, I.V.; Priyadharshini, A.R.; Krithika, S.; Nandhini, S.A. Underwater image enhancement and restoration using cycle GAN. In Proceedings of the International Conference on Innovative Computing and Communication, New Delhi, India, 16–17 February 2023; Springer Nature: Singapore, 2023; pp. 99–110. [Google Scholar]
Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
Liu, R.; Fan, X.; Zhu, M. Real-world underwater enhancement: Challenges, benchmarks, and solutions under natural light. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 4861–4875. [Google Scholar] [CrossRef]
Zhang, W.; Zhuang, P.; Sun, H. Underwater image enhancement via minimal color loss and locally adaptive contrast enhancement. IEEE Trans. Image Process. 2022, 31, 3997–4010. [Google Scholar] [CrossRef]
Tolie, H.F.; Ren, J.; Elyan, E. DICAM: Deep inception and channel-wise attention modules for underwater image enhancement. Neurocomputing 2024, 584, 127585. [Google Scholar] [CrossRef]
Jiang, J.; Ye, T.; Bai, J. Five A + Network: You Only Need 9K Parameters for Underwater Image Enhancement. arXiv 2023, arXiv:2305.08824. [Google Scholar]
Fu, Z.; Li, X.; Ma, H.; Yan, P.; Li, Q.; Chen, W. Uncertainty Inspired Underwater Image Enhancement. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; Springer Nature: Cham, Switzerland, 2022; pp. 465–482. [Google Scholar]
Kumar, N.N.; Ramakrishna, S. An Impressive Method to Get Better Peak Signal Noise Ratio (PSNR), Mean Square Error (MSE) Values Using Stationary Wavelet Transform (SWT). Glob. J. Comput. Sci. Technol. Graph. Vis. 2012, 12, 34–40. [Google Scholar]
Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the 2010 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar]
Yang, M.; Sowmya, A. An underwater color image quality evaluation metric. IEEE Trans. Image Process. 2015, 24, 6062–6071. [Google Scholar] [CrossRef]
Panetta, K.; Gao, C.; Agaian, S. Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean Eng. 2015, 41, 541–551. [Google Scholar] [CrossRef]

Figure 1. Underwater degradation image.

Figure 2. Workflow architecture of the proposed underwater image enhancement framework.

Figure 3. DCR module workflow architecture.

Figure 4. Architecture of the proposed lightweight residual enhancement network.

Figure 5. Architecture of channel attention module.

Figure 6. Architecture of spatial attention module.

Figure 7. Enhancement results on the UIEB dataset. (a) RAW input; (b) UDCP []; (c) MLLE []; (d) DICAM []; (e) FA+Net []; (f) DeepWater []; (g) PUIE-Net []; (h) proposed method.

Figure 8. Enhancement results on the RUIE dataset. (a) RAW input; (b) UDCP []; (c) MLLE []; (d) DICAM []; (e) FA+Net []; (f) DeepWater []; (g) PUIE-Net []; (h) proposed method.

Table 1. Quantitative comparison of enhancement methods in terms of PSNR, SSIM, UCIQE, and UIQM.

Methods	UIEB				RUIE
Methods	PSNR	SSIM	UCIQE	UIQM	UCIQE	UIQM
UDCP	14.147	0.628	0.223	0.357	0.195	0.328
MLLE	17.756	0.717	0.253	1.340	0.259	1.623
DICAM	18.208	0.755	0.261	1.105	0.251	1.348
FA+Net	19.174	0.836	0.273	1.208	0.246	1.468
DeepWater	17.079	0.680	0.315	0.641	0.317	0.895
PUIE-Net	19.048	0.829	0.829	1.080	0.242	1.559
Ours	20.093	0.805	0.965	1.563	0.492	1.611

Table 2. Performance comparison of underwater image enhancement performance under different module configurations.

Module Name	Adding Modules	PSNR	SSIM	UCIQE	UIQM
Methods 1	-	15.344	0.492	0.523	1.361
Methods 2	+DCR	17.826	0.663	0.701	1.511
Methods 3	+CBAM	16.087	0.618	0.690	1.448
Methods 4	DCR + CBAM	19.236	0.747	0.729	1.587

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Underwater Image Enhancement Using Dynamic Color Correction and Lightweight Attention-Embedded SRResNet

Abstract

1. Introduction

2. Proposed Method

2.1. Dynamic Color Recorrection Module

2.2. Lightweight Residual Enhancement Network

Core Design of Lightweight Residual Block

2.3. Loss Functions and Optimization Strategy

3. Experimental Validation

3.1. Dataset Selection and Experimental Configuration

3.2. Qualitative Comparison

3.3. Quantitative Comparison

3.4. Ablation Studies

4. Conclusions

Author Contributions

Funding

Data Availability Statement

Conflicts of Interest

Correction Statement

References

Article Metrics

Citations

Article Access Statistics