MambaRA-GAN: Underwater Image Enhancement via Mamba and Intra-Domain Reconstruction Autoencoder
Abstract
1. Introduction
- A Triple-Path Gated Mamba (TG-Mamba) module is proposed to fully leverage the local feature extraction capabilities of CNNs and the long-range dependency modeling abilities of Mamba.
- An intra-domain reconstruction autoencoder is constructed that generates structured supervision signals to guide model training by quantifying the reconstruction quality differences within the cycle consistency loss.
- The proposed method generates images with significantly enhanced visual quality, demonstrating superior color cast correction and detail enhancement. It outperforms all comparative methods across five performance metrics, validating its overall excellence.
2. Related Work
2.1. Nonphysical Model-Based Methods
2.2. Physical Model-Based Methods
2.3. Data-Driven Methods
2.4. Mamba and Its Variants
3. Materials and Methods
3.1. MambaRA-GAN
3.2. TG-Mamba Generator Network Structure
3.3. Discriminator Network Structure
3.4. Loss Function
- (1)
- GAN Loss
- (2)
- Cycle Loss
- (3)
- Structural Consistency Loss (SC Loss)
- (4)
- Total loss function
4. Experiments
4.1. Dataset Description and Experimental Setup
- (1)
- UIEB: 890 paired samples (800 training/90 testing);
- (2)
- EUVP: 100 paired images randomly selected from the EUVP-515 dataset were used as the test set;
- (3)
- RUIE: 90 unreferenced images with significant blue–green color bias served as the test set.
4.2. Evaluation Indicators
4.3. Visual Comparison with Other Methods
- (1)
- Color cast corrections
- (2)
- Detail richness
4.4. Objective Comparison with Other Methods
4.5. Ablation Experiment
4.6. Real-Time Analysis and Discussion
4.7. Application
4.8. Failure Cases
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Li, J.; Xu, W.; Deng, L.; Xiao, Y.; Han, Z.; Zheng, H. Deep learning for visual recognition and detection of aquatic animals: A review. Rev. Aquac. 2023, 15, 409–433. [Google Scholar] [CrossRef]
- Li, X.; Zhuang, Y.; You, B.; Wang, Z.; Zhao, J.; Gao, Y.; Xiao, D. LDNet: High Accuracy Fish Counting Framework using Limited training samples with Density map generation Network. J. King Saud Univ. Comput. Inf. Sci. 2024, 36, 102143. [Google Scholar] [CrossRef]
- Rahman, M.A.; Barooah, A.; Khan, M.S.; Hassan, R.; Hassan, I.; Sleiti, A.K.; Hamilton, M.; Gomari, S.R. Single and Multiphase Flow Leak Detection in Onshore/Offshore Pipelines and Subsurface Sequestration Sites: An Overview. J. Loss Prev. Process Ind. 2024, 90, 105327. [Google Scholar] [CrossRef]
- Bell, K.L.; Chow, J.S.; Hope, A.; Quinzin, M.C.; Cantner, K.A.; Amon, D.J.; Cramp, J.E.; Rotjan, R.D.; Kamalu, L.; de Vos, A.; et al. Low-cost, deep-sea imaging and analysis tools for deep-sea exploration: A collaborative design study. Front. Mar. Sci. 2022, 9, 873700. [Google Scholar] [CrossRef]
- Zhou, J.; Yang, T.; Zhang, W. Underwater vision enhancement technologies: A comprehensive review, challenges, and recent trends. Appl. Intell. 2023, 53, 3594–3621. [Google Scholar] [CrossRef]
- Hummel, R. Image enhancement by histogram transformation. Comput. Graph. Image Process. 1977, 6, 184–195. [Google Scholar] [CrossRef]
- Buchsbaum, G. A spatial processor model for object colour perception. J. Frankl. Inst. 1980, 310, 1–26. [Google Scholar] [CrossRef]
- Liu, Y.-C.; Chan, W.-H.; Chen, Y.-Q. Automatic white balance for digital still camera. IEEE Trans. Consum. Electron. 1995, 41, 460–466. [Google Scholar] [CrossRef]
- Zuiderveld, K. Contrast limited adaptive histogram equalization. In Graphics Gems IV; Academic Press: Cambridge, MA, USA, 1994; pp. 474–485. [Google Scholar]
- Garg, D.; Garg, N.K.; Kumar, M. Underwater image enhancement using blending of CLAHE and percentile methodologies. Multimed. Tools Appl. 2018, 77, 26545–26561. [Google Scholar] [CrossRef]
- Iqbal, K.; Odetayo, M.; James, A.; Salam, R.A.; Talib, A.Z.H. Enhancing the low quality images using unsupervised colour correction method. In Proceedings of the IEEE International Conference on Systems, Man and Cybernetics, Istanbul, Turkey, 10–13 October 2010; pp. 1703–1709. [Google Scholar]
- Jaffe, J.S. Computer modeling and the design of optimal underwater imaging systems. IEEE J. Ocean. Eng. 1990, 15, 101–111. [Google Scholar] [CrossRef]
- He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2010, 33, 2341–2353. [Google Scholar] [CrossRef]
- Chiang, J.Y.; Chen, Y.C. Underwater image enhancement by wavelength compensation and dehazing. IEEE Trans. Image Process. 2011, 21, 1756–1769. [Google Scholar] [CrossRef]
- Galdran, A.; Pardo, D.; Picón, A.; Alvarez-Gila, A. Automatic red-channel underwater image restoration. J. Vis. Commun. Image Represent. 2015, 26, 132–145. [Google Scholar] [CrossRef]
- Drews, P.; Nascimento, E.; Moraes, F.; Botelho, S.; Campos, M. Transmission estimation in underwater single images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, NSW, Australia, 1–8 December 2013; pp. 825–830. [Google Scholar]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial networks. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, USA, 8–13 December 2014; p. 27. [Google Scholar]
- Li, C.; Anwar, S.; Porikli, F. Underwater scene prior inspired deep underwater image and video enhancement. Pattern Recognit. 2020, 98, 107038. [Google Scholar] [CrossRef]
- Li, C.; Guo, C.; Ren, W.; Cong, R.; Hou, J.; Kwong, S.; Tao, D. An underwater image enhancement benchmark dataset and beyond. IEEE Trans. Image Process. 2019, 29, 4376–4389. [Google Scholar] [CrossRef]
- Li, C.; Guo, J.; Guo, C. Emerging from water: Underwater image color correction based on weakly supervised color transfer. IEEE Signal Process. Lett. 2018, 25, 323–327. [Google Scholar] [CrossRef]
- Li, Q.Z.; Bai, W.X.; Niu, J. Underwater image color correction and enhancement based on improved cycle-consistent generative adversarial networks. Acta Autom. Sin. 2020, 46, 1–11. [Google Scholar]
- Chen, B.; Zhang, X.; Wang, R.; Li, Z.; Deng, W. Detect concrete cracks based on OTSU algorithm with differential image. J. Eng. 2019, 23, 9088–9091. [Google Scholar] [CrossRef]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Bakht, A.B.; Jia, Z.; Din, M.U.; Akram, W.; Saoud, L.S.; Seneviratne, L.; Lin, D.; He, S.; Hussain, I. MuLA-GAN: Multi-Level Attention GAN for Enhanced Underwater Visibility. Ecol. Inform. 2024, 81, 102631. [Google Scholar] [CrossRef]
- Cong, R.; Yang, W.; Zhang, W.; Li, C.; Guo, C.-L.; Huang, Q.; Kwong, S. Pugan: Physical model-guided underwater image enhancement using gan with dual-discriminators. IEEE Trans. Image Process. 2023, 32, 4472–4485. [Google Scholar] [CrossRef] [PubMed]
- Guan, M.; Xu, H.; Jiang, G.; Yu, M.; Chen, Y.; Luo, T.; Zhang, X. DiffWater: Underwater image enhancement based on conditional denoising diffusion probabilistic model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 2319–2335. [Google Scholar] [CrossRef]
- Vaswani, A. Attention is all you need. In Advances in Neural Information Processing Systems; NIPS: La Jolla, CA, USA, 2017. [Google Scholar]
- Gu, A.; Goel, K.; Re, C. Efficiently Modeling Long Sequences with Structured State Spaces. arXiv 2022, arXiv:2111.00396. [Google Scholar] [CrossRef]
- Gu, A.; Dao, T. Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv 2023, arXiv:2312.00752. [Google Scholar] [CrossRef]
- Zhu, L.; Liao, B.; Zhang, Q.; Wang, X.; Liu, W.; Wang, X. Vision mamba: Efficient visual representation learning with bidirectional state space model. arXiv 2024, arXiv:2401.09417. [Google Scholar] [CrossRef]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Jiao, J.; Liu, Y. Vmamba: Visual state space model. Adv. Neural Inf. Process. Syst. 2024, 37, 103031–103063. [Google Scholar]
- Dao, T.; Gu, A. Transformers are SSMs: Generalized models and efficient algorithms through structured state space duality. arXiv 2024, arXiv:2405.21060. [Google Scholar] [CrossRef]
- Liu, S.H.; Yu, W.H.; Tan, Z.X.; Wang, X.C. Linfusion: 1 gpu, 1 minute, 16k image. arXiv 2024, arXiv:2409.02097. [Google Scholar] [CrossRef]
- Lee, S.; Choi, J.; Kim, H.J. Efficientvim: Efficient vision mamba with hidden state mixer based state space duality. In Proceedings of the Computer Vision and Pattern Recognition Conference, Nashville, TN, USA, 10–17 June 2025; pp. 14923–14933. [Google Scholar]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Islam, M.J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
- Liu, R.; Fan, X.; Zhu, M.; Hou, M.; Luo, Z. Real-world underwater enhancement: Challenges, benchmarks, and solutions under natural light. IEEE Trans. Circuits Syst. Video Technol. 2020, 30, 4861–4875. [Google Scholar] [CrossRef]
- Panetta, K.; Gao, C.; Agaian, S. Human-visual-system-inspired underwater image quality measures. IEEE J. Ocean. Eng. 2015, 41, 541–551. [Google Scholar] [CrossRef]
- Yang, M.; Sowmya, A. An underwater color image quality evaluation metric. IEEE Trans. Image Process. 2015, 24, 6062–6071. [Google Scholar] [CrossRef]
- Wang, Y.; Guo, J.; He, W.; Gao, H.; Yue, H.; Zhang, Z.; Li, C. Is underwater image enhancement all object detectors need? IEEE J. Ocean. Eng. 2024, 49, 606–621. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Lei, T.; Jia, X.; Zhang, Y.; Liu, S.; Meng, H.; Nandi, A.K. Superpixel-based fast fuzzy C-means clustering for color image segmentation. IEEE Trans. Fuzzy Syst. 2018, 27, 1753–1766. [Google Scholar] [CrossRef]
- Canny, J. A computational approach to edge detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, 6, 679–698. [Google Scholar] [CrossRef]
Method | SSIM ↑ | PSNR ↑ | UIQM ↑ | UCIQE ↑ | AG ↑ |
---|---|---|---|---|---|
CLAHE | 0.7190 | 17.3471 | 1.0293 | 0.4429 | 10.5692 |
UCM | 0.7177 | 16.6368 | 1.0138 | 0.4927 | 10.9695 |
UDCP | 0.6439 | 12.1219 | 1.3371 | 0.5230 | 9.8035 |
UWCNN | 0.7066 | 19.0402 | 0.5164 | 0.3707 | 6.7591 |
CycleGAN | 0.7554 | 20.4743 | 0.7244 | 0.4504 | 11.1365 |
SSIM-CycleGAN | 0.7717 | 21.8249 | 0.8319 | 0.4618 | 11.3564 |
SESS-CycleGAN | 0.7593 | 21.5742 | 0.8330 | 0.4608 | 11.3594 |
MuLA-GAN | 0.7759 | 23.6261 | 0.8325 | 0.4542 | 10.0260 |
DiffWater | 0.7839 | 22.3570 | 0.7863 | 0.4387 | 12.1463 |
PUGAN | 0.7714 | 24.3907 | 0.7921 | 0.4628 | 9.2452 |
ours | 0.8012 | 25.1418 | 0.9256 | 0.4893 | 13.2751 |
Method | SSIM ↑ | PSNR ↑ | UIQM ↑ | UCIQE ↑ | AG ↑ |
---|---|---|---|---|---|
CLAHE | 0.7295 | 17.4002 | 1.0586 | 0.4597 | 8.3591 |
UCM | 0.7573 | 18.6918 | 0.9554 | 0.4745 | 8.6528 |
UDCP | 0.5799 | 14.4813 | 1.1924 | 0.5204 | 7.3796 |
UWCNN | 0.7437 | 18.7315 | 0.5329 | 0.3958 | 6.0778 |
CycleGAN | 0.7635 | 19.7940 | 0.7643 | 0.4457 | 9.0708 |
SSIM-CycleGAN | 0.7771 | 20.7512 | 0.7989 | 0.4482 | 9.5739 |
SESS-CycleGAN | 0.7747 | 20.7194 | 0.7775 | 0.4454 | 9.4611 |
MuLA-GAN | 0.7730 | 19.2041 | 0.7819 | 0.4571 | 9.9494 |
DiffWater | 0.7923 | 20.7983 | 0.7659 | 0.4126 | 10.8531 |
PUGAN | 0.7647 | 23.6889 | 0.8178 | 0.4495 | 9.0193 |
ours | 0.8128 | 24.7524 | 0.8763 | 0.4694 | 12.3856 |
Method | UIQM ↑ | UCIQE ↑ | AG ↑ |
---|---|---|---|
CLAHE | 0.4619 | 0.3409 | 7.8120 |
UCM | 0.7004 | 0.3932 | 8.6002 |
UDCP | 0.9736 | 0.4227 | 6.1208 |
UWCNN | 0.5164 | 0.3707 | 6.7591 |
CycleGAN | 0.5765 | 0.3618 | 7.8345 |
SSIM-CycleGAN | 0.6150 | 0.3638 | 8.2624 |
SESS-CycleGAN | 0.6209 | 0.3627 | 8.4896 |
MuLA-GAN | 0.6394 | 0.3688 | 8.3231 |
DiffWater | 0.6125 | 0.3492 | 8.9647 |
PUGAN | 0.6288 | 0.3715 | 8.7618 |
ours | 0.6819 | 0.3914 | 9.8128 |
Experiments | Mamba | SC | SSIM ↑ | PSNR ↑ | UIQM ↑ | UCIQE ↑ | AG ↑ |
---|---|---|---|---|---|---|---|
T1 | — | — | 0.7554 | 20.4743 | 0.7244 | 0.4504 | 11.1365 |
T2 | √ | — | 0.7831 | 22.7584 | 0.8748 | 0.4729 | 12.1372 |
T3 | — | √ | 0.7915 | 23.6293 | 0.8561 | 0.4688 | 12.6835 |
T4 | √ | √ | 0.8012 | 25.1418 | 0.9256 | 0.4893 | 13.2751 |
Model/Method | Params (M) ↓ | FLOPs (G) ↓ | FPS (Hz) ↑ |
---|---|---|---|
CLAHE | — | — | 154.2 |
UCM | — | — | 128.4 |
UDCP | — | — | 142.3 |
UWCNN | 1.1 | 3.08 | 123.7 |
CycleGAN | 11.4 | 58.2 | 99.2 |
SSIM-CycleGAN | 13.6 | 63.3 | 89.4 |
SESS-CycleGAN | 13.1 | 62.7 | 90.9 |
MuLA-GAN | 17.3 | 30.2 | 117.8 |
DiffWater | 155.2 | 155.3 | 11.8 |
PUGAN | 95.7 | 72.5 | 62.5 |
ours | 12.1 | 60.5 | 96.5 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Wu, J.; Zhang, G.; Fan, Y. MambaRA-GAN: Underwater Image Enhancement via Mamba and Intra-Domain Reconstruction Autoencoder. J. Mar. Sci. Eng. 2025, 13, 1745. https://doi.org/10.3390/jmse13091745
Wu J, Zhang G, Fan Y. MambaRA-GAN: Underwater Image Enhancement via Mamba and Intra-Domain Reconstruction Autoencoder. Journal of Marine Science and Engineering. 2025; 13(9):1745. https://doi.org/10.3390/jmse13091745
Chicago/Turabian StyleWu, Jiangyan, Guanghui Zhang, and Yugang Fan. 2025. "MambaRA-GAN: Underwater Image Enhancement via Mamba and Intra-Domain Reconstruction Autoencoder" Journal of Marine Science and Engineering 13, no. 9: 1745. https://doi.org/10.3390/jmse13091745
APA StyleWu, J., Zhang, G., & Fan, Y. (2025). MambaRA-GAN: Underwater Image Enhancement via Mamba and Intra-Domain Reconstruction Autoencoder. Journal of Marine Science and Engineering, 13(9), 1745. https://doi.org/10.3390/jmse13091745