SARM: Scene-Aware Retinex Mamba for Underwater Image Enhancement
Highlights
- SARM achieves the deep integration of Retinex physical priors and state space models (SSMs), providing a new perspective for self-supervised underwater image enhancement.
- Scene-aware adaptation and global linear complexity modeling techniques yield significant image quality improvements across multiple underwater visual benchmarks.
- The prior-guided mechanism provides an effective paradigm for tackling paired data scarcity and highly heterogeneous degradations in real waters, highlighting the importance of physical laws in deep feature modeling.
- The framework proposes an efficient and highly generalizable enhancement strategy that can serve as a visual preprocessing front-end for marine edge devices, enabling stable performance gains in downstream underwater tasks such as feature matching and edge extraction.
Abstract
1. Introduction
- A prior-guided self-supervised underwater image enhancement framework, SARM, is proposed. By innovatively embedding the Retinex physical prior into the Mamba architecture, this framework effectively balances the trade-off between long-range dependency modeling and computational overhead without the need for real paired training data. Consequently, it offers an efficient enhancement paradigm for underwater platforms with limited computational resources.
- An illumination decoupling mechanism based on multi-color space analysis and adaptive residual modulation is proposed. To address the issue where the direct division operation in traditional Retinex models is prone to causing numerical instability and amplifying high-frequency noise in dark waters, we introduce a residual modulation strategy in the feature domain. By leveraging the complementary features of the RGB, LAB, and HSV color spaces, this mechanism achieves a more stable decoupling of water attenuation from the underlying reflectance, thereby providing a robust illumination prior for the denoising network.
- A Retinex-Mamba global denoising architecture with linear computational complexity is constructed. To tackle the prohibitively high inference latency of existing global models, our method injects photometric priors into the state space model. Utilizing the 2D Selective Scan (SS2D) mechanism and a dual-track parallel encoding strategy, the network achieves global context modeling while maintaining an computational complexity. This effectively removes underwater color casts and suppresses non-uniform scattering noise.
- An unpaired data-driven Scene-Aware Adapter (SAA) and a dual adaptive routing mechanism are designed. To alleviate the challenges of model generalization in highly heterogeneous waters, we utilize high-quality physical pseudo-labels to provide self-supervised constraints during the training phase. Concurrently, by quantitatively assessing the degradation characteristics of the samples (e.g., color cast, low illumination, and blur), we implement dynamic loss scheduling and feature gating, significantly enhancing the model’s scene adaptability in unknown dynamic marine environments.
- An effective balance between high inference speed and high-quality visual perception is achieved. Extensive experiments on benchmark datasets including UIEB, EUVP, and UCCS demonstrate that, under the unpaired data setting, SARM achieves superior subjective and objective image quality (e.g., CCF 35.76, URanker 2.491). Furthermore, the model achieves an ultra-fast inference speed of 136.52 FPS and substantially improves the performance of downstream underwater vision tasks, such as feature matching, demonstrating significant potential for real-world deployment.
2. Related Work
2.1. Traditional Underwater Image Enhancement Methods
2.1.1. Physical Model-Based Methods
2.1.2. Non-Physical Model-Based Methods
2.2. Deep Learning-Based Underwater Image Enhancement Methods
3. Methodolgy
3.1. Overall Architecture
3.2. Illumination Estimator
- (1)
- Multi-Color Space Feature Extraction
- (2)
- Adaptive Color Cast Correction and Prior Mapping
- (3)
- Adaptive Illumination Residual Modulation
3.3. Mamba Denoiser
- (1)
- Dual-Track Parallel Encoding
- (2)
- Physical Prior Injection
- (3)
- Two-Dimensional Selective Scan and Dynamic Gating
3.4. Scene-Aware Adapter
- (1)
- Physical Degradation Quantization and Index Modeling
- (2)
- Global Anchor Matching and Dynamic Loss Scheduling
- (3)
- Adaptive Feature Gated Fusion in the Inference Stage
3.5. Pseudo-Label Generation and Caching
- (1)
- Multi-Scale Prior Fusion
- (2)
- Memory Hash Caching
3.6. Dynamic Composite Loss
- (1)
- Multi-Dimensional Perceptual Constraints
- (2)
- Structural Clarity Loss
- (3)
- Scene-Adaptive Dynamic Total Loss
4. Experiments
4.1. Experimental Settings
- UCIQE and UIQM: As the most classical quantitative standards in the field of underwater vision, UCIQE utilizes a linear combination of chroma standard deviation (), luminance contrast (), and saturation mean () to quantify chromatic distortion, calculated as ; meanwhile, UIQM (Underwater Image Quality Measure) performs a weighted summation of three components, namely color richness (UICM), sharpness (UISM), and contrast (UIConM), with its typical formulation being . Both indices comprehensively reflect the fundamental capabilities of the model in color constancy and contrast stretching.
- CCF and FDUM: To further evaluate dehazing and detail fidelity, the Colorfulness–Contrast–Fog (CCF) index is introduced, which penalizes residual color casts by evaluating the mapping relationship between local chromatic variance and fog density; FDUM objectively quantifies the model’s suppression effect on mid-to-low-frequency blur caused by underwater suspended particles by analyzing the high-frequency energy distribution in the frequency domain.
- URanker: Considering that traditional hand-crafted metrics are prone to evaluation bias when evaluating complex generative architectures, this paper additionally introduces URanker, a perceptual metric benchmark based on deep neural networks. By undergoing alignment training on a large-scale human visual preference dataset, this model can reflect genuine human subjective aesthetics more accurately and robustly.
4.2. Comparison with State-of-the-Art Methods
4.2.1. Quantitative Analysis
4.2.2. Computational Efficiency Analysis
4.2.3. Qualitative Analysis
- Accurate Color Correction and Artifact Suppression: When processing highly heterogeneous color-cast waters, existing methods frequently exhibit insufficient correction or introduce unnatural tones. For example, Phaseformer and NU2Net still retain a certain cyan-green color cast when processing blue-cast images, while Fusion and HFM significantly improve contrast, though they introduce reddish or purplish color artifacts in the background water body or on rocks (e.g., the lobster background in Figure 5). Observing the scenes in the 5th and 6th rows of Figure 4, reddish or orange tones also appear in the results of DCD and CLIP-UIE. In contrast, SARM utilizes multi-color space illumination estimation and the Retinex mechanism to successfully separate the selective absorption effect of the water body. This enables SARM to suppress color shifts while restoring the grayish-white color of rocks and the white sand background.
- Dehazing and Detail Enhancement: Backward scattering in highly turbid waters drastically reduces image visibility. Conventional methods (such as Phaseformer and SHR) often exhibit incomplete dehazing and an overall dark visual effect in such scenes (e.g., the diver scene in the 2nd row of Figure 3). Meanwhile, models constrained by local receptive fields are prone to losing texture details when processing complex topological structures. Leveraging the global context modeling capability of the State Space Model (Mamba) combined with the dynamic weight adjustment of the Scene-Aware Adapter (SAA), SARM eliminates the global fogging phenomenon. As can be seen from the 4th row of Figure 3, the model maintains smooth spatial transitions while improving the distinguishability of high-frequency textures on rock surfaces.
- Physical Realism and Over-exposure Suppression: In shallow-water highlight or complex light source scenes, some deep models lacking physical constraints are prone to highlight clipping when forcibly stretching local contrast (e.g., the over-exposure phenomenon of CLIP-UIE in certain regions). Due to the introduction of prior-based physical pseudo-labels to provide stable regularization supervision, SARM better follows the distribution laws of natural underwater illumination when brightening dark details (e.g., the dark parts of the lobster), avoiding over-saturation and local whitening. This visual performance also lays the foundation for the subsequent improvement of comprehensive perceptual metrics.
4.3. Ablation Study
4.3.1. Architecture Analysis
4.3.2. Ablation of Core Modules
- (1) Illumination Estimator: After removing this module (w/o Retinex, Row 4), the model’s frequency-domain uniformity metric FDUM decreases from 0.6340 to 0.6219. This decline indicates that, without an explicit illumination decomposition mechanism, the network’s ability to process non-uniform underwater illumination fields (such as local light spots or depth attenuation) is somewhat limited. By extracting illumination priors, the illumination estimation module assists the network in maintaining the global structure of the image to a certain extent, alleviating local over-exposure or the loss of texture in dark regions.
- (2) Scene-Aware Adapter: When this adapter is removed (w/o Scene, Row 5), the color correlation metric CCF decreases from 35.76 in the Full Model to 27.45. This performance degradation indicates that static, fixed loss weights have limited generalization capability when dealing with highly heterogeneous natural water body distributions. Facing underwater environments with different turbidities and color cast tendencies, a single optimization objective struggles to achieve an adaptive balance between dehazing intensity and color correction. By constructing a dynamic mapping between degradation features and loss weights, this adapter enhances the network’s targeted adjustment capability for complex scenes, thereby improving the final color restoration.
- (3) Physical Pseudo-Label Constraints: In the absence of real paired data, the selection of supervision signals has a decisive impact on the generation results. After removing the physical pseudo-labels (w/o Pseudo, Row 6), the model’s CCF metric drops to 23.70, and the perceptual metric URanker also decreases from 2.491 to 2.044. This data comparison demonstrates that relying solely on conventional self-supervised adversarial learning is prone to producing uncontrolled color shifts and local distortions during the feature mapping process. Introducing pseudo-labels generated based on physical fusion mechanisms provides the network with relatively reliable physical regularization constraints. Consequently, while improving local contrast, it better maintains a color distribution that conforms to natural laws.
4.4. Application Study
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Hsieh, Y.Z.; Chang, M.C. Underwater image enhancement and attenuation restoration based on depth and backscatter estimation. IEEE Trans. Comput. Imaging 2025, 11, 321–332. [Google Scholar] [CrossRef]
- Liang, Y.; Li, L.; Zhou, Z.; Tian, L.; Xiao, X.; Zhang, H. Underwater image enhancement via adaptive bi-level color-based adjustment. IEEE Trans. Instrum. Meas. 2025, 74, 5018916. [Google Scholar] [CrossRef]
- Zhang, W.; Liu, Q.; Lu, H.; Wang, J.; Liang, J. Underwater image enhancement via wavelet decomposition fusion of advantage contrast. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 7807–7820. [Google Scholar] [CrossRef]
- Liu, S.; Zheng, Y.; Li, J.; Lu, H.; An, D.; Shen, Z.; Wang, Z. Turbid underwater image enhancement with illumination-constrained and structure-preserved retinex model. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 10844–10861. [Google Scholar] [CrossRef]
- Chang, H.H.; Kuan, P.Y. Underwater image enhancement using illuminant intensity compensation with foreground edge map rectification. IEEE J. Ocean. Eng. 2025, 50, 835–850. [Google Scholar] [CrossRef]
- Zhou, J.; Wang, S.; Lin, Z.; Jiang, Q.; Sohel, F. A pixel distribution remapping and multi-prior retinex variational model for underwater image enhancement. IEEE Trans. Multimed. 2024, 26, 7838–7849. [Google Scholar] [CrossRef]
- Kong, D.; Zhang, Y.; Zhao, X.; Wang, Y.; Cai, L. MUFFNet: Lightweight dynamic underwater image enhancement network based on multi-scale frequency. Front. Mar. Sci. 2025, 12, 1541265. [Google Scholar] [CrossRef]
- Wu, Z.; Ji, P.; Chen, K.; Gao, F.; Zhao, H.; Sun, X. MSCT: Multi-Scale Conv-Transformer for Underwater Image Enhancement. IEEE Multimed. 2025, 32, 105–114. [Google Scholar] [CrossRef]
- Lu, L.; Wu, D.; Wang, L.; Zhang, W.; Liu, T. Underwater image enhancement based on transformer, attention and multi-color-space inputs. IEEE Access 2025, 13, 103682–103696. [Google Scholar] [CrossRef]
- Liu, X.; Xu, H.; Ju, Y.; Wang, S.; Liu, C.; Chen, L. DA-GAN: Dual-Attention GAN for Underwater Image Enhancement with Contrast and Color Correction. IEEE Trans. Geosci. Remote Sens. 2026, 64, 4200816. [Google Scholar] [CrossRef]
- Kong, D.; Mao, J.; Zhang, Y.; Zhao, X.; Wang, Y.; Wang, S. Dual-Domain Adaptive Synergy GAN for Enhancing Low-Light Underwater Images. J. Mar. Sci. Eng. 2025, 13, 1092. [Google Scholar] [CrossRef]
- Bi, H.; Chen, L.; Cao, J.; Wang, J.; Sun, J.; Rao, Y.; Dong, J. SeaDiff: Underwater image enhancement with degradation-aware diffusion model. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 12212–12226. [Google Scholar] [CrossRef]
- Ou, Y.; Esmaeilzehi, A.; Ahmad, M.O.; Swamy, M. UADiff: A deep underwater image enhancement network using generative diffusion prior and uncertainty-aware learning. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4208114. [Google Scholar] [CrossRef]
- Cao, J.; Zeng, Z.; Zhang, X.; Zhang, H.; Fan, C.; Jiang, G.; Lin, W. Unveiling the underwater world: CLIP perception model-guided underwater image enhancement. Pattern Recognit. 2025, 162, 111395. [Google Scholar] [CrossRef]
- Liu, S.; Li, K.; Ding, Y.; Qi, Q. Underwater image enhancement by diffusion model with customized clip-classifier. Pattern Recognit. 2025, 112232. [Google Scholar] [CrossRef]
- Zhang, Y.; Yu, X.; Cai, Z. Uwmambanet: Dual-branch underwater image reconstruction based on w-shaped mamba. Mathematics 2025, 13, 2153. [Google Scholar] [CrossRef]
- Fang, Y.; Sun, H.; Li, Y.; Yuan, S.; Zhao, F. Symmetry-Constrained Dual-Path Physics-Guided Mamba Network: Balancing Performance and Efficiency in Underwater Image Enhancement. Symmetry 2025, 17, 1742. [Google Scholar] [CrossRef]
- Pramanick, A.; Roy, S.; Sur, A. D2Mamba: Dual Domain Guided Informed Search in State Space Model for Underwater Image Enhancement. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Tucson, AZ, USA, 6–10 March 2026; pp. 7126–7136. [Google Scholar]
- Luan, X.; Fan, H.; Wang, Q.; Yang, N.; Liu, S.; Li, X.; Tang, Y. FMambaIR: A hybrid state-space model and frequency domain for image restoration. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4201614. [Google Scholar] [CrossRef]
- Jaffe, J.S. Underwater optical imaging: The past, the present, and the prospects. IEEE J. Ocean. Eng. 2014, 40, 683–700. [Google Scholar] [CrossRef]
- Duntley, S.Q. Light in the sea. J. Opt. Soc. Am. 1963, 53, 214–233. [Google Scholar] [CrossRef]
- Drews, P.; Nascimento, E.; Moraes, F.; Botelho, S.; Campos, M. Transmission estimation in underwater single images. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Sydney, Australia, 1–8 December 2013; pp. 825–830. [Google Scholar]
- Akkaynak, D.; Treibitz, T. Sea-thru: A method for removing water from underwater images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1682–1691. [Google Scholar]
- Ancuti, C.O.; Ancuti, C.; De Vleeschouwer, C.; Bekaert, P. Color balance and fusion for underwater image enhancement. IEEE Trans. Image Process. 2017, 27, 379–393. [Google Scholar] [CrossRef]
- Zhang, W.; Zhou, L.; Zhuang, P.; Li, G.; Pan, X.; Zhao, W.; Li, C. Underwater image enhancement via weighted wavelet visual perception fusion. IEEE Trans. Circuits Syst. Video Technol. 2023, 34, 2469–2483. [Google Scholar] [CrossRef]
- Zhao, G.; Xiao, Y.; Huang, C.; Wang, Z.; Wu, H. Underwater image enhancement via adaptive white-balancing and multi-restoration image fusion. Opt. Rev. 2025, 32, 76–92. [Google Scholar] [CrossRef]
- Li, T.; Rong, S.; Zhao, W.; Chen, L.; Liu, Y.; Zhou, H.; He, B. Underwater image enhancement using adaptive color restoration and dehazing. Opt. Express 2022, 30, 6216–6235. [Google Scholar] [CrossRef] [PubMed]
- Li, C.; Anwar, S.; Hou, J.; Cong, R.; Guo, C.; Ren, W. Underwater image enhancement via medium transmission-guided multi-color space embedding. IEEE Trans. Image Process. 2021, 30, 4985–5000. [Google Scholar] [CrossRef] [PubMed]
- Islam, M.J.; Xia, Y.; Sattar, J. Fast underwater image enhancement for improved visual perception. IEEE Robot. Autom. Lett. 2020, 5, 3227–3234. [Google Scholar] [CrossRef]
- Peng, L.; Zhu, C.; Bian, L. U-shape transformer for underwater image enhancement. IEEE Trans. Image Process. 2023, 32, 3066–3079. [Google Scholar] [CrossRef]
- Wang, Z.; Cun, X.; Bao, J.; Zhou, W.; Liu, J.; Li, H. Uformer: A general u-shaped transformer for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 19–20 June 2022; pp. 17683–17693. [Google Scholar]
- Li, T.; Rong, S.; Chen, L.; Zhou, H.; He, B. Underwater motion deblurring based on cascaded attention mechanism. IEEE J. Ocean. Eng. 2022, 49, 262–278. [Google Scholar] [CrossRef]
- Yan, H.; Zhang, Z.; Xu, J.; Wang, T.; An, P.; Wang, A.; Duan, Y. UW-CycleGAN: Model-driven CycleGAN for underwater image restoration. IEEE Trans. Geosci. Remote Sens. 2023, 61, 4207517. [Google Scholar] [CrossRef]
- Zhang, L.; Chen, Y.; Lan, J.; Niu, Y. MSSCE-GAN: Multi-Scale Structural and Color Enhanced Generative Adversarial Network for Unpaired Underwater Image Enhancement. In Proceedings of the 2023 5th International Conference on Frontiers Technology of Information and Computer (ICFTIC), Qingdao, China, 17–19 November 2023; pp. 837–841. [Google Scholar]
- Guan, M.; Xu, H.; Jiang, G.; Yu, M.; Chen, Y.; Luo, T.; Zhang, X. DiffWater: Underwater image enhancement based on conditional denoising diffusion probabilistic model. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 17, 2319–2335. [Google Scholar] [CrossRef]
- Song, J.; Xu, H.; Jiang, G.; Yu, M.; Chen, Y.; Luo, T.; Song, Y. Frequency domain-based latent diffusion model for underwater image enhancement. Pattern Recognit. 2025, 160, 111198. [Google Scholar] [CrossRef]
- Gu, A.; Goel, K.; Ré, C. Efficiently modeling long sequences with structured state spaces. arXiv 2021, arXiv:2111.00396. [Google Scholar]
- Liu, Y.; Tian, Y.; Zhao, Y.; Yu, H.; Xie, L.; Wang, Y.; Ye, Q.; Jiao, J.; Liu, Y. Vmamba: Visual state space model. Adv. Neural Inf. Process. Syst. 2024, 37, 103031–103063. [Google Scholar]
- Hou, X.; Zhang, L. Saliency detection: A spectral residual approach. In Proceedings of the 2007 IEEE Conference on Computer Vision and Pattern Recognition, Minneapolis, MN, USA, 17–22 June 2007; pp. 1–8. [Google Scholar]
- Sauvola, J.; Pietikäinen, M. Adaptive document image binarization. Pattern Recognit. 2000, 33, 225–236. [Google Scholar] [CrossRef]
- Guo, X.; Li, Y.; Ling, H. LIME: Low-light image enhancement via illumination map estimation. IEEE Trans. Image Process. 2016, 26, 982–993. [Google Scholar] [CrossRef] [PubMed]
- Reinhard, E.; Stark, M.; Shirley, P.; Ferwerda, J. Photographic tone reproduction for digital images. In Seminal Graphics Papers: Pushing the Boundaries, Volume 2; ACM: New York, NY, USA, 2023; pp. 661–670. [Google Scholar]
- An, S.; Xu, L.; Deng, Z.; Zhang, H. HFM: A hybrid fusion method for underwater image enhancement. Eng. Appl. Artif. Intell. 2024, 127, 107219. [Google Scholar] [CrossRef]
- Guo, C.; Wu, R.; Jin, X.; Han, L.; Zhang, W.; Chai, Z.; Li, C. Underwater ranker: Learn which is better and how to be better. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 702–709. [Google Scholar]
- Jiang, W.; Tan, Y.; Qiu, Z.; Wang, Z.; Yu, Y.; Jiang, Q. PyUIE: A Coarse-to-Fine Deep Pyramid Network for Underwater Image Enhancement. IEEE Trans. Multimed. 2026, 28, 3054–3067. [Google Scholar] [CrossRef]
- Huang, S.; Wang, K.; Liu, H.; Chen, J.; Li, Y. Contrastive semi-supervised learning for underwater image restoration via reliable bank. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 18145–18155. [Google Scholar]
- Zhou, J.; Sun, J.; Li, C.; Jiang, Q.; Zhou, M.; Lam, K.M.; Zhang, W.; Fu, X. HCLR-Net: Hybrid contrastive learning regularization with locally randomized perturbation for underwater image enhancement. Int. J. Comput. Vis. 2024, 132, 4132–4156. [Google Scholar] [CrossRef]
- Yu, M.; Shen, L.; Yu, Y.; Zhang, Y.; Le, R. Task-Driven Underwater Image Enhancement via Hierarchical Semantic Refinement. IEEE Trans. Image Process. 2026, 35, 42–56. [Google Scholar] [CrossRef]
- Khan, R.; Negi, A.; Kulkarni, A.; Phutke, S.S.; Vipparthi, S.K.; Murala, S. Phaseformer: Phase-based attention mechanism for underwater image restoration and beyond. In Proceedings of the Winter Conference on Applications of Computer Vision, Tucson, AZ, USA, 26 February–6 March 2025; pp. 9600–9611. [Google Scholar]
- Fan, G.; Zhou, Y.; Zhou, J.; Ju, Y.; Chen, G.Y.; Li, J.; Kot, A.C. DCD-UIE: Decoupled Chromatic Diffusion Model for Underwater Image Enhancement. IEEE Trans. Image Process. 2026, 35, 449–464. [Google Scholar] [CrossRef]










| Methods | URanker ↑ | CCF ↑ | UCIQE ↑ | UIQM ↑ | FDUM ↑ | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| UIEB | UCCS | EUVP | UIEB | UCCS | EUVP | UIEB | UCCS | EUVP | UIEB | UCCS | EUVP | UIEB | UCCS | EUVP | |
| PyUIE | 2.112 | 2.269 | 2.493 | 29.2996 | 29.4248 | 35.1891 | 0.6023 | 0.5976 | 0.6144 | 1.3508 | 1.3808 | 1.3881 | 0.5671 | 0.4702 | 0.5369 |
| Phaseformer | 1.441 | 0.272 | 1.527 | 18.1431 | 11.5598 | 17.9430 | 0.5773 | 0.4851 | 0.5791 | 1.1403 | 0.7416 | 1.0839 | 0.4462 | 0.1913 | 0.3394 |
| Fusion | 1.377 | 1.494 | 2.285 | 20.0503 | 20.9589 | 28.3084 | 0.5923 | 0.5407 | 0.5887 | 1.3454 | 1.2888 | 1.4045 | 0.5386 | 0.3278 | 0.4862 |
| CLIP-UIE | 2.371 | 2.214 | 2.712 | 34.8420 | 27.1263 | 40.8142 | 0.6227 | 0.5601 | 0.6305 | 1.4609 | 1.3152 | 1.4316 | 0.7246 | 0.4461 | 0.5893 |
| DCD | 1.232 | 1.565 | 2.667 | 29.8477 | 25.7823 | 27.3271 | 0.6201 | 0.5654 | 0.5867 | 1.2435 | 1.2636 | 1.3995 | 0.5972 | 0.4144 | 0.5950 |
| HCLR | 1.689 | 2.532 | 1.080 | 26.3193 | 19.5395 | 36.9285 | 0.6128 | 0.5288 | 0.6176 | 1.3600 | 1.1630 | 1.4106 | 0.6086 | 0.3581 | 0.5547 |
| HFM | 1.192 | 2.016 | 2.807 | 32.2691 | 32.9097 | 27.3271 | 0.6269 | 0.5741 | 0.5867 | 1.4786 | 1.4035 | 1.3995 | 0.6481 | 0.4311 | 0.5950 |
| NU2Net | 1.753 | 1.632 | 2.568 | 20.6933 | 20.1309 | 29.1227 | 0.5984 | 0.5534 | 0.6056 | 1.2588 | 1.2152 | 1.3886 | 0.5095 | 0.3697 | 0.5234 |
| Semi-UIR | 1.684 | 1.389 | 2.689 | 27.8894 | 21.7851 | 40.0093 | 0.6166 | 0.5537 | 0.6174 | 1.3860 | 1.2728 | 1.4713 | 0.6308 | 0.4130 | 0.6247 |
| HSR | 1.906 | 1.844 | 2.438 | 22.8217 | 19.8717 | 27.6385 | 0.5688 | 0.5085 | 0.5879 | 1.3574 | 1.0659 | 1.3480 | 0.5406 | 0.2354 | 0.4604 |
| WWPF | 2.050 | 2.179 | 2.818 | 38.7457 | 34.0076 | 51.6859 | 0.6142 | 0.5853 | 0.6175 | 1.5273 | 1.4581 | 0.4815 | 0.7047 | 0.5080 | 0.6390 |
| D2Mamba | 2.105 | 1.980 | 1.900 | 25.7562 | 22.9233 | 27.1353 | 0.6012 | 0.5565 | 0.5823 | 1.4937 | 1.3074 | 1.4097 | 0.6904 | 0.4027 | 0.5548 |
| FmambaIR | 2.260 | 1.869 | 2.576 | 32.9522 | 22.5091 | 37.1278 | 0.6194 | 0.5402 | 0.6271 | 1.5237 | 1.2084 | 1.4112 | 0.7825 | 0.3802 | 0.5697 |
| SARM (Ours) | 2.491 | 2.804 | 3.100 | 35.7609 | 38.2005 | 52.5657 | 0.6214 | 0.5615 | 0.6423 | 1.4758 | 1.4031 | 1.4491 | 0.6340 | 0.4033 | 0.5956 |
| Metrics | DCD | CLIP-UIE | HCLR | HSR | Semi-UIR | Phaseformer | D2Mamba | FmambaIR | SARM (Ours) |
|---|---|---|---|---|---|---|---|---|---|
| Params (M) | 97.82 | 97.81 | 4.87 | 14.56 | 1.67 | 1.77 | 4.25 | 4.20 | 8.10 |
| FLOPs (G) | 360.13 | 358.67 | 5651.99 | 386.75 | 36.43 | 13.00 | 19.37 | 11.81 | 37.85 |
| ID | Method/Variant | Backbone | Fidelity | Color | Perceptual | Efficiency | ||
|---|---|---|---|---|---|---|---|---|
| FDUM ↑ | UCIQE ↑ | CCF ↑ | UIQM ↑ | URanker ↑ | FPS ↑ | |||
| 1 | Baseline-CNN | U-Net | 0.5996 | 0.5992 | 27.3529 | 1.3984 | 2.214 | 115.42 |
| 2 | Baseline-Trans | Swin-T | 0.6068 | 0.5991 | 28.2320 | 1.4021 | 2.225 | 31.29 |
| 3 | Base-Mamba | SSM | 0.5583 | 0.5752 | 25.6510 | 1.4117 | 2.187 | 142.35 |
| 4 | w/o Estimator | SSM | 0.6219 | 0.6067 | 34.7141 | 1.4746 | 2.499 | 136.80 |
| 5 | w/o SAA | SSM | 0.6303 | 0.6089 | 27.4480 | 1.4711 | 2.514 | 136.25 |
| 6 | w/o Pseudo | SSM | 0.5994 | 0.5930 | 23.7037 | 1.3790 | 2.044 | 135.92 |
| 7 | Ours (Full) | SSM | 0.6340 | 0.6132 | 35.7609 | 1.4758 | 2.491 | 136.52 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Fu, Z.; Yang, S.; Sun, A.; Xiong, R.; Chen, N. SARM: Scene-Aware Retinex Mamba for Underwater Image Enhancement. Remote Sens. 2026, 18, 1652. https://doi.org/10.3390/rs18101652
Fu Z, Yang S, Sun A, Xiong R, Chen N. SARM: Scene-Aware Retinex Mamba for Underwater Image Enhancement. Remote Sensing. 2026; 18(10):1652. https://doi.org/10.3390/rs18101652
Chicago/Turabian StyleFu, Zhanbo, Shuang Yang, Aiguo Sun, Rongjun Xiong, and Nengcheng Chen. 2026. "SARM: Scene-Aware Retinex Mamba for Underwater Image Enhancement" Remote Sensing 18, no. 10: 1652. https://doi.org/10.3390/rs18101652
APA StyleFu, Z., Yang, S., Sun, A., Xiong, R., & Chen, N. (2026). SARM: Scene-Aware Retinex Mamba for Underwater Image Enhancement. Remote Sensing, 18(10), 1652. https://doi.org/10.3390/rs18101652

