Hierarchical Scale-Adaptive Diffusion Priors for Efficient Remote Sensing Dehazing
Highlights
- Hierarchical Diffusion Prior Representation is developed to decompose global diffusion latents into multi-scale embeddings, enabling fine-grained and scale-aware restoration.
- A Scale-Adaptive Prior Injection mechanism is introduced to dynamically modulate prior contributions across feature levels, improving feature utilization and robustness.
- The proposed method significantly improves dehazing performance under heavy and spatially variant haze, achieving superior quantitative metrics and visual quality.
- It provides an efficient and robust solution for remote sensing image restoration, enhancing the reliability of downstream Earth observation applications.
Abstract
1. Introduction
1.1. Background and Challenge
1.2. Existing Methods and Limitations
1.3. The Rise of Diffusion Models and the Gap
1.4. Proposed Solution
- Hierarchical Diffusion Prior Representation: The global diffusion latent is decomposed into multi-scale representations. This strategy aligns the generative priors with the hierarchical feature maps of the restoration network, enabling fine-grained guidance without increasing latent dimensionality.
- Scale-Adaptive Injection Mechanism: A learnable modulation module is introduced to adaptively re-weight the influence of diffusion priors across different feature scales. This mechanism enables the network to autonomously optimize the utilization of priors, enhancing robustness against non-uniform haze.
- Highly Competitive Performance on Remote Sensing Benchmarks: Extensive experiments demonstrate that the proposed method generally outperforms the baseline, DiffIR and other methods. The framework yields robust visual quality and strong quantitative metrics (particularly PSNR), demonstrating exceptional reliability in challenging scenarios with heavy and spatially variant haze.
2. Related Work
2.1. Remote Sensing Image Dehazing
2.2. Diffusion Models for Image Restoration
2.3. Prior Injection and Multi-Scale Feature Modeling
3. Methodology
3.1. Problem Formulation
3.2. Preliminaries: DiffIR Framework
- Compact Latent Representation: Unlike standard diffusion methods that operate in the high-dimensional pixel space (or substantial latent space like Latent Diffusion), DiffIR first learns a highly compact latent space to encapsulate the global structural and semantic information of ground-truth (GT) images. Let denote the hazy image, and denote the clean GT image. An encoder E maps and to a low-dimensional latent vector :This serves as the target for the diffusion model. Since the dimension C is small, the diffusion process becomes extremely efficient.
- Conditional Diffusion for Prior Generation: DiffIR trains a conditional diffusion model to estimate this latent code from the input hazy image .Forward Process: A Markov chain gradually adds Gaussian noise to over T steps:where represents the predefined variance schedule and I denotes the identity matrix.Reverse Process: The diffusion model learns to denoise the latent variable, conditioned on the degradation features extracted from the hazy image . The training objective is to minimize the noise prediction error:where represents the condition features extracted from . During inference, the model samples a random noise and iteratively denoises it to obtain the estimated latent code . This is termed the Image Prior Representation (IPR).
- IPR-Guided Restoration: The estimated IPR () contains rich global priors but lacks high-frequency spatial details due to its compactness. Therefore, it is injected into a deterministic restoration network, DIRformer, which performs the final pixel-wise reconstruction:While DiffIR achieves efficiency, it treats the IPR as a holistic, scale-agnostic vector. The same vector guides both the shallow layers (processing fine textures) and deep layers (processing semantic shapes) of the DIRformer uniformly. As discussed in Section 1, this design is suboptimal for remote sensing dehazing, where degradation is intrinsically hierarchical. This limitation motivates our proposed HS-DiffIR framework.
3.3. Hierarchical Image Prior Decomposition (H-IPR)
3.4. Scale-Adaptive Prior Injection (S-API)
- Learnable Injection Strength: For each scale k, we define a learnable gating parameter . We obtain the modulated prior as:The Sigmoid function ensures the modulation strength is bounded in . Crucially, we initialize to a small negative value, ensuring that the network starts with minimal prior influence and gradually learns to incorporate the generative guidance where necessary.
- Injection to DIRformer: The DIRformer module (Dynamic IR Transformer) from the original DiffIR framework is leveraged. Let denote the input feature map of the i-th stage. In the proposed framework, instead of using the shared , the stage-specific modulated prior is employed as input to modulate the image features . The injection process at stage i is formulated aswhere and represent linear projection layers tailored to scale k for generating scale and shift parameters. The replacement of the global input with the hierarchical guarantees that the guidance received by the i-th stage is structurally aligned (via ) and intensity-calibrated (via ).
3.5. Training Strategy
- Stage 1: Pretraining for Hierarchical Prior-Guided Restoration:In the first stage, training aims to optimize the restoration network (DIRformer) for effective prior utilization. An optimal latent code is extracted from ground-truth (GT) images to serve as supervision, facilitating the learning of an accurate restoration mapping.Process: A compact prior extraction network, denoted by , is employed. This network takes the concatenation of the ground-truth image and the hazy image as input to produce a reference global latent :Crucially, in this stage, the proposed Hierarchical Decomposition and Scale-Adaptive Injection modules are integrated into the framework.Optimization: The parameters of , the DIRformer, and our proposed modules (projectors and scalars ) are jointly optimized. The loss function minimizes the distance between the restored image and the ground truth:Note: By the end of Stage 1, the DIRformer has learned to effectively utilize hierarchically decomposed priors, and has learned to encode the essential image manifold into a compact space.
- Stage 2: Training the Diffusion Model for Prior Estimation:In the second stage, the diffusion model is trained to estimate the target latent solely from degraded inputs, as is unavailable during inference.Configuration: With the parameters of from Stage 1 frozen, a second extraction network, (receiving solely ), and a denoising network are employed.Forward Process: This paper uses the frozen to extract the target latent from the training pair. Then we diffuse into noise via the standard Gaussian transition .Reverse Process (Training): The denoising network is trained to predict the noise added to the latent. It is conditioned on a vector D extracted from the hazy image via :The optimization objective follows the standard diffusion loss:where t is the time step and is the noisy latent.Inference: During the inference phase, the trained diffusion model ( and ) is utilized to generate a predicted latent . This is then passed through the frozen hierarchical projectors and injection modules (learned in Stage 1) to guide the DIRformer.
4. Experiments
4.1. Datasets and Implementation Details
4.2. Comparison with Existing Methods
- Qualitative Comparison. Figure 2 presents a visual comparison of the Sate1K Thick remote sensing scene characterized by highly non-uniform haze distribution, where the upper-left region is covered by dense fog while the lower right remains relatively clear. This scenario poses a significant challenge for scale-agnostic methods.
- (1)
- Failure of Traditional and CNN Methods
- (2)
- Limitations of Transformer and Baseline
- (3)
- Superiority of HS-DiffIR
4.3. Ablation Study
4.4. Analysis of Hierarchical Disentanglement
4.5. Efficiency Analysis
5. Discussion
5.1. Mechanism of Hierarchical Disentanglement
5.2. Interpreting Scale-Adaptive Calibration
5.3. Limitations and Future Prospects
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.-H. Learning enriched features for fast image restoration and enhancement. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 1934–1948. [Google Scholar] [CrossRef]
- Zou, X.; Li, K.; Xing, J.L.; Zhang, Y.; Wang, S.Y.; Jin, L. DiffCR: A Fast Conditional Diffusion Framework for Cloud Removal From Optical Satellite Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 5612014. [Google Scholar] [CrossRef]
- Valanarasu, J.; Yasarla, R.; Patel, V.M. TransWeather: Transformer-based restoration of images degraded by adverse weather conditions. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2022), New Orleans, LA, USA, 18–24 June 2022; pp. 2353–2363. [Google Scholar] [CrossRef]
- Jiang, B.; Chen, G.; Wang, J.; Ma, H.; Wang, L.; Wang, Y.; Chen, X. Deep Dehazing Network for Remote Sensing Image with Non-Uniform Haze. Remote Sens. 2021, 13, 4443. [Google Scholar] [CrossRef]
- Xia, G.-S.; Hu, J.; Hu, F.; Shi, B.; Bai, X.; Zhong, Y.; Zhang, L.; Lu, X. AID: A benchmark data set for performance evaluation of aerial scene classification. IEEE Trans. Geosci. Remote Sens. 2017, 55, 3965–3981. [Google Scholar] [CrossRef]
- Yu, J.; Liang, D.; Hang, B.; Gao, H. Aerial image dehazing using reinforcement learning. Remote Sens. 2022, 14, 5998. [Google Scholar] [CrossRef]
- He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar] [CrossRef]
- Gu, Z.; Zhan, Z.; Yuan, Q.; Yan, L. Single Remote Sensing Image Dehazing Using a Prior-Based Dense Attentive Network. Remote Sens. 2019, 11, 3008. [Google Scholar] [CrossRef]
- Hu, A.; Xie, Z.; Xu, Y.; Xie, M.; Wu, L.; Qiu, Q. Unsupervised Haze Removal for High-Resolution Optical Remote-Sensing Images Based on Improved Generative Adversarial Networks. Remote Sens. 2020, 12, 4162. [Google Scholar] [CrossRef]
- Song, T.; Fan, S.; Li, J.; Jin, J.; Jin, G.; Fan, L. Learning an Effective Transformer for Remote Sensing Satellite Image Dehazing. IEEE Trans. Geosci. Remote Sens. Lett. 2023, 20, 1–5. [Google Scholar] [CrossRef]
- Song, Y.; He, Z.; Qian, H.; Du, X. Vision transformers for single image dehazing. IEEE Trans. Image Process. 2023, 32, 1927–1941. [Google Scholar] [CrossRef] [PubMed]
- Liu, X.; Ma, Y.; Shi, Z.; Chen, J. GridDehazeNet: Attention-based multi-scale network for image dehazing. In Proceedings of the IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Republic of Korea, 27 October–2 November 2019; pp. 7314–7323. [Google Scholar] [CrossRef]
- Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. AOD-Net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar] [CrossRef]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 19–24 June 2022; pp. 10684–10695. [Google Scholar] [CrossRef]
- Saharia, C.; Ho, J.; Chan, W.; Salimans, T.; Fleet, D.J.; Norouzi, M. Image super-resolution via iterative refinement. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 4713–4726. [Google Scholar] [CrossRef] [PubMed]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. (NeurIPS) 2020, 33, 6840–6851. [Google Scholar] [CrossRef]
- Song, J.; Meng, C.; Ermon, S. Denoising diffusion implicit models. In Proceedings of the International Conference on Learning Representations (ICLR), Vienna, Austria, 3–7 May 2021. [Google Scholar] [CrossRef]
- Xia, B.; Zhang, Y.; Wang, S.; Wang, Y.; Wu, X.; Tian, Y.; Yang, W.; Gool, L.V. DiffIR: Efficient diffusion model for image restoration. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 13095–13105. [Google Scholar] [CrossRef]
- Wei, J.; Cao, Y.; Yang, K.; Chen, L.; Wu, Y. Self-Supervised Remote Sensing Image Dehazing Network Based on Zero-Shot Learning. Remote Sens. 2023, 15, 2732. [Google Scholar] [CrossRef]
- Chen, L.; Chu, X.; Zhang, X.; Sun, J. Simple baselines for image restoration. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 17–33. [Google Scholar] [CrossRef]
- Ren, W.; Pan, J.; Zhang, H.; Pan, J.; Cao, X.; Yang, M.-H. Single image dehazing via multi-scale convolutional neural networks with holistic edges. Int. J. Comput. Vis. 2020, 128, 240–259. [Google Scholar] [CrossRef]
- Zheng, Z.; Ren, W.; Cao, X.; Hu, X.; Wang, T.; Song, F.; Jia, X. Ultra-high-definition image dehazing via multi-guided bilateral learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPR), Virtual Conference, 19–25 June 2021; pp. 16180–16189. [Google Scholar] [CrossRef]
- Ren, W.; Ma, L.; Zhang, J.; Pan, J.; Cao, X.; Liu, W.; Yang, M.-H. Gated fusion network for single image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognit (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 3253–3261. [Google Scholar] [CrossRef]
- Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020; pp. 11908–11915. Available online: https://arxiv.org/pdf/1911.07559v2 (accessed on 24 May 2026).
- Wu, H.; Qu, Y.; Lin, S.; Zhou, J.; Qiao, R.; Zhang, Z.; Xie, Y.; Ma, L. Contrastive learning for compact single image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 10551–10560. [Google Scholar] [CrossRef]
- Zamir, S.W.; Arora, A.; Khan, S.; Hayat, M.; Khan, F.S.; Yang, M.-H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 5728–5739. [Google Scholar] [CrossRef]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. SwinIR: Image restoration using swin transformer. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops, Montreal, BC, Canada, 11–17 October 2021; pp. 1833–1844. [Google Scholar] [CrossRef]
- Wang, Z.; Cun, X.; Bao, J.; Zhou, W.; Liu, J.; Li, H. Uformer: A general U-shaped transformer for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 17683–17693. [Google Scholar] [CrossRef]
- Yang, G.; Zhou, M.; Yan, K.; Liu, A.; Fu, X.; Wang, F. Memory-Augmented deep conditional unfolding network for pan-sharpening. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 1788–1797. [Google Scholar] [CrossRef]
- Li, Z.; He, J.; Yuan, Q.; Jin, X.; Xiao, Y.; Zhang, L. PhDnet: A novel physic-aware dehazing network for remote sensing images. Inf. Fusion 2024, 107, 102277. [Google Scholar] [CrossRef]
- Guo, C.; Yan, Q.; Anwar, S.; Cong, R.; Ren, W.; Li, C. Image dehazing transformer with transmission-aware 3D position embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 5812–5820. [Google Scholar] [CrossRef]
- Chi, K.; Yuan, Y.; Wang, Q. Trinity-Net: Gradient-Guided swin transformer-based remote sensing image dehazing and beyond. IEEE Trans. Geosci. Remote Sens. 2023, 61, 1–14. [Google Scholar] [CrossRef]
- Qin, Y.; Wang, J.; Cao, S.; Zhu, M.; Sun, J.; Hao, Z.; Jiang, X. SRBPSwin: Single-Image Super-Resolution for Remote Sensing Images Using a Global Residual Multi-Attention Hybrid Back-Projection Network Based on the Swin Transformer. Remote Sens. 2024, 16, 2252. [Google Scholar] [CrossRef]
- Shao, Y.; Li, L.; Ren, W.; Gao, C.; Sang, N. Domain adaptation for image dehazing. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 2808–2817. [Google Scholar] [CrossRef]
- Zheng, Y.; Su, J.; Zhang, S.; Tao, M.; Wang, L. Dehaze-TGGAN: Transformer-Guide Generative Adversarial Networks With Spatial-Spectrum Attention for Unpaired Remote Sensing Dehazing. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–20. [Google Scholar] [CrossRef]
- Ma, L.; Mao, K.; Guo, Z. Defogging remote sensing images method based on a hybrid attention-based generative adversarial network. Smart Agric. 2025, 7, 172–182. [Google Scholar] [CrossRef]
- Chung, H.; Sim, B.; Ye, J.C. Come-closer-diffuse-faster: Accelerating conditional diffusion models for inverse problems through stochastic contraction. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), New Orleans, LA, USA, 19–20 June 2022; pp. 12413–12422. [Google Scholar] [CrossRef]
- Saharia, C.; Chan, W.; Chang, H.; Lee, C.; Ho, J.; Salimans, T.; Fleet, D.; Norouzi, M. Palette: Image-to-image diffusion models. In Proceedings of the ACM Special Interest Group on Computer Graphics and Interactive Techniques Conference, Vancouver, BC, Canada, 7–11 August 2022; pp. 1–10. [Google Scholar] [CrossRef]
- Li, R.; Pan, J.; Li, Z.; Tang, J. Single image dehazing via conditional generative adversarial network. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 8202–8211. [Google Scholar] [CrossRef]
- Xiao, Z.; Kreis, K.; Vahdat, A. Tackling the generative learning trilemma with denoising diffusion GANs. In Proceedings of the International Conference on Learning Representations (ICLR), Online, 25–29 April 2022. [Google Scholar] [CrossRef]
- Ozdenizci, O.; Legenstein, R. Restoring vision in adverse weather conditions with patch-based denoising diffusion models. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 10346–10357. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Zhang, L. Dynamic Mutual Enhancement Network for Single Remote Sensing Image Dehazing. In Proceedings of the 2022 IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 3336–3340. [Google Scholar] [CrossRef]
- Ren, W.; Liu, S.; Zhang, H.; Pan, J.; Cao, X.; Yang, M.-H. Single image dehazing via multi-scale convolutional neural networks. In Proceedings of the European Conference on Computer Vision (ECCV), Amsterdam, The Netherlands, 11–14 October 2016; pp. 154–169. [Google Scholar] [CrossRef]
- Wang, X.; Yu, K.; Dong, C.; Loy, C.C. Recovering realistic texture in image super-resolution by deep spatial feature transform. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 606–615. [Google Scholar] [CrossRef]
- Huang, B.; Li, Z.; Yang, C.; Sun, F.; Song, Y. Single Satellite Optical Imagery Dehazing using SAR Image Prior Based on Conditional Generative Adversarial Networks. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Snowmass Village, CO, USA, 2–5 March 2020; pp. 1806–1815. [Google Scholar] [CrossRef]
- Lin, D.; Xu, G.; Wang, X.; Wang, Y.; Sun, X.; Fu, K. A Remote Sensing Image Dataset for Cloud Removal. arXiv 2019, arXiv:1901.00600. [Google Scholar] [CrossRef]
- Hore, A.; Ziou, D. Image quality metrics: PSNR vs. SSIM. In Proceedings of the International Conference on Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE/CVF Conference on Computer Vision Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 586–595. [Google Scholar] [CrossRef]
- Kulkarni, A.; Murala, S. Aerial image dehazing with attentive deformable transformers. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–7 January 2023; pp. 6305–6314. [Google Scholar] [CrossRef]
- Hong, M.; Liu, J.; Li, C.; Qu, Y. Uncertainty-driven dehazing network. In Proceedings of the AAAI Conference on Artificial Intelligence, Online, 22 February–1 March 2022; Volume 36, pp. 735–743. [Google Scholar] [CrossRef]







| Method | Sate1K Thin | Sate1K Moderate | Sate1K Thick | Rice1 | Rice2 | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PSNR | SSIM | LPIPS | PSNR | SSIM | LPIPS | PSNR | SSIM | LPIPS | PSNR | SSIM | LPIPS | PSNR | SSIM | LPIPS | |
| DCP | 17.6711 | 0.8674 | 0.1255 | 18.3316 | 0.9000 | 0.1284 | 9.3920 | 0.5715 | 0.4287 | 18.2124 | 0.8183 | 0.1845 | 16.6889 | 0.5762 | 0.5169 |
| AOD-Net | 12.9109 | 0.6882 | 0.2073 | 13.1130 | 0.6656 | 0.3116 | 13.2382 | 0.6792 | 0.2418 | 13.6038 | 0.4314 | 0.4211 | 12.2970 | 0.2497 | 0.5578 |
| GridDehazeNet | 22.7932 | 0.8983 | 0.0700 | 25.0822 | 0.9325 | 0.0651 | 20.3605 | 0.8267 | 0.1547 | 30.4821 | 0.9402 | 0.0516 | 32.3731 | 0.8697 | 0.1839 |
| AIDNet | 23.1221 | 0.9052 | 0.0603 | 25.0894 | 0.9124 | 0.0675 | 20.5650 | 0.8325 | 0.1281 | 29.9344 | 0.9402 | 0.0485 | - | - | - |
| Uformer | 24.7021 | 0.9193 | 0.0696 | 25.9305 | 0.9431 | 0.0634 | 22.3350 | 0.8541 | 0.1676 | 30.6672 | 0.9383 | 0.0624 | 33.6961 | 0.8759 | 0.2123 |
| DehazeFormer | 24.2275 | 0.9149 | 0.0591 | 25.7707 | 0.9418 | 0.0708 | 21.5320 | 0.8414 | 0.1605 | 31.6247 | 0.9370 | 0.0612 | 32.8495 | 0.8602 | 0.2234 |
| DiffIR | 24.9692 | 0.9259 | 0.0607 | 26.6415 | 0.9473 | 0.0638 | 22.8874 | 0.8784 | 0.1477 | 31.1950 | 0.9461 | 0.0702 | 34.1305 | 0.8823 | 0.1980 |
| HS-DiffIR (Ours) | 25.5510 | 0.9307 | 0.0548 | 27.2444 | 0.9458 | 0.0625 | 23.2747 | 0.8837 | 0.1372 | 32.1260 | 0.9481 | 0.0691 | 34.3883 | 0.8824 | 0.1939 |
| Model Variant | H-IPR | S-API | PSNR | SSIM |
|---|---|---|---|---|
| (a) Baseline (DiffIR) | ✗ | ✗ | 22.8874 | 0.8784 |
| (b) H-IPR only | ✓ | ✗ | 23.1091 | 0.8789 |
| (c) S-API only | ✗ | ✓ | 22.9922 | 0.8830 |
| (d) H-IPR + S-API | ✓ | ✓ | 23.2747 | 0.8837 |
| Method | Parameters (M) | FLOPs (G) | Inference Time (ms) |
|---|---|---|---|
| DiffIR (Baseline) | 26.91 | 451.64 | 248.2 |
| HS-DiffIR (Ours) | 27.11 | 451.64 | 249.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ju, W.; Liang, Z.; Chen, H.; Shen, J. Hierarchical Scale-Adaptive Diffusion Priors for Efficient Remote Sensing Dehazing. Remote Sens. 2026, 18, 1907. https://doi.org/10.3390/rs18121907
Ju W, Liang Z, Chen H, Shen J. Hierarchical Scale-Adaptive Diffusion Priors for Efficient Remote Sensing Dehazing. Remote Sensing. 2026; 18(12):1907. https://doi.org/10.3390/rs18121907
Chicago/Turabian StyleJu, Wei, Zheng Liang, Huan Chen, and Jie Shen. 2026. "Hierarchical Scale-Adaptive Diffusion Priors for Efficient Remote Sensing Dehazing" Remote Sensing 18, no. 12: 1907. https://doi.org/10.3390/rs18121907
APA StyleJu, W., Liang, Z., Chen, H., & Shen, J. (2026). Hierarchical Scale-Adaptive Diffusion Priors for Efficient Remote Sensing Dehazing. Remote Sensing, 18(12), 1907. https://doi.org/10.3390/rs18121907

