Multi-Scale Image Defogging Network Based on Cauchy Inverse Cumulative Function Hybrid Distribution Deformation Convolution
Abstract
1. Introduction
2. Theory and Methods
2.1. Cauchy and Inverse Convolution Deformation Model
2.1.1. Mathematical Foundations
- Polynomial decay is slower.
- 2.
- Better dimensional flexibility.
- 3.
- Better gradient stability.
2.1.2. Innovative Solutions
2.1.3. Operating Network Design
- Basic offset: the initial coarse offset is obtained by sampling through the inverse convolution function:
- Adaptive correction: a lightweight CNN with 3/5/7 layers (number of parameters < 1 k) is used to predict pixel-wise correction coefficients.
- Design of adaptive truncation strategy constraints.
2.2. Theory and Implementation of the Cauchy–Gaussian Hybrid Distribution Attention Mechanism
2.2.1. Mathematical Foundations
2.2.2. Innovative Cauchy Attention Branch
- (1)
- Linearization Implementation Method
- (2)
- Gradient constraint conditions
2.2.3. Innovative Gaussian Attention Branch
- (1)
- Multi-Scale Modulation Mechanism
- (2)
- Gaussian Receptive Field Constraint Strategy
2.2.4. Cauchy–Gauss Branch Dynamic Fusion Module
- (1)
- Hybrid Coefficient Generation
- (2)
- Gradient problem analysis of mixed operations
2.2.5. Operating Network Design
- Dual-branch parallel processing:
- 2.
- The Gaussian branch:
- 3.
- Feature reconstruction output:
2.3. Theory and Implementation of Tree-like Multi-Path Coding Structure
2.3.1. Principles of Bio-Inspired Design
2.3.2. Innovative Cauchy Deformation Bimodal Convolution
2.3.3. Dynamic Gate Fusion
- (1)
- Fog concentration estimation module.
- (2)
- Gate coefficient generation.
2.3.4. Multi-Path Constraint Strategy
- (1)
- Multi-path gradient allocation.
- (2)
- Orthogonality Constraint Between Paths
2.3.5. Detailed Implementation Steps
- Input Feature Division
- 2.
- Independent processing of paths
- 3.
- Dynamic feature fusion
3. Operational Process and Innovative Advantages
3.1. Operational Process
Algorithm 1 EnhancedMB-CauchyFormer Pseudocode |
I—Input: Image tensor (B, C, H, W),—Apply 3x3 convolution with stride 1 and padding 1, O—Output: Embedded patches (B, embed_dim, H, W) 1: function EnhancedMB-CauchyFormer(I): 2: x = I // Receive raw image input 3: x = OverlapPatchEmbed(x) // Initial feature extraction [B, C0, H, W] 4: // --- Level 1 --- 5: x1_list = Patch_Embed_stage_Cauchy(x, num_path=N1) // Generate N1 paths [B, C0, H, W] × N1 6: x1 = EnhancedMHCA_stage(x1_list) + x // Tree-based multi-head attention + residual connection 7: // --- Level 2 --- 8: x2 = Downsample(x1) // Downsampling [B, C1, H/2, W/2] 9: x2_list = Patch_Embed_stage_Cauchy(x2, num_path=N2) 10: x2 = EnhancedMHCA_stage(x2_list) + x2 11: // --- Level 3 --- 12: x3 = Downsample(x2) // Downsampling [B, C2, H/4, W/4] 13: x3_list = Patch_Embed_stage_Cauchy(x3, num_path=N3) 14: x3 = EnhancedMHCA_stage(x3_list) + x3 15: // --- Level 4 (Latent) --- 16: x4 = Downsample(x3) // Downsampling [B, C3, H/8, W/8] 17: x4_list = Patch_Embed_stage_Cauchy(x4, num_path=N4) 18: x4 = EnhancedMHCA_stage(x4_list) + x4 19: // ======== 4. Decoder Upsampling Path ======== 20: // --- Level 3 --- 21: x3_up = Upsample(x4) // Upsampling [B, C2, H/4, W/4] 22: x3_up = Concat(x3_up, x3) // Concatenate with encoder features 23: x3_up = ReduceChannels(x3_up) // 1 × 1 convolution for dimension reduction 24: x3_up_list = Patch_Embed_stage_Cauchy(x3_up, num_path=N3) 25: x3_up = EnhancedMHCA_stage(x3_up_list) + x3_up 26: // --- Level 2 --- 27: x2_up = Upsample(x3_up) // Upsampling [B, C1, H/2, W/2] 28: x2_up = Concat(x2_up, x2) 29: x2_up = ReduceChannels(x2_up) 30: x2_up_list = Patch_Embed_stage_Cauchy(x2_up, num_path=N2) 31: x2_up = EnhancedMHCA_stage(x2_up_list) + x2_up 32: // --- Level 1 --- 33: x1_up = Upsample(x2_up) // Upsampling [B, C0, H, W] 34: x1_up = Concat(x1_up, x1) 35: x1_up_list = Patch_Embed_stage_Cauchy(x1_up, num_path=N1) 36: x1_up = EnhancedMHCA_stage(x1_up_list) + x1_up 37: // ======== 5. Refinement Stage ======== 38: x_refine_list = Patch_Embed_stage_Cauchy(x1_up, num_path=N1) 39: x_refine = EnhancedMHCA_stage(x_refine_list) + x1_up 40: // ======== 6. Output Processing ======== 41: if dual_pixel_task: 42: x_refine = x_refine + SkipConv(x) // Skip connection 43: O = OutputConv(x_refine) // 3-channel output 44: else: 45: O = OutputConv(x_refine) + I // Residual output 46: return O |
3.2. Methodological Advantages
3.2.1. Cauchy Deformable Convolution
3.2.2. Innovative Dual-Distribution Hybrid Mechanism
3.2.3. Innovative Tree-Based Self-Attention Model Block
4. Experimental Setup and Result Analysis
4.1. Algorithm Setup
4.2. Experiment on Synthetic Blur and Real Blur Images
4.3. Ablation Studies
4.3.1. Cauchy Deformable Convolution Layer
4.3.2. Cauchy–Gauss Hybrid Attention Mechanism
4.3.3. Tree-like Multi-Path Branch Architecture
4.3.4. Module Synergy Analysis
4.4. Real Target Recognition and Verification
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
ICDF | Inverse Cumulative Distribution Function |
CNN | Convolutional Neural Network |
MB-TaylorFormer | Multi-Branch Taylor Transformer |
DCP | Dark Channel Prior |
AOD-Net | All-in-One Dehazing Network |
MSBDN | Multi-Scale Boosted Dehazing Network |
FFA-Net | Feature Fusion Attention Network |
Restormer | Restoration Transformer |
Dehamer | Deep Hybrid Atmospheric-scattering Model Guided Network |
EnhancedMB-CauchyFormer | Enhanced Multi-Branch Cauchy Transformer Network |
PSNR | Peak Signal-to-Noise Ratio |
SSIM | Structural Similarity Index Measure |
References
- Narasimhan, S.G.; Nayar, S.K. Contrast restoration of weather degraded images. IEEE Trans. Pattern Anal. Mach. Intell. 2003, 25, 713–724. [Google Scholar] [CrossRef]
- He, K.M.; Sun, J.; Tang, X.O. Single Image Haze Removal Using Dark Channel Prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar]
- Liu, Z.H.; Zhao, S.J.; Wang, X. Research on Driving Obstacle Detection Technology in Foggy Weather Based on GCANet and Feature Fusion Training. Sensors 2023, 23, 2822. [Google Scholar] [CrossRef] [PubMed]
- Zai, W.J.; Yan, L.S. Multi-Patch Hierarchical Transmission Channel Image Dehazing Network Based on Dual Attention Level Feature Fusion. Sensors 2023, 23, 7026. [Google Scholar] [CrossRef]
- Zamanidoost, Y.; Ould-Bachir, T.; Marterl, S. OMS-CNN: Optimized Multi-Scale CNN for Lung Nodule Detection Based on Faster R-CNN. IEEE J. Biomed. Health Inform. 2024, 20, 2148–2160. [Google Scholar] [CrossRef] [PubMed]
- Liu, Z.H.; Yan, J.; Zhang, J.Z. Research on a Recognition Algorithm for Traffic Signs in Foggy Environments Based on Image Defogging and Transformer. Sensors 2024, 24, 4370. [Google Scholar] [CrossRef]
- Zhang, H.; Sindagi, V.; Patel, V.M. Multi-scale Single Image Dehazing Using Perceptual Pyramid Deep Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1015–1023. [Google Scholar]
- Anvari, Z.; Athitsos, V. Dehaze-GLCGAN: Unpaired Single Image De-hazing via Adversarial Training. arXiv 2020, arXiv:2008.06632. [Google Scholar]
- Liang, L.H.; Zhang, S.Q.; Li, J. Multiscale DenseNet Meets with Bi-RNN for Hyperspectral Image Classification. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2022, 15, 5401–5405. [Google Scholar] [CrossRef]
- Zhang, J.; Tu, B.; Liu, B.; Li, J.; Plaza, A. Hyperspectral Image Classification via Neighborhood Adaptive Graph Isomorphism Network. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5515717. [Google Scholar] [CrossRef]
- Xia, Z.F.; Pan, X.R.; Song, S.J.; Li, L.E.; Huang, G. Vision Transformer with Deformable Attention. arXiv 2022, arXiv:2201.00520. [Google Scholar] [CrossRef]
- Liu, Z.; Hu, H.; Lin, Y.T.; Yao, Z.L.; Xie, Z.D.; Wei, Y.X. Swin Transformer V2: Scaling Up Capacity and Resolution. arXiv 2022, arXiv:2111.09883. [Google Scholar] [CrossRef]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2022, arXiv:2010.11929. [Google Scholar]
- Liu, W.N.; Zhang, J.H. Research on Image Defogging Algorithm Combining Homomorphic Filtering and Retinex. In Proceedings of the International Symposium on Computer Technology and Information Science, Xi’an, China, 12–14 July 2024; pp. 596–599. [Google Scholar]
- Yang, P.W.; Wang, L. Multi-Scale Dehaze Network Based on Frequency Domain Assistance and Detailed Brightness Information Guidance. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing, Hyderabad, India, 6–11 April 2025; pp. 1–5. [Google Scholar]
- Qiu, Y.W.; Zhang, K.H.; Wang, C.X.; Luo, W.H.; Li, H.D.; Jin, Z. MB-TaylorFormer: Multi-branch Efficient Transformer Expanded by Taylor Formula for Image Dehazing. In Proceedings of the IEEE International Conference on Computer Vision, Paris, France, 2–3 October 2023; pp. 12756–12767. [Google Scholar]
- Jin, Z.; Qiu, Y.W.; Zhang, K.H.; Li, H.D.; Luo, W.H. MB-TaylorFormer V2: Improved Multi-Branch Linear Transformer Expanded by Taylor Formula for Image Restoration. arXiv 2025, arXiv:2501.04486. [Google Scholar] [CrossRef] [PubMed]
- Wang, M.; Wang, J.S.; Li, X.D.; Zhang, M.; Hao, W.K. Harris Hawk Optimization Algorithm Based on Cauchy Distribution Inverse Cumulative Function and Tangent Flight Operator. Appl. Intell. 2021, 52, 10999–11026. [Google Scholar] [CrossRef]
- Wang, S.; Gao, Y.B.; Li, S.; Lv, C.; Cai, X.; Li, C.K.; Yuan, H.; Zhang, J.L. MetricGrids: Arbitrary Nonlinear Approximation with Elementary Metric Grids based Implicit Neural Representation. arXiv 2025, arXiv:2503.10000. [Google Scholar]
- Zhang, Y.F.; Wang, C.Y.; Wang, X.G.; Zeng, W.J.; Liu, W.Y. FairMOT: On the Fairness of Detection and Re-Identification in Multiple Object Tracking. arXiv 2021, arXiv:2004.01888. [Google Scholar] [CrossRef]
- Dubey, S.R.; Singh, S.K.; Chaudhuri, B.B. Activation functions in deep learning: A comprehensive survey and benchmark. Nerocomputing 2022, 6, 92–108. [Google Scholar] [CrossRef]
- Wang, T.; Dai, W.F.; Wu, Y.J. Nonuniform and pathway-specific laminar processing of spatial frequencies in the primary visual cortex of primates. Nat. Commun. 2024, 15, 4005–4021. [Google Scholar] [CrossRef]
- Luo, W.J.; Li, Y.J.; Urtasun, R.; Zemel, R. Understanding the Effective Receptive Field in Deep Convolutional Neural Networks. arXiv 2017, arXiv:1701.04128. [Google Scholar] [CrossRef]
- Li, B.Y.; Peng, X.L.; Wang, Z.Y.; Xu, J.Z.; Feng, D. AOD-Net: All-in-One Dehazing Network. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 4780–4788. [Google Scholar]
- Zheng, T.Y.; Xu, T.Y.; Li, X.D.; Zhao, X.W. Improved AOD-Net Dehazing Algorithm for Target Image. In Proceedings of the International Conference on Computer Engineering and Intelligent Control, Guangzhou, China, 11–13 October 2024; pp. 333–337. [Google Scholar]
- Wang, Y.; Yu, Z.; Guo, C.; Zhao, J. Layer Separation Network with Contrastive Loss for Robust Dehazing. In Proceedings of the International Conference on Cloud Computing and Big Data Analytics, Chengdu, China, 26–28 April 2023; pp. 290–296. [Google Scholar]
- Zheng, Y.J.; Liu, S.C.; Bruzzone, L. An Attention-Enhanced Feature Fusion Network (AeF2N) for Hyperspectral Image Classification. IEEE Geosci. Remote Sens. Lett. 2023, 20, 5511005. [Google Scholar] [CrossRef]
- Cui, Y.N.; Ren, W.Q.; Cao, X.C.; Knoll, A. Revitalizing Convolutional Network for Image Restoration. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 9423–9438. [Google Scholar] [CrossRef] [PubMed]
- Faramarzi, F.; Heidarinejad, M.; Mirjalili, S.; Gandomi, A.H. Marine predators algorithm: A nature-inspired metaheuristic. Expert Syst. Appl. 2020, 152, 1929–1957. [Google Scholar] [CrossRef]
- Li, B.Y.; Ren, W.Q.; Fu, D.P.; Tao, D.C.; Feng, D.; Zeng, W.J. Benchmarking Single-Image Dehazing and Beyond. IEEE Trans. Image Process. 2019, 28, 492–505. [Google Scholar] [CrossRef] [PubMed]
- Ancuti, C.O.; Ancuti, C.; Timofte, R.; Vleeschouwer, C.D. O-haze: A dehazing benchmark with real hazy and haze-free outdoor images. In Proceedings of the CVF Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 754–762. [Google Scholar]
- Ancuti, C.O.; Ancuti, C.; Timofte, R.; Vleeschouwer, C.D. Dense-haze: A benchmark for image dehazing with dense-haze and haze-free images. In Proceedings of the IEEE International Conference on Image Processing, Taipei, Taiwan, 22–25 September 2019; pp. 1014–1018. [Google Scholar]
Dataset | Training Sample Count | Test Sample Count | Fog Concentration Distribution | Evaluation Subset |
---|---|---|---|---|
RESIDE-ITS [30] | 13,990 | 500 | Low → high (indoor) | SOTS |
RESIDE-OTS [30] | 313,950 | 500 | Uniform (outdoor) | SOTS |
O-HAZE [31] | 40 | 5 | Medium (real scene) | Last 5 |
Dense-HAZE [32] | 50 | 5 | Extremely high (heavy fog) | Last 5 |
Method | SOTS-Indoor | SOTS-Outdoor | O-HAZE | Dense-Haze | Computational Efficiency |
---|---|---|---|---|---|
- | PSNR ↑ SSIM ↑ | PSNR ↑ SSIM ↑ | PSNR ↑ SSIM ↑ | PSNR ↑ SSIM ↑ | Params (M) MACs (G) |
Based on physical models | - | - | - | - | - |
DCP [1] | 16.62 0.818 | 19.13 0.815 | 16.78 0.653 | 12.72 0.442 | - 0.78 |
Improved AOD-Net [24,25] | 24.61 0.854 | 22.356 0.934 | 20.577 0.723 | 13.59 0.544 | 0.025 2.3 |
MSBDN-DFF [26] | 34.29 0.983 | 33.65 0.970 | 24.87 0.758 | 16.34 0.638 | 31.45 43.82 |
Self-attention model | - | - | - | - | - |
FFA-Net [27] | 36.39 0.989 | 33.57 0.984 | 22.12 0.770 | 15.70 0.549 | 4.46 287.8 |
ConvIR [28] | 41.53 0.994 | 37.95 0.990 | 25.36 0.784 | 16.86 0.600 | 5.53 42.1 |
SOTA | - | - | - | - | - |
FIGD-Net [15] | 41.11 0.996 | 38.19 0.9992 | 26.12 0.805 | 17.71 0.608 | 124.24 70.76 |
MB-TaylorFormerV2 [16,17] | 42.86 0.995 | 39.25 0.992 | 25.43 0.792 | 16.95 0.621 | 7.29 86.0 |
Ours (SOTA breakthrough) | - | - | - | - | - |
EnhancedMB-CauchyFormer | 45.12 0.997 | 39.55 0.994 | 24.98 0.800 | 17.83 0.584 | 15.80 96.8 |
Method | PSNR (dB) | SSIM | Parameters (M) | MACs (G) |
---|---|---|---|---|
Standard convolution | 40.71 | 0.991 | 1.8 | 3.2 |
Gaussian deformable convolution | 41.62 | 0.992 | 2.5 | 5.1 |
Cauchy deformable convolution | 41.87 | 0.993 | 2.7 | 7.8 |
Attention Type | PSNR (dB) | SSIM | Approximation | Long-Range Dependency Strength |
---|---|---|---|---|
Standard attention | 40.71 | 0.991 | 0 | 1.0 |
Linear attention | 36.12 | 0.973 | 0.148 | 0.32 |
Gaussian kernel approximation | 40.05 | 0.989 | 0.087 | 0.75 |
Cauchy–Gaussian Kernel Approximation | 42.59 | 0.995 | 0.021 | 0.92 |
Activate Branch | PSNR (dB) | SSIM | Main Domain |
---|---|---|---|
L1 | 43.95 | 0.994 | Edge texture |
L1 + L2 + L3 | 44.98 | 0.993 | Contour shape integration |
L1 + L2 + L3 + L4 | 45.02 | 0.995 | Global color/texture shape modeling |
L1 + L2 + L3 + L4 + L5 | 45.08 | 0.996 | Target recognition |
Full branch + dynamic gate control | 45.12 | 0.996 | Cross-scale feature fusion |
Module Combination | PSNR (dB) | SSIM | ΔPSNR vs. Baseline | Contribution Decomposition | MACs (G) |
---|---|---|---|---|---|
Baseline (standard convolution+ attention) | 40.71 | 0.991 | — | — | 3.2 |
Cauchy–Fourier transform | 41.87 | 0.993 | +1.16 | 26.3% | 7.8 |
Mixed distribution attention | 42.59 | 0.995 | +1.88 | 16.3% | 58.8 |
Tree-like multi-branching (complete model) | 45.12 | 0.996 | +4.41 | 57.4% | 96.8 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ji, L.; Chen, C. Multi-Scale Image Defogging Network Based on Cauchy Inverse Cumulative Function Hybrid Distribution Deformation Convolution. Sensors 2025, 25, 5088. https://doi.org/10.3390/s25165088
Ji L, Chen C. Multi-Scale Image Defogging Network Based on Cauchy Inverse Cumulative Function Hybrid Distribution Deformation Convolution. Sensors. 2025; 25(16):5088. https://doi.org/10.3390/s25165088
Chicago/Turabian StyleJi, Lu, and Chao Chen. 2025. "Multi-Scale Image Defogging Network Based on Cauchy Inverse Cumulative Function Hybrid Distribution Deformation Convolution" Sensors 25, no. 16: 5088. https://doi.org/10.3390/s25165088
APA StyleJi, L., & Chen, C. (2025). Multi-Scale Image Defogging Network Based on Cauchy Inverse Cumulative Function Hybrid Distribution Deformation Convolution. Sensors, 25(16), 5088. https://doi.org/10.3390/s25165088