Haze-Aware Attention Network for Single-Image Dehazing

Single-image dehazing is a pivotal challenge in computer vision that seeks to remove haze from images and restore clean background details. Recognizing the limitations of traditional physical model-based methods and the inefficiencies of current attention-based solutions, we propose a new dehazing network combining an innovative Haze-Aware Attention Module (HAAM) with a Multiscale Frequency Enhancement Module (MFEM). The HAAM is inspired by the atmospheric scattering model, thus skillfully integrating physical principles into high-dimensional features for targeted dehazing. It picks up on latent features during the image restoration process, which gives a significant boost to the metrics, while the MFEM efficiently enhances high-frequency details, thus sidestepping wavelet or Fourier transform complexities. It employs multiscale fields to extract and emphasize key frequency components with minimal parameter overhead. Integrated into a simple U-Net framework, our Haze-Aware Attention Network (HAA-Net) for single-image dehazing significantly outperforms existing attention-based and transformer models in efficiency and effectiveness. Tested across various public datasets, the HAA-Net sets new performance benchmarks. Our work not only advances the field of image dehazing but also offers insights into the design of attention mechanisms for broader applications in computer vision.


Introduction
Single-image dehazing [1][2][3] aims to eliminate haze from images, thus accurately restoring the details of a clean background.This process is recognized as a classic example of an ill-posed problem, where the solution is not unique.Despite these challenges, single-image dehazing is a key area of research due to its wide range of applications in many computer vision tasks.In numerous computer vision tasks, these include outdoor surveillance [4], outdoor scene understanding [5,6], and object detection [7,8].The visual effect of fog is crucial for improving the accuracy and effectiveness of these tasks.Therefore, the pursuit of effective single-image dehazing methods has become a focal point of research.Over the past decade, this field has attracted a lot of interest from researchers and engineers, thus leading to the development of various innovative technologies and algorithms.These efforts are driven by the potential benefits that dehazing can bring to many applications, thus making it a dynamic and vibrant area of study within the broader field of computer vision.
In recent years, the swift advancement in deep learning technologies has brought attention-based dehazing networks to the forefront of interest.The key to their growing popularity lies in the attention module's ability to selectively target various areas and channels.This adaptability is especially valuable for dehazing networks, given the uneven spatial spread of haze degradation.Such attention modules offer a tailored approach to effectively restoring clean images, thus addressing the specific challenges posed by haze.The attention-based dehazing methods [2] have achieved performance far beyond that of traditional physical model-based methods [1].The PCFA module [9] taps feature pyramid and channel attention to extract and focus on crucial image features for effective dehazing.Zhang et al. [10] introduced an RMAM with an attention block to help networks concentrate on key features during learning.Zhang et al. [11] designed a network that teams up multilevel feature blending with mixed convolution attention to steadily and smartly boost dehazing results.However, the explanation behind these attention-based approaches remains unclear, as they do not have a solid link to the atmospheric scattering model [1,12].Additionally, some of these techniques [2,13,14] employ intricate designs or self-attention mechanisms, thus leading to suboptimal efficiency.In this study, we take a fresh look at the attention mechanism designs in dehazing networks and introduce a new approach inspired by physical priors [15][16][17], which is named the Haze-Aware Attention Module (HAAM).This module ingeniously applies the attention mechanism to mimic the parameters of the atmospheric scattering model, thus expressing these parameters through high-dimensional features.By employing these features constrained by the physical model, we conducted the dehazing process within the feature space, thus achieving outstanding results in both performance and efficiency.
In addition to HAAM, we also developed a new Multiscale Frequency Enhancement Module (MFEM) designed to boost high-frequency details without relying on wavelet [18] or Fourier transforms [19].MFEM uses a 4-scale receptive field to extract contextual features, thus fully adapting to receptive fields of different sizes.It also emphasizes important information in the frequency domain through lightweight kernel learning parameters in the channel dimension, thus effectively enhancing the dehazing effect.This approach sidesteps the extra computational work typically needed for these methods' reverse transformations, thus resulting in an efficient and stable enhancement of features.By combining our proposed HAAM and MFEM with the straightforward U-Net architecture, we have created the Haze-Aware Attention Network (HAA-Net) for single-image dehazing.Our method has been tested on both synthetic and real-world datasets.In terms of metrics and visual effects, our approach significantly outperformed traditional attention-based methods, as well as transformer-based methods.
The contributions of this work are summarized as follows: • We developed an efficient attention mechanism known as the HAAM, which is inspired by the atmospheric scattering model that smartly incorporates physical principles into high-dimensional features.

•
We crafted a multiscale frequency enhancement module that tunes high-frequency features, thus effectively bringing back the finer details of hazy images.

•
Our HAA-Net set new benchmarks in performance across several public datasets.Notably, it reached the PSNR/SSIM of 41.23 dB/0.996 on the RESIDE-Indoor dataset, thus showcasing its exceptional dehazing performance.

Prior-Based Image Dehazing
Research on single-image dehazing in computer vision and computer graphics has been widely explored.Traditional methods have relied on priors such as the dark channel prior (DCP) [12], color attenuation prior [20], and nonlocal prior [21] to estimate scattering light, atmospheric light, depth, and transmission map [12,15,[20][21][22].These methods are backed by strong principles and are interpretable.However, they may not perform well in real-world image dehazing scenarios because they only extract features based on the atmospheric scattering model at the image level, without accessing deep latent features.The Atmospheric Scattering Model (ASM) has been the cornerstone for many previous works in constructing image dehazing networks.These methods have explicitly incorporated the ASM to enhance the generalization capability of their models, thus thoroughly validating the effectiveness of the ASM.In contrast to these approaches, we introduced the ASM at the feature level, thus leveraging it to learn more latent features.Additionally, we allocated a substantial number of channels for the atmospheric light value A, thus aiming to better adapt to the complexities of real-world scenarios.

Deep Learning-Based Image Dehazing
Due to the inability of prior-based dehazing methods to adapt well to all haze scenes, recent dehazing efforts have moved away from using priors.Some end-to-end networks directly estimated haze-free images [2,3,14,[23][24][25][26][27].AECRNet, SFNet, and others use a U-shaped structure, which has been proven to be superior for haze removal.These methods have achieved some results, but they perform poorly in dehazing real images.Recently, transformers [28][29][30][31][32] have been used in image tasks due to their advantage in capturing long-range relationships.However, their computational complexity increases quadratically with resolution, thereby making them unsuitable for pixel-to-pixel tasks like dehazing.Moreover, these methods lack theoretical interpretability.Therefore, instead of using transformers, we developed our own more efficient attention mechanism based on physical priors.

Attention-Based Image Dehazing
Attention mechanisms have been playing a crucial role in the field of dehazing.A lot of effective attention mechanisms have been proposed to enhance hazy images.FFA-Net [2] introduced attention mechanisms and achieved impressive results in metrics like the PSNR and SSIM.MSAFF-Net [33] used a channel attention module and a multiscale spatial attention module to focus on areas with features related to fog.Chen et al. [34] proposed the Detail-Enhanced Attention Block (DEAB), which enhances feature learning by combining Detail-Enhanced Convolution and Content-Guided Attention, thereby further improving dehazing performance.Zhang et al. [35] proposed a Residual Nonlocal Attention Network that takes into account the uneven distribution of information in corrupted images.They designed both local and nonlocal attention blocks to extract features for high-quality image restoration.Mou et al. [36] introduced the COLA-Net for image restoration, which combines local and nonlocal attention mechanisms to restore areas with complex textures and highly repetitive details.However, they had high complexity and slow processing during the hazing process.These methods overlook physical characteristics.To tackle this, we propose the Haze-Aware Attention Module, thereby considering the physical model in the feature space of low-resolution images.By incorporating physical priors, we obtained effective features with fewer parameters, thus leading to higher PSNR and SSIM values.

Frequency-Based Image Dehazing
Due to the convolution theorem, Fourier analysis is widely used to address various low-level vision problems.Numerous algorithms have been researched and developed from a frequency domain perspective for low-level vision issues.Some CNN-based frameworks [37][38][39] have been utilized to bridge the frequency gap between blurred and GT image pairs.For instance, Chen et al. [40] proposed a hierarchical desnow network based on dual-tree complex wavelet transform to reduce snow noise in images.Yang et al. [41] developed a wavelet transform-based U-Net model to replace traditional upsampling and downsampling operations.Zou et al. [18] employed wavelet transform to divide the input into four frequency sub-bands and processed each sub-band with separate convolutions to prevent interference between different frequency parts.Yu et al. [42] used deep Fourier transform to handle global frequency data and reconstruct the phase spectrum under the guidance of the amplitude spectrum, which then aids in enhancing the learning of local features within the spatial domain.Liu et al. [43] achieved impressive results by removing the haze effect from the low-frequency part based on the prior that haze is typically distributed in the low-frequency spectrum of its multiscale wavelet decomposition.But, these methods all add to the complexity of wavelet or Fourier transforms, thus making the computation more costly.We have explored a more straightforward and efficient Multiscale Frequency Enhancement Module (MFEM), which enriches and emphasizes the frequencies extracted from four size receptive fields using ultralightweight learnable parameters, and it weights the features on the channel dimension, thus achieving satisfactory results.

Image Dehazing
As shown in Figure 1, our dehazing model employs a classic encoder-decoder architecture as its backbone.This framework performs a 4 × downsampling operation, which greatly reduces memory usage during both training and inference, thus enhancing the model's operational efficiency.It is worth noting that our model uses three different types of activation functions.ReLU, validated in the gUnet [44] study for image dehazing, effectively learns complex patterns for image dehazing.Tanh, with its output range of (−1, 1), constrains the model's output to prevent extreme values, thus enhancing stability and output quality.Additionally, the model employs a dynamic fusion module to merge features from the downsampling and upsampling layers, which is a strategy that helps to retain more information from the image and strengthens the model's ability to capture details, thus resulting in a more compact and efficient dehazing model.Within this refined feature space, we have further enhanced feature extraction and optimization through the cascade of HAABs.Each HAAB has been meticulously designed and consists of two key components: a Haze-Aware Attention Module and a Multiscale Frequency Attention Module.The HAAM, inspired by physical priors, guides the network to progressively extract clear, fog-free features, which are crucial for the dehazing effect in images.Meanwhile, the MFEM enriches the features through multiscale modulation, thus intelligently identifying and emphasizing features that contain important information and then fusing these features through channelwise weighted fusion using learnable parameters, which further improves the model's performance.With this innovative structural design, our dehazing model can effectively handle a variety of complex haze images.It not only preserves the original details of the image but also significantly improves the clarity and quality of the image.

Haze-Aware Attention Module
We introduce a new Haze-Aware Attention Module(HAAM), which cleverly applies physical models at the feature level to guide the model in feature extraction, thus pulling out a lot of potentially important features.Empirical evidence demonstrates that this module not only provides good interpretability but also significantly improves performance metrics.Impressively, it is likely that the introduction of the physical model makes our module more robust and adaptable for processing real images.As shown in the Figure 2, especially for the restoration of sky areas, it significantly outperformed other state-of-the-art methods, thus making the enhanced images more visually appealing.Next, we will delve into the principles of the model.Initially, leveraging the atmospheric scattering model, the generation of a hazy image can be described as follows: where I symbolizes the hazy image, J represents the GT image, and A denotes the atmospheric light, and the scattering of atmospheric light can lead to a reduction in image contrast and visibility.T is the transmission map, which reflects the proportion of light that travels from a point in the scene to the camera without scattering.X indicates the pixel location.The transmission map is expressed as t = e −βd(x) , wherein β signifies the atmospheric scattering coefficient, and d signifies the depth of the scene.As the scene depth increases, the amount of light that reaches the camera decreases exponentially.This is because the light encounters more scattering as it passes through the atmosphere.To streamline the process for convolution operations, we reconfigure the equation and reformulate it in a matrix representation as follows: where J, T, I, and A denote the matrix-vector representations of J, t, I, and A, respectively.Based on the equations presented above, we constructed the Haze-Aware Attention Module in an intuitive and effective manner.We assumed that the atmospheric light is uniform and derived A by transforming the global contextual information of the entire image captured through Global Average Pooling (GAP).
Here, X represents the input feature map, GAP(•) signifies global average pooling, Conv (N, N 8 ) (•) refers to a convolution layer with N input channels and N 8 output channels, σ denotes the Sigmoid activation function, and N is set to 64.In obtaining A, we use a process that first reduces dimensions and then increases them, which improves computational efficiency.
Given that GAP(•) captures global information, thus neglecting local details and textures, we employed a 3 × 3 convolution layer to extract features for T. By introducing physical priors, this approach balances global and local features, thereby facilitating a more effective restoration of hazy images.
Subsequently, we performed elementwise multiplication between A and (1 − T).JT was obtained via X − A(1 − T).Given that division might lead to training instability, we approximated T ′ , representing 1  T , using the following formula: Finally, J was acquired through HAAM is an advanced attention mechanism that stands out significantly from traditional spatial and channel attention mechanisms.Its core advantage lies in its ability to integrate physical prior knowledge, thus allowing the model to learn discriminative clean features more effectively during the training process.These clean features can more accurately reflect the essence of the image, thus reducing the interference of noise and artifacts, which is crucial for high-definition image reconstruction.By integrating physical priors, HAAM not only enhances the model's understanding and processing capabilities of image content but also strengthens its generalization ability.Moreover, the design of HAAM also takes into account the computational efficiency of the model, thus reducing computational costs while improving module performance and making it a widely applicable attention mechanism.Visual results comparisons on real-world hazy images from the RTTS dataset [46].Zoom in for best view.

Multiscale Frequency Enhancement Module
Traditional image restoration methods often focus on enhancing the frequency characteristics of images to improve their clarity and detail.These methods use transformation techniques, such as wavelet and Fourier transforms, to decompose the frequency features of an image into several different frequency bands.The purpose of this is to create isolation between signals of different frequencies, thus reducing their mutual interference, so that each band can be processed independently.Applying different convolutional kernels to these bands can further extract and enhance information within specific frequency ranges.This approach allows for the optimization of high-frequency details and low-frequency contours separately during image processing to achieve better restoration results.However, there are some limitations to this method.Firstly, it may not accurately identify and select the frequency components that carry the most important information in the image.Secondly, the need to process multiple bands separately not only increases the complexity of the algorithm but also significantly raises the computational cost.Especially during the inverse transformation, the operation must be performed separately for each band, which can be a bottleneck when computational resources are limited.Furthermore, since the size of the degradation blur is always variable, the field of view (receptive field) is crucial in the image restoration process.We propose an exceptionally concise and efficient multiscale frequency enhancement module that employs extremely lightweight learnable parameters to effectively decompose frequencies into distinct components, thereby highlighting parts that contain key information.As depicted in Figure 3, we fully considered the impact of the receptive field in our design by utilizing convolutional kernels of sizes 3 × 3, 5 × 5, and 7 × 7 kernels, along with a global kernel, to capture four low-frequency components with different receptive field sizes.By subtracting these low-frequency components from the original input, we were able to generate high-frequency components and enhance the frequency sub-bands carrying significant information through network parameters.Subsequently, we applied these learnable channel weights to different frequency sub-bands.This process not only allows for individual processing of each frequency sub-band but also achieves fine-tuning of features, thus further enhancing the quality of image restoration.
The MFEM primarily consists of two main parts: the decoupler and the modulator.The decoupler acquires various frequency sub-bands using multiscale filtering.The modulator then highlights the significant frequency sub-bands with learnable parameters and processes each sub-band individually through learnable parameters on the channel dimension.
For any input feature map X ∈ R C×H×W , we obtain the lowest frequency spectrum through average pooling.Then, by subtracting the low-frequency part from X, we obtain the high-frequency part.To fully capture spectral information from different receptive fields, we process X using kernels of sizes 3 × 3, 5 × 5, and 7 × 7 and a global kernel.The formula is as follows: Inthis context, X l g and X h g represent the global low-frequency and high-frequency sub-bands, respectively.X l 3×3 , X h 3×3 , X l 5×5 , X h 5×5 , X h 5×5 , X l 7×7 , and X h 7×7 denote the low-frequency and high-frequency sub-bands for different receptive field sizes.To emphasize the frequency sub-bands that carry important information, we apply learnable weight parameters to the obtained frequency sub-bands for weighting.Taking only the global receptive field as an example, the formula is as follows: Xl g represents the global low-frequency sub-band after emphasizing important information.Finally, we modulate the weighted frequency sub-bands along the channel dimension using learnable parameters.The final output of the MFEM is obtained by summing these elements together.Our MFEM excels at addressing uneven fog densities and irregular shapes in images, thus successfully achieving the goal of high-quality image reconstruction.This module, with its innovative multiscale processing approach, can accurately identify and handle various details and textures within the image, thus maintaining clarity and realism even under complex conditions.The core strength of MFEM lies in its fine control over different frequency components, thus allowing it to provide customized treatment for every detail in the image.By separately optimizing low-frequency and high-frequency information, the MFEM can significantly enhance the clarity of edges and textures while preserving the overall structure of the image.Moreover, the lightweight design of the module also means it has a significant advantage in computational efficiency, thereby enabling it to quickly process a large amount of image data without compromising performance.
For the loss function, We designate the dehazed image, the clear ground truth J gt , and the hazy image I as the anchor, positive sample, and negative sample, respectively: L CR = CR(HAA-Net(I), J gt , I).Finally, we combine the loss obtained from Contrastive Regularization with the L1 loss function to form the final loss function: Our experimental validation shows that when λ 1 = 0.5, excellent metrics were achieved.

Experiments 4.1. Implementation Details
We used the PyTorch 1.11.0 version on Four NVIDIA RTX 4090 GPUs to conduct all the experiments.When training, the images were randomly cropped to 320 × 320 patches.When calculating model complexity, we set the size to 128 × 128.We used the Adam optimizer with a decay rate of 0.9 for β 1 and 0.999 for β 2 .The starting learning rate was set at 0.00015, and we scheduled it with a cosine annealing strategy.The batch size was set to 64.Empirically, we set the penalty parameter λ to 0.2 and γ to 0.25, and we trained for 80 k steps.We employed Contrastive Regularization (CR) [3] to better restore dehazed images.

Datasets and Metrics
We used the PSNR and SSIM to evaluate the performance of our HAA-Net.We trained and tested the network on five datasets:RESIDE-Indoor [47], Haze4K [48], RTTS [46], RESIDE-Outdoor [47], NH-HAZE [49], and Dense-Haze [50].Specifically, the RESIDE-Indoor dataset has a total of 13,990 image pairs.We trained our model using 13,000 of those pairs and then tested the model on an additional 990 images from the RESIDE-Indoor set.We also conducted training and testing on the RESIDE-Outdoor dataset, which is larger and offers a more diverse set of data.This process fully demonstrates the model's generalization capabilities.The Haze4K dataset comprises 4000 image pairs, with 3000 used for training and the remaining 1000 for testing.Compared to RESIDE-Indoor, Haze4K includes both indoor and outdoor scenes, thus making it more realistic.The RTTS dataset consists of 1000 real haze images, which is ideal for assessing the generalization of our model trained on RESIDE-Indoor and Haze4K.It differs significantly from the other two datasets, thus providing a challenging and effective benchmark for evaluating the performance of HAA-Net.The NH-HAZE dataset is made up of 55 image pairs, with 50 pairs used for training and 5 pairs for testing.This setup thoroughly demonstrates our model's ability to handle fog with uneven distribution and varying densities.The Dense-Haze dataset comprises 55 pairs, including hazy images of varying sizes and densities along with their corresponding GT images.We utilized 50 pairs for training and reserved 5 pairs for testing, thereby validating the robustness of our model.

Comparison with State-of-the-Art Methods
Results on Synthetic Dataset.We compared our approach with the state-of-the-art on simulated haze images from the RESIDE-Indoor dataset, Haze4K dataset, and RESIDE-Outdoor dataset.For the RESIDE-Indoor dataset, visually, we can observe that the KDDN [51], MSBDN [25], and AOD-Net [1] suffered from texture details loss and color distortion when dealing with small patches of haze (yellow box in Figure 4).They also exhibited edge distortion issues (green box in Figure 4).While DehazeFormer-L [13], Dehamer [52], and FFA-Net [2] produced improved images, they sometimes overly brightened the images, thus leading to the darkening of certain details (yellow box in Figure 4) and showed slight edge distortion (green box in Figure 4).In contrast, our method excelled in preserving details, the clarity of textures, and color authenticity.For the RESIDE-Outdoor dataset, as shown in Figure 5, you can clearly see that AOD-Net [1] and GridDehazeNet [14] had a lot of haze left in the images.The FFA-Net [2] and KDDN [51] both had some small Real-World Visual Comparisons.We performed real-world haze tests using samples from both the RTTS, NH-HAZE, and Dense-Haze datasets, which are more challenging than synthetic ones.The RTTS dataset includes dense and uneven haze, thereby really testing the robustness and effectiveness of the model.As shown in Figure 2, AOD-Net [1] had a lot of large fog remnants, and both GridDehazeNet [14] and FFA-Net [2] had quite a bit of fog left over, with overenhancement in the sky areas.The MSBDN [25], KDDN [51], and Dehamer [52] all had residual fog, and when the fog was heavy, the enhanced images turned out too dark.DehazeFormer-L [13] also had residual fog and severe texture loss.Clearly, the images restored by our HAA-Net are clear in texture and realistic in color, thus closely matching the clean images.This fully demonstrates the superior robustness and effectiveness of our method.The NH-HAZE is a nonhomogeneous real image dehazing dataset.As shown in Figure 6, by comparing side by side, it is clear that our HAA-Net could perfectly adapt to haze of different concentrations.And our HAA-Net achieved a PSNR of 21.32 dB and an SSIM of 0.692, which is significantly better than other state-of-the-art methods.The Dense-Haze dataset includes images with various extents and densities of haze, which poses a more significant challenge for dehazing.Nonetheless, our HAA-Net method has surpassed the best-performing methods to date, thus obtaining a PSNR of 18.74 dB and an SSIM of 0.620.These achievements strongly validate that our HAA-Net is capable of effectively handling haze of different magnitudes and concentrations.

Ablation Study
We performed an ablation study on our HAA-Net using the Haze4k dataset, thus gradually up the key parts of the model to really show how effective each module is, as shown in Table 2.We started with a "Base" model, which is a straightforward U-Net structure with basic 3 × 3 depth convolutions.When the HAAM was added to the model, its performance improved significantly, thus achieving a PSNR of 31.76 dB and an SSIM of 0.97.What is truly impressive is that the PCNR increased by 6.3 with the addition of the HAAM, thus demonstrating the effectiveness of the HAAM module for dehazing images.As we continued to enhance the HAAM, we observed the PSNR rising even higher to 32.32 dB, along with an SSIM of 0.98.When we integrated the MFEM with the HAAM, the model truly excelled, thus achieving a PSNR of 33.46 dB and an SSIM of 0.99.Furthermore, by incorporating SKFusion technology, we elevated the PSNR to a new peak of 33.93 dB while maintaining an SSIM of 0.99.These outcomes not only validate the effectiveness of the modules we have developed but also establish HAA-Net as a standout performer in the field of image dehazing.

Limitations
While our method shows excellent performance, it is not without its limitations.Specifically, due to the high complexity of our HAA-Net model and the inclusion of attention mechanisms, the number of parameters is relatively high.This could lead to increased computational costs and pose challenges in situations where computational resources are constrained.Additionally, although our network's complex design is advantageous for capturing fine-grained features in hazy images, it also results in a more complex model structure.This complexity may potentially affect the model's interpretability and could require more training data to achieve optimal performance.
Unfortunately, despite delivering remarkable results, our model, with a parameter count of 18.7 million and a computational complexity of 122.48 GMacs, still requires further optimization to be deployable on embedded devices.The deployment on embedded systems will necessitate further trade-offs between the actual performance and operational speed.

Conclusions
In this paper, we have introduced the HAA-Net, a novel image dehazing framework that uses the U-Net structure and includes HAABs.This HAAB is made up of two key parts: the Haze-Aware Attention Module and the Multiscale Frequency Enhancement Module.The HAAM cleverly mixes in physical rules at the feature level, which helps the network pick up more useful underlying details during image restoration.It is likely that by including these physical models in our HAA-Net, we have managed to obtain some really impressive results when it comes to clearing up real-world hazy images.On another note, the MFEM focuses on pulling out frequency features using a multiscale field of view and highlights important information across different channels, thus making it great for dealing with fog of all sizes and densities.We put our model to the test on both made-up and real-world datasets, and the thorough evaluations really show that the HAA-Net is robust and effective for all kinds of dehazing tasks.It is clear that our method outperforms other state-of-the-art methods, thus proving its potential as a leading solution in the fields of image processing and computer vision.

Figure 1 .
Figure 1.The overview of our Haze-Aware Attention Network architecture.We give details of the structure and configurations in Section 3. SKFusion [45] is a feature fusion method.
Figure 2. Visual results comparisons on real-world hazy images from the RTTS dataset [46].Zoom in for best view.

Figure 6 .
Figure 6.Visual results comparisons on real-world hazy images from the NH-HAZE dataset [49].Zoom in for best view.
7 × 7} represents the channel attention weight maps from the filters of various scales.Multiscale Frequency Enhancement Module.GAP stands for Global Average Pooling.AP k × k means an Average Pooling operation with a kernel size of k × k.Modulation is a process that recalibrates the channels by setting attention weights as directly learnable parameters, without adding any extra layers.Learnable parameters are adjustable values that help adjust the weights at different scales.