1. Introduction
Image quality is critically important in vision-based applications such as object detection, intelligent driving, and video surveillance systems. However, under foggy conditions, suspended atmospheric particles cause light absorption, refraction, and reflection, significantly degrading image visibility, contrast, texture details, and color fidelity. This degradation severely impairs the performance of computer vision algorithms, making defogging an essential preprocessing step for visual perception tasks. The critical importance of defogging is further underscored by its diverse applications: In intelligent driving systems, it restores obscured traffic elements (e.g., pedestrians and road signs), directly improving object detection accuracy and mitigating safety risks. For remote sensing and aerial monitoring, it corrects atmospheric distortions in satellite/UAV imagery, enabling precise land cover classification and disaster assessment. In surveillance security systems, edge-preserving algorithms recover identifiable features (e.g., faces and license plates) from foggy footage for forensic analysis. For drone-based topographic mapping, it rectifies contrast loss in mountainous terrain where localized fog causes significant detail distortion. These mission-critical applications highlight defogging’s societal impact in maintaining computer vision reliability under environmental degradation. To address these needs, current defogging algorithms are primarily categorized into three approaches: image enhancement methods [
1,
2,
3], image restoration techniques [
4,
5,
6,
7], and hybrid approaches [
8,
9,
10].
Image enhancement techniques aim to improve the visual quality of images by enhancing contrast and sharpness. These methods directly process the image to remove haze and improve visual clarity. Although they do not explicitly model the physical effects of atmospheric scattering, they can effectively address issues such as uneven illumination and enhance image details. This makes them particularly useful for improving the accuracy of atmospheric light estimation and transmittance computation in hazy images. Early enhancement algorithms were based on the gray-level histogram of the image, such as histogram equalization [
11], gamma correction [
12], and logarithmic transformation [
13]. These methods, however, often fail to consider the local relationships between pixels, leading to the potential over-enhancement or under-enhancement of the image. More recent approaches, such as those based on wavelet transforms [
14] and Retinex theory [
15,
16,
17,
18], have shown greater promise in addressing these limitations. Retinex theory, in particular, has gained significant attention in the field of image enhancement. This theory, which models the way the human visual system perceives color and brightness, decomposes an image into illumination and reflectance components. By manipulating these components, Retinex-based methods can effectively enhance image details and improve overall visual quality. For instance, the Center–Surround Retinex (CSR) model computes the local contrast of an image, which can be used to enhance details in dark areas and improve the overall image quality [
15]. Zhuang et al. [
16] proposed a super-Laplacian prior for reflectance and used an L1-norm regularization for the multi-level gradient of the reflectance to highlight important structural elements and fine-scale details in images. In the context of non-uniform illumination, Retinex-based methods have been particularly effective. These methods simulate the perception of physical reflectance images by the visual system through the computation of contrast. The image is decomposed into perceived reflectance and perceived illumination components. Subsequently, the dynamic range of the illumination component is adjusted in accordance with the visual system’s response to light intensity, thereby achieving image enhancement. This approach not only enhances the details in under-exposed regions but also maintains the overall naturalness of the image [
17]. Specifically, in the calculation of the perceived reflectance and perceived light components of an image using Center–Surround Retinex (CSR) models, by adjusting the dynamic range of the perceived illumination component, we were able to enhance the details in dark areas while maintaining the naturalness of the image [
18]. This method has shown promising results in improving the visual quality of images with non-uniform illumination.
Image restoration defogging approaches are grounded in the physics of aerosol scattering on imaging. An atmospheric scattering model describes this effect, as seen in
Figure 1. The characteristics of a clear image are analyzed to estimate the transmission map and at-atmospheric light intensity of a foggy image. Next, the model’s inverse process is carried out to recover an image clear of haze. Through a statistical analysis of extensive outdoor clear image datasets, researchers such as He et al. observed that in most non-sky patches, at least one RGB channel exhibits very low intensity values, often approaching zero. Dark channel prior [
19] (DCP) theory, which is used to create a transmission map and calculate atmospheric light intensity, was put forth in response to this phenomenon. He et al. optimized the transmission map obtained from the DCP using a guided filter [
20] to provide a refined transmission map. However, this method fails to account for sky regions, leading to inaccurate estimates that cause artifacts like intense halos and color distortion in dehazed images. To address these limitations, numerous researchers have proposed methods focusing on refining transmission maps and atmospheric light estimation. A haze reduction approach based on global illumination [
21] adjustment was presented by Zhen et al. to lessen the effects of background interference on atmospheric light and transmission map estimation. A vector orthogonality-based technique [
22] for determining the atmospheric light vector’s amplitude and direction was presented by Kong et al. In order to accomplish accurate transmission map estimation, Cheng et al. applied gamma correction [
23] to the transmission map that was previously acquired from the dark channel. They then fused the original and corrected transmission maps using a weighted averaging image fusion algorithm. In order to obtain accurate transmission maps, Li et al. combined the benefits of the image enhancement and picture restoration defogging procedures by introducing a Gaussian-weighted image fusion method [
24]. In order to restore transmission maps and atmospheric light intensity, Guo et al. [
25] suggested a Rayleigh scattering and adaptive color compensation method that uses chromaticity and luminance variations in areas where the dark channel prior is ineffective for regional segmentation. Hong et al. proposed a pixel-level transmission estimation technique based on estimated radiance patches in order to address the problem of outliers brought on by depth discontinuities during the refinement of initial transmission maps [
26]. Meng et al. [
27] employed a boundary-constrained method for coarse transmittance estimation, followed by refinement using a weighted L1-norm regularization. Fattal et al. [
28] presented a haze removal method based on independent component analysis; however, it is ineffective under dense haze conditions. A quick defogging technique based on the atmospheric dissipation coefficient was presented by Tarel et al. [
29]. Although this technique offers high processing speed, it requires extensive parameter tuning, and its performance is highly sensitive to parameter changes, reducing its flexibility. Additionally, it can cause brightness reduction and halo artifacts at depth edges in dehazed images. The color attenuation prior (CAP) [
30] was created from the observation by Zhu et al. that haze concentration is correlated with the difference between brightness and saturation in hazy images. Based on this prior, a linear regression model extracts scene depth information from hazy images, which is then combined with the atmospheric scattering model for haze removal.
In order to provide high-quality visual data, image fusion-based defogging algorithms aggregate and recombine pertinent or valuable information [
31] from many input sources. This method usually uses certain enhancement approaches, first making hazy images clearer, then integrating the improved images using fusion algorithms to provide a dehazed output. For example, Zhu et al. applied nonlinear gamma correction [
32] with different settings to strengthen hazy images, increasing the resilience of image defogging. The final haze-free image was then created by fusing four luminance-adjusted photos. For defogging, Zheng et al. combined multi-exposure image fusion with adaptive structural decomposition [
33]. As deep learning has advanced, researchers have turned their attention to multi-feature fusion algorithms based on picture combining, which greatly enhance defogging efficiency through dynamic optimization mechanisms and multi-level feature interaction. In order to significantly improve detail recovery in areas with intense fog, Wang et al. devised MFID-Net [
34], which makes use of a multi-feature fusion (MF) module that dynamically weights channel and pixel-level information.
The effectiveness of image enhancement techniques for defogging in the previously mentioned approach is limited; the excessive suppression of high-frequency components during the enhancement process can result in the blurring of edge details, and artifacts can be introduced during frequency domain inverse transformation. Nonetheless, some inhomogeneous haze conditions can be addressed by these techniques. They can produce better outcomes when paired with restoration-based defogging algorithms. While supervised deep learning-based fusion techniques typically require well-annotated samples (incurring acquisition costs), unsupervised methods (e.g., GANs) and synthetic data generation (e.g., 3D-rendered hazy images) offer alternatives to reduce annotation dependency. Nevertheless, hybrid approaches combining physical models with efficient learning remain competitive in latency-sensitive applications. In order to overcome problems including imprecise atmospheric light estimation, halo artifacts, and color distortion in sky regions that are inherent in the dark channel prior approach, this study combines the benefits of picture enhancement and restoration techniques. We suggest a defogging approach that integrates Retinex theory with the dark channel prior. This algorithm’s primary contributions are as follows:
- (1)
In order to reduce the effects of non-uniform lighting on the estimation of atmospheric light intensity and transmission maps, improve contrast, and enrich detail information in foggy images, the Retinex algorithm with non-uniform illumination image enhancement that maintains naturalness should be applied.
- (2)
Since the estimation of atmospheric light intensity is susceptible to the influence of bright objects, the Otsu algorithm and edge extraction are used to exclude the region of bright objects in the image to realize the accurate estimation of the atmospheric light value.
- (3)
In order to solve the problem of the halo effect and the color distortion of the sky region in the defogged image, the dark channel is compensated by mean value filtering to suppress the halo effect, and transmittance is optimized, and furthermore, we solve the problem of color distortion by correcting the transmittance of the sky region of the dark channel using a CAP a priori estimation of the depth map.
This paper’s remaining sections are arranged as follows:
Section 2 introduces the theoretical basis and key elements of Retinex-based image enhancement and dark channel a priori defogging.
Section 3 introduces a novel defogging algorithm that integrates the dark channel prior with Retinex theory to address the limitations of existing defogging methods.
Section 4 presents both subjective assessments and quantitative comparisons with current state-of-the-art defogging algorithms, followed by conclusions and discussions in
Section 5.
3. The Proposed Methods
Figure 2 shows the image defogging algorithm flow. First, we perform Retinex-based enhancement on the input image. Next, the image is binarized using Otsu thresholding to segment potentially dense haze and high-brightness regions. The binarization results are then fused with an edge map to isolate haze-affected regions while excluding bright objects. In order to improve the localization accuracy of atmospheric light regions, a morphological expansion operation is performed on the fusion result to obtain an approximate region of atmospheric light. Then, in the enhanced image, the pixel values of the positions corresponding to the top 0.1% brightest pixels in the corresponding region are selected, and their average values are the atmospheric light estimates. At the stage of acquiring the dark channel map of the image, the edge-preserving optimization technique is introduced to protect the edge information of the image, the mean value filter is used to obtain the mean dark channel map, and the weighted fusion of the minimum dark channel map (the original dark channel image) and the mean dark channel map realizes the compensation of the original dark channel, and at the same time, the transmittance value is refined. Furthermore, the depth map estimated using the CAP is corrected for transmittance values in the sky region via the dark channel prior. Finally, we recover the fog-free image using the atmospheric scattering model.
3.1. Dark Channel Prior Compensation
When calculating the dark channel values of regions with large changes in the depth of field edges of the image, the minimum value filter used may cause the dark channel values to be on the low end, leading to the overestimation of the transmittance values, which in turn causes serious halo effects in the defogged image, so the mean value filter is used to obtain the mean dark channel map. Considering that the mean filter can improve the dark channel value and reduce the transmittance value of the local area, but it may also cause the overall color of the recovered image to be dark, we use the weighted value of the mean filter and the minimum filter as the new dark channel value of the halo area to inhibit the halo effect and, at the same time, to reduce the overall color of the dark effect.
First, the enhanced image is minimized to determine the
R,
G, and
B channel minima
, as in Equation (15):
where
denotes the
R,
G, and
B color channels.
Subsequently, the results obtained from Equation (15) are subjected to minimum filtering and mean filtering to derive the dark channel map and the mean dark channel map, respectively. The expressions are as follows:
where
denotes the original dark channel image, and
represents the mean dark channel image.
Mean value filtering can increase the dark channel value in the local area, thus reducing the value of the transmittance in the area and improving the defogging effect, but it may lead to an overall dark color of the image after defogging. Therefore, this paper adopts the weighted value of the mean filter and minimum filter as the new dark channel value of the halo region, in order to inhibit the halo effect and minimize the dark effect of the overall color. The halo region value is calculated as follows.
The absolute value image of the corresponding pixel point difference between the original dark channel image
and the minimum value grayscale map
is calculated to obtain the difference image
, as in Equation (18):
where
denotes the edge image obtained by the difference, which refers to the halo regions.
Finally, an adaptive threshold
is introduced, whereby pixels with a
difference greater than or equal to
, identified as edge pixels, require correction. This correction is achieved through the weighted fusion of the original dark channel map and the mean dark channel map, resulting in a refined dark channel image
. The optimal value of
, determined through multiple experiments, is 0.8, as in Equations (19) and (20):
The weighting coefficient
= 0.8 is determined by minimizing the halo artifact metric
H defined as the gradient variance in depth-discontinuous regions:
where
denotes edge pixels. As shown in
Figure 3d,
reaches its minimum at
= 0.8 across 100 test images fromc RESIDE dataset. Values within [0.75, 0.85] yield comparable results (
H < 1.2 min(H)), confirming robustness.
Figure 3 presents the enhanced foggy image, the minimum value dark channel map, the mean-compensated dark channel image, and a sensitivity analysis of the weighting coefficient
. In
Figure 3b, blocky artifacts are observed in regions with a significant depth of field variation (building edges and tree branches), which directly contribute to halo artifacts in the restored image. These artifacts manifest as unnatural bright/dark bands along depth discontinuities, degrading visual quality.
Figure 3c demonstrates substantial improvements: 1. Enhanced detail preservation at object boundaries. 2. Smoother transitions in homogeneous regions (uniform sky areas). 3. Effective suppression of halo artifacts caused by abrupt depth changes. The quantitative basis for these improvements is established in
Figure 3d, where the halo artifact metric
shows a distinct minimum at
= 0.8:
Halo Artifact Reduction: There are 42% fewer halo artifacts obtained compared to when using = 0.5.
Edge–Halo Balance: There is 23% better edge preservation than when using = 1.0 and effective halo suppression.
The synergistic relationship between visual improvements in
Figure 3c and quantitative optimization in
Figure 3d confirms that
optimally addresses the artifacts shown in
Figure 3b while preserving critical image details.
3.2. Optimization of Atmospheric Light Values
The dark channel a priori defogging algorithm selects the top 0.1% of the highest brightness pixel values in the dark channel map to be used as the global atmospheric light estimation value, and it considers that the position with the highest brightness represents the position at infinity; when there is a bright object in the image, the algorithm will be incorrectly localized on the bright object, which will result in the incorrect estimation of the atmospheric light value. In this paper, we adopt the Otsu algorithm and the edge extraction method to estimate the atmospheric light value.
First, the Otsu algorithm is used to perform binary segmentation on the enhanced image in
Figure 4a. The segmented image is shown in
Figure 4b. The mathematical expression of Otsu’s algorithm is shown below:
In this formula,
represents the pixel value of the image;
represents the optimal threshold for binary segmentation, which takes the value of 0.65;
represents the foggy area and the bright area of the image; and
represents the foggy area and the fog-free area. The white car existing in
is a bright object, which will affect the judgment of the atmospheric light area and needs to be extracted separately. Considering that the overall gray level of foggy images is relatively smooth and contains more noise, the Sobel operator is relatively accurate in edge location for high-noise images with gradual grayscale changes. Therefore, the Sobel operator is used to extract the edges. The extraction results are shown in
Figure 4c. The convolution kernels in the x and y directions are
and
, respectively:
where
denotes a Retinex-enhanced image, and
and
denote the convolution kernel in the
x and
y directions, respectively.
Considering the gradients
and
obtained by Equation (23), the sum of the absolute values of both is calculated as the edge extraction result
with the following expression:
where
represents the edge extraction result.
To eliminate the influence of highlighted objects on the atmospheric light region, we fuse the results obtained by the Otsu algorithm with the results of edge extraction. The specific fusion process is as follows: Invert the edge extraction result
, turning the original white edge area into a non-black edge area and the original black non-edge area into a new white edge area. Then, perform an operation on the Otsu algorithm result
and the inverted
to obtain the fusion result
to remove the highlighted object. The fusion result is shown in
Figure 4d, and the mathematical expression for this result is as follows:
where
denotes the fusion result; since
contains holes and cracks as well as a large amount of noise, in order to improve the localization accuracy of the atmospheric light region, an expansion operation is performed on
to obtain the approximate region of the atmospheric light, which is set to be
, as shown in
Figure 4e. The values of the pixels at the positions corresponding to the brightest pixels of the top 0.1% in
in the enhanced foggy image
are selected, the average of which is calculated to be the atmospheric light estimation value.
3.3. The Correction of the Transmittance Value in the Sky Area
Due to the large average brightness of the sky region with almost no dark elements, the calculated transmittance value will be small and lead to the final result being over-amplified, resulting in color distortion in the sky region. The color attenuation prior proposed by Zhu et al. [
30] estimates the depth map of the scene by building a linear model of depth versus brightness and saturation, which leads to better performance when dealing with the sky region, and therefore, we use the depth map estimated by the CAP to correct the transmittance value of the DCP in the sky region, which can more accurately reflect the fog concentration in these regions, thus making the CAP depth map in these regions able to compensate for the deficiency of the DCP.
According to the color decay a priori method, the concentration of fog at any pixel point of a foggy image is positively correlated with the difference between the brightness and saturation of that pixel, as shown in Equation (26):
where
denotes the fog density at the pixel location,
denotes the pixel brightness, and
denotes the pixel saturation; the linear model expressing depth in relation to brightness and saturation is as follows:
where
,
, and
denote unknown linear coefficients (
= 0.121779;
= 0.959710;
= −0.780245), and
denotes a random 0. Equation (27) can be substituted into Equation (9) to produce the following:
where
denotes the transmittance derived from the color attenuation prior. Subsequently, the sky region is detected using dual thresholds for luminance and saturation, as in Equation (29):
When the
value equals 1, it indicates the sky region in the image; a value of 0 signifies a non-sky area.
is the luminance threshold, and
is the saturation threshold, where
= 0.85, and
= 0.12 based on sky region statistics from 100 foggy images in the RESIDE dataset. Within the sky region, weights are adjusted based on luminance and saturation to prevent edge discontinuities caused by segmentation. A weighting function is employed to correct the transmittance in color-distorted areas, where
denotes the transmittance value obtained by the original dark channel a priori defogging algorithm, as in Equations (30) and (31):
3.4. An Evaluation of the Results of Image Defogging
At present, the commonly used quality evaluation standards for restoration images are mainly divided into subjective evaluation and objective evaluation conducted in two ways. Subjective evaluation involves considering the subjective human visual experience of an image; objective evaluation takes into account the relevant evaluation parameter standards for image evaluation. Usually a combination of subjective and objective evaluation is used.
3.4.1. Subjective Evaluation
Subjective evaluation is a qualitative method of analysis. It is based on people’s subjective feelings and visual experience and judges the advantages and disadvantages of an image by observing its clarity, resolution, and color accuracy and the level of detail of features. This evaluation method requires a lot of time and human resources and is easily affected by subjective feelings, but the combination of objective evaluation indicators can make the evaluation results more reliable.
3.4.2. Objective Evaluation
Objective evaluation is a quantitative analysis methodology based on specific parameters and metrics, categorized into reference-based and non-reference-based approaches. It evaluates images by analyzing pixel values, structural information, and statistical features to provide empirical data support. Reference-based evaluation requires fog-free images of the same scene as a benchmark, typically utilizing synthetically generated image pairs for assessment. Non-reference evaluation does not require the original clear image; it directly analyzes a single dehazed image. Considering the difficulty of acquiring clear images in real foggy conditions, we adopt the visibility assessment method based on perceptual edges proposed by Nicolas Hautière et al. as an objective criterion [
37] for algorithm performance comparison. This method evaluates image visibility through three specific metrics.
- (1)
Ratio of visible edges ():
The visibility of edges in the dehazed image is quantified by the visible edge ratio (
), where a higher value of
indicates a greater number of detectable edges and more detailed information in the dehazed image. Its mathematical expression is as follows:
where
denotes the number of visible edges in the original foggy image, and
denotes the number of visible edges after image defogging.
- (2)
Ratio of average gradient ():
The mean visibility of the dehazed image is quantified by the average gradient ratio (
), where a higher value of
indicates a clearer dehazed image. Its mathematical expression is as follows:
where
denotes the mean gradient of the dehazed image, and
denotes the mean gradient of the original foggy image.
- (3)
Saturation pixel ratio ():
The number of completely black and completely white pixels in the dehazed image is quantified using the saturation pixel percentage (
). A lower value of
indicates higher image contrast and a more effective defogging process, expressed by the following formula:
where
denotes the number of completely black or white pixels in the dehazed image,
denotes the number of rows in the image, and
denotes the number of columns in the image.
4. Experimental Outcomes and Discussion
An 11th Generation Intel(R) Core (TM) i7-11800H CPU operating at 2.30GHz and running Windows 11 comprises the hardware environment for this experiment. The technique is implemented using the MATLAB 2023a platform. In order to verify the effectiveness and generalizability of this technique, we conducted comparative experiments with the MSR [
15], DCP [
19], Meng [
27], Tarel [
29], and Grid_Net [
38] algorithms.
As shown in
Figure 5, we selected 10 test images: 6 self-selected images (Img1–6) and 4 public dataset images (Img7–10). Img7 and Img8 are from the dataset NH-HAZE, and Img9 and Img10 are from the dataset RESIDE. Img1, Img3, Img4, Img5, Img7, and Img9 contain sky regions, while Img2, Img6, Img8, and Img10 do not contain sky regions. A comprehensive evaluation of the Retinex and dark channel a priori defogging algorithms was performed using both subjective and objective evaluations, with particular attention paid to halo artifact suppression in the non-sky regions, color fidelity in the sky scene, and consistency across datasets.
4.1. Subjective Evaluation
With an emphasis on image contrast, feature preservation, and color fidelity,
Figure 6 shows the outcomes of defogging for non-sky images from multiple datasets: Img2 and Img6, Img8 (NH-HAZE), and Img10 (RESIDE). The original hazy images exhibit monochromatic tones, low contrast, and fog-obscured details.
The MSR enhancement technique improves contrast but retains noticeable haze, failing to enhance overall visibility. For non-sky images, the conventional dark channel prior (DCP) algorithm boosts contrast and recovers significant features without major color distortion, though mild halo artifacts persist at depth discontinuities. Meng et al.’s technique restores natural colors effectively but lacks flexibility in complex scenes, as evidenced by obscured low-contrast details. Tarel et al.’s method achieves satisfactory contrast but introduces localized color variations (bluish tint on roads in Img8). Grid_Net robustly removes haze, but it overly smoothens the texture, and the defogged images show relatively severe color distortion (such as the colors of houses in Img2 and trees in Img5).
In contrast, our proposed algorithm achieves an optimal balance between the following: contrast enhancement (vivid foreground vegetation in Img5); detail recovery (preserved road structures in Img8); and color integrity (natural tones without halos in Img9). This consistency across self-collected and public datasets (NH-HAZE and RESIDE) demonstrates strong adaptability to diverse haze distributions.
Using several defogging techniques,
Figure 7 shows the defogging outcomes for sky-containing images from multiple sources: self-collected data (Img1, Img3, Img4, Img5); the NH-HAZE benchmark (Img7); and the RESIDE dataset (Img9).
The conventional dark channel prior produces significant sky region artifacts, including color distortion (cyan cast in Img4 and Img9 sky) and halo effects (Img5 tree boundaries). Meng’s weighted L1-norm approach effectively removes haze but introduces color shifts (yellowish buildings in Img3) and inconsistent saturation (darkened skies in Img9). Tarel’s method causes over-saturation (mountain in Img3) and amplifies halo artifacts (Img5 tree edges). Grid_Net demonstrates competitive performance on synthetic haze (Img9) but exhibits limitations in real-world conditions: patchy sky restoration in Img1 and texture loss in dense haze (Img4 foreground).
Our proposed method overcomes these limitations by achieving the following: 1. Eliminating color distortion through depth-guided transmission correction (neutral skies in Img5/Img9). 2. Suppressing halo artifacts via mean dark channel compensation (clean horizon in Img4 and Img5). 3. Preserving structural details in diverse haze densities. This consistent performance across self-collected and public benchmarks validates our approach’s robustness for sky-containing hazy images.
4.2. Objective Evaluation
Three established metrics were used across all datasets, including the visible edge ratio (
), average gradient ratio (
), and saturation pixel ratio (
) (
Table 1). cey findings demonstrate our method’s consistent superiority: 1. In terms of edge preservation, the highest
was achieved in Img9 and Img10 (Img9: 4.86 vs. Grid_Net’s 4.01), which is a 68% average improvement over traditional methods in self-collected data. 2. Regarding structural detail recovery,
dominated in all test cases, presenting significant gains of 23% over the DCP in Img4 and 8% over Grid_Net in Img10. 3. With respect to color fidelity, the
value obtained is 0 in Img8 and Img9, outperforming Grid_Net (0 vs. 0.0009 average
) and showing a minor compromise only in dense haze (Img4’s
is 0.024). Public data confirmed the generalizability of these results. NH-HAZE obtains a 15% higher
than Grid_Net, and RESIDE obtains a 12% better
than the best alternative. In a Grid_Net comparison, it was shown to match edge recovery in synthetic haze (Img9) and fall short in color preservation.
In summary, the quantitative analysis presented in
Table 1 unequivocally validates the superior performance of the proposed algorithm. The key innovations, mean dark channel compensation and Retinex-enhanced transmittance correction, collectively yield significant improvements across all established metrics. Specifically, mean dark channel compensation demonstrably enhances edge visibility, evidenced by substantial increases in the visible edge ratio (a 32% improvement in Img6). Concurrently, the Retinex preprocessing method combined with refined transmittance estimation significantly boosts structural detail recovery, reflected in consistently higher average gradient ratios. Crucially, our approach excels in preserving color fidelity, achieving near-zero saturation pixel ratios in most test images, and effectively mitigating color distortion. This comprehensive superiority over both conventional (MSR, DCP, Meng, Tarel) and deep learning (Grid_Net) benchmarks underscores the efficacy of integrating Retinex theory with the enhanced dark channel prior framework. The robust quantitative results across diverse datasets (self-collected, NH-HAZE, RESIDE) and scene types (with/without sky) confirm the algorithm’s reliability and generalizability in producing high-quality, haze-free images.
4.3. Computational Efficiency Analysis
To analyze computational complexity and runtime performance issues, we measured the average processing time of all the compared algorithms on the complete RESIDE dataset (400 × 400). The results in
Table 2 show the competitive efficiency of our approach.
As demonstrated in
Table 2, the proposed algorithm achieves a favorable balance between computational efficiency and restoration quality. While MSR exhibits the fastest processing time (0.004 s) owing to its computationally lightweight multi-scale convolution, traditional restoration methods like the DCP (0.018 s) incur higher costs primarily due to the computationally intensive guided filtering step for transmission refinement. Meng (0.006 s) and Tarel (0.007 s) show moderate efficiency. The deep learning-based Grid_Net, despite its strong performance, is the slowest (0.059 s) due to the inherent complexity of its multi-scale CNN architecture. In contrast, our method processes images in 0.023 s on average, representing an increase of 2.6 in speed over Grid_Net. This efficiency stems from the deliberate design choices within our bi-level optimization framework: (1) The Retinex preprocessing method and subsequent parameter estimation steps (atmospheric light via Otsu/edge fusion, compensated dark channel via mean/min fusion) rely primarily on efficient filtering and thresholding operations, avoiding heavy optimization or learning. (2) The depth-guided transmission correction using the CAP leverages a simple linear model, adding minimal overhead compared to deep feature extraction. In practical applications, such as those based on a real-time constraint of less than 100 ms in the perception layer of automatic driving, the defogging algorithm proposed in this paper currently consumes an average of 23 ms, which satisfies the delay constraint of 20–100 ms. Crucially, this computational efficiency is attained without compromising the significant visual quality improvements demonstrated in
Section 4.1 and
Section 4.2 and quantified in
Table 1. The proposed method thus establishes a practical efficiency–quality trade-off, being substantially faster than state-of-the-art deep learning alternatives while maintaining competitive timing with traditional methods and delivering superior defogging results.