Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net

Han, Jinru; Han, Yunho; Kim, Jiyoung; Park, Woo-Chan

doi:10.3390/ai7060190

Open AccessArticle

Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net

¹

Department of Computer Science and Engineering, Sejong University, Seoul 05006, Republic of Korea

²

EXARION, Seoul 05006, Republic of Korea

^*

Author to whom correspondence should be addressed.

AI 2026, 7(6), 190; https://doi.org/10.3390/ai7060190

Submission received: 10 March 2026 / Revised: 10 May 2026 / Accepted: 19 May 2026 / Published: 25 May 2026

Download

Browse Figures

Versions Notes

Abstract

Atmospheric scattering caused by fog degrades image quality and significantly reduces the reliability of computer vision systems. Existing dehazing studies have mainly evaluated dehazing performance using pixel-level metrics such as PSNR and SSIM. However, these metrics do not fully reflect the actual impact of dehazing on downstream object detection performance. Therefore, this paper treats image dehazing as a preprocessing step for object detection in foggy environments and analyzes its effect using standard object detection evaluation metrics. The experimental results demonstrate that, under three fog-density conditions,

β = 0.005

,

0.010

, and

0.020

, images processed by the DL-U-Net-based dehazing method achieved higher

mAP @ 0.5

values than the corresponding original hazy images, with relative improvements of

+ 0.39 %

,

+ 6.60 %

, and

+ 13.37 %

, respectively. Furthermore, under the dense fog condition of

β = 0.020

, Recall improved more substantially than Precision. These results indicate that, as fog density increases, dehazing preprocessing becomes more effective in restoring object structural information, reducing missed detections, and enhancing downstream object detection performance.

Keywords:

image dehazing; foggy object detection; dehazing preprocessing; Foggy Cityscapes; YOLO26n; DL-U-Net; mAP

1. Introduction

Recent advances in artificial intelligence (AI), particularly deep learning-based image recognition, have enabled computer vision systems to achieve significant success in real-world applications such as autonomous driving, intelligent surveillance, and unmanned aerial vehicle (UAV) remote sensing [1,2,3,4]. The performance of these systems fundamentally depends on the quality of the input images [5]. However, atmospheric scattering caused by fog, mist, or smoke in complex environments significantly reduces visibility and contrast, thereby undermining detection stability and accuracy [6,7].

Suspended fine particles in the atmosphere degrade images through scattering and absorption, causing contrast reduction, color distortion, and loss of fine details. This degradation not only reduces visual quality but also impairs the performance of deep learning models in high-level visual tasks such as detection, tracking, and semantic segmentation [7,8,9].

To address this, preprocessing that simultaneously improves both image quality and detection accuracy is required. Such an approach ensures that the system reliably performs its tasks even in degraded environments such as fog [10,11].

Existing dehazing techniques have focused on low-level metrics such as PSNR and SSIM, and despite improvements in visual clarity, performance improvements in high-level visual tasks such as detection are often overlooked [12,13]. Some studies still use subjective clarity as a key metric, and systematic analysis remains insufficient regarding the effects of dehazing preprocessing on metrics such as Precision, Recall, and false negatives (FN) in complex scenes [14].

In object detection tasks under foggy environments, improvements in the visual quality of images do not necessarily lead to improved detection performance. Dehazing preprocessing can enhance image contrast and restore some structural information of objects, but it may also introduce local artifacts and affect the detector’s process of judging object features [15]. Therefore, metrics limited to evaluating visual feature restoration are insufficient to fully explain the actual role of dehazing preprocessing in downstream object detection tasks.

To address this issue, this study applies image dehazing as a preprocessing step for object detection in foggy environments and analyzes changes in detection performance before and after dehazing using standard object detection evaluation metrics. This analysis verifies whether dehazing preprocessing can improve object detection performance in foggy environments.

The experiments were conducted using the Foggy Cityscapes dataset [16]. Foggy Cityscapes was constructed by Sakaridis, Dai, and Van Gool based on the Cityscapes dataset [16,17]. It applies synthetic fog effects based on a physical model to real urban scene images. In this dataset, fog density is controlled by the atmospheric attenuation coefficient

β

[16]. This study used three

β

settings, 0.005, 0.010, and 0.020, corresponding to light, moderate, and dense fog scenes, respectively [16]. By comparing detection results between the original hazy images and the dehazed images under each fog-density condition, this study analyzed the effect of dehazing preprocessing on object detection performance.

The experimental results show that mAP@0.5 was higher for the dehazed images than for the original hazy images under all three fog-density conditions. This indicates that dehazing preprocessing improves object–background clarity and makes objects easier to detect by mitigating fog-induced occlusion and contrast degradation. For mAP@0.75, consistent performance improvement was not observed under light or moderate fog conditions. This suggests that artifacts generated during dehazing may have affected bounding box localization. Under dense fog conditions, the restoration of object structural information became more evident, leading to improved detection performance at mAP@0.75. In terms of the aggregated IoU metric, mAP@0.5:0.95 also showed superior detection performance across all fog-density conditions.

This study aims to analyze the effect of dehazing-based preprocessing on object detection performance in foggy environments. Although visual artifacts may occur during dehazing, the experimental results show that the benefits of contrast enhancement and structural information restoration outweigh their negative effects on object detection performance.

The remainder of this paper is organized as follows. Section 2 reviews related work, and Section 3 presents the object detection evaluation method based on dehazing preprocessing. Section 4 presents the experimental setup and result analysis, and Section 5 concludes the paper and discusses future research directions.

2. Related Works

Early dehazing techniques commonly model image degradation using the atmospheric scattering model (ASM) [18], which is expressed as follows:

I (x) = J (x) t (x) + A (1 - t (x))

Although methods using polarization imagery or multi-view information to estimate the transmission t and atmospheric light A offer physical interpretability, their reliance on multiple frames limits their applicability to single-image dehazing. Single-image inverse-problem approaches rely heavily on statistical assumptions, which can lead to performance variability and limited generalization under non-homogeneous haze or strong illumination conditions.

The dark channel prior (DCP) [19] is based on the natural image statistic that, within a local window, at least one color channel has an intensity close to zero. In other words, DCP assumes that, in haze-free images, the pixel values of at least one channel in a local region are nearly zero, which can be formulated as follows:

I^{dark} (x) = \min_{y \in Ω (x)} (\min_{c \in {r, g, b}} I^{c} (y))

I^{c +} (y) \Rightarrow I^{c} (y)

Here, x denotes the current pixel position, and

Ω (x)

represents a local window centered at x. The variable y denotes a pixel position within the corresponding window, and c represents an RGB color channel.

I^{c} (y)

denotes the pixel intensity of the c-th color channel at position y in image I. This equation calculates the dark channel value at each position by first selecting the minimum value among the RGB channels at each pixel within the local window

Ω (x)

, and then taking the minimum over the entire local window [19]. Through this process, DCP estimates the transmission and atmospheric light in the atmospheric scattering model and restores a haze-free image.

With advances in deep neural networks, dehazing research has shifted from physics-model-driven approaches to data-driven end-to-end strategies [20]. These approaches directly learn the degradation-to-restoration mapping, eliminating the need for explicit estimation of transmission and atmospheric light [21]. Owing to their strong feature-learning capabilities, deep learning-based dehazing methods can achieve superior visual quality and quantitative performance even in complex scenes.

Recently, several studies have investigated the integration of image enhancement or dehazing with object detection. AOD-Net used an end-to-end dehazing network as a front-end module for Faster R-CNN and demonstrated improved object detection performance on hazy images [22]. IA-YOLO introduced a differentiable image processing module that adaptively enhances input images according to the degree of weather degradation, thereby improving YOLO detection performance in foggy and low-light environments [23]. BAD-Net proposed detection-friendly dehazing by connecting the dehazing and object detection modules in an end-to-end manner [12]. ERUP-YOLO proposed an integrated image-adaptive processing framework that enhances adverse-weather images through differentiable filters, improving YOLO detection performance in foggy and low-light scenes [24].

Among deep learning-based dehazing methods, U-Net-based encoder–decoder architectures have been widely used for image restoration. These architectures extract multi-scale degradation features through the encoder and progressively restore image details through the decoder. In addition, skip connections partially preserve texture and boundary information from shallow layers, helping restore object contours and local structures in hazy images [25,26,27]. The DL-U-Net model [28] used in this study belongs to this class of lightweight dehazing models. Its simple structure and low inference cost make it suitable as a preprocessing module for object detection systems. However, this study does not propose a new network; instead, it uses DL-U-Net as a dehazing preprocessing module and analyzes the impact of dehazing preprocessing on object detection performance.

Existing studies reveal two issues that require further analysis. First, the effect of dehazing on object detection performance is not always positive and may vary depending on fog density, IoU threshold, and object class [12,29]. Second, most studies focus on new network architectures or joint optimization frameworks, while insufficient attention has been given to whether dehazing preprocessing can consistently improve detector performance [12,23,24,29].

This study uses image dehazing as a preprocessing step for object detection in foggy environments and analyzes its actual impact on detection performance. Specifically, this study conducted comparative experiments on original hazy images and dehazed images without modifying the object detector, and evaluated the extent to which dehazing preprocessing contributes to detection performance under each fog-density condition using standard object detection metrics, including Precision, Recall, F1-score,

mAP @ 0.5

,

mAP @ 0.75

, and mAP@0.5:0.95.

3. Task-Oriented Evaluation of Dehazing Preprocessing

Figure 1 shows the object detection evaluation framework based on dehazing preprocessing. The experiments used images from the Foggy Cityscapes dataset under three fog-density conditions, with atmospheric attenuation coefficients

β

of 0.005, 0.010, and 0.020. These settings correspond to light, moderate, and dense fog conditions, respectively. This study compares object detection performance with and without dehazing preprocessing under the three fog-density conditions.

As shown in Figure 1, each input hazy image is divided into two evaluation paths. Path A represents the baseline direct path, in which no dehazing is applied and the original hazy image is directly fed into the YOLO26n object detector. Path B represents the proposed path, in which the same hazy image is first processed by the DL-U-Net-based dehazing module, and the resulting dehazed image is then fed into the same YOLO26n object detector. The detection results from the two paths are then compared to evaluate the impact of dehazing preprocessing on object detection performance.

The two paths use the same input procedure, fog-density settings, object detector, and evaluation criteria. The performance differences between the two paths indicate how effectively contrast and structural information restored during preprocessing contribute to downstream object detection. This detection-oriented comparison is designed to examine whether dehazing preprocessing leads to improved detection performance.

Figure 2 presents a simplified dual-path comparison structure for analyzing the effect of dehazing preprocessing on object detection performance. The upper part represents the preprocessing dehazing step, whereas the lower part represents the object detection step. Path A represents the baseline direct path, in which the hazy image is directly input into the object detection module without dehazing to produce the skip-dehazing results. In contrast, Path B represents the proposed path, in which the hazy image is first processed by the preprocessing dehazing module and then fed into the same object detection module to produce the with-preprocessing results.

Although the two paths use different image conditions, they share the same detector and evaluation criteria. During the experiments, the detector weights, input image size, class mapping rules, ground truth annotations, IoU matching criteria, and evaluation metrics were kept identical. The detection results from the two paths were then integrated and quantitatively evaluated using Precision, Recall, F1-score,

mAP @ 0.5

,

mAP @ 0.75

, and

mAP @ 0.5 : 0.95

.

This experimental design aims to verify whether dehazing preprocessing improves not only image clarity but also object detection performance. Through these experiments, this study systematically analyzes the effect of dehazing preprocessing on detection performance under different fog-density conditions.

4. Experiments

This chapter quantitatively verifies whether dehazing-based preprocessing can improve object detection performance degraded in foggy environments. The focus of this study is not to propose a new algorithm or network architecture, but to analyze the practical impact of dehazing preprocessing on downstream object detection tasks. To this end, under each fog-density setting, the original hazy images and dehazed images are input into the same YOLO26n detector, and the resulting changes in detection performance are compared and analyzed.

Section 4.1 presents the experimental environment, dataset configuration, and evaluation details. Section 4.2 analyzes the changes in detection performance under each fog-density condition. Section 4.3 compares the detection results for each object class. Section 4.4 provides framework-level comparative experiments. Section 4.5 presents visual comparisons of the detection results. Section 4.6 compares computational costs and conducts statistical significance analysis of the experimental data.

4.1. Experimental Setup, Dataset Configuration, and Evaluation Details

The implementation is based on Python 3.9.16 and PyTorch 2.6.0 with CUDA 11.8 support on a Windows 11 environment. The hardware comprises an Intel Core i7-14700K, an NVIDIA RTX 4090 (24 GB), and 64 GB of memory, providing a sufficient experimental environment for image dehazing preprocessing, object detection inference, and evaluation metric calculation.

In terms of data usage, Foggy Cityscapes includes 2975 training-set images and 500 validation-set images under each fog-density setting. Accordingly, because the dataset provides three fog-density versions with

β = 0.005

,

β = 0.010

, and

β = 0.020

, it consists of a total of 10,425 hazy images when the training and validation sets are combined.

The object detection labels used in the experiments were derived from the gtFine label information of Cityscapes. These labels were generated by converting the object regions annotated in the original Cityscapes scenes. Because the hazy images under the three

β

settings correspond to the same original clear Cityscapes images, the same scene shares identical Ground Truth annotations across different fog-density conditions.

Since the label-class taxonomy of Foggy Cityscapes differs from the pre-trained class taxonomy of YOLO26n, this study retained only object classes with clear semantic correspondence between Cityscapes and YOLO/COCO to ensure the consistency of the evaluation results.

Specifically, a total of seven classes were included: person, bicycle, car, motorcycle, bus, train, and truck. These classes were mapped to the corresponding YOLO/COCO classes of person, bicycle, car, motorcycle, bus, train, and truck, respectively. These classes not only correspond to major object classes frequently appearing in urban scenes but also have clear one-to-one correspondence with the pre-trained class system of YOLO26n. Therefore, they were set as the unified object detection evaluation classes in this study.

To ensure that the experimental results reflect the effect of dehazing preprocessing itself on object detection performance, YOLO26n always used the MS COCO pre-trained weights, and no additional training or fine-tuning was performed using the Foggy Cityscapes dataset. In the preprocessing setting, DL-U-Net was used as the dehazing front-end, which is the main analysis target of this study, while AOD-Net and DehazeFormer were used as comparative dehazing preprocessing models. Each dehazing model was used only to generate dehazed images, and the generated dehazing results were input into the same YOLO26n detector.

Throughout the experiment, the detector weights, input image size, class mapping rules, Ground Truth labels, IoU matching criteria, and evaluation metrics were kept identical. The detection results were evaluated following the unified evaluation protocol described in Section 3. Specifically, this study determined the matching between Prediction boxes and Ground Truth boxes according to whether their IoU reached the predefined threshold, and then calculated standard object detection metrics, including Precision, Recall, F1-score,

mAP @ 0.5

,

mAP @ 0.75

, and

mAP @ 0.5 : 0.95

.

4.2. Changes in Detection Performance Under Each Fog-Density Condition

This section quantitatively evaluates the practical gain provided by the DL-U-Net dehazing front-end for YOLO26n, which is used as a fixed downstream object detector. The experiments were performed under three fog-density conditions with atmospheric attenuation coefficients

β = 0.005

,

β = 0.010

, and

β = 0.020

. In the same detector environment, the detection performance of the original hazy images (Hazy) and the DL-U-Net-processed dehazed images (Dehazed) was compared. Standard detection metrics, including Precision and Recall, together with IoU-threshold-based F1-score,

mAP @ 0.5

,

mAP @ 0.75

, and

mAP @ 0.5 : 0.95

, were used to separately analyze detection and localization performance. Table 1 summarizes the detection performance of the original hazy images (Hazy) and the DL-U-Net dehazed images (Dehazed) for each fog-density condition, along with the corresponding differences

(Δ)

.

First, the detection performance gain obtained by DL-U-Net dehazing is positively correlated with fog density, while the magnitude of the gain increases nonlinearly as fog density becomes higher. The relative improvements in

mAP @ 0.5

are

+ 0.39 %

,

+ 6.60 %

, and

+ 13.37 %

for

β = 0.005

,

β = 0.010

, and

β = 0.020

, respectively, showing a much steeper increase than the arithmetic increase in

β

. A similar nonlinear trend is consistently observed in

mAP @ 0.5 : 0.95

and Recall.

This indicates that the detector exhibits a nonlinear response to changes in input quality. In regions where sufficient detection signals are already preserved, additional signal restoration provides only a limited marginal benefit. In contrast, in regions where the signal has degraded below a critical level, the same degree of restoration may determine whether an object is detectable. Therefore, the practical utility of detection-oriented dehazing is not uniformly observed across all fog conditions but is concentrated under conditions in which visibility degradation exceeds a certain level.

Second, the effect of dehazing is asymmetric between Precision and Recall. At

β = 0.020

,

Δ

Recall

(+ 0.0205)

is approximately 3.7 times greater than

Δ

Precision

(+ 0.0056)

. This can be explained by the fact that candidate detections suppressed below the confidence threshold by fog-induced degradation are raised above the threshold through structural restoration by the dehazing front-end and are therefore detected. This indicates that dehazing primarily contributes to recovering missed detections rather than suppressing false positives (FP).

Meanwhile,

Δ

Precision peaks at

+ 0.0169

for

β = 0.010

and then decreases to

+ 0.0056

for

β = 0.020

, showing a pattern different from Recall, which consistently increases as fog density becomes higher. This suggests that, under dense fog conditions, residual artifacts introduce some false positives, partially offsetting the improvement in Precision. Consequently, the

mAP

improvement observed under dense fog conditions is mainly driven by Recall recovery.

Third, when the changes

(Δ)

in Table 1 are examined according to the IoU threshold, the effect of dehazing varies qualitatively depending on the threshold.

mAP @ 0.5

consistently improves under all fog-density conditions. In contrast,

mAP @ 0.75

, which reflects stricter localization accuracy, shows almost no improvement or a slight decrease at

β = 0.005

and

β = 0.010

, and shows only a small improvement of

+ 0.0046

at

β = 0.020

.

This can be explained by the fact that dehazing produces mutually competing effects. While it restores object visibility and confidence responses, thereby improving classification-related performance, it can also partially degrade precise bounding box regression by introducing subtle artifacts and pixel smoothing. When fog density is low, the regression-disturbing effect is more pronounced than the classification improvement. However, under dense fog conditions, the effect of contour restoration outweighs the negative influence of artifacts. Meanwhile, mAP@0.5:0.95, which is averaged over 10 IoU thresholds, maintains stable improvement across all fog-density conditions, indicating that the slight decreases observed at specific thresholds do not appear at the level of the aggregated metric. Therefore, dehazing preprocessing stably improves classification-related performance across all fog-density conditions, whereas its benefit for precise localization regression appears only under dense fog conditions.

Based on these results, the utility of the DL-U-Net dehazing front-end can be summarized as follows. The degree to which dehazing contributes to detection performance increases nonlinearly as fog density becomes higher, and this contribution is mainly achieved through Recall recovery, that is, the reduction in missed detections. In addition, dehazing stably enhances classification-related performance across all fog-density conditions, whereas its benefit for precise localization regression appears only under dense fog conditions. These results indicate that the detection-oriented evaluation framework proposed in this paper can quantitatively evaluate the effectiveness of the dehazing front-end from the perspective of downstream detection tasks, rather than simple visual quality.

4.3. Class-Wise Detection Results

This section evaluates the effect of DL-U-Net on class-wise object detection performance using

AP @ 0.5

and

AP @ 0.75

as evaluation metrics. By comparing the performance differences before and after dehazing, this section demonstrates how DL-U-Net contributes to feature restoration for downstream detection tasks.

According to the

AP @ 0.5

results in Table 2, the overall changes before and after dehazing are relatively small under the light fog condition of

β = 0.005

. Slight increases are observed for bicycles, cars, trains, and trucks, whereas slight decreases appear for person, motorcycles, and buses. This suggests that, under light fog conditions, object contours, textures, and local contrast information are still relatively well preserved in the original images, thereby limiting the additional detection gain provided by DL-U-Net preprocessing.

Under the

β = 0.010

condition,

AP @ 0.5

increases for most classes except truck. In particular, positive changes are observed for cars, trains, motorcycles, and bicycles. This can be attributed to the fact that DL-U-Net partially restores the structural information required for detection under moderate fog conditions, where object boundaries and fine structures begin to be weakened.

Under the dense fog condition of

β = 0.020

, this tendency becomes more pronounced.

AP @ 0.5

increases for all classes except bus, with the car class showing the largest increase. This result indicates that the vehicle contours and local contrast weakened by dense fog are partially recovered after DL-U-Net processing, thereby enhancing the detection response of YOLO26n. Therefore, based on

AP @ 0.5

, the effect of DL-U-Net is more clearly observed under moderate and dense fog conditions than under light fog conditions.

Compared with

AP @ 0.5

, the

AP @ 0.75

results in Table 3 show a more conservative trend. Under the

β = 0.005

and

β = 0.010

conditions,

AP @ 0.75

decreases for some classes, indicating that although dehazing can help determine object presence or increase detection confidence, it does not always improve precise bounding box localization. Even when object contours and contrast are enhanced through DL-U-Net processing, smoothing in certain regions or subtle texture changes may still impose a burden on bounding box regression.

In contrast, under the

β = 0.020

condition,

AP @ 0.75

increases for person, bicycle, car, train, and truck. In particular, cars and trucks show positive changes in both

AP @ 0.5

and

AP @ 0.75

. This confirms that, under dense fog conditions, the contour restoration effect of DL-U-Net can positively affect not only object detectability but also the localization stability of certain object classes.

In summary, the effect of the DL-U-Net dehazing front-end is not consistent across all object classes and IoU criteria. Based on

AP @ 0.5

, detection performance improves for more classes as fog density increases, indicating that DL-U-Net helps reduce missed detections by restoring object contours, textures, and local contrast information weakened by fog. On the other hand, based on

AP @ 0.75

, the performance improvement is limited, and some classes even show decreases. Therefore, the effect on precise localization regression varies depending on fog density and object type. These results show that the detection-oriented evaluation framework proposed in this paper can analyze the influence of the dehazing front-end in detail, not only in terms of overall

mAP

but also with respect to object class and IoU criterion.

4.4. Framework-Level Comparative Experiments

This section conducts a framework-level front-end replacement experiment to verify whether the proposed evaluation framework can be generally applied to different dehazing front-ends. In addition to the DL-U-Net adopted in this paper, AOD-Net and DehazeFormer are applied at the same position, while all other settings, including the Foggy Cityscapes validation dataset, the YOLO26n detector, class mapping rules, and evaluation metrics, are kept identical.

Reference in Table 4 refers to the input original hazy images without dehazing processing. As shown in Table 4, detection performance varies substantially under the same fog condition depending on the type of dehazing front-end. Under the lightest fog condition,

β = 0.005

, the

mAP @ 0.5

improvements compared with Reference differ across the front-ends: DehazeFormer achieves

+ 0.0152

(+ 6.0 %)

, DL-U-Net achieves

+ 0.0010

(+ 0.4 %)

, and AOD-Net shows

- 0.0060

(- 2.4 %)

. AOD-Net records lower performance than Reference despite dehazing processing. By contrast, as the fog density increases, the performance gap becomes narrower. At

β = 0.020

, DehazeFormer, DL-U-Net, and AOD-Net show improvements of

+ 0.0268

,

+ 0.0254

, and

+ 0.0128

, respectively, indicating clear detection gains for all three front-ends. This is because, under dense fog conditions, a large amount of structural information is lost, so signal restoration above a certain level contributes to detection regardless of the restoration method. However, under light fog conditions, sufficient detection-related information is already preserved in the original image, and the side effects of the restoration method therefore become relatively more apparent in the detection results.

As a result, even for the same detection task, downstream detection performance may either improve or even decrease depending on the design of the dehazing front-end. The proposed evaluation framework can quantitatively identify these differences based on detection performance itself, rather than visual restoration quality.

In addition, the

mAP @ 0.5

gap between DL-U-Net and DehazeFormer is

0.0142

at

β = 0.005

,

0.0027

at

β = 0.010

, and

0.0014

at

β = 0.020

, indicating that the gap rapidly narrows as fog density increases. Under the dense fog condition

(β = 0.020)

, the detection performance of the two front-ends becomes nearly identical. Their

mAP @ 0.5 : 0.95

values are

0.1258

and

0.1300

, respectively, with a difference of only

0.0042

. This indicates that although DehazeFormer has an advantage in visual restoration quality due to its deeper and heavier architecture, the lightweight DL-U-Net can also provide sufficient effectiveness under dense fog conditions in terms of restoring structural information useful for detection. In other words, as fog density increases, the amount of structural information that needs to be restored becomes a key factor determining detection performance, while differences in model capacity are not fully translated into differences in detection performance. This characteristic is the basis for selecting DL-U-Net as the main analysis target in this paper and using DehazeFormer as a strong baseline for comparison.

Finally, the results of AOD-Net show low consistency. At

β = 0.005

, its Precision is

0.4101

, which is slightly higher than that of Reference

(0.4050)

, but its Recall decreases substantially to

0.2494

compared with Reference

(0.2701)

, resulting in a decrease in

mAP @ 0.5

. Even at

β = 0.020

, although

mAP @ 0.5

improves by

+ 0.0128

, Precision remains lower at

0.3781

than that of Reference

(0.3818)

. This can be interpreted as a result of AOD-Net dehazing outputs introducing color bias and local distortions, thereby partially damaging the detailed information that the detector needs to utilize. In other words, although AOD-Net visually removes some haze, the side effects introduced during this process disturb the detector response, preventing the dehazing effect from being stably converted into detection gains.

These results confirm that the proposed evaluation framework has both generality and discriminative capability, allowing different front-end models to be compared under the same criteria. Within this framework, DehazeFormer provides the highest absolute performance across all fog-density conditions, while DL-U-Net shows stable detection improvement and achieves performance close to that of DehazeFormer under dense fog conditions. In contrast, AOD-Net shows variability, including a decrease in detection performance under light fog conditions.

4.5. Visual Comparison of Detection Results

In this section, representative samples under different fog-density conditions were selected for visualization analysis to more intuitively verify the effect of dehazing preprocessing on subsequent object detection results. All images were input into the same YOLO26n detector, and the input size, confidence threshold, and inference settings were kept identical.

As shown in Figure 3, in the original hazy images, the detector frequently exhibits missed detections, low confidence scores, and positional bias in bounding boxes. Such degradation is particularly pronounced for distant small objects or traffic objects in complex backgrounds. As the fog density increases, these problems become visually more apparent. Under the light fog condition of

β = 0.005

, the differences in detection results before and after dehazing are barely distinguishable, whereas under the dense fog condition of

β = 0.020

, object contours become clearly sharper after processing, and objects that were previously missed are detected again. This visually corresponds to the nonlinear amplification pattern of detection performance improvement derived in Section 4.2, namely, that the detection gain from dehazing increases sharply as fog becomes denser. In particular, the recovery of missed detections analyzed in Section 4.2 is confirmed to be the most prominent change in the visualization.

As shown in Figure 4, the visual restoration characteristics of the three models do not result in consistent differences in the detection results. Although images processed by AOD-Net show enhanced contrast in some scenes, they also exhibit brightness bias and local texture distortion, which leads to unstable bounding box locations or missed targets. This is consistent with the decrease in Precision observed in Section 4.4 and with the result that AOD-Net showed lower detection performance than the original hazy input under light fog conditions. Images processed by DL-U-Net stably restore the contours and local structures of major traffic objects, such as vehicles and pedestrians, under dense fog conditions, making the detection boxes clearer and the responses more stable. Images processed by DehazeFormer show the highest overall clarity; however, under dense fog conditions, the difference between its detection results and those of the DL-U-Net-processed images is marginal. This is consistent with the quantitative result in Section 4.4, where the

mAP @ 0.5

gap between the two models narrowed to

0.0014

at

β = 0.020

.

The results thus intuitively support the conclusions of the quantitative analysis presented in Section 4.2 and Section 4.4. In particular, although DehazeFormer shows the best visual restoration quality, its detection results under dense fog conditions are not clearly different from those of DL-U-Net. This visually supports the argument of this paper that the value of a dehazing front-end cannot be judged solely by visual metrics, but should be evaluated from the perspective of downstream detection tasks.

4.6. Computational Cost and Statistical Significance Analysis

To compare the computational cost of different dehazing preprocessing front-ends, this section measures the pure inference time of each model. Here, pure inference time is defined as the duration from the moment the input tensor is passed to the model to the completion of the neural network forward pass. Accordingly, this measurement excludes data loading, preprocessing, result saving, and other pipeline operations, focusing only on the computational cost of the dehazing model itself.

Table 5 presents a comparison of the pure inference times of AOD-Net, DL-U-Net, and DehazeFormer under the same experimental conditions. AOD-Net achieved the fastest processing speed, with an average inference time of 5.74 ms and 174.28 FPS. DL-U-Net recorded an average inference time of 11.90 ms and 84.07 FPS. Although it is slower than AOD-Net, its inference time remains within approximately 12 ms per image, indicating that its computational cost is sufficiently low for real-time processing. In contrast, DehazeFormer exhibited the highest inference time, with an average of 46.30 ms and 21.60 FPS.

These differences mainly arise from the structural complexity of each model. AOD-Net achieves the fastest speed because of its simple and lightweight CNN-based architecture, whereas DehazeFormer requires substantially longer inference time due to its Transformer-based attention operations and multi-scale feature processing. Although DL-U-Net requires more computation than AOD-Net because of its 9-channel input and U-Net-based architecture, it is approximately 3.89 times faster than DehazeFormer. These results indicate that DL-U-Net provides a relatively well-balanced preprocessing model in terms of the trade-off between detection performance improvement and computational cost.

As shown in Table 6, dehazing preprocessing increased Recall and F1-score under all three fog-density conditions. Since none of the corresponding 95% confidence intervals included 0, these differences can be regarded as statistically significant. Under the

β = 0.005

condition, the increases in Recall and F1-score were relatively small, at

+ 0.006

and

+ 0.008

, respectively, suggesting that the detection performance gain from dehazing is limited under light fog conditions.

As fog density increased, the magnitude of improvement in both metrics became larger. Under the

β = 0.010

condition, Recall and F1-score increased by

+ 0.020

and

+ 0.024

, respectively, and under the

β = 0.020

condition, they increased by

+ 0.031

and

+ 0.036

, respectively. Overall, the statistical significance analysis indicates that dehazing preprocessing has a relatively stable positive effect on Recall and F1-score. This effect mainly contributes to reducing missed detections to some extent and improving the overall detection performance. The improvement tendency becomes more evident under dense fog conditions.

5. Conclusions

This study addressed object detection performance degradation in foggy environments by analyzing the effect of DL-U-Net-based dehazing preprocessing on downstream object detection. To this end, DL-U-Net-based dehazing preprocessing was integrated with YOLO26n, and experiments were conducted using the Foggy Cityscapes dataset.

The experimental results indicate that the contribution of dehazing to detection performance is not limited to improvement in a single metric. The primary improvement was observed in Recall, indicating that dehazing reduces missed detections caused by fog-induced occlusion and contrast degradation. Under dense fog conditions, the improvement in Recall was more pronounced than that in Precision. This suggests that dehazing preprocessing plays a greater role in helping the detector identify previously difficult-to-detect objects than in merely reducing false positives. While mAP@0.5 improved across all fog-density conditions, mAP@0.75 showed clear improvement only under dense fog conditions. This suggests that, although dehazing can consistently improve object detectability, its contribution to precise bounding box localization depends on fog density and artifacts generated during restoration.

Although the effectiveness of the proposed evaluation framework and DL-U-Net-based dehazing preprocessing was confirmed, several limitations remain. The current approach uses a serial architecture that separates the dehazing model from the object detector, preventing end-to-end joint optimization of the two modules. Consequently, the visual information restored by the dehazing module may not align with the detector’s feature requirements for classification and localization.

Future research will focus on developing an end-to-end network that jointly optimizes dehazing and object detection by directly constraining the dehazing process with detection loss. This is expected to facilitate the learning of image representations that are more suitable for object recognition and bounding box regression. Furthermore, subsequent studies will explore lightweight model design and deployment optimization for hardware platforms such as FPGAs and NPUs.

Author Contributions

Conceptualization, J.H. and W.-C.P.; methodology, J.H. and W.-C.P.; software, J.H. and Y.H.; validation, J.H.; formal analysis, J.H.; investigation, J.H.; resources, J.H.; data curation, J.H.; writing—original draft preparation, J.H. and J.K.; writing—review and editing, J.H., Y.H. and W.-C.P.; visualization, J.H.; supervision, W.-C.P.; project administration, W.-C.P.; funding acquisition, W.-C.P. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported in part by the Commercialization Promotion Agency for R&D Outcomes (COMPA) grant funded by the Korea government (Ministry of Science and ICT) under the project “Upgrading and Commercializing IP for Real-Time Denoising AI Hardware” (RS-2025-02315892, Contribution Rate: 50%), and in part by the IITP (Institute of Information & Communications Technology Planning & Evaluation)-ITRC (Information Technology Research Center)(IITP-2026-RS-2022-00156354, Contribution Rate: 50%) grant funded by the Korea government (Ministry of Science and ICT).

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Data are available upon request to the corresponding author.

Conflicts of Interest

Author Yunho Han is employed as a researcher at EXARION. Author Woo-Chan Park is the founder and CEO of EXARION. The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

References

Zhao, J.; Wu, Y.; Deng, R.; Xu, S.; Gao, J.; Burke, A. A survey of autonomous driving from a deep learning perspective. ACM Comput. Surv. 2025, 57, 263. [Google Scholar] [CrossRef]
Duong, H.-T.; Le, V.-T.; Hoang, V.T. Deep learning-based anomaly detection in video surveillance: A survey. Sensors 2023, 23, 5024. [Google Scholar] [CrossRef] [PubMed]
Tang, G.; Ni, J.; Zhao, Y.; Gu, Y.; Cao, W. A survey of object detection for UAVs based on deep learning. Remote Sens. 2024, 16, 149. [Google Scholar] [CrossRef]
He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
Li, G.; Li, J.; Chen, G.; Wang, Z.; Jin, S.; Ding, C.; Zhang, W. Delving deeper into image dehazing: A survey. IEEE Access 2023, 11, 131759–131774. [Google Scholar] [CrossRef]
Kumar, D.; Muhammad, N. Object detection in adverse weather for autonomous driving through data merging and YOLOv8. Sensors 2023, 23, 8471. [Google Scholar] [CrossRef] [PubMed]
Guo, F.; Yang, J.; Liu, Z.; Tang, J. Haze removal for single image: A comprehensive review. Neurocomputing 2023, 537, 85–109. [Google Scholar] [CrossRef]
Gupta, H.; Kotlyar, O.; Andreasson, H.; Lilienthal, A.J. Robust object detection in challenging weather conditions. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2024; pp. 7523–7532. [Google Scholar]
Kirillova, N.; Mirza, M.J.; Bischof, H.; Possegger, H. Into the fog: Evaluating robustness of multiple object tracking. In Proceedings of the British Machine Vision Conference (BMVC), Glasgow, UK, 25–28 November 2024; pp. 1–15. [Google Scholar]
Wang, T.-S.; Kim, G.-T.; Kim, M.; Jang, J. Contrast enhancement-based preprocessing process to improve deep learning object task performance and results. Appl. Sci. 2023, 13, 10760. [Google Scholar] [CrossRef]
Wang, T.-S.; Kim, G.-T.; Shin, J.; Jang, S.-W. Hierarchical image quality improvement based on illumination, resolution, and noise factors for improving object detection. Electronics 2024, 13, 4438. [Google Scholar] [CrossRef]
Li, C.; Zhou, H.; Liu, Y.; Yang, C.; Xie, Y.; Li, Z.; Zhu, L. Detection-friendly dehazing: Object detection in real-world hazy scenes. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 8284–8295. [Google Scholar]
Dwivedi, P.; Chakraborty, S. A comprehensive qualitative and quantitative survey on image dehazing based on deep neural networks. Neurocomputing 2024, 610, 128582. [Google Scholar] [CrossRef]
Hodges, C.; Bennamoun, M.; Boussaid, F. Quantitative performance evaluation of object detectors in hazy environments. Pattern Recognit. Lett. 2021, 152, 150–157. [Google Scholar] [CrossRef]
Zhang, Z.; Zhao, L.; Liu, Y.; Zhang, S.; Yang, J. Unified density-aware image dehazing and object detection in real-world hazy scenes. In Proceedings of the Asian Conference on Computer Vision (ACCV), Kyoto, Japan, 30 November–4 December 2020; pp. 119–135. [Google Scholar]
Sakaridis, C.; Dai, D.; Van Gool, L. Semantic foggy scene understanding with synthetic data. Int. J. Comput. Vis. 2018, 126, 973–992. [Google Scholar] [CrossRef]
Cordts, M.; Omran, M.; Ramos, S.; Rehfeld, T.; Enzweiler, M.; Benenson, R.; Franke, U.; Roth, S.; Schiele, B. The Cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 3213–3223. [Google Scholar]
Ju, M.; Gu, Z.; Zhang, D. Single image haze removal based on the improved atmospheric scattering model. Neurocomputing 2017, 260, 180–191. [Google Scholar] [CrossRef]
He, K.; Sun, J.; Tang, X. Single image haze removal using dark channel prior. IEEE Trans. Pattern Anal. Mach. Intell. 2011, 33, 2341–2353. [Google Scholar] [PubMed]
Qin, X.; Wang, Z.; Bai, Y.; Xie, X.; Jia, H. FFA-Net: Feature fusion attention network for single image dehazing. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), New York, NY, USA, 7–12 February 2020; Volume 34, pp. 11908–11915. [Google Scholar]
Guo, C.-L.; Yan, Q.; Anwar, S.; Cong, R.; Ren, W.; Li, C. Image dehazing transformer with transmission-aware 3D position embedding. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 5812–5820. [Google Scholar]
Li, B.; Peng, X.; Wang, Z.; Xu, J.; Feng, D. AOD-Net: All-in-one dehazing network. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4770–4778. [Google Scholar]
Liu, W.; Ren, G.; Yu, R.; Guo, S.; Zhu, J.; Zhang, L. Image-adaptive YOLO for object detection in adverse weather conditions. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Virtual, 22 February–1 March 2022; Volume 36, pp. 1792–1800. [Google Scholar]
Ogino, Y.; Shoji, Y.; Toizumi, T.; Ito, A. ERUP-YOLO: Enhancing object detection robustness for adverse weather condition by unified image-adaptive processing. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA, 26 February–6 March 6 2025; pp. 8586–8594. [Google Scholar]
Yang, H.-H.; Fu, Y. Wavelet U-Net and the chromatic adaptation transform for single image dehazing. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Taipei, Taiwan, 22–25 September 2019; pp. 2736–2740. [Google Scholar]
Nguyen, Q.H.; Kang, B. An end-to-end single image dehazing method using U-Net architecture. J. KIIT 2021, 19, 93–100. [Google Scholar] [CrossRef]
Zhou, H.; Chen, Z.; Li, Q.; Tao, T. Dehaze-U-Net: A lightweight network based on U-Net for single-image dehazing. Electronics 2024, 13, 2082. [Google Scholar]
Han, Y.; Kim, J.; Lee, J.; Nah, J.-H.; Ho, Y.-S.; Park, W.-C. Efficient haze removal from a single image using a DCP-based lightweight U-Net neural network model. Sensors 2024, 24, 3746. [Google Scholar] [CrossRef] [PubMed]
Huang, S.-C.; Le, T.-H.; Jaw, D.-W. DSNet: Joint semantic learning for object detection in inclement weather conditions. IEEE Trans. Pattern Anal. Mach. Intell. 2021, 43, 2623–2633. [Google Scholar] [CrossRef] [PubMed]

Figure 1. Dehazing-preprocessing-based object detection evaluation framework.

Figure 2. Dual-path dehazing-based object detection evaluation diagram.

Figure 3. Comparison of object detection results between original hazy images and DL-U-Net dehazed images.

Figure 4. Comparison of object detection results by dehazing model according to fog density.

Table 1. Detection performance comparison between Hazy and DL-U-Net Dehazed images by fog-density condition, with differences

(Δ

).

Table 1. Detection performance comparison between Hazy and DL-U-Net Dehazed images by fog-density condition, with differences

(Δ

).

Hazy Level β	Input	Precision	Recall	F1-Score	mAP@0.5	mAP@0.75	mAP@0.5:0.95
0.005	Hazy	0.4050	0.2701	0.3240	0.2541	0.1447	0.1511
	Dehazed	0.4077	0.2745	0.3281	0.2551	0.1436
	$Δ$	0.0027	0.0044	0.0041	0.0010	−0.0011
0.010	Hazy	0.4029	0.2393	0.3002	0.2288	0.1346	0.1381
	Dehazed	0.4198	0.2589	0.3203	0.2439	0.1345
	$Δ$	0.0169	0.0196	0.0201	0.0151	−0.0001
0.020	Hazy	0.3818	0.2132	0.2736	0.1900	0.1144	0.1160
	Dehazed	0.3874	0.2337	0.2915	0.2154	0.1191
	$Δ$	0.0056	0.0205	0.0179	0.0254	0.0046

Table 2. Class-wise AP@0.5 changes before and after DL-U-Net dehazing under each fog-density condition.

Hazy Level β	Input	Person	Bicycle	Car	Motorcycle	Bus	Train	Truck
0.005	Foggy	0.2889	0.2230	0.5364	0.1535	0.3776		0.1321
	Dehazed	0.2875	0.2263	0.5426	0.1417	0.3684	0.0670	0.1398
	$Δ$	−0.0014	0.0033	0.0062	−0.0118	−0.0092		0.0077
0.010	Foggy	0.2697	0.2126	0.5006	0.1347	0.3407		0.1170
	Dehazed	0.2772	0.2226	0.5298	0.1458	0.3443	0.0262	0.1156
	$Δ$	0.0075	0.0100	0.0292	0.0111	0.0036		−0.0014
0.020	Foggy	0.2303	0.1849	0.4323	0.0936	0.2840		0.0936
	Dehazed	0.2559	0.2165	0.4951	0.1216	0.2816	0.0112	0.1050
	$Δ$	0.0256	0.0316	0.0628	0.0280	−0.0024		0.0114

Table 3. Class-wise AP@0.75 changes before and after DL-U-Net dehazing under each fog-density condition.

Hazy Level β	Input	Person	Bicycle	Car	Motorcycle	Bus	Train	Truck
0.005	Foggy	0.1390	0.0892	0.3327	0.0486	0.2942		0.0924
	Dehazed	0.1336	0.0834	0.3383	0.0373	0.2871	0.0170	0.1061
	$Δ$	−0.0054	−0.0058	0.0056	−0.0113	−0.0071		0.0137
0.010	Foggy	0.1339	0.0892	0.3164	0.0438	0.2750		0.0791
	Dehazed	0.1333	0.0790	0.3218	0.0352	0.2655	0.0049	0.0880
	$Δ$	−0.0006	−0.0102	0.0054	−0.0086	−0.0095		0.0089
0.020	Foggy	0.1150	0.0805	0.2840	0.0258	0.2338		0.0614
	Dehazed	0.1194	0.0825	0.2992	0.0233	0.2292	0.0006	0.0728
	$Δ$	0.0044	0.0020	0.0152	−0.0025	−0.0046		0.0114

Table 4. Detection performance comparison of different front-ends under each fog-density condition.

Hazy Level β	Front-End	Precision	Recall	mAP@0.5	mAP@0.5:0.95
0.005	Reference	0.4050	0.2701	0.2541	0.1511
	AOD-Net	0.4101	0.2494	0.2481	0.1451
	DL-U-Net	0.4077	0.2745	0.2551	0.1515
	DehazeFormer	0.4201	0.2883	0.2693	0.1584
0.010	Reference	0.4029	0.2393	0.2288	0.1381
	AOD-Net	0.3913	0.2297	0.2295	0.1347
	DL-U-Net	0.4198	0.2589	0.2439	0.1437
	DehazeFormer	0.4017	0.2551	0.2466	0.1465
0.020	Reference	0.3818	0.2132	0.1900	0.1160
	AOD-Net	0.3781	0.2238	0.2028	0.1204
	DL-U-Net	0.3874	0.2337	0.2154	0.1258
	DehazeFormer	0.4047	0.2445	0.2168	0.1300

Table 5. Computational cost comparison of dehazing preprocessing models.

Method	Number of Images	Avg. Inference Time	First Image Time	FPS
AOD-Net	500	5.74 ms	90.77 ms	174.28
DL-U-Net	500	11.90 ms	167.55 ms	84.07
DehazeFormer	500	46.30 ms	240.05 ms	21.60

Table 6. Statistical significance analysis of Recall and F1-score under different fog-density conditions.

Hazy Level β	Metric	Hazy	Dehazed	Δ	95% CI	p-Value
0.005	Recall	0.274	0.280	0.006	[0.0014, 0.0100]	0.0396
0.005	F1-score	0.404	0.412	0.008	[0.0025, 0.0123]	0.0198
0.010	Recall	0.251	0.272	0.020	[0.0157, 0.0247]	0.0198
0.010	F1-score	0.379	0.403	0.024	[0.0185, 0.0293]	0.0198
0.020	Recall	0.218	0.249	0.031	[0.0263, 0.0373]	0.0198
0.020	F1-score	0.341	0.377	0.036	[0.0302, 0.0434]	0.0198

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Han, J.; Han, Y.; Kim, J.; Park, W.-C. Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net. AI 2026, 7, 190. https://doi.org/10.3390/ai7060190

AMA Style

Han J, Han Y, Kim J, Park W-C. Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net. AI. 2026; 7(6):190. https://doi.org/10.3390/ai7060190

Chicago/Turabian Style

Han, Jinru, Yunho Han, Jiyoung Kim, and Woo-Chan Park. 2026. "Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net" AI 7, no. 6: 190. https://doi.org/10.3390/ai7060190

APA Style

Han, J., Han, Y., Kim, J., & Park, W.-C. (2026). Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net. AI, 7(6), 190. https://doi.org/10.3390/ai7060190

Article Menu

Improving Object Detection Performance by Preprocessing Dehazing with a DCP-Based Lightweight U-Net

Abstract

1. Introduction

2. Related Works

3. Task-Oriented Evaluation of Dehazing Preprocessing

4. Experiments

4.1. Experimental Setup, Dataset Configuration, and Evaluation Details

4.2. Changes in Detection Performance Under Each Fog-Density Condition

4.3. Class-Wise Detection Results

4.4. Framework-Level Comparative Experiments

4.5. Visual Comparison of Detection Results

4.6. Computational Cost and Statistical Significance Analysis

5. Conclusions

Author Contributions

Funding

Institutional Review Board Statement

Informed Consent Statement

Data Availability Statement

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI