1. Introduction
Recent advances in artificial intelligence (AI), particularly deep learning-based image recognition, have enabled computer vision systems to achieve significant success in real-world applications such as autonomous driving, intelligent surveillance, and unmanned aerial vehicle (UAV) remote sensing [
1,
2,
3,
4]. The performance of these systems fundamentally depends on the quality of the input images [
5]. However, atmospheric scattering caused by fog, mist, or smoke in complex environments significantly reduces visibility and contrast, thereby undermining detection stability and accuracy [
6,
7].
Suspended fine particles in the atmosphere degrade images through scattering and absorption, causing contrast reduction, color distortion, and loss of fine details. This degradation not only reduces visual quality but also impairs the performance of deep learning models in high-level visual tasks such as detection, tracking, and semantic segmentation [
7,
8,
9].
To address this, preprocessing that simultaneously improves both image quality and detection accuracy is required. Such an approach ensures that the system reliably performs its tasks even in degraded environments such as fog [
10,
11].
Existing dehazing techniques have focused on low-level metrics such as PSNR and SSIM, and despite improvements in visual clarity, performance improvements in high-level visual tasks such as detection are often overlooked [
12,
13]. Some studies still use subjective clarity as a key metric, and systematic analysis remains insufficient regarding the effects of dehazing preprocessing on metrics such as Precision, Recall, and false negatives (FN) in complex scenes [
14].
In object detection tasks under foggy environments, improvements in the visual quality of images do not necessarily lead to improved detection performance. Dehazing preprocessing can enhance image contrast and restore some structural information of objects, but it may also introduce local artifacts and affect the detector’s process of judging object features [
15]. Therefore, metrics limited to evaluating visual feature restoration are insufficient to fully explain the actual role of dehazing preprocessing in downstream object detection tasks.
To address this issue, this study applies image dehazing as a preprocessing step for object detection in foggy environments and analyzes changes in detection performance before and after dehazing using standard object detection evaluation metrics. This analysis verifies whether dehazing preprocessing can improve object detection performance in foggy environments.
The experiments were conducted using the Foggy Cityscapes dataset [
16]. Foggy Cityscapes was constructed by Sakaridis, Dai, and Van Gool based on the Cityscapes dataset [
16,
17]. It applies synthetic fog effects based on a physical model to real urban scene images. In this dataset, fog density is controlled by the atmospheric attenuation coefficient
[
16]. This study used three
settings, 0.005, 0.010, and 0.020, corresponding to light, moderate, and dense fog scenes, respectively [
16]. By comparing detection results between the original hazy images and the dehazed images under each fog-density condition, this study analyzed the effect of dehazing preprocessing on object detection performance.
The experimental results show that mAP@0.5 was higher for the dehazed images than for the original hazy images under all three fog-density conditions. This indicates that dehazing preprocessing improves object–background clarity and makes objects easier to detect by mitigating fog-induced occlusion and contrast degradation. For mAP@0.75, consistent performance improvement was not observed under light or moderate fog conditions. This suggests that artifacts generated during dehazing may have affected bounding box localization. Under dense fog conditions, the restoration of object structural information became more evident, leading to improved detection performance at mAP@0.75. In terms of the aggregated IoU metric, mAP@0.5:0.95 also showed superior detection performance across all fog-density conditions.
This study aims to analyze the effect of dehazing-based preprocessing on object detection performance in foggy environments. Although visual artifacts may occur during dehazing, the experimental results show that the benefits of contrast enhancement and structural information restoration outweigh their negative effects on object detection performance.
The remainder of this paper is organized as follows.
Section 2 reviews related work, and
Section 3 presents the object detection evaluation method based on dehazing preprocessing.
Section 4 presents the experimental setup and result analysis, and
Section 5 concludes the paper and discusses future research directions.
2. Related Works
Early dehazing techniques commonly model image degradation using the atmospheric scattering model (ASM) [
18], which is expressed as follows:
Although methods using polarization imagery or multi-view information to estimate the transmission t and atmospheric light A offer physical interpretability, their reliance on multiple frames limits their applicability to single-image dehazing. Single-image inverse-problem approaches rely heavily on statistical assumptions, which can lead to performance variability and limited generalization under non-homogeneous haze or strong illumination conditions.
The dark channel prior (DCP) [
19] is based on the natural image statistic that, within a local window, at least one color channel has an intensity close to zero. In other words, DCP assumes that, in haze-free images, the pixel values of at least one channel in a local region are nearly zero, which can be formulated as follows:
Here,
x denotes the current pixel position, and
represents a local window centered at
x. The variable
y denotes a pixel position within the corresponding window, and
c represents an RGB color channel.
denotes the pixel intensity of the
c-th color channel at position
y in image
I. This equation calculates the dark channel value at each position by first selecting the minimum value among the RGB channels at each pixel within the local window
, and then taking the minimum over the entire local window [
19]. Through this process, DCP estimates the transmission and atmospheric light in the atmospheric scattering model and restores a haze-free image.
With advances in deep neural networks, dehazing research has shifted from physics-model-driven approaches to data-driven end-to-end strategies [
20]. These approaches directly learn the degradation-to-restoration mapping, eliminating the need for explicit estimation of transmission and atmospheric light [
21]. Owing to their strong feature-learning capabilities, deep learning-based dehazing methods can achieve superior visual quality and quantitative performance even in complex scenes.
Recently, several studies have investigated the integration of image enhancement or dehazing with object detection. AOD-Net used an end-to-end dehazing network as a front-end module for Faster R-CNN and demonstrated improved object detection performance on hazy images [
22]. IA-YOLO introduced a differentiable image processing module that adaptively enhances input images according to the degree of weather degradation, thereby improving YOLO detection performance in foggy and low-light environments [
23]. BAD-Net proposed detection-friendly dehazing by connecting the dehazing and object detection modules in an end-to-end manner [
12]. ERUP-YOLO proposed an integrated image-adaptive processing framework that enhances adverse-weather images through differentiable filters, improving YOLO detection performance in foggy and low-light scenes [
24].
Among deep learning-based dehazing methods, U-Net-based encoder–decoder architectures have been widely used for image restoration. These architectures extract multi-scale degradation features through the encoder and progressively restore image details through the decoder. In addition, skip connections partially preserve texture and boundary information from shallow layers, helping restore object contours and local structures in hazy images [
25,
26,
27]. The DL-U-Net model [
28] used in this study belongs to this class of lightweight dehazing models. Its simple structure and low inference cost make it suitable as a preprocessing module for object detection systems. However, this study does not propose a new network; instead, it uses DL-U-Net as a dehazing preprocessing module and analyzes the impact of dehazing preprocessing on object detection performance.
Existing studies reveal two issues that require further analysis. First, the effect of dehazing on object detection performance is not always positive and may vary depending on fog density, IoU threshold, and object class [
12,
29]. Second, most studies focus on new network architectures or joint optimization frameworks, while insufficient attention has been given to whether dehazing preprocessing can consistently improve detector performance [
12,
23,
24,
29].
This study uses image dehazing as a preprocessing step for object detection in foggy environments and analyzes its actual impact on detection performance. Specifically, this study conducted comparative experiments on original hazy images and dehazed images without modifying the object detector, and evaluated the extent to which dehazing preprocessing contributes to detection performance under each fog-density condition using standard object detection metrics, including Precision, Recall, F1-score, , , and mAP@0.5:0.95.
3. Task-Oriented Evaluation of Dehazing Preprocessing
Figure 1 shows the object detection evaluation framework based on dehazing preprocessing. The experiments used images from the Foggy Cityscapes dataset under three fog-density conditions, with atmospheric attenuation coefficients
of 0.005, 0.010, and 0.020. These settings correspond to light, moderate, and dense fog conditions, respectively. This study compares object detection performance with and without dehazing preprocessing under the three fog-density conditions.
As shown in
Figure 1, each input hazy image is divided into two evaluation paths. Path A represents the baseline direct path, in which no dehazing is applied and the original hazy image is directly fed into the YOLO26n object detector. Path B represents the proposed path, in which the same hazy image is first processed by the DL-U-Net-based dehazing module, and the resulting dehazed image is then fed into the same YOLO26n object detector. The detection results from the two paths are then compared to evaluate the impact of dehazing preprocessing on object detection performance.
The two paths use the same input procedure, fog-density settings, object detector, and evaluation criteria. The performance differences between the two paths indicate how effectively contrast and structural information restored during preprocessing contribute to downstream object detection. This detection-oriented comparison is designed to examine whether dehazing preprocessing leads to improved detection performance.
Figure 2 presents a simplified dual-path comparison structure for analyzing the effect of dehazing preprocessing on object detection performance. The upper part represents the preprocessing dehazing step, whereas the lower part represents the object detection step. Path A represents the baseline direct path, in which the hazy image is directly input into the object detection module without dehazing to produce the skip-dehazing results. In contrast, Path B represents the proposed path, in which the hazy image is first processed by the preprocessing dehazing module and then fed into the same object detection module to produce the with-preprocessing results.
Although the two paths use different image conditions, they share the same detector and evaluation criteria. During the experiments, the detector weights, input image size, class mapping rules, ground truth annotations, IoU matching criteria, and evaluation metrics were kept identical. The detection results from the two paths were then integrated and quantitatively evaluated using Precision, Recall, F1-score, , , and .
This experimental design aims to verify whether dehazing preprocessing improves not only image clarity but also object detection performance. Through these experiments, this study systematically analyzes the effect of dehazing preprocessing on detection performance under different fog-density conditions.
4. Experiments
This chapter quantitatively verifies whether dehazing-based preprocessing can improve object detection performance degraded in foggy environments. The focus of this study is not to propose a new algorithm or network architecture, but to analyze the practical impact of dehazing preprocessing on downstream object detection tasks. To this end, under each fog-density setting, the original hazy images and dehazed images are input into the same YOLO26n detector, and the resulting changes in detection performance are compared and analyzed.
Section 4.1 presents the experimental environment, dataset configuration, and evaluation details.
Section 4.2 analyzes the changes in detection performance under each fog-density condition.
Section 4.3 compares the detection results for each object class.
Section 4.4 provides framework-level comparative experiments.
Section 4.5 presents visual comparisons of the detection results.
Section 4.6 compares computational costs and conducts statistical significance analysis of the experimental data.
4.1. Experimental Setup, Dataset Configuration, and Evaluation Details
The implementation is based on Python 3.9.16 and PyTorch 2.6.0 with CUDA 11.8 support on a Windows 11 environment. The hardware comprises an Intel Core i7-14700K, an NVIDIA RTX 4090 (24 GB), and 64 GB of memory, providing a sufficient experimental environment for image dehazing preprocessing, object detection inference, and evaluation metric calculation.
In terms of data usage, Foggy Cityscapes includes 2975 training-set images and 500 validation-set images under each fog-density setting. Accordingly, because the dataset provides three fog-density versions with , , and , it consists of a total of 10,425 hazy images when the training and validation sets are combined.
The object detection labels used in the experiments were derived from the gtFine label information of Cityscapes. These labels were generated by converting the object regions annotated in the original Cityscapes scenes. Because the hazy images under the three settings correspond to the same original clear Cityscapes images, the same scene shares identical Ground Truth annotations across different fog-density conditions.
Since the label-class taxonomy of Foggy Cityscapes differs from the pre-trained class taxonomy of YOLO26n, this study retained only object classes with clear semantic correspondence between Cityscapes and YOLO/COCO to ensure the consistency of the evaluation results.
Specifically, a total of seven classes were included: person, bicycle, car, motorcycle, bus, train, and truck. These classes were mapped to the corresponding YOLO/COCO classes of person, bicycle, car, motorcycle, bus, train, and truck, respectively. These classes not only correspond to major object classes frequently appearing in urban scenes but also have clear one-to-one correspondence with the pre-trained class system of YOLO26n. Therefore, they were set as the unified object detection evaluation classes in this study.
To ensure that the experimental results reflect the effect of dehazing preprocessing itself on object detection performance, YOLO26n always used the MS COCO pre-trained weights, and no additional training or fine-tuning was performed using the Foggy Cityscapes dataset. In the preprocessing setting, DL-U-Net was used as the dehazing front-end, which is the main analysis target of this study, while AOD-Net and DehazeFormer were used as comparative dehazing preprocessing models. Each dehazing model was used only to generate dehazed images, and the generated dehazing results were input into the same YOLO26n detector.
Throughout the experiment, the detector weights, input image size, class mapping rules, Ground Truth labels, IoU matching criteria, and evaluation metrics were kept identical. The detection results were evaluated following the unified evaluation protocol described in
Section 3. Specifically, this study determined the matching between Prediction boxes and Ground Truth boxes according to whether their IoU reached the predefined threshold, and then calculated standard object detection metrics, including Precision, Recall, F1-score,
,
, and
.
4.2. Changes in Detection Performance Under Each Fog-Density Condition
This section quantitatively evaluates the practical gain provided by the DL-U-Net dehazing front-end for YOLO26n, which is used as a fixed downstream object detector. The experiments were performed under three fog-density conditions with atmospheric attenuation coefficients
,
, and
. In the same detector environment, the detection performance of the original hazy images (Hazy) and the DL-U-Net-processed dehazed images (Dehazed) was compared. Standard detection metrics, including Precision and Recall, together with IoU-threshold-based F1-score,
,
, and
, were used to separately analyze detection and localization performance.
Table 1 summarizes the detection performance of the original hazy images (Hazy) and the DL-U-Net dehazed images (Dehazed) for each fog-density condition, along with the corresponding differences
.
First, the detection performance gain obtained by DL-U-Net dehazing is positively correlated with fog density, while the magnitude of the gain increases nonlinearly as fog density becomes higher. The relative improvements in are , , and for , , and , respectively, showing a much steeper increase than the arithmetic increase in . A similar nonlinear trend is consistently observed in and Recall.
This indicates that the detector exhibits a nonlinear response to changes in input quality. In regions where sufficient detection signals are already preserved, additional signal restoration provides only a limited marginal benefit. In contrast, in regions where the signal has degraded below a critical level, the same degree of restoration may determine whether an object is detectable. Therefore, the practical utility of detection-oriented dehazing is not uniformly observed across all fog conditions but is concentrated under conditions in which visibility degradation exceeds a certain level.
Second, the effect of dehazing is asymmetric between Precision and Recall. At , Recall is approximately 3.7 times greater than Precision . This can be explained by the fact that candidate detections suppressed below the confidence threshold by fog-induced degradation are raised above the threshold through structural restoration by the dehazing front-end and are therefore detected. This indicates that dehazing primarily contributes to recovering missed detections rather than suppressing false positives (FP).
Meanwhile, Precision peaks at for and then decreases to for , showing a pattern different from Recall, which consistently increases as fog density becomes higher. This suggests that, under dense fog conditions, residual artifacts introduce some false positives, partially offsetting the improvement in Precision. Consequently, the improvement observed under dense fog conditions is mainly driven by Recall recovery.
Third, when the changes
in
Table 1 are examined according to the IoU threshold, the effect of dehazing varies qualitatively depending on the threshold.
consistently improves under all fog-density conditions. In contrast,
, which reflects stricter localization accuracy, shows almost no improvement or a slight decrease at
and
, and shows only a small improvement of
at
.
This can be explained by the fact that dehazing produces mutually competing effects. While it restores object visibility and confidence responses, thereby improving classification-related performance, it can also partially degrade precise bounding box regression by introducing subtle artifacts and pixel smoothing. When fog density is low, the regression-disturbing effect is more pronounced than the classification improvement. However, under dense fog conditions, the effect of contour restoration outweighs the negative influence of artifacts. Meanwhile, mAP@0.5:0.95, which is averaged over 10 IoU thresholds, maintains stable improvement across all fog-density conditions, indicating that the slight decreases observed at specific thresholds do not appear at the level of the aggregated metric. Therefore, dehazing preprocessing stably improves classification-related performance across all fog-density conditions, whereas its benefit for precise localization regression appears only under dense fog conditions.
Based on these results, the utility of the DL-U-Net dehazing front-end can be summarized as follows. The degree to which dehazing contributes to detection performance increases nonlinearly as fog density becomes higher, and this contribution is mainly achieved through Recall recovery, that is, the reduction in missed detections. In addition, dehazing stably enhances classification-related performance across all fog-density conditions, whereas its benefit for precise localization regression appears only under dense fog conditions. These results indicate that the detection-oriented evaluation framework proposed in this paper can quantitatively evaluate the effectiveness of the dehazing front-end from the perspective of downstream detection tasks, rather than simple visual quality.
4.3. Class-Wise Detection Results
This section evaluates the effect of DL-U-Net on class-wise object detection performance using and as evaluation metrics. By comparing the performance differences before and after dehazing, this section demonstrates how DL-U-Net contributes to feature restoration for downstream detection tasks.
According to the
results in
Table 2, the overall changes before and after dehazing are relatively small under the light fog condition of
. Slight increases are observed for bicycles, cars, trains, and trucks, whereas slight decreases appear for person, motorcycles, and buses. This suggests that, under light fog conditions, object contours, textures, and local contrast information are still relatively well preserved in the original images, thereby limiting the additional detection gain provided by DL-U-Net preprocessing.
Under the condition, increases for most classes except truck. In particular, positive changes are observed for cars, trains, motorcycles, and bicycles. This can be attributed to the fact that DL-U-Net partially restores the structural information required for detection under moderate fog conditions, where object boundaries and fine structures begin to be weakened.
Under the dense fog condition of , this tendency becomes more pronounced. increases for all classes except bus, with the car class showing the largest increase. This result indicates that the vehicle contours and local contrast weakened by dense fog are partially recovered after DL-U-Net processing, thereby enhancing the detection response of YOLO26n. Therefore, based on , the effect of DL-U-Net is more clearly observed under moderate and dense fog conditions than under light fog conditions.
Compared with
, the
results in
Table 3 show a more conservative trend. Under the
and
conditions,
decreases for some classes, indicating that although dehazing can help determine object presence or increase detection confidence, it does not always improve precise bounding box localization. Even when object contours and contrast are enhanced through DL-U-Net processing, smoothing in certain regions or subtle texture changes may still impose a burden on bounding box regression.
In contrast, under the condition, increases for person, bicycle, car, train, and truck. In particular, cars and trucks show positive changes in both and . This confirms that, under dense fog conditions, the contour restoration effect of DL-U-Net can positively affect not only object detectability but also the localization stability of certain object classes.
In summary, the effect of the DL-U-Net dehazing front-end is not consistent across all object classes and IoU criteria. Based on , detection performance improves for more classes as fog density increases, indicating that DL-U-Net helps reduce missed detections by restoring object contours, textures, and local contrast information weakened by fog. On the other hand, based on , the performance improvement is limited, and some classes even show decreases. Therefore, the effect on precise localization regression varies depending on fog density and object type. These results show that the detection-oriented evaluation framework proposed in this paper can analyze the influence of the dehazing front-end in detail, not only in terms of overall but also with respect to object class and IoU criterion.
4.4. Framework-Level Comparative Experiments
This section conducts a framework-level front-end replacement experiment to verify whether the proposed evaluation framework can be generally applied to different dehazing front-ends. In addition to the DL-U-Net adopted in this paper, AOD-Net and DehazeFormer are applied at the same position, while all other settings, including the Foggy Cityscapes validation dataset, the YOLO26n detector, class mapping rules, and evaluation metrics, are kept identical.
Reference in
Table 4 refers to the input original hazy images without dehazing processing. As shown in
Table 4, detection performance varies substantially under the same fog condition depending on the type of dehazing front-end. Under the lightest fog condition,
, the
improvements compared with Reference differ across the front-ends: DehazeFormer achieves
, DL-U-Net achieves
, and AOD-Net shows
. AOD-Net records lower performance than Reference despite dehazing processing. By contrast, as the fog density increases, the performance gap becomes narrower. At
, DehazeFormer, DL-U-Net, and AOD-Net show improvements of
,
, and
, respectively, indicating clear detection gains for all three front-ends. This is because, under dense fog conditions, a large amount of structural information is lost, so signal restoration above a certain level contributes to detection regardless of the restoration method. However, under light fog conditions, sufficient detection-related information is already preserved in the original image, and the side effects of the restoration method therefore become relatively more apparent in the detection results.
As a result, even for the same detection task, downstream detection performance may either improve or even decrease depending on the design of the dehazing front-end. The proposed evaluation framework can quantitatively identify these differences based on detection performance itself, rather than visual restoration quality.
In addition, the gap between DL-U-Net and DehazeFormer is at , at , and at , indicating that the gap rapidly narrows as fog density increases. Under the dense fog condition , the detection performance of the two front-ends becomes nearly identical. Their values are and , respectively, with a difference of only . This indicates that although DehazeFormer has an advantage in visual restoration quality due to its deeper and heavier architecture, the lightweight DL-U-Net can also provide sufficient effectiveness under dense fog conditions in terms of restoring structural information useful for detection. In other words, as fog density increases, the amount of structural information that needs to be restored becomes a key factor determining detection performance, while differences in model capacity are not fully translated into differences in detection performance. This characteristic is the basis for selecting DL-U-Net as the main analysis target in this paper and using DehazeFormer as a strong baseline for comparison.
Finally, the results of AOD-Net show low consistency. At , its Precision is , which is slightly higher than that of Reference , but its Recall decreases substantially to compared with Reference , resulting in a decrease in . Even at , although improves by , Precision remains lower at than that of Reference . This can be interpreted as a result of AOD-Net dehazing outputs introducing color bias and local distortions, thereby partially damaging the detailed information that the detector needs to utilize. In other words, although AOD-Net visually removes some haze, the side effects introduced during this process disturb the detector response, preventing the dehazing effect from being stably converted into detection gains.
These results confirm that the proposed evaluation framework has both generality and discriminative capability, allowing different front-end models to be compared under the same criteria. Within this framework, DehazeFormer provides the highest absolute performance across all fog-density conditions, while DL-U-Net shows stable detection improvement and achieves performance close to that of DehazeFormer under dense fog conditions. In contrast, AOD-Net shows variability, including a decrease in detection performance under light fog conditions.
4.5. Visual Comparison of Detection Results
In this section, representative samples under different fog-density conditions were selected for visualization analysis to more intuitively verify the effect of dehazing preprocessing on subsequent object detection results. All images were input into the same YOLO26n detector, and the input size, confidence threshold, and inference settings were kept identical.
As shown in
Figure 3, in the original hazy images, the detector frequently exhibits missed detections, low confidence scores, and positional bias in bounding boxes. Such degradation is particularly pronounced for distant small objects or traffic objects in complex backgrounds. As the fog density increases, these problems become visually more apparent. Under the light fog condition of
, the differences in detection results before and after dehazing are barely distinguishable, whereas under the dense fog condition of
, object contours become clearly sharper after processing, and objects that were previously missed are detected again. This visually corresponds to the nonlinear amplification pattern of detection performance improvement derived in
Section 4.2, namely, that the detection gain from dehazing increases sharply as fog becomes denser. In particular, the recovery of missed detections analyzed in
Section 4.2 is confirmed to be the most prominent change in the visualization.
As shown in
Figure 4, the visual restoration characteristics of the three models do not result in consistent differences in the detection results. Although images processed by AOD-Net show enhanced contrast in some scenes, they also exhibit brightness bias and local texture distortion, which leads to unstable bounding box locations or missed targets. This is consistent with the decrease in Precision observed in
Section 4.4 and with the result that AOD-Net showed lower detection performance than the original hazy input under light fog conditions. Images processed by DL-U-Net stably restore the contours and local structures of major traffic objects, such as vehicles and pedestrians, under dense fog conditions, making the detection boxes clearer and the responses more stable. Images processed by DehazeFormer show the highest overall clarity; however, under dense fog conditions, the difference between its detection results and those of the DL-U-Net-processed images is marginal. This is consistent with the quantitative result in
Section 4.4, where the
gap between the two models narrowed to
at
.
The results thus intuitively support the conclusions of the quantitative analysis presented in
Section 4.2 and
Section 4.4. In particular, although DehazeFormer shows the best visual restoration quality, its detection results under dense fog conditions are not clearly different from those of DL-U-Net. This visually supports the argument of this paper that the value of a dehazing front-end cannot be judged solely by visual metrics, but should be evaluated from the perspective of downstream detection tasks.
4.6. Computational Cost and Statistical Significance Analysis
To compare the computational cost of different dehazing preprocessing front-ends, this section measures the pure inference time of each model. Here, pure inference time is defined as the duration from the moment the input tensor is passed to the model to the completion of the neural network forward pass. Accordingly, this measurement excludes data loading, preprocessing, result saving, and other pipeline operations, focusing only on the computational cost of the dehazing model itself.
Table 5 presents a comparison of the pure inference times of AOD-Net, DL-U-Net, and DehazeFormer under the same experimental conditions. AOD-Net achieved the fastest processing speed, with an average inference time of 5.74 ms and 174.28 FPS. DL-U-Net recorded an average inference time of 11.90 ms and 84.07 FPS. Although it is slower than AOD-Net, its inference time remains within approximately 12 ms per image, indicating that its computational cost is sufficiently low for real-time processing. In contrast, DehazeFormer exhibited the highest inference time, with an average of 46.30 ms and 21.60 FPS.
These differences mainly arise from the structural complexity of each model. AOD-Net achieves the fastest speed because of its simple and lightweight CNN-based architecture, whereas DehazeFormer requires substantially longer inference time due to its Transformer-based attention operations and multi-scale feature processing. Although DL-U-Net requires more computation than AOD-Net because of its 9-channel input and U-Net-based architecture, it is approximately 3.89 times faster than DehazeFormer. These results indicate that DL-U-Net provides a relatively well-balanced preprocessing model in terms of the trade-off between detection performance improvement and computational cost.
As shown in
Table 6, dehazing preprocessing increased Recall and F1-score under all three fog-density conditions. Since none of the corresponding 95% confidence intervals included 0, these differences can be regarded as statistically significant. Under the
condition, the increases in Recall and F1-score were relatively small, at
and
, respectively, suggesting that the detection performance gain from dehazing is limited under light fog conditions.
As fog density increased, the magnitude of improvement in both metrics became larger. Under the condition, Recall and F1-score increased by and , respectively, and under the condition, they increased by and , respectively. Overall, the statistical significance analysis indicates that dehazing preprocessing has a relatively stable positive effect on Recall and F1-score. This effect mainly contributes to reducing missed detections to some extent and improving the overall detection performance. The improvement tendency becomes more evident under dense fog conditions.