1. Introduction
Images in bad weather suffer from the scattering of haze, fog, and dust in the atmosphere. The scattering reduces scene visibility, attenuates color and blurs details. With the development of outdoor computer vision systems, single image dehazing approaches have been widely studied in recent years. These approaches can be divided into two categories: physical model-based approaches and non-physical model-based approaches.
The prime reason for the low visibility of hazy images is the light absorption and scattering by suspended particles in the atmosphere [
1]; the scattering causes the attenuation of light in the transmission process between the target and the camera, and adds a layer of air light scattering [
2]. Narasimhan et al. [
3] built Atmospheric Scattering Model (ASM) to explain the factors of scattering. The model, widely used for dehazing, is formulated by the following equation:
where
I is the observed hazy image,
J is the scene radiance,
A is the air light,
is the scattering medium transmittance,
is the attenuation coefficient of the scattering medium, and
z is the scene depth.
Since light scattering is the main cause of image degradation in bad weather, ASM for single image dehazing that is physically valid has gained popularity. The essence of these algorithms is to estimate the transmittance and air light using priors [
4,
5,
6]. Deep learning strategies [
7,
8,
9] have been also proposed to estimate parameters in the ASM more accurately without artificial priors.
Decreases in contrast and saturation are one of the most remarkable features of hazy images, and most non-physical-based approaches aim to increase contrast or saturation to suit human vision. In the early days, image contrast enhancement methods including Histogram Equalization (HE) algorithms [
10,
11], Retinex algorithms [
12,
13,
14], and image fusion algorithms [
15,
16] were used for haze removal; as they do not consider physical constraints, the effectiveness of these enhancement methods is quite limited. The current data-driven methods [
17,
18,
19,
20,
21,
22,
23,
24,
25,
26,
27,
28,
29,
30,
31,
32] that map the input hazy image directly to the dehazed image can obtain satisfactory results. However, there are still many challenges that must be overcome. Methods based on priors suffer from certain shortcomings, such as inadequate edge information recovery and the emergence of halo artifacts in sky regions. While data-driven deep learning algorithms are not based on physical processes, they are prone to generating artifacts or color shifts under conditions of dense or non-uniform haze. Furthermore, the acquisition of large-scale, high-quality paired datasets has long been a persistent challenge for such algorithms.
Therefore, ideas for improving the physical scattering model have been investigated. Ju et al. [
33] established a concise gamma correction-based dehazing model (GDM) for the inner relationship between gamma correction and the ASM.
GDM compensates for the brightness deficiency of ASM to a certain extent, but the defects of the model itself is hard to overcome. As the formula for transmittance t is in the form of an exponential function, it will dramatically reduce with the depth of the scene.
Considering these issues, this paper proposed a new scattering model called the United Scattering Transmission Model (USTM), based on the image degradation model:
where
is the point spread function caused by forward scattering, and
is the noise caused by back scattering, both of which are related to the depth
z. During the process of building the model, both forward scattering and back scattering are taken into consideration physically. Then, the haze-free image, hazy image, and its edge operator are related by using Taylor expansion as shown in
Figure 1.
Haze removal with the USTM is carried out and compared with other scattering models and also with deep learning networks. We summarize the contributions of our work as follows:
In contrast to existing scattering models in the literature that merely superimpose forward and backward scattering effects in a decoupled manner, we remodel the medium using a layered model and propose a physically self-consistent unified framework, USTM, that achieves the coupled modeling of the two scattering mechanisms via diffusion equations. It overcomes the drawbacks of traditional combined models, such as premature convergence in large-depth scenes and insufficient compensation for high-frequency details.
We use Taylor expansion to introduce a derivative edge operator in this USTM that makes contributions to details’ protection.
The depth parameter in the USTM is formulated into attenuation optical depth , not the transmittance . Since the dynamic range of the attenuation optical depth is larger than that of the transmission which plays an important role in constrainting image over-saturation, this model performs better in the sky region or prospect area.
4. Simulated Scattering Experiment
In order to compare the characteristics of the linear constraint between the optical distance and the actual distance in the above three models, we set up a simulation experiment environment as shown in
Figure 3.
We used a sink full of muddy water as the scattering medium, then positioned a camera at one end of the sink, and a light source at the other end to assist in the generation of slit light. The blackboard could slide and stop at a certain distance from the camera. In this way, we acquired a series of slit images at different depths, which are listed in
Figure 4.
Let the optical depth
, as in Equations (
1), (
3) and (
25). Since each slit image was at a fixed depth
z, the optical depth
should be a constant. However, the attenuation coefficient
of the scattering medium is unknown, which makes it unable to restore the haze-free image with the exact value of
. Here, we chose various values of
for restoration, and evaluated the restored results. The
corresponding to the optimal restored result should approximate to the actual optical depth. We separately applied ASM, GDM, and USTM into the sets of slit images in
Figure 4. After this, we can obtain the restoration results of each model corresponding to different
values as shown in
Figure 5. Due to the fact that the light attenuation coefficients of different wavelengths are dissimilar in water, we transformed the color from RGB space to HSI space and restored the image in subchannel of brightness.
In order to evaluate the restoration results, we compared the optical depths corresponding to the optimal restoration results determined by the PSNR of each image with the actual depth; due to the simplicity of slit images, the required clean reference images are directly generated by code.
Figure 6 shows the actual depth curve and optical depth curves obtained by the ASM, GDM, and USTM. It can be noted from the figure that the optical depths estimated by the USTM retain a proportional relationship with the actual depths in both small and large scenes, while the other two models have the problem of premature convergence at the larger depth.
It is worth noting that this experiment is not intended to simulate atmospheric scattering conditions, but to verify the feasibility of the USTM modeling method and the applicability of the model’s physical constraints on the correlation between optical depth and actual depth. Although aqueous and atmospheric scattering media differ in their properties, they follow the same fundamental scattering laws, both imposing the superposition effect of forward and back scattering during light propagation. The experimental results confirm that USTM can maintain a linear correlation between optical depth and actual depth, which validates the rationality of the model’s layered medium decomposition and diffusion equation derivation.
5. Haze Removal Using USTM
Using the USTM to remove haze also requires two estimates: one is used to estimate the attenuation optical depth map; the other is used to find the sky region, and then to utilize the power spectrum and mean value of this region to fit and A, which are called scattering medium characteristic parameters. The pseudo code for the dehazing process is provided in Algorithm 1.
The algorithm takes the hazy image
I as the input and outputs the haze-free image
J as the final result. In Steps 1 and 2, the atmospheric light
A and transmission map
t are first estimated using the methods described below. In Steps 3 to 4, the power spectrum of the sky region is calculated. From Steps 5 to 11, a frequency map is constructed based on the sky region, and the zero-frequency component is removed via a mask. The term
in Step 10 corresponds to
in the denominator of Equation (
24), and
represents the power spectrum after excluding the zero-frequency point. In Steps 12 and 13, the parameter
is estimated via the least squares method by combining
and
. Finally, in Steps 14 to 16, the result of the Laplacian operator and the dehazed image derived from the USTM model are computed.
5.1. Scattering Medium: Characteristic Parameters’ Estimation
Air light
A in Equation (
1), (
3) and (
25) represents the intensity of atmospheric light at infinity. In images containing sky regions, the sky’s brightness is used as an approximation for this distant atmospheric light. In images without clear sky regions, a small portion of the brightest pixels or the brightest regions is typically selected, and their average value is used as an estimate for
A. This is because the brightest parts of an image usually correspond to areas with extremely large scene depth or to inherently bright, highlighted objects. Moreover, this approach has been demonstrated to be robust even for images without a clear sky region, as evidenced by numerous publications in the field.
| Algorithm 1: Single Image Dehazing Algorithm. |
-
Input: Hazy image -
Output: Dehazed image -
// Estimate air light, This estimation module can be replaced. - 1
; // Estimate transmission map , This estimation module can be replaced. - 2
; // Calculate parameter using power spectrum method // Compute 2D Fourier transform and power spectrum - 3
; - 4
// Create frequency coordinate grid - 5
; - 6
; - 7
; - 8
; // Exclude zero frequency point - 9
; - 10
; - 11
; // Least squares fitting to estimate - 12
Define model: ; - 13
; // Perform dehazing calculation - 14
; - 15
; - 16
return
|
In our algorithm, we adopt a quadtree-based method to estimate the air light A. This method recursively divides the image into four sub-blocks, calculates a score for each sub-block (defined as average intensity minus standard deviation), and selects the sub-block with the highest score for further iterative partitioning. The iteration terminates when the size of the current sub-block is less than predefined minimum block size (e.g., 50 × 50 pixels). Finally, the average intensity of the selected sub-block is used as the estimate for the air light A.
After we obtain the air light
A, its converted one-dimensional power spectrum can be used to fit the parameters in (
25). Here it should be mentioned that because the edge operator in USTM is very small, as long as the fitting parameter value is correct in the range of the order of magnitude, it is robust to the result.
5.2. Depth Map Estimation
We use three prior-based (DCP, CAP, and NLD) depth map estimation methods in this paper.
DCP is based on the statistical law of a large number of haze-free images: the minimal gray level among the three RGB color channels of each image is very low and tends to be zero. For any input image, the mathematical expression of its dark channel is defined as
For a clear haze-free image
J, the mathematical representation of DCP is
. What is more, based on statistical analysis of considerable dark channel images of the edge operators of hazy images, DCP is also valid; in other words,
. Then, assuming that the attenuation optical depth is a constant in the same patch
, we perform dark channel operations on both sides of the USTM’s equation:
Due to
and
both being zeros, the above formula can be simplified:
In case the denominator is zero, we use a constraint coefficient
. Finally, we estimate the refined attenuation optical depth map by a guided image filter [
33]. In addition, the transmission estimation methods independent of the scattering model can be directly applied into the USTM, such as CAP and NLD. Since the dynamic range of the attenuation optical depth is larger than that of the transmission
, we compare the transmission depth maps estimated by DCP in both the ASM and USTM.
Figure 7 shows the comparison of transmission maps estimated by two models. The groups on the left side are hazy images with a larger-depth scale of field, and the other groups on the right side are images with a smaller depth scale. Obviously, the results of the two models did not differ significantly in the right groups, but, on the prospect area in the left groups, the USTM had a greater anti-interference ability.
6. Experimental Results
6.1. Datasets
To make the comparison more comprehensive, our validation dataset employs both paired synthetic data and unpaired real-world data, covering diverse indoor/outdoor scenes and varying haze densities. Since obtaining reference-based evaluation metrics for the latter is difficult, we use no-reference metrics on the unpaired data.
RESIDE-RTTS [
35]: The Real-world Task-Driven Testing Set (RTTS) is a subset of the RESIDE dataset created by Boyi Li et al. It contains over 4000 real hazy traffic images with varying resolutions and is suitable for multiple evaluation metrics.
Haze4K: Haze4K is a synthetic dataset comprising 4000 hazy images, each paired with corresponding clean images, ground truth transmission maps, and atmospheric light matrices. The images have a resolution of 400 × 400 pixels. Released in 2021 by multiple academic institutions, it serves as a large-scale, high-quality paired synthetic dataset.
Dense-Haze [
36]: Dense-Haze consists of 33 pairs of real hazy and corresponding haze-free images, along with 22 additional pairs from the I-Haze and O-Haze datasets. The hazy images were captured under controlled conditions using a professional haze-generating machine, which produces dense and uniform haze that closely simulates real hazy environments. Furthermore, since the images were collected in a controlled setting, both hazy and haze-free ground truth images were captured under identical lighting conditions.
Classic hazy images: We collected about 30 classic hazy images frequently used in previous works as the dataset for real-world hazy scene images.
UA-DETRAC [
37]: This dataset is used for the application experiment in
Section 6.6. UA-DETRAC is a benchmark dataset for vehicle detection with precise annotations, captured in clear weather conditions. It contains four vehicle categories: Car, Bus, Van, and Other. In this work, it is utilized to train the object recognition model for the application experiment.
HazyDet [
38]: This is a paired synthetic dataset captured by drones, designed for image dehazing research. In the application experiment of this paper, a subset is extracted and re-annotated according to the UA-DETRAC category standard to construct a hazy vehicle detection dataset for testing purposes.
For the comparative experiments, the code was executed on a cloud server with the following configuration (
Table 1):
6.2. Evaluation Metrics
We categorize the evaluation metrics into reference-based metrics, which require paired clear images, and no-reference metrics, which do not require paired images.
The reference-based metrics are: PSNR, SSIM and NIMA.The calculation methods for the first two metrics are as follows:
For the PSNR calculation in Equation (
29),
denotes the maximum possible pixel value of the image, and
in Equation (
30) is the Mean Squared Error, where
m and
n represent the image height and width in pixels, and
and
are the pixel values at position (
i,
j) in the original clear image and the processed dehazed image, respectively. For the SSIM index in Equation (
31), which is computed between two local image patches
x and
y,
and
represent the mean pixel intensities (for luminance),
and
are the standard deviations (for contrast), and
is the covariance (for structural similarity). The constants
and
are small values introduced to stabilize the division and prevent a denominator of zero.
NIMA (Neural Image Assessment) [
39] is a deep learning-based method for image quality evaluation. It typically employs a pre-trained Convolutional Neural Network (CNN), such as Inception-v3, as a feature extractor. The extracted features are then fed into a regression or classification head to predict a quality score that correlates with human perceptual judgments.
The calculation methods in [
34,
40] for the no-reference evaluation metrics are as follows:
BRISQUE (Blind/Referenceless Image Spatial Quality Evaluator) is a no-reference image quality assessment method. Its core principle is that natural images have predictable statistical regularities (Natural Scene Statistics; NSS), and distortions disrupt these regularities. The key step is to compute the Mean Subtracted Contrast Normalized (MSCN) coefficients to eliminate local content variations. The formula is as follows:
where
and
are the local mean and standard deviation calculated using a Gaussian-weighted window, and C is a small constant for stabilization. Subsequently, the algorithm extracts features from the distributions of the MSCN coefficients and their adjacent product coefficients (such as shape parameters fitted with a Generalized Gaussian Distribution; GGD), and inputs these features into a pre-trained Support Vector Regression (SVR) model to finally output a quality score.
Newly visible edges
e can be obtained in the following equation:
where
and
are edges from the restored clear image and original hazy image defined by adaptive thresholds. The contrast of edges must be greater than 5%.
This metric takes into account the sensitivity of human eyes to high-contrast objects, focuses on the evaluation of local detail restoration, and can show the ability of the dehazing algorithm to recover the edges and texture details of objects.
The mean ratio of the gradients,
r, is defined as below:
where
and
denote the visibility levels of the object in the restored and original images. The visibility level can be obtained with formulation below:
where
C is the Weber luminous contrast:
r is mathematically defined, because only the gradients of the visible edges in the restored image are considered. This metric focuses on edges but, differing from e, the mean ratio of gradients can be more sensitive to the variation of optical depth; therefore, combining the two metrics can obtain more objective and accurate evaluation.
The mean saturated pixels s quantifies the proportion of pixels reaching maximum or minimum luminance values in the processed image. It effectively evaluates the degree of dynamic range loss induced by contrast enhancement in dehazing algorithms. Higher s values indicate more severe image quality degradation.
The mathematical formulation is given by
Here,
denotes the pixels with 0 or 255 bright value;
is the size of image.
6.3. Haze Removal Comparison of the ASM, GDM, and USTM
In this experiment, we compared the haze removal quality of the ASM, GDM, and USTM under different prior estimations. A quick demonstration using DCP from
Section 4 is shown in
Figure 8.
We used five widely used classic real hazy images as samples, and marked them as
Figure 9, Bank,
Figure 10, Stadium, and
Figure 11, and, from left to right, as City, Snowy day and Square. Additionally, we evaluated their newly visible edges
e, the ratio of the gradients
r, and the saturated pixels
s, as shown in
Table 2.
We can observe that the USTM achieves excellent performance across all scenarios, and its restoration of color, contrast, and edge details surpasses that of ASM and GDM in most cases. In terms of key metrics, the USTM shows an average improvement of approximately 23.5% in newly visible edges and an average improvement of approximately 18.1% in the mean ratio of the gradients.
Because the minimum constraint of transmission in the power term of the GDM is not equal to zero, the saturated pixels s of the GDM are lower than the other two models. Both the e and r scores of the Stadium example are consistent with subjective perception that restoration using the USTM is better. In the Bank, Snowy day and Square examples, due to the heavy fog, the transition discontinuity produces false edges at large depth in the ASM results that lead to higher e. In the dense fog images, with the same transmittance estimation, the USTM can suppress the transition discontinuity very well. In the mist outdoor image in the City example, the ASM stretches the contrast in both the near and far region, and brightens the image, leading to greater r.
From the perspective of the visualization results, we can see that no matter whether there is fog or mist on a clear day, the ASM tends to over-saturate the sky and the transition is not smooth as indicated by the red arrow in
Figure 9; on the contrary, the USTM ensures the true color in the sky. The USTM is better at suppressing super-saturation in the further sky region than the ASM. The zoomed area in the frame reveals more detail.
The color fidelity of the GDM is also slightly better than that of ASM, but the contrast is lower. The USTM is also excellent at highlighting details, as shown by the green color in the local amplification diagrams. Additionally, more examples in
Figure 11 show that the USTM performs better in terms of both over-saturation suppression and detail restoration.
To investigate why the three models exhibit these visual differences, we conducted a statistical analysis on the intensity of pixels with different brightness levels and optical distances. It should be mentioned that the GDM does not have uniform effects on both bright and dark pixels unlike the ASM and USTM, as shown in
Figure 12. The ASM and GDM have a positive exponential correlation with optical depth
, but the USTM has a linear correlation with
. This explains why the ASM will over recover rapidly with the increase in
at greater depth, while the GDM can avoid this but only for dark pixels.
Generally, regardless of subjective or objective evaluation, the performance of the USTM is satisfactory.
6.4. Comparison Between USTM and Deep Learning
In this experiment, we implemented several representative data-driven deep learning algorithms and compared their results with the USTM using various evaluation metrics. These deep learning methods can learn the high-level features of the image and obtain excellent dehazing effects. Considering the effect and time-consuming nature of the transmittance estimation methods, DCP, CAP and NLD, we chose NLD for the USTM to compare it with deep learning approaches. In the case of dense fog, the results will be converted to HSB space, and compensated for the brightness component due to the presence of light attenuation.
We selected seven real hazy images from the classic hazy image dataset, as shown in
Figure 13, and calculated the
e and
r metrics of the dehazing results for each method, which are two of the most representative indicators of the restoration quality. The results are presented in
Figure 14 and
Figure 15. More dehazing results of deep learning algorithms in this experiment are provided in the
Supplementary Materials.
It can be noticed that the USTM performs better than the other competing methodologies at both e and r. Since the USTM involves forward scattering blur, the dehazing results of this model can ensure edge recovery. Secondly, PSD is relatively effective in contrast enhancement but with supersaturation, and EPDN appears to be able to increase more visible edges.
To better compare with the latest deep learning algorithms and make the experimental results fairer and more rigorous, we conducted supplementary experiments based on the 55 real hazy image pairs provided by the Dense-Haze dataset, using the widely recognized metrics PSNR, SSIM, NIMA, and BRISQUE in the field as the basis for qualitative analysis. Then, we randomly selected hazy images from the RTTS and Haze4K datasets and added them to the test set, resulting in approximately 100 hazy data samples covering both synthetic and real-world scenarios, as a demonstration of the visualization results.
The test results on the Dense-Haze dataset are shown in
Table 3. In terms of PSNR, the USTM achieves the optimal score 11.87 dB, representing an approximate 3.85% improvement over the second-best method. The USTM ranks second in SSIM, with a gap of about 0.01 from the first place and an approximately 11.76% improvement over the average level, and ranks third in NIMA and BRISQUE scores. The USTM is still highly competitive among the dehazing algorithms proposed in recent years.
The visualization results are shown in
Figure 16 and
Figure 17.
Figure 16 is from the Dense-Haze dataset, with the ground truth (GT) captured under the same exposure conditions without using a smoke machine.
It can be observed that algorithms such as RIDCP and DehazeSB remove haze unevenly and may produce artifacts like those in the DehazeSB results. This may be related to the fact that the domain mapping fitted by deep learning during training does not strictly follow physical constraints. In contrast, AOD-Net, PSD, and the USTM, which incorporate more physical constraints, do not generate artifacts.
Notably, HazeFlow shows blue color distortion in the dehazing results and suffers from the excessive loss of dark details in some scenes. In comparison, the USTM demonstrates accurate depth estimation even in different haze conditions, and maintains an excellent overall readability of the image content. This approach effectively preserves scene details while avoiding common issues such as over-saturation and color distortion.
The comprehensive evaluation based on these metrics further validates the effectiveness of the USTM in image dehazing. Benefiting from its robust physical framework, the USTM can effectively address the challenges posed by various haze conditions. This enables the USTM to achieve accurate detail restoration and recover image information.
6.5. Efficiency Comparison
Deep learning algorithms have higher hardware requirements, and the wide range of dehazing applications has made the demand for efficient algorithms increasingly important. In this experiment, we select representative algorithms with performance requirements ranging from low to high to conduct an efficiency comparison with the USTM.
The experiment is conducted on the Dense-Haze dataset, and the evaluation metrics include the number of parameters, Floating Point Operations (FLOPs), and algorithm speed. The results are shown in
Table 4.
Notably, the USTM is not a deep learning algorithm, so the metric of the number of parameters is not applicable. In addition, since the USTM is not implemented based on deep learning frameworks (e.g., PyTorch and TensorFlow), it is difficult to obtain an accurate calculation of FLOPs at the code level. According to our estimation, its FLOPs are approximately 0.28 G.
The USTM performs excellently in terms of its lightweight design and computational speed. Even when running on the CPU, it outperforms most compared algorithms.
6.6. Application Experiments
As an upstream task, dehazing does not need to be evaluated solely based on human visual perception; its compatibility with downstream tasks is equally important. In this experiment, we test two widely used versions of the YOLO (You Only Look Once) model, namely YOLOv5 and YOLOv11.
First, we train the model on the UA-DETRAC dataset for 100 epochs. Then, we sample 15 pairs of road-containing images from the HazeDet dataset and perform dehazing on these images using USTM, resulting in three subsets: Hazy, Clear, and Dehazed. Hazy and Clear are obtained from the HazeDet dataset, while Dehazed denotes the dehazed images generated by USTM taking Hazy as input.
Notably, due to the scarcity of public detection datasets with paired hazy images, we re-annotated the sampled images from HazeDet. Considering the domain gap caused by differences in shooting angles and focal lengths between the two datasets, the performance evaluation values of clear images from HazeDet are lower than those on the UA-DETRAC validation set. Therefore, when analyzing the experimental results, we take the detection accuracy on clear images as the baseline and focus on the relative improvement.
We evaluate object detection performance using two core metrics: mAP50 and mAP50-95, both based on Intersection over Union (IoU)—the standard metric to quantify the alignment between predicted and ground-truth bounding boxes. mAP50 reflects the basic detection accuracy at 0.5 IoU, while mAP50-95 averages the mAP over IoU thresholds from 0.5 to 0.95 to rigorously measure localization precision. Our results, in terms of mAP50 and mAP50-95, are presented in
Table 5.
To intuitively demonstrate the improvement of the USTM on the detection task, we selected a subset of experimental results for visualization, as shown in
Figure 18.
From the experimental results, it can be seen that there exists a performance gap between the two YOLO versions. The bounding boxes of YOLOv5 are less affected by haze, but still suffer from classification errors, while YOLOv11 is more vulnerable to haze interference.
It is clear that the detection results combined with USTM dehazing are much closer to those obtained directly using clean images. For YOLOv11, mAP50 is improved by 112.70% and mAP50-95 by 102.96%. For YOLOv5, mAP50 is improved by 6.65% and mAP50-95 by 12.57%.
These experiments demonstrate that the USTM has great potential for practical deployment in downstream vision tasks. We will further improve its applicability and generalization in future work.
7. Discussion
The traditional Atmospheric Scattering Model describes the main reason for image degradation, namely, back scattering, which has been proven to be effective in the restoration of mist images. On the contrary, no matter if it is estimating air light and transmission on the basis of the priors or from deep-level image features by deep learning, it is still powerless to dense fog images or the region at an infinite distance like the sky. Subsequent fusion-based deep learning dehazing models perform well in terms of reducing halos and enhancing contrast, but bring about depth confusion without considering the physical degradation mechanism. Moreover, deep learning approaches need a large quantity of data to train model parameters and also strong computer hardware capability to support the training progress.
Consequently, regression to a physical model is an optimized way to solve problems. The forward scattering neglected in the previous scattering model is an important reason to cause edge blur, which is not perceptible in the mist but prominent when the depth ratio is large. The USTM presented in this paper unifies both the forward and back scattering in the process of scattering degradation.
USTM is well-suited for edge devices due to its lightweight inference design. Its core dehazing computation consists of simple pixel-wise operations combined with efficient prior estimation methods, enabling the USTM to achieve an excellent balance between dehazing performance and inference efficiency on edge platforms and demonstrating strong deployment potential for real-time dehazing tasks at low resolutions.
To fully unlock its deployment potential under the stringent memory and computational constraints of edge hardware, two adaptations can be made to the USTM. First, an appropriate image downsampling ratio. Smaller images occupy less memory and require less computation for dehazing. Before deployment, studying the specific application scenarios of edge devices to determine the downsampling ratio can effectively utilize device resources. Second, adopting more lightweight parameter estimation methods. The most direct approach is to modify the convolution kernel size of existing methods and use low-precision estimation that is suitable for the application scenario. Additionally, if the device is equipped with a sensor that can directly acquire scene depth, it can be linked to replace the transmittance estimation step.
The Laplace operator of USTM delivers outstanding performance in edge detail preservation. However, if the image resolution is extremely low and parameter estimation deviates beyond the valid range, it may slightly amplify the cosine noise introduced by image formatting. Our future work will focus on addressing this issue.