1. Introduction
Remote sensing imageries are being increasingly utilized in the fields of numerous applications with the advances of remote sensing technology, such as agriculture and weather studies [
1], land cover monitoring [
2,
3,
4], and so on. However, remote sensing images are always impacted by various atmospheric conditions like cloud, fog, and haze, which leads to a low image quality and thus inefficient downstream analysis for many applications. Therefore, remote sensing image haze removal is a crucial and indispensable pre-processing task.
For the image dehazing problem, earlier works utilized multiple images of the same scenery [
5,
6,
7,
8]. Despite some success, these methods are not practical in real life since the acquisition of several images from the same scenery under different conditions is rather difficult. Subsequently, numerous single-image dehazing methods have been developed. Some of the earlier methods make use of image enhancement techniques, including histogram-based and contrast-based methods. In [
9], Xu et al. presented a solution based on contrast limited adaptive histogram equalization to remove haze from single-color images. Narasimhan et al. [
10] proposed a physical-based model to describe the appearances of scenery under uniform bad weather conditions and utilizes a quick algorithm to recover the scene contrast. However, these enhancement methods do not take the reasons for the image degradation into account, leading to common over-estimation, under-estimation, and color shift problems.
With the physics-grounded atmospheric scattering model (ASM) developed in [
11], many methods have followed this physical model and attempted to recover clear scenes. To tackle the ill-posed essence of single image haze removal problem, different priors and assumptions have been made. He et al. [
12] presented an empirical statistics-based dark channel prior (DCP) that for haze-free non-sky image patches, at least one color-channel has some pixels with a quite low intensity. With the DCP, transition matrix can be estimated from the original hazy image, and thus a clear image is restored. However, this method leads to halos and color distortion when it comes to sky regions, since the dark channel of sky regions has bright values, leading to underestimation of the transmission and thus unsatisfactory dehazing results. In addition to the DCP, many other prior-based methods have been developed. Meng et al. [
13] proposed a boundary constraint and contextual regularization (BCCR) based dehazing method to obtain a shaper restoration. In [
14], Zhu et al. developed a color attenuation prior to recovering the depth information of original hazy images through a linear model and estimate the transmission maps. According to the prior that in a haze-free image every color cluster becomes a line in the RGB space, Berman et al. [
15] developed a non-local single-image haze removal solution to recover the distance map and a clear image.
In consideration of the success of convolutional neural networks (CNNs) in computer vision tasks, various haze removal techniques leverage CNNs to learn a transmission map just from the data. Cai et al. [
16] developed a CNN-based system to learn the mapping from the original hazy image to the medium transmission matrix based on the training data, and leveraged empirical methods to estimate the global atmospheric light. The clear image was subsequently recovered with the ASM. In [
17], a multi-scale deep CNN is presented by Ren et al. for single-image haze removal, which contains a coarse net to predict an initial transmission matrix and a fine net to refine the results locally. Since these data-based methods leverage CNNs only for the transmission estimation, they cannot perform directly end-to-end haze removal. To handle this issue, Li et al. [
18] developed a light network through deforming the traditional ASM and minimizing the reconstruction error between the input hazy image and corresponding clear image. More recently, Zhang et al. [
19] developed a densely connected pyramid dehazing network (DCPDN) to estimate the transmission matrices, atmospheric light, and dehazed results at the same time through implanting the traditional ASM into the proposed network.
Although much success has been achieved, these prevailing dehazing methods easily fail when it comes to remote sensing images since hazy remote sensing images are largely different from regular natural hazy images, in many aspects. For instance, natural images often contain sky regions, which can be used for the estimation of atmospheric light, while remote sensing images contain no sky regions. At the same time, for natural close-range images, the haze distribution changes with the depth of field, and thus estimation of the depth of field is the focus of dehazing. However, as to remote sensing images, the depth of field can be regarded as a constant since the distance between the sensor and the scene is always very large. As a result, the haze intensity distribution is mostly affected by the atmospheric conditions and is thus rather changeable and irregular. From this perspective, haze removal for single remote sensing image is much more complicated since there is no rule in the distribution of haze. To handle the issue of single remote sensing image dehazing, Fu et al. [
20] presented an enhancement solution which combines the regularized-histogram equalization with discrete cosine transform. Makarau et al. [
21] constructed a haze thickness map (HTM) through locally search of the dark objects, and subtracted the HTM to restore the haze-free image. Some haze removal methods pay attention to the visible bands, since haze tends to pollute more visible bands. According to the DCP, Long et al. [
22] leveraged the DCP and a low-pass Gaussian filter for the estimation of the transmission matrix and subsequently removed the haze from a single remote sensing image. Shen et al. [
23] utilized the classic homomorphic filter, removed the thin cloud (also regarded as haze) and restored the ground information in the frequency domain. Haze optimized transformation (HOT) was presented in [
24] for the dehazing of Landsat scenes. Jiang et al. [
25] further developed HOT to make it more robust and suitable for visible remote sensing images. Based on the HTM, Liu et al. [
26] presented a ground radiance suppressed HTM to get a more accurate haze distribution estimation, and thus removed the haze component existing in every band. Xie et al. [
27] modified the DCP for remote sensing images and developed a novel dark channel-saturation prior. Despite being physically grounded, these methods are mostly sensitive to a non-uniform haze distribution, which however, is the most common state of haze in remote sensing images.
To handle these issues, we propose a prior-based dense attentive dehazing network (DADN) for single remote sensing image dehazing. Firstly, taking the non-uniform haze distribution of hazy remote sensing images into account, we propose to extract a haze density map (HDM) from the original hazy image at the first step, which can be regarded as a haze density prior, and we subsequently use the HDM together with the original hazy image as input of the network. The proposed network contains an encoder-decoder structure and directly learns the mapping from the original input images to the corresponding haze-free images, without any intermediate parameter estimation steps, enabling the network to measure the distortion of the clear image directly, rather than that of intermediate parameters. Dense blocks are carefully constructed to effectively mine the haze-relevant information, considering the advantages of dense networks. Meanwhile, both spatial and channel attention blocks are leveraged to recalibrate the extracted feature maps, thus allowing for more adaptive and efficient training.
Our main contributions are listed as follows:
- (1)
A single hazy remote sensing image dehazing solution, which combines both physical prior and deep learning technology, is presented to better describe the haze distribution in remote sensing images, and thus deal with non-uniform haze removal. In this solution, we first extract an HDM from the original hazy image, and subsequently leverage the HDM prior as input of the network together with the original hazy image.
- (2)
An encoder-decoder structured dehazing framework is proposed to directly learn clear images from input images, without estimation of any intermediate parameters. The proposed network is constructed based on dense blocks and attention blocks for accurate clear image estimation. Furthermore, we leverage a discriminator at the end of net to fine-tune the output and ensure that the estimated dehazed result is undifferentiated from the corresponding clear image.
- (3)
A large-scale hazy remote sensing dataset is created as a benchmark which contains both uniform and non-uniform, high-resolution and low-resolution, synthetic and real hazy remote sensing images. Experimental results on the proposed dataset demonstrate the outstanding performance of the proposed method.
The remainder of the paper is organized as follows.
Section 2 describes the degradation procedure caused by haze, as well as the details of the proposed dense attentive dehazing network (DADN). The experimental settings, results, and analysis are presented in
Section 3 and a further discussion is presented in
Section 4. Finally, our conclusions are given in
Section 5.
4. Discussion
In this study, we proposed a dense attentive dehazing network (DADN) which combines physical prior and deep learning technology to learn the mapping between the original input images and the corresponding haze-free image directly. Specialized designed for single remote sensing image dehazing, we propose to first extract an HDM from the original hazy image, which can be regarded as a haze density prior, and subsequently combine the HDM with the original hazy image as input of the network for a better description of the non-uniform haze distribution in hazy remote sensing images. Meanwhile, both spatial and channel attention blocks are carefully constructed in the network to recalibrate the extracted feature maps, thus allowing more adaptive and efficient training. To make sure that the estimated dehazed result is undifferentiated from the corresponding clear image, we further utilize a discriminator at the end of net, to refine the output.
To further validate the effectiveness of each module of the network, we conducted experiments on a network without the HDM (DADN_noHDM), a network without the discriminator (DADN_noDISCRI), and a network without the attention blocks (DADN_noRCSAB). The results are presented in
Figure 13 and
Table 6. DADN_noHDM and DADN_noRCSAB fail to detect the non-uniform haze, and obvious vestiges of haze remain, especially in the last two images, indicating that models without the HDM prior and RCSAB lack the ability to mine high-level haze-relevant features, and thus fail to remove all the non-uniform haze. Meanwhile, for the PSNR and SSIM criterion in
Table 6, the proposed DADN method considerably outperforms DADN_noHDM and DADN_noRCSAB, which demonstrates that the haze density prior of the HDM and the attention module (RCSAB) are important and effective in the detection and removal of the non-uniform haze existing in remote sensing images. For the visual effects, DADN_noDISCRI and DADN are the most competitive methods, with vivid color, clear structure, and most of the non-uniform haze removed, while for the qualitative results, DADN outperforms DADN_noDISCRI, with the PSNR improved by 0.5. The qualitative results on the large-scale test data further validate the effectiveness of the proposed discriminator. Furthermore, a comparison on average consuming time (per image) is conducted. As we can see, our module makes obvious improvement in dehazing performance only with the cost of less than 0.06 s increase in time (per image), which is acceptable.
Overall, all the modules, i.e., the HDM prior, RCSAB, and the discriminator, are effective and necessary for single remote sensing image haze removal.