You are currently viewing a new version of our website. To view the old version click .
Electronics
  • Article
  • Open Access

9 June 2024

SIDGAN: Efficient Multi-Module Architecture for Single Image Defocus Deblurring

,
and
School of Computer Science and Engineering, Sichuan University of Science & Engineering, Zigong 643000, China
*
Author to whom correspondence should be addressed.
This article belongs to the Special Issue Artificial Intelligence in Image Processing and Computer Vision

Abstract

In recent years, with the rapid developments in deep learning and graphics processing units, learning-based defocus deblurring has made favorable achievements. However, the current methods are not effective in processing blurred images with a large depth of field. The greater the depth of field, the blurrier the image, namely, the image contains large blurry regions and encounters severe blur. The fundamental reason for the unsatisfactory results is that it is difficult to extract effective features from the blurred images with large blurry regions. For this reason, a new FFEM (Fuzzy Feature Extraction Module) is proposed to enhance the encoder’s ability to extract features from images with large blurry regions. After using the FFEM during encoding, its PSNR (Peak Signal-to-Noise Ratio) is improved by 1.33% on the DPDD (Dual-Pixel Defocus Deblurring). Moreover, images with large blurry regions often cause the current algorithms to generate artifacts in their results. Therefore, a new module named ARM (Artifact Removal Module) is proposed in this work and employed during decoding. After utilizing the ARM during decoding, its PSNR is improved by 2.49% on the DPDD. After using the FFEM and the ARM simultaneously, compared to the latest algorithms, the PSNR of our method is improved by 3.29% on the DPDD. Following the previous research in this field, qualitative and quantitative experiments are conducted on the DPDD and the RealDOF (Real Depth of Field), and the experimental results indicate that our method surpasses the state-of-the-art algorithms in three objective metrics.

1. Introduction

Defocus blur of an image occurs when an object is out of the Depth of Field (DOF) of an imaging system. The aperture shape and lens design of the camera determine the blur shape, and the blur size varies depending on the depth of a scene point and the intrinsic parameters of the camera [1]. Defocused images impact not only human visual perception but also the performance of various vision tasks such as object detection, target recognition, image segmentation, and so forth. Despite extensive research on single image defocus deblurring, it remains a challenging problem because defocus blur not only spatially varies in size, but its shape also varies across the image. Therefore, it is essential to restore an all-in-focus image from its defocused one.
The conventional method of defocus deblurring is to model defocus blur as a combination of different convolution results obtained by applying various kernels to the sharp image [2,3,4,5,6,7,8,9,10,11,12,13,14,15,16]. According to the blur model, these methods can obtain sharp images by performing non-blind deconvolution on the blurry image after estimating per-pixel blur kernels. However, due to the employment of the restrictive blur model, the aforementioned methods often fail to recover sharp images from the defocused images because they consider that the blurred images only contain specific blur kernels, such as Gaussian kernels or discs while performing defocus deblurring. On the contrary, the blurred images in the real world are more complex than the images generated using algorithms or captured in the laboratory.
In recent years, with the rapid developments in hardware and deep learning, defocus deblurring has gradually evolved from traditional approaches to deep learning-based techniques. The first end-to-end learning-based algorithm for defocus deblurring, DPDNet (Dual-Pixel Deblurring Network) [17], was proposed by Abuolaim and Brown. Moreover, they created a new dataset called Dual-Pixel Defocus Deblurring (DPDD) [17]. In virtue of the superiority in the end-to-end learning, the sharp images generated by DPDNet are better than conventional defocus deblurring methods based on a single image in the aspect of both quantitative and qualitative evaluation. Henceforth, more and more experts and scholars all over the world began to make use of learning-based methods for image defocus deblurring [17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46].
Although previous research has achieved good image quality in single image defocus deblurring, the current methods are not effective in processing blurred images with a large depth of field. The greater the depth of field, the blurrier the image, namely, the image contains large blurry regions and encounters severe blur. The blurrier the image, the more features are lost. The fundamental reason for the unsatisfactory results is that it is difficult to extract effective features from the blurred images that contain large blurry regions. For this reason, a new Fuzzy Feature Extraction Module (FFEM) is proposed for enhancing the encoder’s ability to extract features from the image with large blurry regions. After using the FFEM during encoding, its PSNR (Peak Signal-to-Noise Ratio) is improved by 1.33% on the DPDD.
Moreover, images with large blurry regions often cause the current algorithms to generate artifacts in their results. Therefore, a new module named the Artifact Removal Module (ARM) is proposed in this work and employed during decoding. After using the ARM during decoding, its PSNR is improved by 2.49% on the DPDD. As illustrated in Figure 1, our method is capable of not only restoring structural contents and textural details but also preserving spatial smoothness of the homogeneous regions in the field of single image defocus deblurring when the images contain large blurry regions and encounter severe blur. In summary, the contributions of this paper are as follows:
Figure 1. Visualization of defocused images and sharp images restored by our SIDGAN (Single Image Deblurring Generative Adversarial Network). The defocused image and the ground truth are from the DPDD dataset.
  • Firstly, we present the Artifact Removal Module (ARM) that contributes to the removal of artifacts during decoding in the field of single image defocus deblurring. The design of the ARM addresses the artifact problem in defocus deblurring, which provides a new solution for improving image quality.
  • Secondly, we propose the Fuzzy Feature Extraction Module (FFEM) that is conducive to enhancing the encoder’s ability to extract features from the defocused image with large blurry regions. The design of the FFEM focuses on the effectiveness and robustness of feature extraction, which provides an effective means for improvement in image quality.
  • Finally, a novel algorithm for single image defocus deblurring is proposed and carefully designed. Qualitative and quantitative experiments demonstrate that the proposed method surpasses the state-of-the-art methods in image quality.

3. Proposed Method

In this section, the overall and elaborate architecture of SIDGAN is delineated first. Then, the proposed modules: the Fuzzy Feature Extraction Module (FFEM) and the Artifact Removal Module (ARM) are introduced. Then, the detailed structure of the discriminator is presented and described. Finally, the loss functions utilized by our SIDGAN are described in detail.

3.1. The Details of SIDGAN

As illustrated in Figure 2, the SIDGAN is composed of two main components, one is the generator and the other is the discriminator. The generator is responsible for generating sharp images from defocused images, whereas the discriminator is in charge of determining whether an image is real or fake. The detailed design of the generator is illustrated in Figure 3. Inspired by [66,67,68], a multi-scale original defocus image is put into a single encoder during downsampling, which is conducive to preserving more detailed features. The SCM (Shallow Convolutional Module) and the FAM (Feature Attention Module) proposed by Cho et al. [66] are able to extract the features of the original image while preserving the details of the original as soon as possible. Finally, during the training, the gradient penalty method of WGAN-GP is also employed to train the SIDGAN proposed in this work.
Figure 2. The overall architecture of SIDGAN. The defocused images are cropped and passed to the generator, then the sharp image is recovered. The discriminator calculates the distance between the fake samples generated by the generator and the real samples.
Figure 3. The elaborately designed architecture of SIDGAN. The diagram is divided into two parts: the upper part above the dashed line illustrates the detailed structure of the generator, and the lower part describes the details of each module.
As illustrated in Figure 3, the original defocus image is fed into the generator from four branches. The first is to add the original image and the decoding results, which is named global skip connection and it contributes to preserving features. The second is that the blurry image is passed through 2 FAMs, 3 convolutions, and 3 FFEMs. The third is the case that a half of the original blurred image is first fed into the SCM and then passed into the FAM. The final branch is that one-quarter of the original is fed into the SCM and then passed into the FAM again. Thus far, the encoder of SIDGAN has been introduced. During decoding, 3 ARM modules, 2 deconvolutions, and 1 convolution are employed.
As is shown in the light green box in Figure 3, in order to solve the issue that it is hard to extract features from the defocus image with large blurry regions, we design the FFEM. The core of the FFEM is short skip connections, a distant skip connection, and a deep residual network [69]. During the early stage of the FFEM, three convolutions, with kernel sizes of 7, 3, and 3, are used. After each convolution, batch normalization and Relu [70] activation are attached. After completing 3 convolutions, 2 residual blocks are followed for the purpose of effectively extracting features from the image with large blurry regions. After that, two transposed convolution layers, a regularization layer, and a Relu activation layer are used. Then, after a convolution and a Tanh, a distant skip connection is used to directly fuse the output of the FAM with the output of Tanh to achieve the goal of being able to effectively extract features while still retaining the details of the original image. Finally, the FFEM is inserted into the encoder of SIDGAN, which enhances the feature extraction capability of the encoder and also lays the foundation for restoring high-quality images from defocus images.
Aimed at the common problem of artifacts in the field of defocus deblurring when the image contains large blurry regions and encounters severe blur, an ARM is put forward in this paper. The core of the ARM is distant skip connections, PixelShuffle, and a deep residual network. The specific design of the ARM is shown in the green box in Figure 3. At the beginning of the ARM, a convolution with kernel size 9 × 9 and PRelu [71] activation are adopted. The six residual blocks are employed to restore the details from the blurred image as well as to eliminate artifacts. After completing the residual computation, further convolution and normalization computations are performed. In order to eliminate artifacts in the decoding stage while being able to effectively preserve the original image features, a distant skip connection is also adopted after the first convolution, which is for the sake of fusing the features from the previous normalization results with the results obtained by the first convolution. The main function of PixelShuffle is to obtain high-resolution feature maps from low-resolution feature maps by convolution and reorganization among multiple channels. Moreover, PixelShuffle has achieved good results in the field of image generation [72,73]. Therefore, in this paper, three convolutions and two PixelShuffles are utilized at the end of the ARM to further eliminate the artifacts during decoding. Three ARMs are inserted into the decoding process of the SIDGAN, which enables our method to perceive the entire photo from coarse to fine and from local to overall.
The discriminator of our method is inspired by the components of PatchGAN, and its detailed structure is shown in Figure 4. In general, the discriminator contains 5 convolutions, 4 activation functions, and 3 BatchNorms. The stride values of the first 3 convolutions are set to 2 and the other values are set to 1. Their kernel sizes are set to 4 and negative slopes are set to 0.2. After the first convolution, PRelu is attached as an activation function. Then the second convolution follows. Behind the second convolution, BatchNorm and PRelu are used. After the second PRelu, a convolution, a BatchNorm, and a PRelu are adopted. Behind the third activation function, a convolution, a BatchNorm, and a PRelu follow. Finally, a convolution is attached.
Figure 4. The detailed structure of the discriminator.

3.2. Loss Function

Similar to most image processing tasks, L1 loss helps improve the image quality in the field of defocus deblurring, to some extent, which can reduce the artifacts of fake images generated by defocus deblurring methods. During the training process of the SIDGAN, the training results indicate that L1 loss indeed has a certain effect in removing artifacts. The essence of L1 loss is to subtract the value of the true image and the false image pixel by pixel, then calculate the absolute value, and then accumulate the average value. It can be described by using the following formula:
L M A E = i = 1 N G θ G ( I i B ) I i S N
where I i B represents the defocused image, I i S means the sharp image, G θ G stands for the generator, and N represents the total number of pixels in the image. Compared to ordinary L1 loss, perceptual loss [74] can effectively enhance the details of the resulting image. It can be defined as follows:
L p = 1 W i , j H i , j x = 1 W i , j y = 1 H i , j ( ϕ i , j ( I S ) x , y ϕ i , j ( G θ G ( I B ) ) x , y ) 2
in the above expression, the feature map obtained from the pre-trained model VGG19 (Visual Geometry Group 19) through the j-th convolution (after activation) before the i-th max pooling layer is represented by ϕ i , j , whereas W i , j and H i , j represent the size of the feature map. Then, the Euclidean distance of reconstructed image G θ G ( I B ) and reference image I S is calculated.
In the second section of this paper, we have discussed the origin of WGAN-GP and its ability to train models stably and reliably, which has been confirmed during the training of SIDGAN. The adversarial loss of SIDGAN can be further represented as:
L a d v = n = 1 N D θ D ( G θ G ( I B ) )
where I B represents the original defocused image. G θ G and D θ D represent the generator network and discriminator network, respectively.
In summary, our definitive loss function can be expressed as follows:
L t o t a l = L a d v + λ L M A E + μ L p
in the above expression, λ = 0.0 1 and μ = 100 .

4. Experiments

4.1. Experimental Details

SIDGAN is implemented by Pytorch [75] and trained on one NVIDIA GeForce GTX 1080Ti graphics card. Adam [76] optimizer is adopted to optimize network parameters, where β 1 = 0.5 and β 2 = 0.999. At the beginning, both the generator and the discriminator have a similar learning rate of 0.0001, and it remains unchanged in the first 150 epochs. After completing the first 150 epochs, the learning rate is linearly decayed to zero over the next 150 epochs. During training, adversarial loss, L1 loss, and perceptual loss are combined as the definitive loss. In addition, the batch size is set to 2 and the trained image is cropped to 256 × 256. Our model would accomplish training after 300 epochs. The training set of the DPDD dataset consists of 350 pairs of defocused and sharp images, its validation set includes 74 pairs of defocused and sharp images, and its test set contains 76 pairs of defocused and sharp images. The aforementioned dataset is widely used in the field of image defocus deblurring. Unlike many now available deblurring methods, our SIDGAN only uses a single pixel to train. After completing model training on the DPDD dataset, qualitative and quantitative evaluation experiments on the DPDD and RealDOF are conducted.

4.2. Experimental Results

In order to verify the effectiveness of the SIDGAN proposed in this paper, a large number of experiments are conducted on the DPDD and the RealDOF that are publicly available datasets on the Internet. Not only are the PSNR, SSIM, and MAE chosen to quantitatively evaluate algorithm performance but also visual results are presented for qualitative comparison. In order to make a fair comparison with existing methods, our training data and test data are the same as for the previous algorithms. Moreover, most of the evaluation results are cited from the original paper, whereas a small amount of data come from previously published papers.

4.2.1. Evaluation Experiments on the DPDD Dataset

Table 2 illustrates the quantitative evaluation results on the DPDD dataset. From the table, we can see that our SIDGAN performs the best in terms of the PSNR, SSIM, and MAE on the DPDD dataset. Compared to the latest IRNeXt model published in ICML (International Conference on Machine Learning) in 2023, SIDGAN achieves more than a 0.86 dB PSNR improvement on the DPDD dataset. In comparison with the latest FocalNet (Focal Network) model published in ICCV (International Conference on Computer Vision) in 2023, the PSNR of SIDGAN is improved by 0.98. Compared to the recent Restormer and DRBNet (Dynamic Residual Block Network) published in CVPR (Conference on Computer Vision and Pattern Recognition) in 2022, SIDGAN achieves more than a 1.18 dB and a 1.43 dB PSNR improvement on DPDD, respectively. In contrast with BAMBNet (Blur-Aware Multi-Branch Network), our method achieves an improvement in it PSNR of more than 0.76 dB on DPDD. In comparison with MDPNet (Multi-Task Dual-Pixel Network), the PSNR of our SIDGAN is improved by 1.81 dB. As reported in the fifth row in the second column in Table 2, our method exceeds the PSNR of IFAN by 1.17 dB on the DPDD. Compared to KPAC (Kernel-Sharing Parallel Atrous Convolutional), RDPD (Recurrent Dual-Pixel Deblurring), DDDNet (Dual-Pixel-Based Depth and Deblur Network), DPDNet (Dual-Pixel Deblurring Network), and DMENet (Defocus Map Estimation Network), our method also performs better in terms of its PSNR. From the third and fourth column of Table 2, we can see that our SIDGAN also performs better than others in the aspect of the SSIM and MAE. In a few words, the former objective data go beyond previous methods, which demonstrates that the elaborately designed architecture of SIDGAN is trustworthy and effective in the field of defocus deblurring.
Table 2. Defocus deblurring comparison on the DPDD dataset. ‘+’ indicates the method is trained with extra data. The best algorithm is highlighted in boldface. ↑ indicates that the larger is better. ↓ means that the smaller is better.
Figure 5 shows the visual results of different defocus deblurring methods on the DPDD dataset. From the first large blurry image, it is challenging to determine whether it is a cement floor or a stone floor. In order to provide a better viewing experience, we crop the image into small blocks and then present the corresponding sharp images processed by different methods. From the second image of the first group, we can see that the image is still smooth and it contains many artifacts on the wall and the floor after processing by DPDNet. The image processed by MDPNet not only becomes very smooth but also contains a lot of noise and artifacts. From the fourth image of the first group, it is apparent that Restormer achieves good visual effects, but the texture is not clear enough and there are many artifacts on the floor. Compared to Restormer and our SIDGAN, the results of DRBNet and RDPD are still smooth and there are many artifacts on the floor near the wall. From the final image of the first group, we can see that our method not only restores rich texture from the defocused image but also contains less noise and fewer artifacts than others. From the second group of Figure 5, it can be perceived that other methods cannot restore the window screen well and deblur effectively. However, our method not only restores the window screen effectively but also makes distant houses clear and makes our result contain fewer artifacts than others. In summary, our SIDGAN obtains better visual results, whereas most other methods will fail when they encounter large blurry regions.
Figure 5. Visual comparison on the DPDD dataset among DPDNet [17], MDPNet [20], Restormer [72], DRBNet [81], RDPD [78], and our SIDGAN. The red box represents the cropping area of the image.

4.2.2. Evaluation Experiments on the RealDOF Dataset

Table 3 illustrates the generalization performance of the latest and recent deblurring methods on the RealDOF dataset. Compared to EBDB (Edge-Based Defocus Blur), our SIDGAN achieves more than a 2.72 dB PSNR improvement on the RealDOF. In contrast with JNB (Just Noticeable Blur), our method achieves more than a 2.74 dB PSNR improvement on the RealDOF. In comparison with DPDNet, the PSNR of our SIDGAN is improved by 2.43. Compared to DMENet, our method improves PSNR by 2.69 dB. In contrast with IFAN, MPRNet (Multi-Path Residual Network), and RDPD, the PSNR of our SIDGAN is improved by 0.39 dB, 0.73 dB, and 1.88 dB, respectively. In comparison with recent deblurring methods such as MDPNet, Restormer, and DRBNet, our method also achieves more than a 1.60 dB, 0.01 dB, and 0.22 dB PSNR improvement on the RealDOF, respectively. Moreover, compared to the latest deblurring methods such as FocalNet and IRNeXt, our SIDGAN achieves more than a 0.08 and 0.3 dB PSNR improvement on the RealDOF. According to the available evaluation information in Table 3, we can see that our method performs better in terms of image quality and has better generalization performance than others on the untrained dataset.
Table 3. Defocus deblurring comparison on the RealDOF dataset. The RealDOF is extensively adopted for testing the generalization performance of defocus deblurring algorithms and is not included in the training set. ‘+’ indicates the method is trained with extra data. The best algorithm is highlighted in boldface. ↑ indicates that the larger is better. ↓ means that the smaller is better.
Figure 6 shows the visual generalization results for different defocus deblurring methods on the RealDOF dataset that is only used for testing purposes. In order to provide a better viewing experience, we crop the image into small blocks and then present the corresponding sharp images processed by different methods. From the second image of the first group, it is evident that the image processed by DPDNet is still blurry and there are many artifacts on the wall. The image outputted by MDPNet contains relatively few details and has many artifacts on the wall and the window. At first glance, the image processed by Restormer is good in visual effects. After taking a closer look, from the upper left corner of the fourth image in the first group, it can be found that Restormer is unable to handle window frames well. From the sixth image processed by DRBNet, we can discover that the brick wall is still blurry and there are a lot of artifacts at the edges of the shadows. The result of DPDD is not only blurry but also noisy and contains many artifacts on the wall and the window. Taken as a whole, from the final image of the first group in Figure 6, we find that the result of SIDGAN is more clear and contains fewer artifacts than others. From the second group of Figure 6, we can see that DRBNet and Restormer obtain good visual results. However, the former methods cannot restore details effectively. For example, the regions near the text and the glove are still very blurry; the text and the glove from their output images are barely discernible, whereas our method can restore more details effectively. The main reason for the better results is that our FFEM can extract effective features from the blurred images with large blurry regions and our ARM is capable of removing artifacts. In short, our SIDGAN outperforms several other algorithms in terms of visual effects when the defocused image contains large blurry regions.
Figure 6. Visual comparison on the RealDOF dataset among DPDNet [17], MDPNet [20], Restormer [72], DRBNet [81], RDPD [78], and our SIDGAN. The red box represents the cropping area of the image.

5. Ablation Studies

In this section, in order to confirm whether the FFEM can enhance the ability to extract features from the defocus image that contains large blurry regions and encounters severe blur, we conduct qualitative and quantitative experiments on the DPDD and the RealDOF datasets. Moreover, for the sake of validating the ARM’s ability to remove artifacts during decoding, we also conduct qualitative and quantitative experiments on the DPDD and RealDOF datasets.

5.1. Ablation Studies on the DPDD Dataset

As shown in Table 4, we first conduct ablation studies on the DPDD dataset. The evaluation results for different modules are reported in Table 4. The PSNR of the baseline is only 26.418. After adding the FFEM, the new algorithm improves the PSNR by 0.352 dB. Compared with the baseline, the new algorithm with the ARM improves the PSNR by 0.658 dB. When the FFEM and the ARM are adopted at the same time, the PSNR of the algorithm reaches 27.155 and is enhanced by 0.737 dB. From the third column of the table, we can see that the SSIM gradually improves after adding the FFEM and the ARM. However, the ARM obtains the most improvement in the SSIM metric. In summary, from the final row of Table 4, it can be found that the algorithm performs better than the others after adding the FFEM and the ARM proposed in this work.
Table 4. Ablation studies for different modules of SIDGAN on the DPDD dataset. ↑ indicates that the larger is better. ↓ means that the smaller is better.
In addition, the visual results of our SIDGAN equipped with various modules on the DPDD dataset are illustrated in Figure 7. Overall, no matter which module is employed, the visual results of the defocused image are improved. However, some modules cannot eliminate artifacts effectively. For instance, there are many artifacts near the brick joint in the image processed by the baseline. The main reason is that the first image is severely blurry and the baseline is unable to effectively extract features. After adding the FFEM to the algorithm, we obtain a clearer image with less noise as shown in the final column. The improvement in visual effects is mainly due to the feature extraction ability of the FFEM. Although the FFEM can further improve image quality, there are many artifacts near the brick joints. Therefore, in order to eliminate these artifacts, a new ARM is put forward. From the fifth image in Figure 7, we can see that a lot of artifacts near the brick joints are eliminated. When the FFEM and the ARM, as proposed in this paper, are added simultaneously, the algorithm is capable of not only restoring structural contents and textural details but also preserving the spatial smoothness of the homogeneous regions.
Figure 7. Visual comparison to reflect the effectiveness of our SIDGAN equipped with the FFEM and the ARM proposed in this paper on the DPDD dataset.
In Figure 8, we also provide the error maps of each module on the DPDD. The brighter the color, the larger the error. In general, the baseline, the FFEM, and the ARM are able to recover more details from the defocused image. From the left image in Figure 8, we can see many brighter pixels across the image, and the pixels near the brick joints are especially bright. The previous phenomenon indicates a significant error between the defocused image and the high-definition image. In a word, the first image in the second column of Figure 8 demonstrates that the error is reduced in comparison with the error map of the defocused image. Overall, from the first image in the final column of Figure 8, we can discover that the brighter pixels are further decreased in comparison with the baseline after adding the FFEM. From the first image in Figure 7 and the left image in Figure 8, it is easy to find that the defocused image contains large blurry regions. The previous observations indicate that the FFEM can enhance the feature extraction ability of our method when the input is a defocused image with large blurry regions. Compared to the preceding error maps, the ARM further reduces errors between the sharp image and the reconstructed image. From the final image in Figure 8, we can discover that it has the least number of brighter pixels among the five error maps, which indicates that the FFEM and the ARM are conducive to recovering high-quality images from defocused images.
Figure 8. The error maps reflect the effectiveness of our SIDGAN equipped with the FFEM and the ARM proposed in this paper on the DPDD dataset. The brighter the color, the larger the error.

5.2. Ablation Studies on the RealDOF Dataset

To further verify the effectiveness of the FFEM and the ARM proposed in this paper, a quantitative evaluation experiment is conducted on the RealDOF dataset. The RealDOF dataset is used only for testing purposes. It can be discerned from the second column of Table 5 that the PSNR of the baseline is only 23.874. After adding the FFEM proposed in this paper, its PSNR reaches 24.158. Compared with the baseline, our algorithm achieves better performance after using the ARM, and its PSNR reaches 24.713. When the FFEM and the ARM are used simultaneously, our algorithm achieves the best performance in terms of the PSNR and its PSNR is improved by 1.23 dB. From the third column of Table 5, we can see that the FFEM and the ARM can further improve the performance on the SSIM. As demonstrated in the final column of Table 5, their MAEs are gradually reduced after adopting the FFEM or the ARM. However, after using both of them simultaneously, the MAE is the minimum of all. In the final analysis, the quantitative results in the table demonstrate that the FFEM and the ARM contribute to defocus deblurring.
Table 5. Ablation studies for different modules of SIDGAN on the RealDOF dataset. ↑ indicates that the larger is better. ↓ means that the smaller is better.
As illustrated in Figure 9, the visual results of the SIDGAN equipped with different modules on the RealDOF dataset are exhibited. From the first image, it is difficult to read the writing on the carriage. After processing by our method, the text on the carriage is clear and visible. Compared to the original defocused image, the image quality of the baseline shows a slight improvement, as seen in the first image of the second column in Figure 9. From the first row of the final column, it can be seen that the image quality is further improved after adding the FFEM. By comparison with the original defocused image, we can see that the visual effect is improved again after adding the ARM. In a word, the quality of the image processed by SIDGAN equipped with the FFEM and the ARM is the best of all.
Figure 9. Visual comparison to reflect the effectiveness of our SIDGAN equipped with the FFEM and the ARM proposed in this paper on the RealDOF dataset. The characters on the carriage mean Korean Express.
In Figure 10, we also provide the error maps of each module on the RealDOF dataset. The brighter the color, the larger the error. In general, the baseline, the FFEM, and the ARM can recover more details from the defocused image. From the left image in Figure 10, we can see many brighter pixels near the characters and at the bottom of the carriage. The first image in the middle column of Figure 10 indicates that the error is barely reduced in comparison with the error map of the defocused image. The first image in the final column shows that the number of bright pixels is further decreased as compared with the error maps of the defocused image and the baseline. The second image in the middle column of Figure 10 suggests that the error is reduced again in comparison with the error maps of the baseline and the FFEM. From the final image of Figure 10, we can discover that the number of bright pixels is the smallest of all the five error maps. The previous observations demonstrate that the FFEM benefits the extraction of features from the defocused image and the ARM is conducive to reducing artifacts during defocus deblurring.
Figure 10. The error maps reflect the effectiveness of our SIDGAN equipped with the FFEM and the ARM proposed in this paper on the RealDOF dataset. The characters on the carriage mean Korean Express.

6. Limitations

Although our research has made some progress in the field of single image defocus deblurring, our method still has some limitations. Firstly, the datasets used in this paper may not fully represent all scenarios encountered in real-world applications. These datasets only cover a small number of scenes or types of images and may lack corresponding samples in other scenarios. Consequently, the model’s generalization ability and applicability may be limited by the data trained on it. Secondly, our model may have some errors when processing specific types of images. For instance, when the input images encounter low contrast, complex backgrounds, or strong lighting variations, the outputs of our method are not so satisfactory. Therefore, although our research has made some progress, we still need to consider and address these limitations in order to further improve the performance of our model. In the future, we are committed to addressing these challenges and making more contributions to the development of single image defocus deblurring.

7. Conclusions

In this paper, we propose an efficient multi-module architecture framework for single image defocus deblurring. For the sake of enhancing our method’s ability to extract features from the defocus image that contains large blurry regions and encounters severe blur, we propose a new module named FFEM. After that, we propose another new module named ARM for addressing the common issue of artifacts in the field of defocus deblurring. In addition, we incorporate multi-scale mechanisms into the network by adjusting the image resolution to one-half and one-quarter. Extensive experimental results demonstrate that our SIDGAN proposed in this paper outperforms the state-of-the-art algorithms in three objective metrics. Compared with currently available defocus deblurring methods, our SIDGAN has stronger feature extraction and artifact removal capabilities. Although our method outperforms all the latest defocus deblurring methods in image quality, our method still has shortcomings. For example, the parameters of our method are not the smallest of all, nor does it achieve the best performance in terms of FLOPs, and the generalization of our model needs further improvement. Future research work will aim: (1) to further reduce the computational complexity of SIDGAN while retaining (or improving) image quality, (2) to further optimize SIDGAN using novel loss functions, and (3) to optimize our method to handle more diverse blur scenarios.

Author Contributions

Conceptualization, S.L.; methodology, S.L. and H.Z.; software, S.L. and H.Z.; validation, S.L. and H.Z.; formal analysis, S.L.; investigation, S.L. and H.Z.; resources, S.L.; data curation, S.L.; writing—original draft preparation, S.L. and H.Z.; writing—review and editing, S.L. and L.C.; supervision, S.L.; project administration, L.C.; funding acquisition, S.L.; data analysis, L.C. All authors have read and agreed to the published version of the manuscript.

Funding

This research is supported by the Opening Project of International Joint Research Center of Robotics and Intelligence System of Sichuan Province grant number JQZN2023-006.

Data Availability Statement

The finite models generated and analyzed during the current study are available from the corresponding author on reasonable request.

Acknowledgments

The data utilized in this publication were acquired from the York University (DPDD, https://ln2.sync.com/dl/c45358c50/r7kpybwk-xw8hhszh-qkj249ap-y8k2344d (accessed on 1 September 2023)), and the Pohang University of Science and Technology (RealDOF, https://www.dropbox.com/s/arox1aixvg67fw5/RealDOF.zip?dl=1 (accessed on 1 September 2023)). This manuscript represents the perspectives of the authors and may not necessarily align with the opinions or viewpoints of the individual contributing the original data to the datasets.

Conflicts of Interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as potential conflicts of interest.

Abbreviations

The following abbreviations are used in this manuscript:
ARMArtifact Removal Module
AiFAll-in-focus
BAMBNetBlur-Aware Multi-Branch Network
CVPRConference on Computer Vision and Pattern Recognition
DOFDepth of Field
DPDDDual-Pixel Defocus Deblurring
DPDual-Pixel
DMENetDefocus Map Estimation Network
DBDDefocus Blur Detection
DRBNetDynamic Residual Block Network
DPDNetDual-Pixel Deblurring Network
DDDNetDual-Pixel-Based Depth and Deblur Network
DNNDeep Neural Network
EBDBEdge-Based Defocus Blur
FFEMFuzzy Feature Extraction Module
FAMFeature Attention Module
FocalNetFocal Network
FADAFocused Area Detection Attack
GANsGenerative Adversarial Networks
GRLGlobal, Regional, and Local
GKMGaussian Kernel Mixture
IFANInteractive Filter Adaptive Network
ICMLInternational Conference on Machine Learning
iDFDIndoor Depth from Defocus
ICCVInternational Conference on Computer Vision
JDRLJoint Deblurring And Reblurring Learning
JS divergenceJensen–Shannon divergence
JNBJust Noticeable Blur
KPACKernel-Sharing Parallel Atrous Convolutional
MAEMean Absolute Error
MRNetMulti-Refinement Network
MDPNetMulti-Task Dual-Pixel Network
MPRNetMulti-Path Residual Network
MSEMean Square Error
PSNRPeak Signal-to-Noise Ratio
RealDOFReal Depth of Field
RDPDRecurrent Dual-Pixel Deblurring
SIDDSingle Image Defocus Deblurring
SIDGANSingle Image Deblurring Generative Adversarial Networks
SCMShallow Convolutional Module
SSIMStructural Similarity Index
SDDSingle-Image Defocus Deblurring
VGG19Visual Geometry Group 19
WGAN-GPWasserstein Generative Adversarial Network With Gradient Penalty

References

  1. Son, H.; Lee, J.; Cho, S.; Lee, S. Single Image Defocus Deblurring Using Kernel-Sharing Parallel Atrous Convolutions. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 2622–2630. [Google Scholar]
  2. Tai, Y.W.; Brown, M. Single image defocus map estimation using local contrast prior. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Cairo, Egypt, 22–25 September 2009; pp. 1797–1800. [Google Scholar]
  3. Zhuo, S.J.; Sim, T. Defocus map estimation from a single image. Pattern Recognit. 2021, 44, 1852–1858. [Google Scholar] [CrossRef]
  4. Karaali, A.; Jung, C.R. Edge-Based Defocus Blur Estimation with Adaptive Scale Selection. IEEE Trans. Image Process. 2018, 3, 1126–1137. [Google Scholar] [CrossRef]
  5. Cho, S.; Lee, S. Convergence Analysis of MAP Based Blur Kernel Estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4818–4826. [Google Scholar]
  6. Fish, D.; Brinicombe, A.M.; Pike, E.R.; Walker, J.G. Blind deconvolution by means of the Richardson–Lucy algorithm. JOSA A 1995, 12, 58–65. [Google Scholar] [CrossRef]
  7. Levin, A.; Fergus, R.; Durand, F.; Freeman, W. Image and depth from a conventional camera with a coded aperture. Acm Trans. Graph. (Tog) 2007, 27, 70-es. [Google Scholar] [CrossRef]
  8. Krishnan, D.; Fergus, R. Fast image deconvolution using hyper-Laplacian priors. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 7–10 December 2009; pp. 1033–1041. [Google Scholar]
  9. Bando, Y.; Nishita, T. Towards Digital Refocusing from a Single Photograph. In Proceedings of the 15th Pacific Conference on Computer Graphics and Applications (PG’07), Maui, HI, USA, 29 October–2 November 2007; pp. 363–372. [Google Scholar]
  10. Shi, J.P.; Xu, L.; Jia, J.Y. Just noticeable defocus blur detection and estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 657–665. [Google Scholar]
  11. Park, J.; Tai, Y.W.; Cho, D.; Kweon, I. A Unified Approach of Multi-scale Deep and Hand-Crafted Features for Defocus Estimation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 21–26 July 2017; pp. 2760–2769. [Google Scholar]
  12. Xu, G.D.; Quan, Y.H.; Ji, H. Estimating Defocus Blur via Rank of Local Patches. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 5381–5389. [Google Scholar]
  13. D’Andres, L.; Salvador, J.; Kochale, A.; Süsstrunk, S. Non-Parametric Blur Map Regression for Depth of Field Extension. IEEE Trans. Image Process. 2016, 25, 1660–1673. [Google Scholar] [CrossRef]
  14. Liu, Y.Q.; Du, X.; Shen, H.L.; Chen, S.J. Estimating Generalized Gaussian Blur Kernels for Out-of-Focus Image Deblurring. IEEE Trans. Circuits Syst. Video Technol. 2021, 31, 829–843. [Google Scholar] [CrossRef]
  15. Goilkar, S.; Yadav, D.M. Implementation of Blind and Non-blind Deconvolution for Restoration of Defocused Image. In Proceedings of the International Conference on Emerging Smart Computing and Informatics (ESCI), Pune, India, 19–22 August 2021; pp. 560–563. [Google Scholar]
  16. Chan, S.; Nguyen, T. Single image spatially variant out-of-focus blur removal. In Proceedings of the IEEE International Conference on Image Processing, Brussels, Belgium, 11–14 September 2011; pp. 677–680. [Google Scholar]
  17. Abuolaim, A.; Brown, M. Defocus deblurring using dual-pixel data. In Proceedings of the European Conference on Computer Vision, Online, 22–28 August 2020; pp. 111–126. [Google Scholar]
  18. Lee, J.; Lee, S.; Cho, S.; Lee, S. Deep Defocus Map Estimation Using Domain Adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 15–20 June 2019; pp. 12214–12222. [Google Scholar]
  19. Lee, J.; Son, H.; Rim, J.; Cho, S.; Lee, S. Iterative Filter Adaptive Network for Single Image Defocus Deblurring. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Online, 19–25 June 2021; pp. 2034–2042. [Google Scholar]
  20. Abuolaim, A.; Afifi, M.; Brown, M. Improving Single-Image Defocus Deblurring: How Dual-Pixel Images Help Through Multi-Task Learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 3–8 January 2022; pp. 82–90. [Google Scholar]
  21. Zhao, W.D.; Wei, F.; He, Y.; Lu, H.C. United Defocus Blur Detection and Deblurring via Adversarial Promoting Learning. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 569–586. [Google Scholar]
  22. Quan, Y.H.; Yao, X.; Ji, H. Single Image Defocus Deblurring via Implicit Neural Inverse Kernels. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 12566–12576. [Google Scholar]
  23. Zhang, A.; Sun, J. Joint Depth and Defocus Estimation From a Single Image Using Physical Consistency. IEEE Trans. Image Process. 2021, 30, 3419–3433. [Google Scholar] [CrossRef]
  24. Anwar, S.; Hayder, Z.; Porikli, F.M. Deblur and deep depth from single defocus image. Mach. Vis. Appl. 2021, 32, 1–13. [Google Scholar] [CrossRef]
  25. Karaali, A.; Harte, N.; Jung, C.R. Deep Multi-Scale Feature Learning for Defocus Blur Estimation. IEEE Trans. Image Process. 2022, 31, 1097–1106. [Google Scholar] [CrossRef]
  26. Yang, Y.; Pan, L.Y.; Liu, L.; Liu, M.M. K3DN: Disparity-Aware Kernel Estimation for Dual-Pixel Defocus Deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 13263–13272. [Google Scholar]
  27. Quan, Y.H.; Wu, Z.C.; Ji, H. Neumann Network with Recursive Kernels for Single Image Defocus Deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 5754–5763. [Google Scholar]
  28. Li, Y.W.; Fan, Y.C.; Xiang, X.Y.; Demandolx, D.; Ranjan, R.; Timofte, R.; Gool, L.V. Efficient and explicit modelling of image hierarchies for image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 18278–18289. [Google Scholar]
  29. Ye, Q.; Suganuma, M.; Okatani, T. Accurate Single-Image Defocus Deblurring Based on Improved Integration with Defocus Map Estimation. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–11 October 2023; pp. 750–754. [Google Scholar]
  30. Zhao, W.; Hu, G.; Wei, F.; Wang, H.P.; He, Y.; Lu, H.C. Attacking Defocus Detection With Blur-Aware Transformation for Defocus Deblurring. IEEE Trans. Multimed. 2024, 26, 5450–5460. [Google Scholar] [CrossRef]
  31. Ali, K.; Jung, C.R. SVBR-Net: A Non-Blind Spatially Varying Defocus Blur Removal Network. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 566–570. [Google Scholar]
  32. Zhang, D.; Wang, X.B. Dynamic Multi-Scale Network for Dual-Pixel Images Defocus Deblurring with Transformer. In Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Taipei, Taiwan, 18–22 July 2022; pp. 1–6. [Google Scholar]
  33. Saqib, N.; Lorenzo, V.; Manuel, M.; Victor, M.B.; Daniela, C. 2HDED:Net for Joint Depth Estimation and Image Deblurring from a Single Out-of-Focus Image. In Proceedings of the IEEE International Conference on Image Processing (ICIP), Bordeaux, France, 16–19 October 2022; pp. 2006–2010. [Google Scholar]
  34. Nazir, S.; Qiu, Z.Y.; Coltuc, D.; Martínez-Sánchez, J.; Arias, P. iDFD: A Dataset Annotated for Depth and Defocus. In Proceedings of the Scandinavian Conference on Image Analysis, Sirkka, Finland, 18–21 April 2023; pp. 67–83. [Google Scholar]
  35. Mazilu, I.; Wang, S.; Dummer, S.; Veldhuis, R.; Brune, C.; Strisciuglio, N. Defocus Blur Synthesis and Deblurring via Interpolation and Extrapolation in Latent Space. arXiv 2023, arXiv:2307.15461. [Google Scholar]
  36. Zhao, Z.J.; Yang, H.; Liu, P.; Nie, H.; Zhang, Z.; Li, C. Defocus blur detection via adaptive cross-level feature fusion and refinement. Vis. Comput. 2024, 1432–2315. [Google Scholar] [CrossRef]
  37. Zhang, K.H.; Ren, W.; Luo, W.; Lai, W.S.; Stenger, B.; Yang, M.H.; Li, H.D. Deep Image Deblurring: A Survey. Int. J. Comput. Vis. 2022, 130, 2103–2130. [Google Scholar] [CrossRef]
  38. Chai, S.; Zhao, X.; Zhang, J.; Kan, J. Defocus blur detection based on transformer and complementary residual learning. Multimed. Tools Appl. 2023, 83, 53095–53118. [Google Scholar] [CrossRef]
  39. Galetto, F.; Deng, G. Single image defocus map estimation through patch blurriness classification and its applications. Vis. Comput. 2022, 39, 4555–4571. [Google Scholar] [CrossRef]
  40. Zhang, N.; Yan, J.C. Rethinking the Defocus Blur Detection Problem and a Real-Time Deep DBD Model. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 22–28 August 2020; pp. 617–632. [Google Scholar]
  41. Lin, X.; Suo, J.L.; Cao, X.; Dai, Q.H. Iterative Feedback Estimation of Depth and Radiance from Defocused Images. In Proceedings of the Asian Conference on Computer Vision, Singapore, 20–23 May 2012; pp. 95–109. [Google Scholar]
  42. Quan, Y.H.; Wu, Z.C.; Cao, X.; Ji, H. Gaussian Kernel Mixture Network for Single Image Defocus Deblurring. Adv. Neural Inf. Process. Syst. 2021, 34, 20812–20824. [Google Scholar]
  43. Zhang, D.F.; Wang, X.B.; Jin, Z.Z. MRNET: Multi-Refinement Network for Dual-Pixel Images Defocus Deblurring. In Proceedings of the International Conference on Acoustics, Speech and Signal Processing (ICASSP), Rhodes Island, Greece, 4–9 June 2023; pp. 1–5. [Google Scholar]
  44. Jung, S.H.; Heo, Y.S. Disparity probability volume guided defocus deblurring using dual pixel data. In Proceedings of the International Conference on Information and Communication Technology Convergence (ICTC), Jeju Island, Korea, 20–22 October 2021; pp. 305–308. [Google Scholar]
  45. Zhai, J.C.; Liu, Y.; Zeng, P.C.; Ma, C.H.; Wang, X.; Zhao, Y. Efficient Fusion of Depth Information for Defocus Deblurring. In Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Seoul, Republic of Korea, 14–19 April 2024; pp. 2640–2644. [Google Scholar]
  46. Ma, H.Y.; Liu, S.J.; Liao, Q.M.; Zhang, J.C.; Xue, J.H. Defocus Image Deblurring Network With Defocus Map Estimation as Auxiliary Task. IEEE Trans. Image Process. 2021, 31, 216–226. [Google Scholar] [CrossRef]
  47. Ruan, L.Y.; Chen, B.; Li, J.; Lam, M.L. AIFNet: All-in-Focus Image Restoration Network Using a Light Field-Based Dataset. IEEE Trans. Comput. Imaging 2021, 7, 675–688. [Google Scholar] [CrossRef]
  48. Shi, J.P.; Xu, L.; Jia, J.Y. Discriminative Blur Detection Features. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 2965–2972. [Google Scholar]
  49. Li, Y.; Ren, D.; Shu, X.; Zuo, W. Learning Single Image Defocus Deblurring with Misaligned Training Pairs. In Proceedings of the AAAI Conference on Artificial Intelligence, Washington, DC, USA, 7–14 February 2023; pp. 1495–1503. [Google Scholar]
  50. Ian, G.; Jean, P.; Mehdi, M.; Bing, X.; David, W.F.; Sherjil, O.; Aaron, C.; Yoshua, B. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; pp. 2672–2680. [Google Scholar]
  51. Tim, S.; Ian, G.; Wojciech, Z.; Vicki, C.; Alec, R.; Xi, C.; Xi, C. Improved Techniques for Training GANs. In Proceedings of the International Conference on Neural Information Processing Systems, Barcelona, Spain, 5–10 December 2016; pp. 2234–2242. [Google Scholar]
  52. Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein Generative Adversarial Networks. In Proceedings of the International Conference on Machine Learning, Sydney, Australia, 6–11 August 2017; pp. 214–223. [Google Scholar]
  53. Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved Training of Wasserstein GANs. In Proceedings of the International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5769–5779. [Google Scholar]
  54. Yang, F.Z.; Yang, H.; Fu, J.L.; Lu, H.T.; Guo, B.N. Learning Texture Transformer Network for Image Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 5790–5799. [Google Scholar]
  55. Vasluianu, F.A.; Seizinger, T.; Timofte, R.; Cui, S.; Huang, J.; Tian, S.; Xia, S. NTIRE 2023 Image Shadow Removal Challenge Report. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Vancouver, BC, Canada, 18–22 June 2023; pp. 1788–1807. [Google Scholar]
  56. Xie, C.H.; Liu, S.H.; Li, C.; Cheng, M.M.; Zuo, W.M.; Liu, X.; Wen, S.L.; Ding, E. Image inpainting with learnable bidirectional attention maps. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 8857–8866. [Google Scholar]
  57. Ling, S.G.; Fu, K.; Lin, Y.; You, D.; Cheng, P. Face illumination processing via dense feature maps and multiple receptive fields. Electron. Lett. 2021, 57, 627–629. [Google Scholar] [CrossRef]
  58. Cui, Y.N.; Ren, W.Q.; Cao, X.C.; Knoll, A. Focal Network for Image Restoration. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 2–6 October 2023; pp. 12955–12965. [Google Scholar]
  59. Zhang, H.G.; Dai, Y.C.; Li, H.D.; Koniusz, P. Deep Stacked Hierarchical Multi-Patch Network for Image Deblurring. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 27 October–2 November 2019; pp. 5971–5979. [Google Scholar]
  60. Olson, M.L.; Liu, S.S.; Anirudh, R.; Thiagarajan, J.; Bremer, P.T.; Wong, W.K. Cross-GAN Auditing: Unsupervised Identification of Attribute Level Similarities and Differences Between Pretrained Generative Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 7981–7990. [Google Scholar]
  61. Solano-Carrillo, E.; Rodríguez, Á.B.; Carrillo-Perez, B.; Steiniger, Y.; Stoppe, J. Look ATME: The Discriminator Mean Entropy Needs Attention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 18–22 June 2023; pp. 787–796. [Google Scholar]
  62. Mirza, M.; Simon, O. Conditional Generative Adversarial Nets. In Proceedings of the Computer Science. arXiv 2014, arXiv:1411.1784. [Google Scholar]
  63. Zhu, J.Y.; Park, T.; Isola, P.; Efros, A. Unpaired Image-to-Image Translation Using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar]
  64. Isola, P.; Zhu, J.Y.; Zhou, T.H.; Efros, A. Image-to-Image Translation with Conditional Adversarial Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 5967–5976. [Google Scholar]
  65. Li, C.; Wand, M. Precomputed Real-Time Texture Synthesis with Markovian Generative Adversarial Networks. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 702–716. [Google Scholar]
  66. Cho, S.J.; Ji, S.W.; Hong, J.P.; Jung, S.W.; Ko, S.J. Rethinking Coarse-to-Fine Approach in Single Image Deblurring. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 694–711. [Google Scholar]
  67. Rădulescu, V.M.; Maican, C.A. Algorithm for image processing using a frequency separation method. In Proceedings of the International Carpathian Control Conference (ICCC), Sinaia, Romania, 29 May–1 June 2022; pp. 181–185. [Google Scholar]
  68. Ledig, C.; Theis, L.; Huszár, F.; Caballero, J.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.H.; Shi, W.Z. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 105–114. [Google Scholar]
  69. He, K.M.; Zhang, X.; Ren, S.Q.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Caesars Palace, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
  70. Nair, V.; Hinton, G. Rectified linear units improve restricted boltzmann machines. In Proceedings of the 27th International Conference on Machine Learning (ICML-10), Haifa, Israel, 21–24 June 2010; pp. 807–814. [Google Scholar]
  71. He, K.M.; Zhang, X.; Ren, S.Q.; Sun, J. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, Chile, 7–13 December 2015; pp. 1026–1034. [Google Scholar]
  72. Zamir, S.W.; Arora, A.; Khan, S.H.; Hayat, M.; Khan, F.S.; Yang, M.H. Restormer: Efficient transformer for high-resolution image restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–14 June 2022; pp. 5728–5739. [Google Scholar]
  73. Wang, X.T.; Xie, L.B.; Dong, C.; Shan, Y. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, QC, Canada, 7–10 October 2021; pp. 1905–1914. [Google Scholar]
  74. Johnson, J.; Alahi, A.; Li, F.F. Perceptual losses for real-time style transfer and super-resolution. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 694–771. [Google Scholar]
  75. Available online: https://pytorch.org/ (accessed on 1 July 2018).
  76. Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2017, arXiv:1412.6980v9. [Google Scholar]
  77. Pan, L.Y.; Chowdhury, S.; Hartley, R.; Liu, M.M.; Zhang, H.G.; Li, H.D. Dual Pixel Exploration: Simultaneous Depth Estimation and Image Restoration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 10–25 June 2021; pp. 4338–4347. [Google Scholar]
  78. Abuolaim, A.; Delbracio, M.; Kelly, D.; Brown, M.; Milanfar, P. Learning to reduce defocus blur by realistically modeling dual-pixel data. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, QC, Canada, 10–17 October 2021; pp. 2289–2298. [Google Scholar]
  79. Mehri, A.; Ardakani, P.B.; Sappa, A.D. MPRNet: Multi-Path Residual Network for Lightweight Image Super Resolution. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Piscataway, NJ, USA, 3–8 January 2021; pp. 2703–2712. [Google Scholar]
  80. Liang, P.W.; Jiang, J.; Liu, X.; Ma, J. BaMBNet: A Blur-Aware Multi-Branch Network for Dual-Pixel Defocus Deblurring. IEEE/CAA J. Autom. Sin. 2022, 9, 878–892. [Google Scholar] [CrossRef]
  81. Ruan, L.Y.; Chen, B.; Li, J.Z.; Lam, M. Learning to Deblur using Light Field Generated and Real Defocus Images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 16283–16292. [Google Scholar]
  82. Cui, Y.N.; Ren, W.Q.; Yang, S.N.; Cao, X.C.; Knoll, A. IRNeXt: Rethinking Convolutional Network Design for Image Restoration. In Proceedings of the International Conference on Machine Learning, Honolulu, HI, USA, 15–17 December 2023; pp. 6545–6564. [Google Scholar]
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Article Metrics

Citations

Article Access Statistics

Multiple requests from the same IP address are counted as one view.