Figure 1.
Comparison of the workflows for multi-image DOF-E (left) and single-image DOF-E (right).
Figure 1.
Comparison of the workflows for multi-image DOF-E (left) and single-image DOF-E (right).
Figure 2.
Architecture of SDENet. The network consists of multi-scale hierarchical design including (a) efficient MsS Transformer blocks and (b) Feature Enhancement Module.
Figure 2.
Architecture of SDENet. The network consists of multi-scale hierarchical design including (a) efficient MsS Transformer blocks and (b) Feature Enhancement Module.
Figure 3.
The process of dynamic-sliding multi-head self-attention to calculate local attention on the block feature map.
Figure 3.
The process of dynamic-sliding multi-head self-attention to calculate local attention on the block feature map.
Figure 4.
The Feature Enhancement Module employs a three-level pooling structure to enhance the features input to the CAM and SAM and then concatenates the enhanced features.
Figure 4.
The Feature Enhancement Module employs a three-level pooling structure to enhance the features input to the CAM and SAM and then concatenates the enhanced features.
Figure 5.
The detailed implementation steps of SDENet.
Figure 5.
The detailed implementation steps of SDENet.
Figure 6.
Example of selected images by MSED.
Figure 6.
Example of selected images by MSED.
Figure 7.
Visual comparisons with MIMO-UNet, Uformer, MPRNet, Blur2blur, and SDENet(our) on MSED-TEST(B).
Figure 7.
Visual comparisons with MIMO-UNet, Uformer, MPRNet, Blur2blur, and SDENet(our) on MSED-TEST(B).
Figure 8.
Visual comparisons with MTRNN, MIMO-UNet, Uformer, MPRNet, DRBNet, Blur2blur and SDENet(ours) on DPDD and RealDOF test set.
Figure 8.
Visual comparisons with MTRNN, MIMO-UNet, Uformer, MPRNet, DRBNet, Blur2blur and SDENet(ours) on DPDD and RealDOF test set.
Figure 9.
Visual comparisons with MIMO-UNet, Uformer, MPRNet, DRBNet, and SDENet (ours) on depth information recovery.
Figure 9.
Visual comparisons with MIMO-UNet, Uformer, MPRNet, DRBNet, and SDENet (ours) on depth information recovery.
Figure 10.
Example results demonstrate the fusing effects on unregistered input images using various methods, with SDENet performing DOF-E on a single image. This set of images provides 7 images; multi-focus image fusion fuses ‘a~g’, SDENet processes a single image, and ‘g’ has the best effect.
Figure 10.
Example results demonstrate the fusing effects on unregistered input images using various methods, with SDENet performing DOF-E on a single image. This set of images provides 7 images; multi-focus image fusion fuses ‘a~g’, SDENet processes a single image, and ‘g’ has the best effect.
Figure 11.
Example results demonstrate the fusion effects on unregistered input images using various methods, with SDENet performing DOF-E on a single image. This set of images includes 13 images. Multi-focus image fusion fuses ‘a~m’, SDENet processes a single image, and ‘h’ achieves the best effect.
Figure 11.
Example results demonstrate the fusion effects on unregistered input images using various methods, with SDENet performing DOF-E on a single image. This set of images includes 13 images. Multi-focus image fusion fuses ‘a~m’, SDENet processes a single image, and ‘h’ achieves the best effect.
Figure 12.
Visual comparison with multi-image fusion (two images a and b) and SDENet, where SDENet performs DOF-E on a single image ‘b’.
Figure 12.
Visual comparison with multi-image fusion (two images a and b) and SDENet, where SDENet performs DOF-E on a single image ‘b’.
Figure 13.
Visual comparison with multi-image fusion and SDENet, where SDENet performs DOF-E on a single image. This set of images includes 8 images. Multi-focus image fusion fuses ‘a~h’, SDENet processes a single image, and ‘e’ achieves the best effect.
Figure 13.
Visual comparison with multi-image fusion and SDENet, where SDENet performs DOF-E on a single image. This set of images includes 8 images. Multi-focus image fusion fuses ‘a~h’, SDENet processes a single image, and ‘e’ achieves the best effect.
Figure 14.
Visual comparison with multi-image fusion and SDENet, where SDENet performs DOF-E on a single image. This set of images includes 8 images. Multi-focus image fusion fuses ‘a~h’, SDENet processes a single image, and ‘d’ achieves the best effect.
Figure 14.
Visual comparison with multi-image fusion and SDENet, where SDENet performs DOF-E on a single image. This set of images includes 8 images. Multi-focus image fusion fuses ‘a~h’, SDENet processes a single image, and ‘d’ achieves the best effect.
Table 1.
The advantages and disadvantages of the pervious related works.
Table 1.
The advantages and disadvantages of the pervious related works.
| Category | Typical Methods | Advantages | Disadvantages |
|---|
| Hardware-based DOF-E | Coded apertures, focus-tunable lenses, active focus sweeping | High optical quality; directly captures sharp images | High cost; increased system complexity and bulkiness; computationally intensive |
| Multi-Focus Image Fusion (MFIF) | CNNs, GANs, Vision Transformers, Wavelet Transform | Produces exceptionally sharp images; utilizes multiple focal cues | Requires multiple captures (time-consuming); sensitive to misalignment and exposure variations |
| Single-Image Defocus Deblurring | Blur map estimation, Non-blind deconvolution, End-to-end networks | Only requires a single input; no hardware modifications | Struggles with complex spatially varying blur; susceptible to residual blur or artifacts |
| Recent Generative & Attention Trends | GRL, SKA-based Nets, Latent Diffusion Models | Superior texture synthesis; dynamic handling of complex blur kernels | Extremely high computational cost; potential for non-physical artifacts |
| SDENet (Ours) | MsS Transformer + FEM | Single input; avoids explicit blur map estimation; balances efficiency and quality | Requires large-scale specialized datasets (e.g., MSED) for training |
Table 2.
The MSED data constitute the source.
Table 2.
The MSED data constitute the source.
| Training set (paired) |
| Dataset Name | DPDD | DLDP | LFDOF | DEDD | Total Quantity |
| Selection Quantity | 350 | 300 | 385 | 737 | 1772 |
| TEST(A) |
| Dataset Name | DPDD | DLDP | DEDD | Lytro | Total Quantity |
| Selection Quantity | 76 | 200 | 63 | 91 | 387 |
| TEST(B) |
| Dataset Name | Xiaomi | Apple | Vivo | Honor | Total Quantity |
| Selection Quantity | 26 | 29 | 24 | 21 | 100 |
Table 3.
Quantitative comparisons of MIMO-UNet, Uformer, MPRNet, DRBNet and SDENet(ours) methods in MSED-TEST(A) and (B).
Table 3.
Quantitative comparisons of MIMO-UNet, Uformer, MPRNet, DRBNet and SDENet(ours) methods in MSED-TEST(A) and (B).
| Method | MSED-TEST(A) | MSED-TEST(B) |
|---|
| SD↑ | Entropy↑ | SD↑ | Entropy↑ |
|---|
| MIMO-UNet [42] | 60.0311 | 7.3044 | 59.0302 | 7.0881 |
| MPRNet [44] | 60.0024 | 7.2846 | 60.3419 | 7.0500 |
| DRBNet [45] | 59.0151 | 7.3254 | 59.5713 | 7.0602 |
| Uformer [43] | 60.3174 | 7.3423 | 63.3031 | 7.0851 |
| Blur2blur [27] | 60.4036 | 7.3115 | 64.0659 | 7.0843 |
| SDENet(our) | 60.5944 | 7.3256 | 65.4211 | 7.1235 |
Table 4.
Quantitative comparisons of MTRNN, MIMO-UNet, Uformer, MPRNet, DRBNet and SDENet(ours) methods in DPDD and RealDOF test set.
Table 4.
Quantitative comparisons of MTRNN, MIMO-UNet, Uformer, MPRNet, DRBNet and SDENet(ours) methods in DPDD and RealDOF test set.
| Method | DPDD Test Set | RealDOF Test Set |
|---|
| PSNR↑ | SSIM↑ | LPIPS↓ | PSNR↑ | SSIM↑ | LPIPS↓ |
|---|
| MTRNN [32] | 24.7811 | 0.7588 | 0.2815 | 22.6408 | 0.6482 | 0.4093 |
| MPRNet [44] | 24.7882 | 0.7673 | 0.2869 | 22.6356 | 0.6574 | 0.4197 |
| MIMO-UNet [42] | 26.3792 | 0.8251 | 0.2048 | 25.0924 | 0.7849 | 0.2515 |
| Uformer [43] | 26.8938 | 0.8444 | 0.1638 | 25.5951 | 0.8117 | 0.2168 |
| DRBNet [45] | 26.7756 | 0.8274 | 0.1411 | 24.6081 | 0.7284 | 0.2251 |
| Blur2blur [27] | 26.7992 | 0.8325 | 0.1406 | 25.3172 | 0.8254 | 0.1905 |
| SDENet(our) | 26.9816 | 0.8461 | 0.1375 | 26.1756 | 0.7977 | 0.1788 |
Table 5.
Quantitative comparisons of MTRNN, MIMO-UNet, Uformer, MPRNet, DRBNet and SDENet(our) methods in DLDP test set.
Table 5.
Quantitative comparisons of MTRNN, MIMO-UNet, Uformer, MPRNet, DRBNet and SDENet(our) methods in DLDP test set.
| Method | SD↑ | Entropy↑ |
|---|
| MTRNN [32] | 61.3427 | 7.2905 |
| MIMO-UNet [42] | 61.3427 | 7.2983 |
| Uformer [43] | 61.7846 | 7.3184 |
| MPRNet [44] | 62.2771 | 7.2919 |
| DRBNet [45] | 60.6655 | 7.2893 |
| Blur2blur [27] | 62.3564 | 7.2988 |
| SDENet(our) | 62.2367 | 7.3240 |
Table 6.
Ablation experiments conducted on the DPDD test set.
Table 6.
Ablation experiments conducted on the DPDD test set.
| | FEM | MsS Transformer | PSNR↑ | SSIM↑ | LPIPS↓ |
|---|
| | - | - | 26.1636 | 0.8209 | 0.2093 |
| SDENet | √ | - | 26.2849 | 0.8248 | 0.1920 |
| DPDD test set | - | √ | 26.7472 | 0.8324 | 0.1829 |
| | √ | √ | 26.9816 | 0.8461 | 0.1375 |
Table 7.
Ablation experiments conducted on the MSED-TEST(A).
Table 7.
Ablation experiments conducted on the MSED-TEST(A).
| | FEM | MsS Transformer | SD↑ | Entropy↑ |
|---|
| | - | - | 60.2863 | 7.3044 |
| SDENet | √ | - | 60.3807 | 7.3111 |
| MSED-TEST(A) | - | √ | 60.2712 | 7.3256 |
| | √ | √ | 60.5944 | 7.3287 |
Table 8.
Ablation study on the deepest feature resolution evaluated on the DPDD test set.
Table 8.
Ablation study on the deepest feature resolution evaluated on the DPDD test set.
| Deepest Resolution | PSNR↑ | SSIM↑ | LPIPS↓ |
|---|
| H/8 × W/8 × 8C | 26.7132 | 0.8380 | 0.1520 |
| H/16 × W/16 × 16C | 26.9816 | 0.8461 | 0.1375 |
| H/32 × W/32 × 32C | 26.8901 | 0.8420 | 0.1440 |
Table 9.
Ablation study on module ordering evaluated on the DPDD test set.
Table 9.
Ablation study on module ordering evaluated on the DPDD test set.
| Deepest Resolution | PSNR↑ | SSIM↑ | LPIPS↓ |
|---|
| MsS → FEM | 26.7824 | 0.8430 | 0.1587 |
| FEM → MsS (Ours) | 26.9816 | 0.8461 | 0.1375 |