DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization
Abstract
1. Introduction
2. Related Works
2.1. Image Super-Resolution
2.2. Direct Preference Optimization
3. Proposed Method
3.1. Preliminaries
3.2. Introducing DPO into the SR Process
3.3. Determining the Reference Model
3.4. Training ESRGAN Using the SR-DPO Loss
4. Experimental Results and Discussion
4.1. Setup
4.2. Effectiveness of Using the SR-DPO Loss
4.3. Performance Comparison Based on How to Design the SR-DPO Loss
4.4. Influence of the Reference Model
4.5. Performance Comparison of the PieAPP and LPIPS Models for Preference Calculation
4.6. Influence of the Batch Size for Preference Calculation
4.7. Performance Comparison with Other ESRGAN Improvement Models
5. Conclusions and Future Works
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
SR | Super-Resolution |
CNN | Convolutional Neural Network |
GAN | Generative Adversarial Network |
DPO | Direct Preference Optimization |
LR | Low Resolution |
HR | High Resolution |
DSPO | Direct Semantic Preference Optimization |
RLHF | Reinforcement Learning from Human Feedback |
LM | Large-scale Model |
KL | Kullback–Leibler |
References
- Lepcha, D.C.; Goyal, B.; Dogra, A.; Goyal, V. Image super-resolution: A comprehensive review, recent trends, challenges and applications. Inf. Fusion 2023, 91, 230–260. [Google Scholar] [CrossRef]
- Ye, S.; Zhao, S.; Hu, Y.; Xie, C. Single-Image Super-Resolution Challenges: A Brief Review. Electronics 2023, 12, 2975. [Google Scholar] [CrossRef]
- Wang, X.; Yu, K.; Wu, S.; Gu, J.; Liu, Y.; Dong, C.; Qiao, Y.; Loy, C.C. ESRGAN: Enhanced Super-Resolution Generative Adversarial Networks. In Lecture Notes in Computer Science, Proceedings of the ECCV 2018 Workshops, Munich, Germany, 8–14 September 2018; Springer: Cham, Switzerland, 2018; pp. 63–79. [Google Scholar] [CrossRef]
- Rakotonirina, N.C.; Rasoanaivo, A. ESRGAN+: Further Improving Enhanced Super-Resolution Generative Adversarial Network. In Proceedings of the 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, Spain, 4–8 May 2020; pp. 3637–3641. [Google Scholar] [CrossRef]
- Choi, Y.; Park, H. Improving ESRGAN with an additional image quality loss. Multimed. Tools Appl. 2023, 82, 3123–3137. [Google Scholar] [CrossRef]
- Chen, Q.; Li, H.; Lu, G. Training ESRGAN with multi-scale attention U-Net discriminator. Sci. Rep. 2024, 14, 29036. [Google Scholar] [CrossRef] [PubMed]
- Rafailov, R.; Sharma, A.; Mitchell, E.; Ermon, S.; Manning, C.D.; Finn, C. Direct preference optimization: Your language model is secretly a reward model. In Proceedings of the 37th International Conference on Neural Information Processing Systems, New Orleans, LA, USA, 10–16 December 2023; Curran Associates Inc.: Red Hook, NY, USA, 2023. [Google Scholar]
- Johnson, J.; Alahi, A.; Fei-Fei, L. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October 2016; pp. 694–711. [Google Scholar] [CrossRef]
- Ledig, C.; Theis, L.; Huszar, F.; Caballero, J.; Cunningham, A.; Acosta, A.; Aitken, A.; Tejani, A.; Totz, J.; Wang, Z.; et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21– 26 July 2017; pp. 105–114. [Google Scholar] [CrossRef]
- Song, J.; Yi, H.; Xu, W.; Li, X.; Li, B.; Liu, Y. ESRGAN-DP: Enhanced super-resolution generative adversarial network with adaptive dual perceptual loss. Heliyon 2023, 9, e15134. [Google Scholar] [CrossRef] [PubMed]
- Wei, Z.; Huang, Y.; Chen, Y.; Zheng, C.; Gao, J. A-ESRGAN: Training Real-World Blind Super-Resolution with Attention U-Net Discriminators. In Lecture Notes in Computer Science, Proceedings of the 20th Pacific Rim International Conference on Artificial Intelligence, Jakarta, Indonesia, 15–19 November 2023; Springer: Singapore, 2023; pp. 16–27. [Google Scholar] [CrossRef]
- Zhang, K.; Liang, J.; Van Gool, L.; Timofte, R. Designing a Practical Degradation Model for Deep Blind Image Super-Resolution. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 10–17 October 2021; pp. 4771–4780. [Google Scholar] [CrossRef]
- Wang, X.; Xie, L.; Dong, C.; Shan, Y. Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 10–17 October 2021; pp. 1905–1914. [Google Scholar] [CrossRef]
- Liang, J.; Cao, J.; Sun, G.; Zhang, K.; Van Gool, L.; Timofte, R. SwinIR: Image Restoration Using Swin Transformer. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, BC, Canada, 10–17 October 2021; pp. 1833–1844. [Google Scholar] [CrossRef]
- Lu, Z.; Li, J.; Liu, H.; Huang, C.; Zhang, L.; Zeng, T. Transformer for Single Image Super-Resolution. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 456–465. [Google Scholar] [CrossRef]
- Moser, B.B.; Shanbhag, A.S.; Raue, F.; Frolov, S.; Palacio, S.; Dengel, A. Diffusion Models, Image Super-Resolution, and Everything: A Survey. IEEE Trans. Neural Netw. Learn. Syst. 2024, 36, 11793–11813. [Google Scholar] [CrossRef] [PubMed]
- Xiao, T.; Yuan, Y.; Zhu, H.; Li, M.; Honavar, V.G. Cal-DPO: Calibrated Direct Preference Optimization for Language Model Alignment. In Proceedings of the 37th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 10–15 December 2024; Curran Associates, Inc.: Red Hook, NY, USA, 2024; Volume 37, pp. 114289–114320. [Google Scholar]
- Zhou, Z.; Liu, J.; Shao, J.; Yue, X.; Yang, C.; Ouyang, W.; Qiao, Y. Beyond One-Preference-Fits-All Alignment: Multi-Objective Direct Preference Optimization. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024; Association for Computational Linguistics: Bangkok, Thailand, 2024; pp. 10586–10613. [Google Scholar] [CrossRef]
- Ahn, D.; Choi, Y.; Kim, S.; Yu, Y.; Kang, D.; Choi, J. ISR-DPO: Aligning Large Multimodal Models for Videos by Iterative Self-Retrospective DPO. In Proceedings of the 39th Annual AAAI Conference on Artificial Intelligence, Philadelphia, PA, USA, 20–27 February 2025. [Google Scholar]
- Zeng, Y.; Liu, G.; Ma, W.; Yang, N.; Zhang, H.; Wang, J. Token-level direct preference optimization. In Proceedings of the 41st International Conference on Machine Learning, Vienna, Austria, 21–27 July 2024. [Google Scholar]
- Park, R.; Rafailov, R.; Ermon, S.; Finn, C. Disentangling Length from Quality in Direct Preference Optimization. In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, Bangkok, Thailand, 11–16 August 2024; Association for Computational Linguistics: Bangkok, Thailand, 2024; pp. 4998–5017. [Google Scholar] [CrossRef]
- Wallace, B.; Dang, M.; Rafailov, R.; Zhou, L.; Lou, A.; Purushwalkam, S.; Ermon, S.; Xiong, C.; Joty, S.; Naik, N. Diffusion Model Alignment Using Direct Preference Optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 8228–8238. [Google Scholar]
- Lee, K.; Kwak, S.; Sohn, K.; Shin, J. Direct Consistency Optimization for Robust Customization of Text-to-Image Diffusion models. In Proceedings of the 37th International Conference on Neural Information Processing Systems, Vancouver, BC, Canada, 10–15 December 2024; Curran Associates, Inc.: Red Hook, NY, USA, 2024; Volume 37, pp. 103269–103304. [Google Scholar]
- Croitoru, F.A.; Hondru, V.; Ionescu, R.T.; Sebe, N.; Shah, M. Curriculum Direct Preference Optimization for Diffusion and Consistency Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 10–17 June 2025. [Google Scholar]
- Cai, M.; Li, S.; Li, W.; Huang, X.; Chen, H.; Hu, J.; Wang, Y. DSPO: Direct Semantic Preference Optimization for Real-World Image Super-Resolution. arXiv 2025, arXiv:2504.15176. [Google Scholar] [CrossRef]
- Christiano, P.F.; Leike, J.; Brown, T.B.; Martic, M.; Legg, S.; Amodei, D. Deep reinforcement learning from human preferences. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4302–4310. [Google Scholar]
- Schulman, J.; Wolski, F.; Dhariwal, P.; Radford, A.; Klimov, O. Proximal Policy Optimization Algorithms. arXiv 2017, arXiv:1707.06347. [Google Scholar] [CrossRef]
- Bradley, R.A.; Terry, M.E. Rank Analysis of Incomplete Block Designs: I. The Method of Paired Comparisons. Biometrika 1952, 39, 324–345. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015; Bengio, Y., LeCun, Y., Eds.; pp. 1–14. [Google Scholar]
- Prashnani, E.; Cai, H.; Mostofi, Y.; Sen, P. PieAPP: Perceptual Image-Error Assessment Through Pairwise Preference. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 1808–1817. [Google Scholar] [CrossRef]
- Agustsson, E.; Timofte, R. NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, Honolulu, HI, USA, 21–26 July 2017; pp. 1122–1131. [Google Scholar] [CrossRef]
- Huang, J.B.; Singh, A.; Ahuja, N. Single image super-resolution from transformed self-exemplars. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 5197–5206. [Google Scholar] [CrossRef]
- Martin, D.; Fowlkes, C.; Tal, D.; Malik, J. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Proceedings of the Eighth IEEE International Conference on Computer Vision, Vancouver, BC, Canada, 7–14 July 2001; pp. 416–423. [Google Scholar] [CrossRef]
- Bevilacqua, M.; Roumy, A.; Guillemot, C.; Alberi Morel, M.-L. Low-Complexity Single-Image Super-Resolution based on Nonnegative Neighbor Embedding. In Proceedings of the British Machine Vision Conference, Surrey, UK, 3–7 September 2012; pp. 135.1–135.10. [Google Scholar] [CrossRef]
- Zeyde, R.; Elad, M.; Protter, M. On Single Image Scale-Up Using Sparse-Representations. In Proceedings of the International Conference on Computing Sciences (ICCS), Phagwara, India, 14–15 September 2012; pp. 711–730. [Google Scholar]
- Horé, A.; Ziou, D. Image Quality Metrics: PSNR vs. SSIM. In Proceedings of the 20th International Conference on Pattern Recognition, Istanbul, Turkey, 23–26 August 2010; pp. 2366–2369. [Google Scholar] [CrossRef]
- Li, L.; Song, S.; Lv, M.; Jia, Z.; Ma, H. Multi-Focus Image Fusion Based on Fractal Dimension and Parameter Adaptive Unit-Linking Dual-Channel PCNN in Curvelet Transform Domain. Fractal Fract. 2025, 9, 157. [Google Scholar] [CrossRef]
- Lv, M.; Song, S.; Jia, Z.; Li, L.; Ma, H. Multi-Focus Image Fusion Based on Dual-Channel Rybak Neural Network and Consistency Verification in NSCT Domain. Fractal Fract. 2025, 9, 432. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Cao, Z.H.; Liang, Y.J.; Deng, L.J.; Vivone, G. An Efficient Image Fusion Network Exploiting Unifying Language and Mask Guidance. IEEE Trans. Pattern Anal. Mach. Intell. 2025. [Google Scholar] [CrossRef] [PubMed]
- Zhang, R.; Isola, P.; Efros, A.; Shechtman, E.; Wang, O. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 586–595. [Google Scholar]
- Mittal, A.; Soundararajan, R.; Bovik, A.C. Making a “Completely Blind” Image Quality Analyzer. IEEE Signal Process. Lett. 2013, 20, 209–212. [Google Scholar] [CrossRef]
- Vo, K.D.; Bui, L.T. StarSRGAN: Improving Real-World Blind Super-Resolution. In Proceedings of the International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision, Pilsen, Czech Republic, 15–19 May 2023; pp. 62–72. [Google Scholar] [CrossRef]
- Li, B.; Li, X.; Zhu, H.; Jin, Y.; Feng, R.; Zhang, Z.; Chen, Z. SeD: Semantic-Aware Discriminator for Image Super-Resolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 25784–25795. [Google Scholar] [CrossRef]
- Schönfeld, E.; Schiele, B.; Khoreva, A. A U-Net Based Discriminator for Generative Adversarial Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8204–8213. [Google Scholar] [CrossRef]
PSNR ↑ | SSIM ↑ | LPIPS ↓ | PieAPP ↓ | NIQE ↓ | ||
---|---|---|---|---|---|---|
DIV2K_valid | [3] | 25.6795 | 0.7278 | 0.1237 | 0.5507 | 4.6551 |
[5] | 25.4889 | 0.7275 | 0.1414 | 0.2749 | 4.3927 | |
25.9753 | 07409 | 0.1244 | 0.2626 | 4.2588 | ||
Urban100 | [3] | 22.0044 | 0.6752 | 0.1587 | 0.8355 | 4.5552 |
[5] | 21.6073 | 0.6635 | 0.1831 | 0.5079 | 4.4453 | |
22.1408 | 0.6860 | 0.1598 | 0.5072 | 4.4582 | ||
BSD100 | [3] | 23.4732 | 0.6037 | 0.1653 | 0.6956 | 5.4894 |
[5] | 22.8331 | 0.5874 | 0.1901 | 0.4011 | 5.0825 | |
23.5257 | 0.5998 | 0.1696 | 0.3798 | 4.6984 | ||
Set5 | [3] | 26.6157 | 0.8027 | 0.0755 | 0.4486 | 5.4038 |
[5] | 26.0691 | 0.7881 | 0.0917 | 0.3511 | 4.9664 | |
26.8175 | 0.7996 | 0.0748 | 0.3564 | 5.3249 | ||
Set14 | [3] | 23.8251 | 0.7659 | 0.1446 | 0.9212 | 4.9711 |
[5] | 22.8668 | 0.7521 | 0.1626 | 0.5072 | 4.5064 | |
23.9951 | 0.7653 | 0.1432 | 0.5985 | 4.6497 |
PSNR ↑ | SSIM ↑ | LPIPS ↓ | PieAPP ↓ | NIQE ↓ | ||
---|---|---|---|---|---|---|
DIV2K_valid | Equation (4) | 24.0414 | 0.7183 | 0.1616 | 1.0361 | 4.3884 |
Equation (5) | 20.7606 | 0.7073 | 0.2426 | 2.0506 | 5.5587 | |
Equation (6) | 20.3597 | 0.6494 | 0.2698 | 2.0456 | 4.9085 | |
Equation (7) | 25.9753 | 0.7409 | 0.1244 | 0.2626 | 4.2588 | |
Urban100 | Equation (4) | 20.8525 | 0.6496 | 0.1966 | 1.1522 | 4.2652 |
Equation (5) | 18.4527 | 0.6154 | 0.2761 | 1.8359 | 5.1611 | |
Equation (6) | 18.3813 | 0.5535 | 0.3031 | 1.8517 | 4.7215 | |
Equation (7) | 22.1408 | 0.6860 | 0.1598 | 0.5072 | 4.4582 | |
BSD100 | Equation (4) | 22.7156 | 0.5885 | 0.2064 | 1.1245 | 5.4479 |
Equation (5) | 20.6213 | 0.5903 | 0.2952 | 1.7535 | 6.9459 | |
Equation (6) | 20.1085 | 0.5366 | 0.3418 | 1.8276 | 6.0400 | |
Equation (7) | 23.5257 | 0.5998 | 0.1696 | 0.3798 | 4.6984 | |
Set5 | Equation (4) | 24.2179 | 0.7456 | 0.1198 | 0.5111 | 4.9955 |
Equation (5) | 21.4562 | 0.7253 | 0.1712 | 0.9298 | 6.8279 | |
Equation (6) | 19.7723 | 0.6271 | 0.2044 | 1.9173 | 6.5020 | |
Equation (7) | 26.8175 | 0.7996 | 0.0748 | 0.3564 | 5.3249 | |
Set14 | Equation (4) | 22.8015 | 0.7515 | 0.1814 | 1.3275 | 5.0441 |
Equation (5) | 20.2181 | 0.7316 | 0.2638 | 2.6234 | 6.0784 | |
Equation (6) | 18.8591 | 0.6548 | 0.3062 | 1.5138 | 5.6434 | |
Equation (7) | 23.9951 | 0.7653 | 0.1432 | 0.5985 | 4.6497 |
Reference Model | PSNR ↑ | SSIM ↑ | LPIPS ↓ | PieAPP ↓ | |
---|---|---|---|---|---|
DIV2K_valid | ESRGAN trained with | 15.2341 | 0.5422 | 0.3866 | 0.4054 |
ESRGAN trained with | 25.9753 | 0.7409 | 0.1244 | 0.2626 | |
No reference model | 25.8533 | 0.7452 | 0.1259 | 0.2547 | |
Urban100 | ESRGAN trained with | 13.9608 | 0.4315 | 0.4033 | 0.6819 |
ESRGAN trained with | 22.1408 | 0.6860 | 0.1598 | 0.5072 | |
No reference model | 22.1086 | 0.6193 | 0.1624 | 0.4574 | |
Set14 | ESRGAN trained with | 13.3721 | 0.5037 | 0.4509 | 0.5808 |
ESRGAN trained with | 23.9951 | 0.7653 | 0.1432 | 0.5985 | |
No reference model | 23.8403 | 0.7581 | 0.1498 | 0.5224 |
PSNR ↑ | SSIM ↑ | LPIPS ↓ | PieAPP ↓ | NIQE ↓ | |
---|---|---|---|---|---|
DIV2K_valid | 25.9707 | 0.7446 | 0.1186 | 0.5191 | 4.0304 |
Urban100 | 22.1383 | 0.6866 | 0.1566 | 0.7609 | 4.0996 |
BSD100 | 23.9345 | 0.6150 | 0.1613 | 0.6565 | 4.6466 |
Set5 | 26.8093 | 0.8044 | 0.0718 | 0.5903 | 4.9209 |
Set14 | 24.2466 | 0.7757 | 0.1337 | 0.7321 | 4.4916 |
PSNR ↑ | SSIM ↑ | LPIPS ↓ | PieAPP ↓ | NIQE ↓ | |
---|---|---|---|---|---|
DIV2K_valid | 25.7276 | 0.7134 | 0.1282 | 0.2828 | 4.1035 |
Urban100 | 21.9077 | 0.6654 | 0.1601 | 0.5648 | 4.0682 |
BSD100 | 23.5713 | 0.6057 | 0.1716 | 0.3612 | 4.9198 |
Set5 | 26.6223 | 0.8011 | 0.0784 | 0.3221 | 4.7340 |
Set14 | 23.3602 | 0.7508 | 0.1448 | 0.5263 | 4.6600 |
PSNR ↑ | SSIM ↑ | LPIPS ↓ | PieAPP ↓ | NIQE ↓ | ||
---|---|---|---|---|---|---|
DIV2K_valid | Bicubic | 26.6942 | 0.7663 | 0.3407 | 0.5804 | 7.2218 |
Our DPO method | 25.9753 | 0.7409 | 0.1244 | 0.2626 | 4.2588 | |
Real-ESRGAN [13] | 21.8301 | 0.6209 | 0.2758 | 1.6711 | 3.5661 | |
StarSRGAN [43] | 24.5426 | 0.7147 | 0.1464 | 0.6914 | 3.0836 | |
ESRGAN-DP [10] | 26.5069 | 0.7555 | 0.0912 | 0.3996 | 2.8800 | |
A-ESRGAN [11] | 22.7443 | 0.6550 | 0.2358 | 1.3846 | 3.1801 | |
MSA-ESRGAN [6] | 24.8405 | 0.7204 | 0.1799 | 1.1802 | 3.5811 | |
SeD [44] | 27.7939 | 0.7934 | 0.0751 | 0.3469 | 3.1302 | |
Urban100 | Bicubic | 21.6991 | 0.6517 | 0.4205 | 1.1409 | 7.1941 |
Our DPO method | 22.1408 | 0.6860 | 0.1598 | 0.5072 | 4.4582 | |
Real-ESRGAN [13] | 18.1046 | 0.5409 | 0.2666 | 2.2056 | 4.2697 | |
StarSRGAN [43] | 20.3558 | 0.6483 | 0.1712 | 1.1067 | 3.5930 | |
ESRGAN-DP [10] | 22.7345 | 0.7149 | 0.1086 | 0.6753 | 3.6833 | |
A-ESRGAN [11] | 18.7962 | 0.5721 | 0.2451 | 1.4945 | 3.5516 | |
MSA-ESRGAN [6] | 21.0561 | 0.6575 | 0.1804 | 1.6531 | 4.0658 | |
SeD [44] | 24.3847 | 0.7714 | 0.0887 | 0.6458 | 3.9645 | |
BSD100 | Bicubic | 24.6507 | 0.6415 | 0.4561 | 0.7388 | 7.5764 |
Our DPO method | 23.5257 | 0.5998 | 0.1696 | 0.3798 | 4.6984 | |
Real-ESRGAN [13] | 21.1057 | 0.5145 | 0.3272 | 2.0294 | 3.9643 | |
StarSRGAN [43] | 22.7458 | 0.6093 | 0.1769 | 0.9638 | 3.8182 | |
ESRGAN-DP [10] | 23.9337 | 0.6397 | 0.1328 | 0.6465 | 3.3965 | |
A-ESRGAN [11] | 21.3103 | 0.5405 | 0.2711 | 1.5606 | 3.6621 | |
MSA-ESRGAN [6] | 23.5678 | 0.6121 | 0.2372 | 1.2789 | 4.0619 | |
SeD [44] | 25.0264 | 0.6638 | 0.1224 | 0.5664 | 3.5617 | |
Set5 | Bicubic | 26.6902 | 0.7899 | 0.3004 | 0.9857 | 8.2823 |
Our DPO method | 26.8175 | 0.7996 | 0.0748 | 0.3564 | 5.3249 | |
Real-ESRGAN [13] | 21.6163 | 0.6269 | 0.2277 | 1.5581 | 5.7241 | |
StarSRGAN [43] | 24.9597 | 0.7691 | 0.1107 | 0.7751 | 4.3798 | |
ESRGAN-DP [10] | 28.3406 | 0.8287 | 0.0598 | 0.3684 | 4.1061 | |
A-ESRGAN [11] | 21.9001 | 0.6427 | 0.1735 | 0.6849 | 4.8551 | |
MSA-ESRGAN [6] | 24.3240 | 0.7403 | 0.1438 | 1.4387 | 6.5172 | |
SeD [44] | 29.3011 | 0.8511 | 0.0521 | 0.3867 | 5.3606 | |
Set14 | Bicubic | 24.2384 | 0.7737 | 0.3862 | 0.7347 | 7.7363 |
Our DPO method | 23.9951 | 0.7653 | 0.1432 | 0.5985 | 4.6497 | |
Real-ESRGAN [13] | 20.3295 | 0.5336 | 0.2977 | 2.7718 | 4.2195 | |
StarSRGAN [43] | 23.3203 | 0.6298 | 0.1732 | 1.0428 | 3.8000 | |
ESRGAN-DP [10] | 24.4899 | 0.6859 | 0.1158 | 0.7899 | 3.3932 | |
A-ESRGAN [11] | 20.7806 | 0.6487 | 0.2367 | 1.7234 | 3.6678 | |
MSA-ESRGAN [6] | 23.3530 | 0.7386 | 0.1979 | 1.3044 | 4.5716 | |
SeD [44] | 25.4956 | 0.8026 | 0.0969 | 0.6217 | 3.8461 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yun, W.; Park, H. DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization. Electronics 2025, 14, 3357. https://doi.org/10.3390/electronics14173357
Yun W, Park H. DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization. Electronics. 2025; 14(17):3357. https://doi.org/10.3390/electronics14173357
Chicago/Turabian StyleYun, Wonwoo, and Hanhoon Park. 2025. "DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization" Electronics 14, no. 17: 3357. https://doi.org/10.3390/electronics14173357
APA StyleYun, W., & Park, H. (2025). DPO-ESRGAN: Perceptually Enhanced Super-Resolution Using Direct Preference Optimization. Electronics, 14(17), 3357. https://doi.org/10.3390/electronics14173357