Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks
Abstract
:1. Introduction
- We analyze the limitations of previous methods, which lack explicit modeling for the domain adaptation task, and introduce the perspective of the frequency domain into the generative domain adaptation task.
- We propose a novel method called Frequency-Auxiliary GAN (FAGAN), which mainly contains two submodules: the low-frequency fusion module (LFF) and high-frequency guide module (HFG), which are designed to improve the diversity and fidelity of the generated images, respectively.
- Extensive experiments demonstrate that our method outperforms the state-of-the-art methods on several benchmark datasets in terms of qualitative and quantitative comparison.
2. Related Work
2.1. Domain Adaptation of GANs
2.2. Frequency Information Used in GANs
3. Our Method
3.1. Low-Frequency Fusion Module
3.2. High-Frequency Guide Module
Algorithm 1 Construction of HFG |
Input: The sampled generated images of target model ; the reference image Output: The proposed loss , which is used to update the parameters of stage1: Local Selection
|
3.3. Overall Training Loss
4. Experiments
4.1. Experimental Settings
4.1.1. Implementation Details
4.1.2. Datasets
4.1.3. Evaluation Metrics
4.2. Qualitative Results
4.3. Quantitative Results
4.4. Ablation Study
4.4.1. The Number of Layers Employing LFF-Module
4.4.2. The Number of Tokens Selected by HFG-Module
4.4.3. Effect of Two Frequency Modules
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
LFF | low-frequency fusion module |
HFG | high-frequency guide module |
FAGAN | frequency-auxiliary GAN |
DWT | discrete wavelet transformation |
IDWT | inverse discrete wavelet transformation |
References
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. Adv. Neural Inf. Process. Syst. 2014. [Google Scholar] [CrossRef]
- Karras, T.; Aila, T.; Laine, S.; Lehtinen, J. Progressive growing of gans for improved quality, stability, and variation. arXiv 2017, arXiv:1710.10196. [Google Scholar]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 4401–4410. [Google Scholar]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of stylegan. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 8110–8119. [Google Scholar]
- Luo, W.; Yang, S.; Wang, H.; Long, B.; Zhang, W. Context-consistent semantic image editing with style-preserved modulation. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2022; pp. 561–578. [Google Scholar]
- Li, N.; Plummer, B.A. Supervised attribute information removal and reconstruction for image manipulation. In Proceedings of the European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2022; pp. 457–473. [Google Scholar]
- Wang, T.; Zhang, Y.; Fan, Y.; Wang, J.; Chen, Q. High-fidelity gan inversion for image attribute editing. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 11379–11388. [Google Scholar]
- Tian, C.; Zhang, X.; Lin, J.C.W.; Zuo, W.; Zhang, Y.; Lin, C.W. Generative adversarial networks for image super-resolution: A survey. arXiv 2022, arXiv:2204.13620. [Google Scholar]
- Li, B.; Li, X.; Zhu, H.; Jin, Y.; Feng, R.; Zhang, Z.; Chen, Z. SeD: Semantic-Aware Discriminator for Image Super-Resolution. arXiv 2024, arXiv:2402.19387. [Google Scholar]
- Yang, T.; Ren, P.; Xie, X.; Zhang, L. Gan prior embedded network for blind face restoration in the wild. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 672–681. [Google Scholar]
- Wang, Y.; Holynski, A.; Zhang, X.; Zhang, X. Sunstage: Portrait reconstruction and relighting using the sun as a light stage. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 7–24 June 2023; pp. 20792–20802. [Google Scholar]
- Koley, S.; Bhunia, A.K.; Sain, A.; Chowdhury, P.N.; Xiang, T.; Song, Y.Z. Picture that sketch: Photorealistic image generation from abstract sketches. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 7–24 June 2023; pp. 6850–6861. [Google Scholar]
- Careil, M.; Verbeek, J.; Lathuilière, S. Few-shot semantic image synthesis with class affinity transfer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 7–24 June 2023; pp. 23611–23620. [Google Scholar]
- Karras, T.; Aittala, M.; Hellsten, J.; Laine, S.; Lehtinen, J.; Aila, T. Training generative adversarial networks with limited data. Adv. Neural Inf. Process. Syst. 2020, 33, 12104–12114. [Google Scholar]
- Yang, C.; Shen, Y.; Xu, Y.; Zhou, B. Data-efficient instance generation from instance discrimination. Adv. Neural Inf. Process. Syst. 2021, 34, 9378–9390. [Google Scholar]
- Tseng, H.Y.; Jiang, L.; Liu, C.; Yang, M.H.; Yang, W. Regularizing generative adversarial networks under limited data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 7921–7931. [Google Scholar]
- Li, T.; Li, Z.; Rockwell, H.; Farimani, A.; Lee, T.S. Prototype memory and attention mechanisms for few shot image generation. In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2023; Volume 18. [Google Scholar]
- Ojha, U.; Li, Y.; Lu, J.; Efros, A.A.; Jae Lee, Y.; Shechtman, E.; Zhang, R. Few-shot Image Generation via Cross-domain Correspondence. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar] [CrossRef]
- Robb, E.; Chu, W.S.; Kumar, A.; Huang, J.B. Few-shot Adaptation of Generative Adversarial Networks. arXiv 2021, arXiv:2010.11943. [Google Scholar]
- Wang, Y.; Wu, C.; Herranz, L.; Weijer, J.; Gonzalez-Garcia, A.; Raducanu, B. Transferring GANs: Generating images from limited data. In Proceedings of the Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Xiao, J.; Li, L.; Wang, C.; Zha, Z.J.; Huang, Q. Few Shot Generative Model Adaption via Relaxed Spatial Structural Alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Zhao, Y.; Ding, H.; Huang, H.; Cheung, N.M. A Closer Look at Few-shot Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022. [Google Scholar]
- Zhu, P.; Abdal, R.; Femiani, J.; Wonka, P. Mind the Gap: Domain Gap Control for Single Shot Domain Adaptation for Generative Adversarial Networks. arXiv 2021, arXiv:2110.08398. [Google Scholar]
- Gal, R.; Patashnik, O.; Maron, H.; Chechik, G.; Cohen-Or, D. StyleGAN-NADA: CLIP-Guided Domain Adaptation of Image Generators. ACM Trans. Graph. 2021, 41, 1–13. [Google Scholar] [CrossRef]
- Kwon, G.; Ye, J. One-Shot Adaptation of GAN in Just One CLIP. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 12179–12191. [Google Scholar] [CrossRef]
- Kim, S.; Kang, K.; Kim, G.; Baek, S.H.; Cho, S. DynaGAN: Dynamic Few-Shot Adaptation of GANs to Multiple Domains. In SIGGRAPH Asia 2022 Conference Papers; ACM: New York, NY, USA, 2022. [Google Scholar]
- Zhang, Y.; Yao, M.; Wei, Y.; Ji, Z.; Bai, J.; Zuo, W. Towards Diverse and Faithful One-shot Adaption of Generative Adversarial Networks. Adv. Neural Inf. Process. Syst. 2022, 35, 37297–37308. [Google Scholar]
- Radford, A.; Kim, J.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Amanda, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021. [Google Scholar]
- Zhu, P.; Abdal, R.; Qin, Y.; Wonka, P. Improved StyleGAN Embedding: Where are the Good Latents? arXiv 2020, arXiv:2012.09036. [Google Scholar]
- Mo, S.; Cho, M.; Shin, J. Freeze Discriminator: A Simple Baseline for Fine-tuning GANs. arXiv 2020, arXiv:2002.10964. [Google Scholar]
- Zhao, M.; Yang, C.; Carin, L. On Leveraging Pretrained GANs for Generation with Limited Data. In Proceedings of the 37th International Conference on Machine Learning, Online, 13–18 July 2020. [Google Scholar]
- Noguchi, A.; Harada, T. Image Generation From Small Datasets via Batch Statistics Adaptation. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar] [CrossRef]
- Hou, X.; Liu, B.; Zhang, S.; Shi, L.; Jiang, Z.; You, H. Dynamic Weighted Semantic Correspondence for Few-Shot Image Generative Adaptation. In Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, Portugal, 10–14 October 2022. [Google Scholar] [CrossRef]
- Tov, O.; Alaluf, Y.; Nitzan, Y.; Patashnik, O.; Cohen-Or, D. Designing an encoder for StyleGAN image manipulation. ACM Trans. Graph. 2021, 40, 1–14. [Google Scholar] [CrossRef]
- Daubechies, I. The wavelet transform, time-frequency localization and signal analysis. IEEE Trans. Inf. Theory 1990, 36, 961–1005. [Google Scholar] [CrossRef]
- Gao, Y.; Wei, F.; Bao, J.; Gu, S.; Chen, D.; Wen, F.; Lian, Z. High-Fidelity and Arbitrary Face Editing. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021. [Google Scholar] [CrossRef]
- Jiang, L.; Dai, B.; Wu, W.; Loy, C.C. Focal Frequency Loss for Image Reconstruction and Synthesis. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar] [CrossRef]
- Yu, Y.; Zhan, F.; Lu, S.; Pan, J.; Ma, F.; Xie, X.; Miao, C. WaveFill: A Wavelet-based Generation Network for Image Inpainting. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar] [CrossRef]
- Yoo, J.; Uh, Y.; Chun, S.; Kang, B.; Ha, J.W. Photorealistic Style Transfer via Wavelet Transforms. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar] [CrossRef]
- Yang, M.; Wang, Z.; Chi, Z.; Zhang, Y. FreGAN: Exploiting Frequency Components for Training GANs under Limited Data. Adv. Neural Inf. Process. Syst. 2022, 35, 33387–33399. [Google Scholar]
- Yang, M.; Wang, Z.; Chi, Z.; Feng, W. WaveGAN: Frequency-aware GAN for High-Fidelity Few-shot Image Generation. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2022. [Google Scholar]
- Bhardwaj, J.; Nayak, A. Haar wavelet transform—Based optimal Bayesian method for medical image fusion. Med. Biol. Eng. Comput. 2020, 58, 2397–2411. [Google Scholar] [CrossRef]
- Gu, Z.; Li, W.; Huo, J.; Wang, L.; Gao, Y. LoFGAN Fusing Local Representations for Few-shot Image Generation.pdf. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021. [Google Scholar] [CrossRef]
- Choi, Y.; Uh, Y.; Yoo, J.; Ha, J.W. StarGAN v2: Diverse Image Synthesis for Multiple Domains. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, DC, USA, 13–19 June 2020. [Google Scholar]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
- Liu, M.; Li, Q.; Qin, Z.; Zhang, G.; Wan, P.; Zheng, W. BlendGAN: Implicitly GAN Blending for Arbitrary Stylized Face Generation. Adv. Neural Inf. Process. Syst. 2021, 34, 29710–29722. [Google Scholar]
- Yaniv, J.; Newman, Y.; Shamir, A. The face of art. ACM Trans. Graph. 2019, 38, 1–15. [Google Scholar] [CrossRef]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. Neural Inf. Process. Syst. Inf. Process. Syst. 2017, 8, 25–34. [Google Scholar] [CrossRef]
- Bińkowski, M.; Sutherland, D.; Arbel, M.; Gretton, A. Demystifying MMD GANs. arXiv 2018, arXiv:1801.01401. [Google Scholar]
Methods | FFHQ | AFHQ-Cat | ||||
---|---|---|---|---|---|---|
Amedeo | Fernand | Raphael | Fox | Lion | Tiger | |
FSA [18] | 173.66 ± 19.43 | 227.19 ± 20.33 | 177.43 ± 29.29 | - | - | - |
NADA [24] | 187.42 ± 21.33 | 258.35 ± 19.17 | 185.37 ± 30.14 | 84.28 ± 14.33 | 60.55 ± 17.44 | 16.74 ± 2.57 |
MTG [23] | 207.17 ± 33.18 | 306.03 ± 51.29 | 184.35 ± 27.41 | 67.64 ± 17.77 | 63.28 ± 18.05 | 21.25 ± 8.32 |
DynaGAN [26] | 229.19 ± 27.32 | 317.97 ± 47.98 | 186.03 ± 14.20 | 93.04 ± 20.01 | 86.35 ± 17.33 | 25.14 ± 9.79 |
DiFa [27] | 178.72 ± 18.95 | 255.17 ± 16.61 | 170.76 ± 9.15 | 69.92 ± 15.55 | 42.01 ± 11.28 | 16.26 ± 2.88 |
FAGAN (Ours) | 170.68 ± 17.99 | 281.74 ± 30.43 | 166.36 ± 14.70 | 65.39 ± 10.66 | 35.20 ± 10.98 | 21.41 ± 4.33 |
Methods | FFHQ | AFHQ-Cat | ||||
---|---|---|---|---|---|---|
Amedeo | Fernand | Raphael | Fox | Lion | Tiger | |
FSA [18] | 178.04 ± 4.97 | 190.26 ± 13.88 | 164.64 ± 60.19 | - | - | - |
NADA [24] | 131.77 ± 23.57 | 177.47 ± 30.48 | 148.32 ± 45.89 | 77.31 ± 35.24 | 54.37 ± 18.02 | 14.22 ± 2.44 |
MTG [23] | 138.55 ± 30.12 | 205.01 ± 28.11 | 128.51 ± 37.26 | 69.97 ± 25.99 | 69.55 ± 21.74 | 21.50 ± 7.78 |
DynaGAN [26] | 159.31 ± 31.77 | 219.81 ± 23.89 | 112.08 ± 27.39 | 82.73 ± 37.88 | 77.14 ± 22.19 | 22.81 ± 8.10 |
DiFa [27] | 120.78 ± 23.91 | 167.22 ± 33.15 | 110.13 ± 17.77 | 51.01 ± 31.24 | 36.73 ± 20.45 | 13.29 ± 1.97 |
FAGAN (Ours) | 116.79 ± 21.58 | 189.67 ± 25.82 | 103.33 ± 23.15 | 47.91 ± 24.43 | 34.62 ± 16.80 | 20.90 ± 9.31 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Cheng, K.; Liu, H.; Liu, J.; Xu, B.; Liu, X. Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks. Electronics 2024, 13, 2643. https://doi.org/10.3390/electronics13132643
Cheng K, Liu H, Liu J, Xu B, Liu X. Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks. Electronics. 2024; 13(13):2643. https://doi.org/10.3390/electronics13132643
Chicago/Turabian StyleCheng, Kan, Haidong Liu, Jiayu Liu, Bo Xu, and Xinyue Liu. 2024. "Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks" Electronics 13, no. 13: 2643. https://doi.org/10.3390/electronics13132643
APA StyleCheng, K., Liu, H., Liu, J., Xu, B., & Liu, X. (2024). Frequency-Auxiliary One-Shot Domain Adaptation of Generative Adversarial Networks. Electronics, 13(13), 2643. https://doi.org/10.3390/electronics13132643