ImbDef-GAN: Defect Image-Generation Method Based on Sample Imbalance
Abstract
1. Introduction
- (1)
- A lightweight StyleGAN3 [17] variant jointly generates the background and a background mask, while a matching discriminator enforces coherence between the generated background and mask. The Progress-coupled Gated Detail Injection (PGDI) module regulates detail strength according to training progress, raising high-frequency fidelity while maintaining training stability. The coupling of joint mask generation with coherence supervision and progress-driven detail control yields more realistic backgrounds and enables the subsequent defect-generation stage to capture complete morphology and sharper boundaries.
- (2)
- To mitigate unnatural transitions at defect boundaries under sample imbalance, the defect stage augments the background generator with a residual defect-feature branch, while a smoothing coefficient blends defect features with background features, yielding more natural boundaries and more realistic defect regions.
- (3)
- To address misalignment between defect masks and generated images, a mask-aware matching discriminator propagates mask information through a multilayer feature extractor, using an explicit image–mask matching signal to enforce layer-wise alignment and strengthen spatial localization of defect regions.
- (4)
- To increase the diversity of defect regions and avoid generation in invalid background areas, Edge Structure Loss (ESL) promotes boundary-aware morphological variation, while Region Consistency Loss (RCL) restricts the defect mask to the valid region of the background mask. Taken together, these objectives provide a unified treatment of boundary structure and region validity under sample imbalance.
2. Related Work
2.1. Few Sample Image Generation
2.2. Defect Image Generation
3. Methods
3.1. Background Image Generation
3.1.1. Progress-Coupled Gated Detail Injection Module
3.1.2. Matching Discrimination
3.2. Defect Image Generation with Background Conditioning
3.2.1. Defect Feature Extraction
3.2.2. Mask-Aware Matching Discriminator
3.2.3. Edge Structure Loss
3.2.4. Region Consistency Loss
4. Experiments and Results
4.1. Experiments Setup
4.1.1. Dataset: MVTec AD
4.1.2. Implementation Details
4.1.3. Evaluation Metrics
4.2. Background Image Generation Experiments
4.3. Defect Image Generation Experiments
4.3.1. Multi-Class Defect Image Generation on a Shared Background
4.3.2. Comparative Experiments on Defect Image Generation
4.3.3. Comparative Experiments for 5-Shot and 1-Shot Defect Image Generation
4.3.4. Comparative Experiments on Defect Image Generation in Other Scenarios
4.3.5. Ablation Study
4.4. Comparative Experiments on Defect Detection
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Pang, G.; Shen, C.; Cao, L.; Van Den Hengel, A. Deep learning for anomaly detection: A review. ACM Comput. Surv. (CSUR) 2021, 54, 1–38. [Google Scholar] [CrossRef]
- Zhao, W.; Chen, F.; Huang, H.; Li, D.; Cheng, W. A new steel defect detection algorithm based on deep learning. Comput. Intell. Neurosci. 2021, 2021, 5592878. [Google Scholar] [CrossRef] [PubMed]
- Zhao, C.; Shu, X.; Yan, X.; Zuo, X.; Zhu, F. RDD-YOLO: A modified YOLO for detection of steel surface defects. Measurement 2023, 214, 112776. [Google Scholar] [CrossRef]
- Guo, Z.; Wang, C.; Yang, G.; Huang, Z.; Li, G. Msft-YOLO: Improved YOLOv5 based on transformer for detecting defects of steel surface. Sensors 2022, 22, 3467. [Google Scholar] [CrossRef] [PubMed]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You only look once: Unified, real-time object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 779–788. [Google Scholar] [CrossRef]
- Ren, S.; He, K.; Girshick, R.; Sun, J. Faster R-CNN: Towards real-time object detection with region proposal networks. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada, 7–12 December 2015; Volume 28. [Google Scholar] [CrossRef]
- Zavrtanik, V.; Kristan, M.; Skočaj, D. DSR—A dual subspace re-projection network for surface anomaly detection. In Proceedings of the European Conference on Computer Vision (ECCV), Tel Aviv, Israel, 23–27 October 2022; pp. 539–554. [Google Scholar] [CrossRef]
- Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. Uninformed students: Student-teacher anomaly detection with discriminative latent embeddings. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 4183–4192. [Google Scholar] [CrossRef]
- DeVries, T.; Taylor, G.W. Improved regularization of convolutional neural networks with cutout. arXiv 2017, arXiv:1708.04552. [Google Scholar] [CrossRef]
- Li, C.-L.; Sohn, K.; Yoon, J.; Pfister, T. CutPaste: Self-supervised learning for anomaly detection and localization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 9664–9674. [Google Scholar] [CrossRef]
- Lin, D.; Cao, Y.; Zhu, W.; Li, Y. Few-shot defect segmentation leveraging abundant defect free training samples through normal background regularization and crop-and-paste operation. arXiv 2020, arXiv:2007.09438. [Google Scholar] [CrossRef]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual Conference, 6–12 December 2020; Volume 33, pp. 6840–6851. [Google Scholar] [CrossRef]
- Goodfellow, I.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar] [CrossRef]
- Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar] [CrossRef]
- Karras, T.; Laine, S.; Aila, T. A style-based generator architecture for generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 4401–4410. [Google Scholar] [CrossRef]
- Karras, T.; Laine, S.; Aittala, M.; Hellsten, J.; Lehtinen, J.; Aila, T. Analyzing and improving the image quality of StyleGAN. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 14–19 June 2020; pp. 8110–8119. [Google Scholar] [CrossRef]
- Karras, T.; Aittala, M.; Laine, S.; Härkönen, E.; Hellsten, J.; Lehtinen, J.; Aila, T. Alias-free generative adversarial networks. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Virtual Conference, 6–14 December 2021; Volume 34, pp. 852–863. [Google Scholar] [CrossRef]
- Duan, Y.; Hong, Y.; Niu, L.; Zhang, L. Few-shot defect image generation via defect-aware feature manipulation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Washington, DC, USA, 7–14 February 2023; Volume 37, pp. 571–578. [Google Scholar] [CrossRef]
- Hu, T.; Zhang, J.; Yi, R.; Du, Y.; Chen, X.; Liu, L.; Wang, Y.; Wang, C. AnomalyDiffusion: Few-shot anomaly image generation with diffusion model. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 8526–8534. [Google Scholar] [CrossRef]
- Niu, S.; Li, B.; Wang, X.; Lin, H. Defect image sample generation with GAN for improving defect recognition. IEEE Trans. Autom. Sci. Eng. 2020, 17, 1611–1622. [Google Scholar] [CrossRef]
- Zhang, G.; Cui, K.; Hung, T.-Y.; Lu, S. Defect-GAN: High-fidelity defect synthesis for automated defect inspection. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 5–9 January 2021; pp. 2524–2534. [Google Scholar] [CrossRef]
- Liu, R.; Liu, W.; Zheng, Z.; Wang, L.; Mao, L.; Qiu, Q.; Ling, G. Anomaly-GAN: A data augmentation method for train surface anomaly detection. Expert Syst. Appl. 2023, 228, 120284. [Google Scholar] [CrossRef]
- Deng, F.; Luo, J.; Fu, L.; Huang, Y.; Chen, J.; Li, N.; Zhong, J.; Lam, T.L. DG2GAN: Improving defect recognition performance with generated defect image sample. Sci. Rep. 2024, 14, 14787. [Google Scholar] [CrossRef] [PubMed]
- He, Z.; Wu, K.; Wen, Y. Defect image generation through feature disentanglement using StyleGAN2-ADA. Neurocomputing 2025, 647, 130455. [Google Scholar] [CrossRef]
- Khanam, R.; Hussain, M. YOLOv11: An overview of the key architectural enhancements. arXiv 2024, arXiv:2410.17725. [Google Scholar] [CrossRef]
- Mo, S.; Cho, M.; Shin, J. Freeze the discriminator: A simple baseline for fine-tuning GANs. arXiv 2020, arXiv:2002.10964. [Google Scholar] [CrossRef]
- Duan, Y.; Niu, L.; Hong, Y.; Zhang, L. WeditGAN: Few-shot image generation via latent space relocation. In Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), Vancouver, BC, Canada, 20–27 February 2024; Volume 38, pp. 1653–1661. [Google Scholar] [CrossRef]
- Zhao, Y.; Chandrasegaran, K.; Abdollahzadeh, M.; Du, C.; Pang, T.; Li, R.; Ding, H.; Cheung, N.-M. AdAM: Few-shot image generation via adaptation-aware kernel modulation. arXiv 2023, arXiv:2307.01465. [Google Scholar] [CrossRef]
- Gal, R.; Alaluf, Y.; Atzmon, Y.; Patashnik, O.; Bermano, A.H.; Chechik, G.; Cohen-Or, D. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv 2022, arXiv:2208.01618. [Google Scholar] [CrossRef]
- Ruiz, N.; Li, Y.; Jampani, V.; Pritch, Y.; Rubinstein, M.; Aberman, K. DreamBooth: Fine-tuning text-to-image diffusion models for subject-driven generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 22500–22510. [Google Scholar] [CrossRef]
- Wang, Y.; Zhou, Z.; Tan, X.; Pan, Y.; Yuan, J.; Qiu, Z.; Liu, C. Unveiling the potential of progressive training diffusion model for defect image generation and recognition in industrial processes. Neurocomputing 2024, 592, 127837. [Google Scholar] [CrossRef]
- Jin, Y.; Peng, J.; He, Q.; Hu, T.; Wu, J.; Chen, H.; Wang, H.; Zhu, W.; Chi, M.; Liu, J.; et al. Dual-Interrelated Diffusion Model for Few-Shot Anomaly Image Generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Denver, CO, USA, 3–7 June 2025; pp. 30420–30429. [Google Scholar] [CrossRef]
- Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional block attention module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar] [CrossRef]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A.C. Improved training of Wasserstein GANs. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar] [CrossRef]
- Mao, Q.; Lee, H.-Y.; Tseng, H.-Y.; Ma, S.; Yang, M.-H. Mode seeking generative adversarial networks for diverse image synthesis. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 1429–1437. [Google Scholar] [CrossRef]
- Bergmann, P.; Fauser, M.; Sattlegger, D.; Steger, C. MVTec AD—A comprehensive real-world dataset for unsupervised anomaly detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, 16–20 June 2019; pp. 9592–9600. [Google Scholar] [CrossRef]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In Proceedings of the Advances in Neural Information Processing Systems (NeurIPS), Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar] [CrossRef]
- Bińkowski, M.; Sutherland, D.J.; Arbel, M.; Gretton, A. Demystifying MMD GANs. arXiv 2018, arXiv:1801.01401. [Google Scholar] [CrossRef]
- Zhang, R.; Isola, P.; Efros, A.A.; Shechtman, E.; Wang, O. The unreasonable effectiveness of deep features as a perceptual metric. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, 18–23 June 2018; pp. 586–595. [Google Scholar] [CrossRef]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the Inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar] [CrossRef]
Methods | Advantages | Limitations |
---|---|---|
Transformation-based augmentation (scaling, rotation, translation, flipping, etc.) | Simple and efficient; computationally lightweight | Cannot faithfully model realistic and complex defect structures |
Traditional defect image generation (CutPaste, Crop&Paste, etc.) | Simple and highly controllable; capable of producing localized structural perturbations | Limited realism; inadequate coverage of complex morphologies; constrained by source images |
Deep learning–based defect image generation (DFMGAN, AnomalyDiffusion, etc.) | Generalize across defect types and learn realistic; complex defect structures | Training is complex and may suffer from mode collapse or instability |
Hazelnut | Bottle | Metal_Nut | |||
---|---|---|---|---|---|
Defect Category | Count | Defect Category | Count | Defect Category | Count |
crack | 18 | broken_large | 20 | bent | 25 |
cut | 17 | broken_small | 22 | color | 22 |
hole | 18 | contamination | 21 | flip | 23 |
17 | – | – | scratch | 23 |
Methods | FID ↓ |
---|---|
StyleGAN2 | 20.41 |
StyleGAN3 | 19.23 |
LStyleGAN3 | 17.05 |
Ours | 14.27 ± 0.68 |
Hazelnut | Crack | Cut | Hole | |||||
---|---|---|---|---|---|---|---|---|
Methods | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ |
Crop&Paste [11] | – | 0.1894 | – | 0.2045 | – | 0.2108 | – | 0.2185 |
StyleGAN2 [16] | 22.51 | 0.0548 | 18.58 | 0.0734 | 30.81 | 0.0734 | 19.61 | 0.0842 |
StyleGAN3 [17] | 36.93 | 0.0891 | 60.60 | 0.0712 | 103.94 | 0.1370 | 135.15 | 0.0971 |
Defect-GAN [21] | 30.98 | 0.1905 | 32.69 | 0.1734 | 36.30 | 0.2007 | 33.35 | 0.2007 |
DFMGAN [18] | 19.73 | 0.2600 | 16.88 | 0.2073 | 20.78 | 0.2391 | 27.25 | 0.2649 |
AnomalyDiffusion [19] | 32.59 | 0.3111 | 21.19 | 0.2753 | 29.40 | 0.2846 | 31.01 | 0.3139 |
He et al. [24] | 15.83 | 0.2759 | 14.44 | 0.2263 | 20.32 | 0.2521 | 19.34 | 0.2359 |
Ours | 16.28 | 0.3261 | 12.59 | 0.2851 | 14.65 | 0.3065 | 17.01 | 0.3112 |
Bottle | Broken_Large | Broken_Small | Contamination | |||
---|---|---|---|---|---|---|
Methods | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ |
Defect-GAN [21] | 77.09 | 0.0593 | 59.18 | 0.0797 | 126.45 | 0.0693 |
DFMGAN [18] | 59.74 | 0.1162 | 76.38 | 0.0854 | 76.59 | 0.1661 |
AnomalyDiffusion [19] | 82.26 | 0.1898 | 75.49 | 0.1646 | 73.86 | 0.1766 |
Ours | 56.32 | 0.1652 | 55.25 | 0.1536 | 65.03 | 0.1884 |
Metal_Nut | Bent | Color | Flip | Scratch | ||||
---|---|---|---|---|---|---|---|---|
Methods | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ | KID↓ | LPIPS↑ |
Defect-GAN [21] | 55.94 | 0.3058 | 44.83 | 0.3138 | 148.86 | 0.2836 | 56.29 | 0.3063 |
DFMGAN [18] | 34.14 | 0.3153 | 35.72 | 0.3326 | 67.66 | 0.2919 | 38.65 | 0.3315 |
AnomalyDiffusion [19] | 46.28 | 0.2921 | 32.23 | 0.2644 | 74.51 | 0.3223 | 35.39 | 0.2927 |
Ours | 29.22 | 0.3254 | 30.52 | 0.3529 | 74.32 | 0.3027 | 33.95 | 0.3453 |
Methods | KID ↓ | LPIPS ↑ |
---|---|---|
(1) ResBlock32 | 23.75 | 0.2370 |
(2) ResBlock128 | 16.47 | 0.2621 |
(3) NoMAMatch | 20.65 | 0.2642 |
(4) NoMS | 15.60 | 0.2235 |
(5) NoRC | 17.32 | 0.2751 |
(6) Ours (ResBlock64) | 14.65 | 0.3065 |
Defect Category | Training Set (Images) | Testing Set (Images) | |
---|---|---|---|
Original Count | Generated Count | Original Count | |
crack | 6 | 200 | 12 |
cut | 5 | 200 | 12 |
hole | 6 | 200 | 12 |
5 | 200 | 12 | |
Total | 22 | 800 | 48 |
Methods | mAP@0.5/% ↑ |
---|---|
StyleGAN2 | 65.5 |
StyleGAN3 | 50.9 |
Defect-GAN | 65.2 |
DFMGAN | 76.1 |
AnomalyDiffusion | 78.2 |
Ours | 83.6 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Jiang, D.; Tao, N.; Zhu, K.; Wang, Y.; Shao, H. ImbDef-GAN: Defect Image-Generation Method Based on Sample Imbalance. J. Imaging 2025, 11, 367. https://doi.org/10.3390/jimaging11100367
Jiang D, Tao N, Zhu K, Wang Y, Shao H. ImbDef-GAN: Defect Image-Generation Method Based on Sample Imbalance. Journal of Imaging. 2025; 11(10):367. https://doi.org/10.3390/jimaging11100367
Chicago/Turabian StyleJiang, Dengbiao, Nian Tao, Kelong Zhu, Yiming Wang, and Haijian Shao. 2025. "ImbDef-GAN: Defect Image-Generation Method Based on Sample Imbalance" Journal of Imaging 11, no. 10: 367. https://doi.org/10.3390/jimaging11100367
APA StyleJiang, D., Tao, N., Zhu, K., Wang, Y., & Shao, H. (2025). ImbDef-GAN: Defect Image-Generation Method Based on Sample Imbalance. Journal of Imaging, 11(10), 367. https://doi.org/10.3390/jimaging11100367