DCGAN-Based Image Data Augmentation in Rawhide Stick Products’ Defect Detection
Abstract
:1. Introduction
2. Literature Review
2.1. GANs and Their Derivative Models
2.2. Residual Network
2.3. AM
2.4. Discussion
3. Residual Block and Hybrid Attention Mechanism-Based DCGAN
3.1. Theoretical Basis
3.2. ResB–HAM–DCGAN Model Framework
3.2.1. ResB
3.2.2. Hybrid Attention Mechanism
3.2.3. Generators and Discriminators
3.2.4. Improved Loss Function
3.3. ResB–HAM–DCGAN Model Training Process
4. Experimental Results and Analysis
4.1. Comparative Experiment
4.1.1. Analysis of Images Generated at Different Training Stages of ResB–HAM–DCGAN
4.1.2. Comparison Experiment of Images Generated by Different Machine Learning Models
4.2. Ablation Study of ResB–HAM–DCGAN Model
4.3. Effectiveness Assessment of Augmented Image Dataset
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Goodfellow, I.J.; Pouget-Abadie, J.; Mirza, M.; Xu, B.; Warde-Farley, D.; Ozair, S.; Courville, A.; Bengio, Y. Generative adversarial nets. In Proceedings of the Advances in Neural Information Processing Systems, Montreal, QC, Canada, 8–13 December 2014; Volume 27. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised representation learning with deep convolutional generative adversarial networks. Comput. Sci. 2015, 3–5. [Google Scholar] [CrossRef]
- Ratliff, L.J.; Burden, S.A.; Sastry, S.S. Characterization and computation of local Nash equilibria in continuous games. In Proceedings of the 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton), Monticello, IL, USA, 2–4 October 2013; pp. 917–924. [Google Scholar] [CrossRef]
- Mao, X.; Li, Q.; Xie, H.; Lau, R.; Wang, Z.; Smolley, S.P. Least squares generative adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2813–2821. [Google Scholar] [CrossRef]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein generative adversarial networks. In Proceedings of the 34th International Conference on Machine Learning, Sydney, NSW, Australia, 9–11 August 2017; pp. 214–223. [Google Scholar]
- Gulrajani, I.; Ahmed, F.; Arjovsky, M.; Dumoulin, V.; Courville, A. Improved training of wasserstein GANs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; pp. 5769–5779. [Google Scholar]
- Wei, X.; Gong, B.; Liu, Z.; Lu, W.; Wang, L. Improving the improved training of wasserstein gans: A consistency term and its dual effect. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018; pp. 1–17. [Google Scholar]
- Miyato, T.; Kataoka, T.; Koyama, M.; Yoshida, Y. Spectral normalization for generative adversarial networks. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, BC, Canada, 30 April–3 May 2018. [Google Scholar]
- Wu, Y.; Shuai, H.; Tam, Z.; Chiu, H. Gradient normalization for generative adversarial networks. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision, Virtual, Online, Canada, 11–17 October 2021; pp. 6353–6362. [Google Scholar] [CrossRef]
- Zhang, H.; Goodfellow, I.; Metaxas, D.; Odena, A. Self-Attention Generative Adversarial Networks. In Proceedings of the 36th International Conference on Machine Learning, Long Beach, CA, USA, 9–15 June 2019; pp. 7354–7363. [Google Scholar]
- Wu, S.; Yang, J.; Shan, Y.; Xu, B. Research on Generative Adversarial Networks Using Twins Attention Mechanism. J. Front. Comput. Sci. Technol. 2020, 14, 833–840. [Google Scholar]
- Liu, B.; Zhu, Y.; Song, K.; Elgammal, A. Towards faster and stabilized gan training for high-fidelity few-shot image synthesis. In Proceedings of the 9th International Conference on Learning Representations, Virtual, Online, 3–7 May 2021. [Google Scholar]
- Hinz, T.; Fisher, M.; Wang, O.; Wermter, S. Improved techniques for training single-image gans. In Proceedings of the 2021 IEEE Winter Conference on Applications of Computer Vision, Virtual, Online, USA, 5–9 January 2021; pp. 1299–1308. [Google Scholar] [CrossRef]
- Chen, H.; Zhao, L.; Zhang, H.; Wang, Z.; Zuo, Z.; Li, A.; Xing, W.; Lu, D. Diverse image style transfer via invertible cross-space mapping. In Proceedings of the 18th IEEE/CVF International Conference on Computer Vision, Virtual, Online, Canada, 11–17 October 2021; pp. 14860–14869. [Google Scholar] [CrossRef]
- Zheng, Z.; Yang, X.; Yu, Z.; Zheng, L.; Yang, Y.; Kautz, J. Joint discriminative and generative learning for person reidentification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2133–2142. [Google Scholar] [CrossRef]
- Shi, H.; Lu, J.; Zhou, Q. A novel data augmentation method using style-based GAN for robust pulmonary nodule segmentation. In Proceedings of the 2020 Chinese Control and Decision Conference, Hefei, China, 22–24 August 2020; pp. 2486–2491. [Google Scholar] [CrossRef]
- Tran, N.; Tran, V.; Nguyen, N.; Nguyen, T.; Cheung, N. On Data Augmentation for GAN Training. IEEE Trans. Image Process. 2020, 30, 1882–1897. [Google Scholar] [CrossRef] [PubMed]
- Upadhyay, A.; Li, J.; King, S.; Addepalli, S. A Deep-Learning-Based Approach for Aircraft Engine Defect Detection. Machines 2023, 11, 192. [Google Scholar] [CrossRef]
- He, J.; Shi, W.; Chen, K.; Fu, L.; Dong, C. GCFSR: A Generative and Controllable Face Super Resolution Method Without Facial and GAN Priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 1889–1898. [Google Scholar] [CrossRef]
- Grigorev, A.; Iskakov, K.; Ianina, A.; Bashirov, R.; Zakharkin, I.; Vakhitov, A.; Lempitsky, V. Stylepeople: A generative model of fullbody human avatars. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 5147–5156. [Google Scholar] [CrossRef]
- Jiang, B.; Wang, L.; Cheng, J.; Tang, J.; Luo, B. Gpens: Graph data learning with graph propagation-embedding network. IEEE Trans. Neural Netw. Learn. Syst. 2021, 34, 3925–3938. [Google Scholar] [CrossRef] [PubMed]
- Esser, P.; Rombach, R.; Ommer, B. Taming Transformers for High-Resolution Image Synthesi. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 12868–12878. [Google Scholar] [CrossRef]
- Suthar, V.; Vakharia, V.; Patel, V.K.; Shah, M. Detection of Compound Faults in Ball Bearings Using Multiscale-SinGAN, Heat Transfer Search Optimization, and Extreme Learning Machine. Machines 2022, 11, 29. [Google Scholar] [CrossRef]
- Jalayer, M.; Kaboli, A.; Orsenigo, C.; Vercellis, C. Fault Detection and Diagnosis with Imbalanced and Noisy Data: A Hybrid Framework for Rotating Machinery. Machines 2022, 10, 237. [Google Scholar] [CrossRef]
- Kim, J.H.; Hwang, Y. GAN-based synthetic data augmentation for infrared small target detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 5002512. [Google Scholar] [CrossRef]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar] [CrossRef]
- Park, S.; Yoo, C.H.; Shin, Y.G. Effective Shortcut Technique for Generative Adversarial Networks. Appl. Intell. 2022, 53, 2055–2067. [Google Scholar] [CrossRef]
- Zhu, J.H.; Zhou, X.Y.; Xu, M.S.; Wang, Y.; Hou, J.J.; Zhao, X.Y.; Cheng, L. Improved DCGAN Data Enhanced Tomato Leaf Disease Image Recognition. Radio Eng. 2023, 53, 1235–1241. [Google Scholar]
- Lin, B.W.; Zhao, G.Z.; Wang, X.P.; Li, H. Facial Expression Generation Based on group residual Block Generative Adversarial Networks. Comput. Eng. Appl. 2024, 60, 240–249. [Google Scholar]
- Li, M.A.; Peng, W.M. EEGsignal augmentation method based on generative adversarial network with ResBlock and self-attention machenism. J. Comput. Appl. 2022, 42, 80–86. [Google Scholar]
- Woo, S.; Park, J.; Lee, J.Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar] [CrossRef]
- Hu, J.; Shen, L.; Sun, G. Squeeze-and-Excitation Networks. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–28 June 2018; pp. 7132–7141. [Google Scholar] [CrossRef]
- Wang, Q.; Wu, B.; Zhu, P.; Li, P.; Zuo, W.; Hu, Q. ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 11531–11539. [Google Scholar] [CrossRef]
- Hou, Q.; Zhou, D.; Feng, J. Coordinate Attention for Efficient Mobile Network Design. In Proceedings of the 2021 IEEE/CVF conference on computer vision and pattern recognition, Nashville, TN, USA, 20–25 June 2021; pp. 13708–13713. [Google Scholar] [CrossRef]
- Wang, L.; Yang, J.; Zhang, C.; Dai, Z. Dual Discriminative Generative Adversarial Networks with Hybrid Attention. Comput. Eeg. Appl. 2024, 60, 212–221. [Google Scholar]
- Yang, Y.; Sun, L.; Mao, X.; Zhao, M. Data Augmentation Based on Generative Adversarial Network with Mixed Attention Mechanism. Electronics 2022, 11, 1718. [Google Scholar] [CrossRef]
- Wang, X.; Cheng, H.X.; Sun, S.Y.; Jiang, Z.Q.; Cheng, K.; Cheng, L. MSFSA-GAN: Multi-Scale Fusion Self Attention Generative Adversarial Network for Single Image Deraining. IEEE Access 2022, 10, 34442–34448. [Google Scholar] [CrossRef]
- Zhang, D.H.; Wu, C.Y.; Zhou, J.C.; Zhang, W.S.; Li, C.L.; Lin, Z.F. Hierarchical attention aggregation with multi-resolution feature learning for GAN-based underwater image enhancement. Eng. Appl. Artif. Intell. 2023, 125, 106743. [Google Scholar] [CrossRef]
- Kingma, D.; Ba, J. Adam: A Method for Stochastic Optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, CA, USA, 7–9 May 2015. [Google Scholar]
No. | Input | Operation | Output |
---|---|---|---|
1 | 1 × 1 × 100 | TranConv2d (100, 512, k = 4 × 4, s = 1) + SN+ BN + ReLU | 4 × 4 × 512 |
2 | 4 × 4 × 512 | ResB_G(512) | 8 × 8 × 256 |
3 | 8 × 8 × 256 | ResB_G(256) | 16 × 16 × 128 |
4 | 16 × 16 × 128 | ResB_G(128) | 32 × 32 × 64 |
5 | 32 × 32 × 64 | ResB_G(64) | 64 × 64 × 32 |
6 | 64 × 64 × 32 | ResB_G(32) | 128 × 128 × 16 |
7 | 128 × 128 × 16 | MA-block (16, b = 1, gamma = 2, k = 3) | 128 × 128 × 16 |
8 | 128 × 128 × 16 | ResB_G(16) | 256 × 256 × 8 |
9 | 256 × 256 × 8 | MA-block (8, b = 1, gamma = 2, k = 3) | 256 × 256 × 8 |
10 | 256 × 256 × 8 | TranConv2d (16, 1, k = 4 × 4, s = 2, p = 1) + Tanh | 512 × 512 × 1 |
No. | Input | Operation | Output |
---|---|---|---|
1 | 512 × 512 × 1 | Conv2d (1, 8, k = 4 × 4, s = 2, p = 1) + SN + IN + LeakyReLU | 256 × 256 × 8 |
2 | 256 × 256 × 8 | ResB_D(8) | 128 × 128 × 16 |
3 | 128 × 128 × 16 | ResB_D(16) | 64 × 64 × 32 |
4 | 64 × 64 × 32 | ResB_D(32) | 32 × 32 × 64 |
5 | 32 × 32 × 64 | ResB_D(64) | 16 × 16 × 128 |
6 | 16 × 16 × 128 | ResB_D(128) | 8 × 8 × 256 |
7 | 8 × 8 × 256 | MA-block (256, b = 1, gamma = 2, k = 3) | 8 × 8 × 256 |
8 | 8 × 8 × 256 | ResB_D(256) | 4 × 4 × 512 |
9 | 4 × 4 × 512 | MA-block (512, b = 1, gamma = 2, k = 3) | 4 × 4 × 512 |
10 | 4 × 4 × 512 | Conv2d (512, 1, k = 4 × 4, s = 1) + LeakyReLU | 1 × 1 × 1 |
Model | Generator Learning Rate | Discriminator Learning Rate | Penalty Coefficient |
---|---|---|---|
DCGAN, and WGAN-GP | 0.0002 | 0.0002 | / |
ResB–HAM–DCGAN | 0.0001 | 0.0004 | 10 |
Model | IS ↑ | FID ↓ | SSIM ↑ |
---|---|---|---|
DCGAN | 6.12 | 105.61 | 0.72 |
WGAN-GP | 7.53 | 100.12 | 0.79 |
ResB-HAM-DCGAN | 10.41 | 91.51 | 0.83 |
Model | IS ↑ | FID ↓ | SSIM ↑ |
---|---|---|---|
DCGAN | 6.12 | 105.61 | 0.72 |
DCGAN+Res | 6.93 | 100.25 | 0.77 |
DCGAN+Res+MA | 9.05 | 95.88 | 0.80 |
ResB-HAM-DCGAN | 10.41 | 91.51 | 0.83 |
No. | Input | Operation | Output |
---|---|---|---|
1 | 512 × 512 × 1 | Conv2d (1, 16, k = 4 × 4, s = 2, p = 1) + BN + ReLU | 256 × 256 × 16 |
2 | 256 × 256 × 16 | MaxPool2d (k = 2 × 2, s = 2) | 128 × 128 × 16 |
3 | 128 × 128 × 16 | Conv2d (16, 32, k = 4 × 4, s = 2, p = 1) + BN + ReLU | 64 × 64 × 32 |
4 | 64 × 64 × 32 | MaxPool2d (k = 2 × 2, s = 2) | 32 × 32 × 32 |
5 | 32 × 32 × 32 | Conv2d (32, 64, k = 4 × 4, s = 2, p = 1) + BN + ReLU | 16 × 16 × 64 |
6 | 16 × 16 × 64 | MaxPool2d (k = 2 × 2, s = 2) | 8 × 8 × 64 |
7 | 8 × 8 × 64 | Conv2d (64, 128, k = 4 × 4, s = 2, p = 1) + BN + ReLU | 4 × 4 × 128 |
8 | 4 × 4 × 128 | MaxPool2d (k = 2 × 2, s = 2) | 2 × 2 × 128 |
9 | 2 × 2 × 128 | Linear (2 × 2 × 128, 2) | 1 × 1 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ding, S.; Guo, Z.; Chen, X.; Li, X.; Ma, F. DCGAN-Based Image Data Augmentation in Rawhide Stick Products’ Defect Detection. Electronics 2024, 13, 2047. https://doi.org/10.3390/electronics13112047
Ding S, Guo Z, Chen X, Li X, Ma F. DCGAN-Based Image Data Augmentation in Rawhide Stick Products’ Defect Detection. Electronics. 2024; 13(11):2047. https://doi.org/10.3390/electronics13112047
Chicago/Turabian StyleDing, Shuhui, Zhongyuan Guo, Xiaolong Chen, Xueyi Li, and Fai Ma. 2024. "DCGAN-Based Image Data Augmentation in Rawhide Stick Products’ Defect Detection" Electronics 13, no. 11: 2047. https://doi.org/10.3390/electronics13112047
APA StyleDing, S., Guo, Z., Chen, X., Li, X., & Ma, F. (2024). DCGAN-Based Image Data Augmentation in Rawhide Stick Products’ Defect Detection. Electronics, 13(11), 2047. https://doi.org/10.3390/electronics13112047