Few-Shot Data Augmentation by Morphology-Constrained Latent Diffusion for Enhanced Nematode Recognition
Abstract
1. Introduction
- We propose a morphology-constrained fine-tuned LDM framework, enabling stable and controlled generation of nematode images under few-shot conditions.
- We design a novel method for generating nematode morphological constraints, introducing a mechanism for mapping from nematode images to morphological constraints.
- We establish a multi-model evaluation system to assess the impact of generated data on classifiers, such as ResNet50 and DenseNet121, across different datasets. Our experiments show significant accuracy improvements over traditional methods.
2. Related Work
2.1. Data Augmentation
2.1.1. Traditional Data Augmentation Methods
2.1.2. Diffusion Model-Based Data Augmentation
2.1.3. Applications of Data Augmentation in Agriculture and Forestry
2.2. Fine-Tuning Diffusion Models
3. Materials and Methods
3.1. Nematode Dataset
3.2. Morphological Constraint Generation
3.2.1. Binary Mask Generation
3.2.2. Morphological Diversity Enhancement
3.3. Diffusion Model
3.3.1. Fine-Tuning LDM
3.3.2. Morphological Control
4. Results
4.1. Experimental Setup and Evaluation Methods
4.1.1. Fine-Tuning Configuration of LDM
4.1.2. Configuration of Image Generation Parameters
4.1.3. Configuration of Image Classification Experiments
4.1.4. Evaluation Metrics
4.2. Data Augmentation Under Class-Imbalanced Conditions
4.3. Impact of Generative Image Proportions on Classification Performance
4.4. Ablation Study
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Nicol, J.M. Important nematode pests. In Bread Wheat: Improvement and Production; FAO Plant Production and Protection Series: Rome, Italy, 2002; pp. 345–366. [Google Scholar]
- Liu, M.; Wang, X.; Zhang, H. Taxonomy of multi-focal nematode image stacks by a CNN based image fusion approach. Comput. Methods Programs Biomed. 2018, 156, 209–215. [Google Scholar] [CrossRef]
- Thevenoux, R.; Le, V.L.; Villessèche, H.; Buisson, A.; Beurton-Aimar, M.; Grenier, E.; Folcher, L.; Parisey, N. Image based species identification of Globodera quarantine nematodes using computer vision and deep learning. Comput. Electron. Agric. 2021, 186, 106058. [Google Scholar] [CrossRef]
- Abade, A.; Porto, L.F.; Ferreira, P.A.; de Barros Vidal, F. NemaNet: A convolutional neural network model for identification of soybean nematodes. Biosyst. Eng. 2022, 213, 39–62. [Google Scholar] [CrossRef]
- Zhu, Y.; Zhuang, J.; Ye, S.; Xu, N.; Xiao, J.; Gu, J.; Fang, Y.; Peng, C.; Zhu, Y. Domain generalization in nematode classification. Comput. Electron. Agric. 2023, 207, 107710. [Google Scholar] [CrossRef]
- Zhou, B.T.; Nah, W.L.; Lee, K.W.; Baek, J.H. A General Image Based Nematode Identification System Design. In Proceedings of the International Conference on Computational Intelligence and Security, Xi’an, China, 15–19 December 2005. [Google Scholar]
- DeVries, T.; Taylor, G.W. Improved regularization of convolutional neural networks with cutout. arXiv 2017, arXiv:1708.04552. [Google Scholar]
- Cubuk, E.D.; Zoph, B.; Shlens, J.; Le, Q.V. Randaugment: Practical automated data augmentation with a reduced search space. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, WA, USA, 14–19 June 2020; pp. 3008–3017. [Google Scholar]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond empirical risk minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y.J. CutMix: Regularization Strategy to Train Strong Classifiers With Localizable Features. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6022–6031. [Google Scholar]
- Chen, H. Challenges and Corresponding Solutions of Generative Adversarial Networks (GANs): A Survey Study. J. Phys. Conf. Ser. 2021, 1827, 012066. [Google Scholar] [CrossRef]
- Kingma, D.P.; Welling, M. Auto-encoding variational bayes. arXiv 2013, arXiv:1312.6114. [Google Scholar]
- Ho, J.; Jain, A.; Abbeel, P. Denoising diffusion probabilistic models. Adv. Neural Inf. Process. Syst. 2020, 33, 6840–6851. [Google Scholar]
- Rombach, R.; Blattmann, A.; Lorenz, D.; Esser, P.; Ommer, B. High-Resolution Image Synthesis with Latent Diffusion Models. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; pp. 10674–10685. [Google Scholar]
- Zhou, T.; Chen, X.; Shen, Y.; Nieuwoudt, M.; Pun, C.M.; Wang, S. Generative AI Enables EEG Data Augmentation for Alzheimer’s Disease Detection Via Diffusion Model. In Proceedings of the 2023 IEEE International Symposium on Product Compliance Engineering—Asia (ISPCE-ASIA), Shanghai, China, 4–5 November 2023; pp. 1–6. [Google Scholar]
- Liu, L.Q.; Chen, B.W.; Chen, H.; Zou, Z.X.; Shi, Z.W. Diverse Hyperspectral Remote Sensing Image Synthesis With Diffusion Models. IEEE Trans. Geosci. Remote. Sens. 2023, 61, 1–16. [Google Scholar] [CrossRef]
- Hu, J.E.; Shen, Y.; Wallis, P.; Allen-Zhu, Z.; Li, Y.; Wang, S.; Chen, W. LoRA: Low-Rank Adaptation of Large Language Models. arXiv 2021, arXiv:2106.09685. [Google Scholar]
- Zhang, L.; Rao, A.; Agrawala, M. Adding Conditional Control to Text-to-Image Diffusion Models. In Proceedings of the 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October2023; pp. 3813–3824. [Google Scholar]
- Wang, K.; Gou, C.; Duan, Y.; Lin, Y.; Zheng, X.; Wang, F.Y. Generative adversarial networks: Introduction and outlook. IEEE/Caa J. Autom. Sin. 2017, 4, 588–598. [Google Scholar] [CrossRef]
- Tomczak, J.M.; Welling, M. VAE with a VampPrior. arXiv 2017, arXiv:1705.07120. [Google Scholar]
- da Silva Abade, A.; Faria Porto, L.; Afonso Ferreira, P.; de Barros Vidal, F. Nemanet: A convolutional neural network model for identification of nematodes soybean crop in brazil. arXiv 2021, arXiv:2103.03717. [Google Scholar]
- Chen, D.; Qi, X.; Zheng, Y.; Lu, Y.; Huang, Y.; Li, Z. Synthetic data augmentation by diffusion probabilistic models to enhance weed recognition. Comput. Electron. Agric. 2024, 216, 108517. [Google Scholar] [CrossRef]
- Ding, N.; Qin, Y.; Yang, G.; Wei, F.; Yang, Z.; Su, Y.; Hu, S.; Chen, Y.; Chan, C.M.; Chen, W.; et al. Parameter-efficient fine-tuning of large-scale pre-trained language models. Nat. Mach. Intell. 2023, 5, 220–235. [Google Scholar] [CrossRef]
- Ruiz, N.; Li, Y.; Jampani, V.; Pritch, Y.; Rubinstein, M.; Aberman, K. DreamBooth: Fine Tuning Text-to-Image Diffusion Models for Subject-Driven Generation. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2022; pp. 22500–22510. [Google Scholar]
- Gal, R.; Alaluf, Y.; Atzmon, Y.; Patashnik, O.; Bermano, A.H.; Chechik, G.; Cohen-Or, D. An image is worth one word: Personalizing text-to-image generation using textual inversion. arXiv 2022, arXiv:2208.01618. [Google Scholar]
- Keys, R. Cubic convolution interpolation for digital image processing. IEEE Trans. Acoust. Speech Signal Process. 1981, 29, 1153–1160. [Google Scholar] [CrossRef]
- Canny, J. A Computational Approach to Edge Detection. IEEE Trans. Pattern Anal. Mach. Intell. 1986, PAMI-8, 679–698. [Google Scholar] [CrossRef]
- Xie, S.; Tu, Z. Holistically-Nested Edge Detection. Int. J. Comput. Vis. 2015, 125, 3–18. [Google Scholar] [CrossRef]
- Su, Z.; Liu, W.; Yu, Z.; Hu, D.; Liao, Q.; Tian, Q.; Pietikäinen, M.; Liu, L. Pixel Difference Networks for Efficient Edge Detection. In Proceedings of the 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 11–17 October 2021; pp. 5097–5107. [Google Scholar]
- Schaefer, S.; McPhail, T.; Warren, J.D. Image deformation using moving least squares. ACM Trans. Graph. 2006, 25, 533–540. [Google Scholar] [CrossRef]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. arXiv 2015, arXiv:1505.04597. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning Transferable Visual Models From Natural Language Supervision. In Proceedings of the International Conference on Machine Learning, Online, 18–24 July 2021. [Google Scholar]
- Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. arXiv 2017, arXiv:1706.03762. [Google Scholar]
- Schuhmann, C.; Beaumont, R.; Vencu, R.; Gordon, C.; Wightman, R.; Cherti, M.; Coombes, T.; Katta, A.; Mullis, C.; Wortsman, M.; et al. LAION-5B: An open large-scale dataset for training next generation image-text models. arXiv 2022, arXiv:2210.08402. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- He, K.; Zhang, X.; Ren, S.; Sun, J. Deep Residual Learning for Image Recognition. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]
- Huang, G.; Liu, Z.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2016; pp. 2261–2269. [Google Scholar]
- Tan, M.; Le, Q.V. EfficientNetV2: Smaller Models and Faster Training. In Proceedings of the International Conference on Machine Learning, Virtual, 18–24 July 2021. [Google Scholar]
- Zhang, H.; Wu, C.; Zhang, Z.; Zhu, Y.; Zhang, Z.L.; Lin, H.; Sun, Y.; He, T.; Mueller, J.W.; Manmatha, R.; et al. ResNeSt: Split-Attention Networks. In Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, LA, USA, 19–20 June 2022; pp. 2735–2745. [Google Scholar]
- Ding, X.; Zhang, X.; Ma, N.; Han, J.; Ding, G.; Sun, J. RepVGG: Making VGG-style ConvNets Great Again. In Proceedings of the 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 13728–13737. [Google Scholar]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Proceedings of the Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017. [Google Scholar]
- Salimans, T.; Goodfellow, I.J.; Zaremba, W.; Cheung, V.; Radford, A.; Chen, X. Improved Techniques for Training GANs. arXiv 2016, arXiv:1606.03498. [Google Scholar]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.E.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
Dataset | Class | Number of Images | Resolution Pixel | DPI |
---|---|---|---|---|
UNBnema5 | Meloidogyne | 600 | 1388 × 1040 | 150 |
Pratylenchus | 280 | |||
Xiphinema | 270 | |||
Trichodorus | 200 | |||
Longidorus | 200 | |||
Bnema5 | Meloidogyne | 200 | 1388 × 1040 | 150 |
Pratylenchus | 200 | |||
Xiphinema | 200 | |||
Trichodorus | 200 | |||
Longidorus | 200 |
Model | Top-1 acc (%) | Precision (%) | Recall (%) | |
---|---|---|---|---|
original | ResNet-50 | 54.67 | 57.37 | 54.67 |
DenseNet-121 | 52.00 | 54.34 | 52.00 | |
EfficientNet-V2-m | 54.00 | 57.23 | 54.00 | |
ResNeSt-50 | 58.00 | 61.73 | 58.00 | |
Repvgg-B1g2 | 56.33 | 56.65 | 56.33 | |
RandAugment | ResNet-50 | 67.33 | 67.03 | 67.33 |
DenseNet-121 | 59.33 | 62.52 | 59.33 | |
EfficientNet-V2-m | 64.00 | 67.59 | 64.00 | |
ResNeSt-50 | 60.33 | 60.30 | 60.33 | |
Repvgg-B1g2 | 59.67 | 60.00 | 59.67 | |
ours | ResNet-50 | 69.33 | 72.81 | 69.33 |
DenseNet-121 | 65.00 | 67.22 | 65.00 | |
EfficientNet-V2-m | 66.67 | 72.13 | 66.67 | |
ResNeSt-50 | 66.33 | 69.29 | 66.33 | |
Repvgg-B1g2 | 63.67 | 66.73 | 63.67 |
Model | Proportion of Data Generated in the Training Set | ||||
---|---|---|---|---|---|
0% | 25% | 50% | 75% | 100% | |
ResNet-50 | 67.00 | 74.00 | 67.00 | 59.00 | 41.00 |
DenseNet-121 | 60.00 | 59.00 | 60.00 | 51.00 | 41.00 |
EfficientNet-V2-m | 68.00 | 68.00 | 69.00 | 56.00 | 42.00 |
ResNeSt-50 | 74.00 | 69.00 | 69.00 | 57.00 | 43.00 |
Repvgg-B1g2 | 65.00 | 63.00 | 63.00 | 47.00 | 38.00 |
Constraints | FID | IS |
---|---|---|
LDM | 25.37 | 1.24 ± 0.036 |
Canny | 20.74 | 1.22 ± 0.082 |
Ours | 12.95 | 1.21 ± 0.057 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Ouyang, X.; Zhuang, J.; Gu, J.; Ye, S. Few-Shot Data Augmentation by Morphology-Constrained Latent Diffusion for Enhanced Nematode Recognition. Computers 2025, 14, 198. https://doi.org/10.3390/computers14050198
Ouyang X, Zhuang J, Gu J, Ye S. Few-Shot Data Augmentation by Morphology-Constrained Latent Diffusion for Enhanced Nematode Recognition. Computers. 2025; 14(5):198. https://doi.org/10.3390/computers14050198
Chicago/Turabian StyleOuyang, Xiong, Jiayan Zhuang, Jianfeng Gu, and Sichao Ye. 2025. "Few-Shot Data Augmentation by Morphology-Constrained Latent Diffusion for Enhanced Nematode Recognition" Computers 14, no. 5: 198. https://doi.org/10.3390/computers14050198
APA StyleOuyang, X., Zhuang, J., Gu, J., & Ye, S. (2025). Few-Shot Data Augmentation by Morphology-Constrained Latent Diffusion for Enhanced Nematode Recognition. Computers, 14(5), 198. https://doi.org/10.3390/computers14050198