Multi-Scale Cross-Domain Augmentation of Tea Datasets via Enhanced Cycle Adversarial Networks
Abstract
1. Introduction
- (1)
- Complex Backgrounds: Traditional single-domain GANs struggle with open-air tea garden backgrounds. While CycleGAN shows promise in style transfer, its multi-scale feature extraction and discriminator capabilities require enhancement.
- (2)
- Multi-Objective Preservation: Existing agricultural GANs focus on single-object (e.g., diseased leaves) style transfer. Tea shoot detection demands preserving both complex backgrounds and multi-object bounding boxes, an understudied area.
- (3)
- Cultivar-Specific Generalization: No systematic method exists for cross-cultivar style transfer to address cultivar-specific model limitations.
- (1)
- Tea CycleGAN Architecture: Incorporates SKNet modules into the generator for multi-scale feature fusion and enhances discriminator depth to improve complex scene parsing.
- (2)
- Hierarchical Style Transfer: Implements global–local collaborative training to preserve multi-scale features, generating high-fidelity synthetic datasets.
- (3)
- Cross-Day–Night and Cross-Variety Framework: Develops a systematic data augmentation pipeline validated against state-of-the-art methods to support cross-Day–Night detection.
- (1)
- The proposed Tea CycleGAN architecture, with enhanced multi-scale feature fusion (via SKNet) and a deeper discriminator, will significantly improve the generation of realistic synthetic tea shoot images under both day and night conditions, particularly in complex open-air backgrounds compared to traditional methods;
- (2)
- The hierarchical style transfer strategy will effectively preserve both the global scene context and local object details (including bounding box integrity for multi-shot scenarios) during domain adaptation;
- (3)
- The systematic cross-Day–Night and cross-variety data augmentation pipeline will demonstrably enhance the generalization capability and detection accuracy of deep learning models for previously unseen cultivars and lighting conditions, surpassing the performance achievable with conventional augmentation methods and existing agricultural GANs. The remainder of this paper is structured as follows: Section 2 details the Tea CycleGAN architecture and hierarchical transfer method; Section 3 presents ablation studies and comparative analyses; Section 4 concludes the work and outlines future directions in dynamic domain adaptation.
2. Materials and Methods
2.1. Overall Research Plan
- Data collection and preprocessing: Raw images of two tea cultivars (Longjing 43 and Zhongcha 108) were collected under daytime and nighttime conditions. Tea shoot positions were annotated, and datasets were split into training/validation sets to support subsequent model training.
- Tea CycleGAN design: An improved CycleGAN architecture was developed, integrating SKConv for multi-scale feature fusion and a “restoration-paste” strategy to preserve spatial consistency. Loss functions (cycle consistency, adversarial loss) and training strategies (learning rate scheduling, gradient clipping) were tailored for agricultural scenarios.
- Cross-domain augmentation: The trained Tea CycleGAN generated augmented images across domains: cross-cultivar (e.g., Longjing 43→Zhongcha 108 style) and cross-Day–Night (e.g., daytime→nighttime style), expanding the dataset to cover underrepresented scenarios.
- Performance validation: Synthesis quality was evaluated via FID (distribution similarity) and MMD (feature alignment). Downstream detection performance was assessed using YOLOv7, with metrics including mAP, Precision, and Recall, to verify if augmented data enhances model generalization.
- Comparative analysis: Tea CycleGAN was compared with traditional augmentation methods (Mosaic, Mixup) and state-of-the-art GANs (CycleGAN, DCGAN) in terms of synthesis quality and detection improvement, validating its superiority.
2.2. Data Collection
2.3. Model
2.3.1. CycleGAN
2.3.2. Generator Combined with SkNet Attention Mechanism
2.3.3. Enhanced Discriminator Architecture
2.4. Multi-Scale Style Transfer-Based Data Augmentation Framework
Input: original_image (600 × 600), bounding_boxes (list of (x, y, w, h)), generator_full (trained for 600 × 600 style transfer), generator_shoot (trained for 64 × 64 shoot transfer) Output: augmented_image (600 × 600 with style-transferred background and shoots) 1. # Extract real shoots and record coordinates real_shoots = [] shoot_coords = [] for bbox in bounding_boxes: x, y, w, h = bbox shoot = original_image [y:y + h, x:x + w] # Crop shoot from original image real_shoots.append (shoot) shoot_coords.append ((x, y, w, h)) 2. # Generate style-transferred background and shoots transferred_background = generator_full (original_image) # Global style transfer transferred_shoots = [] for shoot in real_shoots: resized_shoot = resize (shoot, (64, 64)) # Resize to model input size transferred_shoot_64 = generator_shoot (resized_shoot) # Shoot style transfer transferred_shoots.append (transferred_shoot_64) 3. # Paste transferred shoots to original positions with edge blending augmented_image = transferred_background.copy () for i in range (len(transferred_shoots)): x, y, w, h = shoot_coords [i] # Resize transferred shoot back to original bounding box size transferred_shoot = resize (transferred_shoots [i], (h, w)) # Create Gaussian mask for edge blending mask = create_gaussian_mask ((h, w), kernel_size = 3, sigma = 0.5) # Blend transferred shoot with background augmented_image [y:y + h, x:x + w] = (transferred_shoot ∗ mask) + \ (augmented_image [y:y + h, x:x + w] ∗ (1 − mask)) Return augmented_image |
2.5. Classical Object Detection Network
2.6. Experimental Setup
2.6.1. Training Environment
2.6.2. Training Parameters
2.6.3. Evaluation Metrics
3. Results and Discussion
3.1. Comparative Evaluation of GANs
3.2. Influence of Shoot Placement Methods on Virtual Dataset Effectiveness
3.3. Comparative Testing of Data Augmentation Methods
3.4. t-SNE Visual Embedding Analysis for Domain Alignment
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
References
- Chan, S.P.; Yong, P.Z.; Sun, Y.; Mahendran, R.; Wong, J.C.M.; Qiu, C.; Ng, T.P.; Kua, E.H.; Feng, L. Associations of long-term tea consumption with depressive and anxiety symptoms in community-living elderly: Findings from the Diet and Healthy Aging Study. J. Prev. Alzheimer’s Dis. 2018, 5, 21–25. [Google Scholar] [CrossRef]
- Li, L.; Wen, M.; Hu, W.; Huang, X.; Li, W.; Han, Z.; Zhang, L. Non-volatile metabolite and in vitro bioactivity differences in green, white, and black teas. Food Chem. 2025, 477, 143580. [Google Scholar] [CrossRef]
- Wang, Y.; Li, L.; Liu, Y.; Cui, Q.; Ning, J.; Zhang, Z. Enhanced quality monitoring during black tea processing by the fusion of NIRS and computer vision. J. Food Eng. 2021, 304, 110599. [Google Scholar] [CrossRef]
- Lu, J.; Luo, H.; Yu, C.; Liang, X.; Huang, J.; Wu, H.; Wang, L.; Yang, C. Tea bud DG: A lightweight tea bud detection model based on dynamic detection head and adaptive loss function. Comput. Electron. Agric. 2024, 227, 109522. [Google Scholar] [CrossRef]
- Chen, C.; Lu, J.; Zhou, M.; Yi, J.; Liao, M.; Gao, Z. A YOLOv3-based computer vision system for identification of tea buds and the picking point. Comput. Electron. Agric. 2022, 198, 107116. [Google Scholar] [CrossRef]
- Zhang, L.; Zou, L.; Wu, C.; Jia, J.; Chen, J. Method of famous tea sprout identification and segmentation based on improved watershed algorithm. Comput. Electron. Agric. 2021, 184, 106108. [Google Scholar] [CrossRef]
- Yu, T.J.; Chen, J.N.; Chen, Z.W.; Li, Y.T.; Tong, J.H.; Du, X.Q. DMT: A model detecting multispecies of tea buds in multi-seasons. Int. J. Agric. Biol. Eng. 2024, 17, 199–208. [Google Scholar] [CrossRef]
- Li, Y.; He, L.; Jia, J.; Lv, J.; Chen, J.; Qiao, X.; Wu, C. In-field tea shoot detection and 3D localization using an RGB-D camera. Comput. Electron. Agric. 2021, 185, 106149. [Google Scholar] [CrossRef]
- Wang, X.; Wu, Z.; Fang, C. TeaPoseNet: A deep neural network for tea leaf pose recognition. Comput. Electron. Agric. 2024, 225, 109278. [Google Scholar] [CrossRef]
- Xu, W.; Zhao, L.; Li, J.; Shang, S.; Ding, X.; Wang, T. Detection and classification of tea buds based on deep learning. Comput. Electron. Agric. 2022, 192, 106547. [Google Scholar] [CrossRef]
- Nishad, P.; Chezian, R. Various colour spaces and colour space conversion algorithms. J. Glob. Res. Comput. Sci. 2013, 4, 44–48. [Google Scholar]
- Bochkovskiy, A.; Wang, C.-Y.; Liao, H.-Y.M. YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv 2020, arXiv:2004.10934. [Google Scholar] [CrossRef]
- Zhang, H.; Cisse, M.; Dauphin, Y.N.; Lopez-Paz, D. mixup: Beyond Empirical Risk Minimization. arXiv 2017, arXiv:1710.09412. [Google Scholar] [CrossRef]
- Yun, S.; Han, D.; Oh, S.J.; Chun, S.; Choe, J.; Yoo, Y. CutMix: Regularization Strategy to Train Strong Classifiers with Localizable Features. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Republic of Korea, 27 October–2 November 2019; pp. 6023–6032. [Google Scholar] [CrossRef]
- Gui, Z.; Chen, J.; Li, Y.; Chen, Z.; Wu, C.; Dong, C. A lightweight tea bud detection model based on Yolov5. Comput. Electron. Agric. 2023, 205, 107636. [Google Scholar] [CrossRef]
- Wu, Y.; Chen, J.; Wu, S.; Li, H.; He, L.; Zhao, R.; Wu, C. An improved YOLOv7 network using RGB-D multi-modal feature fusion for tea shoots detection. Comput. Electron. Agric. 2024, 216, 108541. [Google Scholar] [CrossRef]
- Krichen, M. Generative Adversarial Networks. In Proceedings of the 2023 14th International Conference on Computing Communication and Networking Technologies (ICCCNT), Delhi, India, 6–8 July 2023; pp. 1–7. [Google Scholar] [CrossRef]
- Xiao, D.; Zeng, R.; Liu, Y.; Huang, Y.; Liu, J.; Feng, J.; Zhang, X. Citrus greening disease recognition algorithm based on classification network using TRL-GAN. Comput. Electron. Agric. 2022, 200, 107206. [Google Scholar] [CrossRef]
- Zhu, J.-Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 2242–2251. [Google Scholar] [CrossRef]
- Espejo-Garcia, B.; Mylonas, N.; Athanasakos, L.; Vali, E.; Fountas, S. Combining generative adversarial networks and agricultural transfer learning for weeds identification. Biosyst. Eng. 2021, 204, 79–89. [Google Scholar] [CrossRef]
- Yang, X.; Guo, M.; Lyu, Q.; Ma, M. Detection and classification of damaged wheat kernels based on progressive neural architecture search. Biosyst. Eng. 2021, 208, 176–185. [Google Scholar] [CrossRef]
- Cang, H.; Yan, T.; Duan, L.; Yan, J.; Zhang, Y.; Tan, F.; Lv, X.; Gao, P. Jujube quality grading using a generative adversarial network with an imbalanced data set. Biosyst. Eng. 2023, 236, 224–237. [Google Scholar] [CrossRef]
- Cap, Q.H.; Uga, H.; Kagiwada, S.; Iyatomi, H. LeafGAN: An Effective Data Augmentation Method for Practical Plant Disease Diagnosis. IEEE Trans. Autom. Sci. Eng. 2022, 19, 1258–1267. [Google Scholar] [CrossRef]
- Li, X.; Li, X.; Zhang, M.; Dong, Q.; Zhang, G.; Wang, Z.; Wei, P. SugarcaneGAN: A novel dataset generating approach for sugarcane leaf diseases based on lightweight hybrid CNN-Transformer network. Comput. Electron. Agric. 2024, 219, 108762. [Google Scholar] [CrossRef]
- Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, QC, Canada, 10–17 October 2021; pp. 10012–10022. [Google Scholar]
- Zeng, S.; Zhang, H.; Chen, Y.; Sheng, Z.; Kang, Z.; Li, H. Swgan: A new algorithm of adhesive rice image segmentation based on improved generative adversarial networks. Comput. Electron. Agric. 2023, 213, 108226. [Google Scholar] [CrossRef]
- Madsen, S.L.; Dyrmann, M.; Jørgensen, R.N.; Karstoft, H. Generating artificial images of plant seedlings using generative adversarial networks. Biosyst. Eng. 2019, 187, 147–159. [Google Scholar] [CrossRef]
- Egusquiza, I.; Benito-Del-Valle, L.; Picón, A.; Bereciartua-Pérez, A.; Gómez-Zamanillo, L.; Elola, A.; Aramendi, E.; Espejo, R.; Eggers, T.; Klukas, C.; et al. When synthetic plants get sick: Disease graded image datasets by novel regression-conditional diffusion models. Comput. Electron. Agric. 2025, 229, 109690. [Google Scholar] [CrossRef]
- Raya-González, L.E.; Alcántar-Camarena, V.A.; Saldaña-Robles, A.; Duque-Vazquez, E.F.; Tapia-Tinoco, G.; Saldaña-Robles, N. High-precision prototype for garlic apex reorientation based on artificial intelligence models. Comput. Electron. Agric. 2025, 235, 110375. [Google Scholar] [CrossRef]
- Abbas, A.; Jain, S.; Gour, M.; Vankudothu, S. Tomato plant disease detection using transfer learning with C-GAN synthetic images. Comput. Electron. Agric. 2021, 187, 106279. [Google Scholar] [CrossRef]
- Lacerda, C.F.; Ampatzidis, Y.; Costa Neto, A.d.O.; Partel, V. Cost-efficient high-resolution monitoring for specialty crops using AgI-GAN and AI-driven analytics. Comput. Electron. Agric. 2025, 237 Pt B, 110678. [Google Scholar] [CrossRef]
- Krestenitis, M.; Ioannidis, K.; Vrochidis, S.; Kompatsiaris, I. Visual to near-infrared image translation for precision agriculture operations using GANs and aerial images. Comput. Electron. Agric. 2025, 237 Pt C, 110720. [Google Scholar] [CrossRef]
- Afzal Maken, F.; Muthu, S.; Nguyen, C.; Sun, C.; Tong, J.; Wang, S.; Tsuchida, R.; Howard, D.; Dunstall, S.; Petersson, L. Improving 3D Reconstruction Through RGB-D Sensor Noise Modeling. Sensors 2025, 25, 950. [Google Scholar] [CrossRef]
- Li, X.; Wang, W.; Hu, X.; Yang, J. Selective Kernel Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, CA, USA, June 16–20 2019; pp. 510–519. [Google Scholar] [CrossRef]
- Wang, C.-Y.; Bochkovskiy, A.; Liao, H.-Y.M. YOLOv7: Trainable Bag-of-Freebies Sets New State-of-the-Art for Real-Time Object Detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 18–22 June 2023; pp. 7464–7475. [Google Scholar] [CrossRef]
- Heusel, M.; Ramsauer, H.; Unterthiner, T.; Nessler, B.; Hochreiter, S. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. arXiv 2017, arXiv:1706.08500. [Google Scholar] [CrossRef]
- Radford, A.; Metz, L.; Chintala, S. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks. In Proceedings of the International Conference on Learning Representations (ICLR), San Juan, Puerto Rico, 2–4 May 2016. [Google Scholar] [CrossRef]
- Arjovsky, M.; Chintala, S.; Bottou, L. Wasserstein GAN. arXiv 2017, arXiv:1701.07875. [Google Scholar] [PubMed]
- Rana, S.; Gatti, M. Comparative evaluation of modified Wasserstein GAN-GP and state-of-the-art GAN models for synthesizing agricultural weed images in RGB and infrared domain. MethodsX 2025, 14, 103309. [Google Scholar] [CrossRef] [PubMed]
Cultivar | Total Quantity | Detection Network Training Dataset | Detection Network Test Dataset | Generative Adversarial Network Training Dataset | Generative Adversarial Network Test Dataset |
---|---|---|---|---|---|
LJ43 | 1000 | 500 | 100 | 300 | 100 |
ZC108 | 1000 | 500 | 100 | 300 | 100 |
Model | FID (Mean ± SD) | 95% CI for FID | MMD (Mean ± SD) | 95% CI for MMD |
---|---|---|---|---|
Tea CycleGAN | 42.26 ± 1.43 | [41.08, 43.44] | 0.02241 ± 0.0015 | [0.0212, 0.0236] |
CycleGAN + SKConv | 47.32 ± 1.69 | [45.98, 48.66] | 0.03263 ± 0.0021 | [0.0309, 0.0343] |
CycleGAN + Improved discriminator | 53.57 ± 2.01 | [51.95, 55.19] | 0.05942 ± 0.0028 | [0.0572, 0.0616] |
CycleGAN | 75.38 ± 2.56 | [73.35, 77.41] | 0.06137 ± 0.0032 | [0.0588, 0.0639] |
DCGAN | 131.98 ± 3.32 | [129.35, 134.61] | 0.07086 ± 0.0035 | [0.0680, 0.0737] |
WGAN | 254.76 ± 4.53 | [251.01, 258.51] | 0.10245 ± 0.0048 | [0.0985, 0.1064] |
Model | FID (Mean ± SD) | 95% CI for FID | MMD (Mean ± SD) | 95% CI for MMD |
---|---|---|---|---|
Tea CycleGAN | 26.75 ± 1.03 | [25.95, 27.55] | 0.02452 ± 0.0011 | [0.0237, 0.0253] |
CycleGAN + SKConv | 32.79 ± 1.26 | [31.82, 33.76] | 0.02529 ± 0.0012 | [0.0244, 0.0262] |
CycleGAN + Improved discriminator | 44.23 ± 1.61 | [42.98, 45.48] | 0.02538 ± 0.0012 | [0.0245, 0.0262] |
CycleGAN | 57.51 ± 1.94 | [55.98, 59.04] | 0.02845 ± 0.0014 | [0.0274, 0.0295] |
DCGAN | 67.32 ± 2.21 | [65.59, 69.05] | 0.04767 ± 0.0024 | [0.0458, 0.0495] |
WGAN | 82.25 ± 2.47 | [80.32, 84.18] | 0.04851 ± 0.0026 | [0.0465, 0.0505] |
Model | Generated Image |
---|---|
Real Zhongcha 108 | |
Real Longjing 43 | |
Tea CycleGAN | |
CycleGAN + SKConv | |
CycleGAN + Improved discriminator | |
CycleGAN | |
DCGAN | |
WGAN |
Model | Generated Image |
---|---|
Real Zhongcha 108 | |
Real Longjing 43 | |
Tea CycleGAN | |
CycleGAN + SKConv | |
CycleGAN + Improved discriminator | |
CycleGAN | |
DCGAN | |
WGAN |
Placement Method | Restoration-Paste | Probability-Based | Random |
---|---|---|---|
mAP | 83.54 | 69.19 | 61.48 |
P | 86.03 | 66.69 | 51.59 |
Recall | 55.18 | 43.16 | 40.49 |
Placement Method | Restoration-Paste | Probability-Based | Random |
---|---|---|---|
Generated Image |
Data Augmentation Method | mAP | P | Recall |
---|---|---|---|
Ours + Realdata | 83.54 | 86.03 | 55.18 |
Mosaic + Realdata | 80.13 | 83.52 | 50.92 |
Mixup + Realdata | 78.93 | 81.49 | 48.74 |
None + Realdata | 75.21 | 79.12 | 46.20 |
original data | 73.94 | 75.25 | 45.25 |
Data Augmentation Method | Generated Image |
---|---|
Ours | |
Mosaic | |
Mixup | |
None | |
Original data |
Mean Sigma | KL Divergence | |
---|---|---|
Real LJ43 to real ZC108 | 3.406924 | 0.621378 |
fake LJ43 to fake ZC108 | 2.837938 | 0.532295 |
Real LJ43 to Real LJ43 | 2.291254 | 0.410419 |
Real LJ43 to fake LJ43 | 2.755302 | 0.490285 |
real ZC108 to real ZC108 | 2.412863 | 0.345618 |
Real ZC108 to fake ZC108 | 2.613834 | 0.389483 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Yu, T.; Chen, J.; Gui, Z.; Jia, J.; Li, Y.; Yu, C.; Wu, C. Multi-Scale Cross-Domain Augmentation of Tea Datasets via Enhanced Cycle Adversarial Networks. Agriculture 2025, 15, 1739. https://doi.org/10.3390/agriculture15161739
Yu T, Chen J, Gui Z, Jia J, Li Y, Yu C, Wu C. Multi-Scale Cross-Domain Augmentation of Tea Datasets via Enhanced Cycle Adversarial Networks. Agriculture. 2025; 15(16):1739. https://doi.org/10.3390/agriculture15161739
Chicago/Turabian StyleYu, Taojie, Jianneng Chen, Zhiyong Gui, Jiangming Jia, Yatao Li, Chennan Yu, and Chuanyu Wu. 2025. "Multi-Scale Cross-Domain Augmentation of Tea Datasets via Enhanced Cycle Adversarial Networks" Agriculture 15, no. 16: 1739. https://doi.org/10.3390/agriculture15161739
APA StyleYu, T., Chen, J., Gui, Z., Jia, J., Li, Y., Yu, C., & Wu, C. (2025). Multi-Scale Cross-Domain Augmentation of Tea Datasets via Enhanced Cycle Adversarial Networks. Agriculture, 15(16), 1739. https://doi.org/10.3390/agriculture15161739