Unseen-Crop Plant Disease Classification via Disentangled Representation Learning
Abstract
1. Introduction
- The cross-crop plant disease recognition task is systematically defined from a domain generalization perspective. Under the setting where crop domains are unseen but disease categories are seen, disease-specific domain-invariant representations are learned, substantially reducing the reliance on crop-appearance cues and enabling disease-semantic-centered disease classification across plant species, thereby markedly improving generalization in multi-plant disease classification.
- A text-guided semantic anchoring module is introduced. Disease and crop textual prompts constrain the disease and crop branches, respectively, mapping visual disease features into a shared, semantically invariant concept space. This promotes the structured separation of semantic and domain factors and improves cross-domain transferability and interpretability.
- A semantic-anchor-only contrastive disentanglement strategy is proposed. Within a mixed label space, same-disease aggregation across crops and different-disease separation are strengthened, while semantic–domain coupling and information leakage are explicitly weakened by treating crop-branch representations as stop-gradient hard negatives, thereby enhancing robust recognition in unseen crop domains.
2. Related Work
2.1. Disease Recognition in Unseen Crop Domains
2.2. Disentangled Representation Learning for Cross-Domain Robustness
2.3. Vision–Language Multimodality and Textual Semantic Anchors
3. Materials and Methods
3.1. Problem Formulation
3.2. Segment–Raw Cross-View Contrastive Enhancement and Background Randomization
3.3. Multimodal Feature Extraction

3.4. Disentangled Representation Learning
3.4.1. CLIP Text-Guided Feature Disentanglement
3.4.2. Semantic-Anchor-Only Contrastive Disentanglement Strategy
3.4.3. Domain-Adversarial Semantic-Domain Disentanglement
3.5. Category-Prototype-Based Fine-Tuning for Classification
3.6. Dataset
4. Analysis of Experimental Results
4.1. Experimental Setup and Implementation Details
4.2. Comparative Evaluation
4.2.1. Performance Analysis on Seen and Unseen Domains
4.2.2. Parameter Efficiency and Inference Overhead
4.2.3. Cross-Dataset Generalization and Distribution Robustness Evaluation
4.3. Ablation Studies
4.4. Representation Visualization
4.5. Semantic Invariance and Disentanglement Verification
5. Discussion
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
References
- Savary, S.; Willocquet, L.; Pethybridge, S.J.; Esker, P.; McRoberts, N.; Nelson, A. The global burden of pathogens and pests on major food crops. Nat. Ecol. Evol. 2019, 3, 430–439. [Google Scholar] [CrossRef]
- Gai, Y.; Wang, H. Plant disease: A growing threat to global food security. Agronomy 2024, 14, 1615. [Google Scholar] [CrossRef]
- Bock, C.H.; Barbedo, J.G.; Del Ponte, E.M.; Bohnenkamp, D.; Mahlein, A.K. From visual estimates to fully automated sensor-based measurements of plant disease severity: Status and challenges for improving accuracy. Phytopathol. Res. 2020, 2, 9. [Google Scholar] [CrossRef]
- Upadhyay, A.; Chandel, N.S.; Singh, K.P.; Chakraborty, S.K.; Nandede, B.M.; Kumar, M.; Subeesh, A.; Upendar, K.; Salem, A.; Elbeltagi, A. Deep learning and computer vision in plant disease detection: A comprehensive review of techniques, models, and trends in precision agriculture. Artif. Intell. Rev. 2025, 58, 92. [Google Scholar] [CrossRef]
- Ebrahimi, M.A.; Khoshtaghaza, M.H.; Minaei, S.; Jamshidi, B. Vision-based pest detection based on SVM classification method. Comput. Electron. Agric. 2017, 137, 52–58. [Google Scholar] [CrossRef]
- Ashurov, A.Y.; Al-Gaashani, M.S.A.M.; Samee, N.A.; Alkanhel, R.; Atteia, G.; Abdallah, H.A.; Saleh Ali Muthanna, M. Enhancing plant disease detection through deep learning: A Depthwise CNN with squeeze and excitation integration and residual skip connections. Front. Plant Sci. 2025, 15, 1505857. [Google Scholar] [CrossRef]
- George, R.; Thuseethan, S.; Ragel, R.G.; Mahendrakumaran, K.; Nimishan, S.; Wimalasooriya, C.; Alazab, M. Past, present and future of deep plant leaf disease recognition: A survey. Comput. Electron. Agric. 2025, 234, 110128. [Google Scholar] [CrossRef]
- Mohanty, S.P.; Hughes, D.P.; Salathé, M. Using deep learning for image-based plant disease detection. Front. Plant Sci. 2016, 7, 215232. [Google Scholar] [CrossRef]
- Karthikeyan, M.; Raja, D. Deep transfer learning enabled DenseNet model for content based image retrieval in agricultural plant disease images. Multimed. Tools Appl. 2023, 82, 36067–36090. [Google Scholar] [CrossRef]
- Eunice, J.; Popescu, D.E.; Chowdary, M.K.; Hemanth, J. Deep learning-based leaf disease detection in crops using images for agricultural applications. Agronomy 2022, 12, 2395. [Google Scholar] [CrossRef]
- Zhou, K.; Liu, Z.; Qiao, Y.; Xiang, T.; Loy, C.C. Domain generalization: A survey. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 4396–4415. [Google Scholar] [CrossRef]
- Wang, X.; Chen, H.; Tang, S.A.; Wu, Z.; Zhu, W. Disentangled representation learning. IEEE Trans. Pattern Anal. Mach. Intell. 2024, 46, 9677–9696. [Google Scholar] [CrossRef] [PubMed]
- Shoaib, M.; Shah, B.; Ei-Sappagh, S.; Ali, A.; Ullah, A.; Alenezi, F.; Gechev, T.; Hussain, T.; Ali, F. An advanced deep learning models-based plant disease detection: A review of recent research. Front. Plant Sci. 2023, 14, 1158933. [Google Scholar] [PubMed]
- Wu, X.; Fan, X.; Luo, P.; Choudhury, S.D.; Tjahjadi, T.; Hu, C. From laboratory to field: Unsupervised domain adaptation for plant disease recognition in the wild. Plant Phenomics 2023, 5, 0038. [Google Scholar] [CrossRef] [PubMed]
- Gao, X.; Feng, Q.; Wang, S.; Zhang, J.; Yang, S. A multi-source domain feature adaptation network for potato disease recognition in field environment. Front. Plant Sci. 2024, 15, 1471085. [Google Scholar] [CrossRef]
- Yang, S.; Feng, Q.; Zhang, J.; Yang, W.; Zhou, W.; Yan, W. From laboratory to field: Cross-domain few-shot learning for crop disease identification in the field. Front. Plant Sci. 2024, 15, 1434222. [Google Scholar] [CrossRef]
- Zhan, K.; Peng, Y.; Liao, M.; Wang, Y. Domain generalization plant leaf disease recognition: Toward from laboratory to field. Eng. Appl. Artif. Intell. 2025, 156, 111168. [Google Scholar] [CrossRef]
- Bouacida, I.; Farou, B.; Djakhdjakha, L.; Seridi, H.; Kurulay, M. Innovative deep learning approach for cross-crop plant disease detection: A generalized method for identifying unhealthy leaves. Inf. Process. Agric. 2025, 12, 54–67. [Google Scholar] [CrossRef]
- Kumar, P.; Mathew, J.; Sanodiya, R.K.; Setty, T.; Bhaskarla, B.P. Zero shot plant disease classification with semantic attributes. Artif. Intell. Rev. 2024, 57, 305. [Google Scholar] [CrossRef]
- Zhang, T.; Liang, K.; Du, R.; Sun, X.; Ma, Z.; Guo, J. Learning invariant visual representations for compositional zero-shot learning. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2022; pp. 339–355. [Google Scholar]
- Lu, X.; Guo, S.; Liu, Z.; Guo, J. Decomposed soft prompt guided fusion enhancing for compositional zero-shot learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2023; pp. 23560–23569. [Google Scholar]
- Chai, A.Y.H.; Lee, S.H.; Tay, F.S.; Then, Y.L.; Goëau, H.; Bonnet, P.; Joly, A. Pairwise feature learning for unseen plant disease recognition. In 2023 IEEE International Conference on Image Processing (ICIP); IEEE: Piscataway, NJ, USA, 2023; pp. 306–310. [Google Scholar]
- Chai, A.Y.H.; Lee, S.H.; Tay, F.S.; Bonnet, P.; Joly, A. Beyond supervision: Harnessing self-supervised learning in unseen plant disease recognition. Neurocomputing 2024, 610, 128608. [Google Scholar] [CrossRef]
- Peng, X.; Huang, Z.; Sun, X.; Saenko, K. Domain agnostic learning with disentangled representations. In Proceedings of the 36th International Conference on Machine Learning, PMLR 97, Long Beach, CA, USA, 9–15 June 2019; pp. 5102–5112. [Google Scholar]
- Lin, W.; Chu, J.; Leng, L.; Miao, J.; Wang, L. Feature disentanglement in one-stage object detection. Pattern Recognit. 2024, 145, 109878. [Google Scholar] [CrossRef]
- Zhang, A.; Wang, H.; Wang, X.; Chua, T.S. Disentangling Masked Autoencoders for Unsupervised Domain Generalization. In European Conference on Computer Vision; Springer Nature: Cham, Switzerland, 2024; pp. 126–151. [Google Scholar]
- Radford, A.; Kim, J.W.; Hallacy, C.; Ramesh, A.; Goh, G.; Agarwal, S.; Sastry, G.; Askell, A.; Mishkin, P.; Clark, J.; et al. Learning transferable visual models from natural language supervision. In Proceedings of the 38th International Conference on Machine Learning, Virtual, 18–24 July 2021; pp. 8748–8763. [Google Scholar]
- Huang, H.; Xia, Y.; Zhou, S.; Wang, H.; Wang, S.; Zhao, Z. Bridging domain generalization to multimodal domain generalization via unified representations. In Proceedings of the IEEE/CVF International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2025; pp. 22488–22498. [Google Scholar]
- Jiang, J.; He, Z.; Wan, A.; Al-Bukhaiti, K.; Wang, K.; Zhu, P.; Cheng, X. Zero-Shot Industrial Anomaly Detection via CLIP-DINOv2 Multimodal Fusion and Stabilized Attention Pooling. Electronics 2025, 14, 4785. [Google Scholar] [CrossRef]
- Zi, X.; Wu, C. DE-CLIP: Unsupervised Dense Counting Method Based on Multimodal Deep Sharing Prompts and Cross-Modal Alignment Ranking. Electronics 2025, 14, 1234. [Google Scholar] [CrossRef]
- Phan, V.M.H.; Xie, Y.; Qi, Y.; Liu, L.; Liu, L.; Zhang, B.; Liao, Z.; Wu, Q.; To, M.S.; Verjans, J.W. Decomposing disease descriptions for enhanced pathology detection: A multi-aspect vision-language pre-training framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition; IEEE: Piscataway, NJ, USA, 2024; pp. 11492–11501. [Google Scholar]
- Liaw, J.Z.; Chai, A.Y.H.; Lee, S.H.; Bonnet, P.; Joly, A. Can Language Improve Visual Features For Distinguishing Unseen Plant Diseases? In International Conference on Pattern Recognition; Springer Nature: Cham, Switzerland, 2024; pp. 296–311. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16×16 words: Transformers for image recognition at scale. In Proceedings of the International Conference on Learning Representations (ICLR), Virtual, 3–7 May 2021. [Google Scholar]
- Ma, W.; Li, S.; Zhang, J.; Liu, C.H.; Kang, J.; Wang, Y.; Huang, G. Borrowing knowledge from pre-trained language model: A new data-efficient visual learning paradigm. In Proceedings of the IEEE/CVF International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2023; pp. 18786–18797. [Google Scholar]
- Chen, H.; Zhang, Q.; Huang, Z.; Wang, H.; Zhao, J. Towards domain-specific features disentanglement for domain generalization. arXiv 2023, arXiv:2310.03007. [Google Scholar] [CrossRef]
- Ganin, Y.; Ustinova, E.; Ajakan, H.; Germain, P.; Larochelle, H.; Laviolette, F.; March, M.; Lempitsky, V. Domain-adversarial training of neural networks. J. Mach. Learn. Res. 2016, 17, 1–35. [Google Scholar]
- Movshovitz-Attias, Y.; Toshev, A.; Leung, T.K.; Ioffe, S.; Singh, S. No fuss distance metric learning using proxies. In Proceedings of the IEEE International Conference on Computer Vision; IEEE: Piscataway, NJ, USA, 2017; pp. 360–368. [Google Scholar]
- Snell, J.; Swersky, K.; Zemel, R. Prototypical networks for few-shot learning. In Proceedings of the Advances in Neural Information Processing Systems, Long Beach, CA, USA, 4–9 December 2017; Volume 30. [Google Scholar]
- Hughes, D.; Salathé, M. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv 2015, arXiv:1511.08060. [Google Scholar]
- Wang, J. Strawberry Powdery Mildew Image Dataset. Zenodo 2025. [Google Scholar] [CrossRef]
- Lee, S.H.; Goëau, H.; Bonnet, P.; Joly, A. Conditional multi-task learning for plant disease identification. In 2020 25th International Conference on Pattern Recognition (ICPR); IEEE: Piscataway, NJ, USA, 2021; pp. 3320–3327. [Google Scholar]
- Zhou, K.; Yang, J.; Loy, C.C.; Liu, Z. Conditional prompt learning for vision-language models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Piscataway, NJ, USA, 2022; pp. 16816–16825. [Google Scholar]
- Singh, D.; Jain, N.; Jain, P.; Kayal, P.; Kumawat, S.; Batra, N. PlantDoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD; Association for Computing Machinery: New York, NY, USA, 2020; pp. 249–253. [Google Scholar]





| Category | Crop Domain | With Background | Segmented |
|---|---|---|---|
| Bacterial spot | Pepper (bell) | 997 | 897 |
| Bacterial spot | Tomato | 2127 | 1914 |
| Bacterial spot | Peach | 2297 | 0 |
| Black rot | Apple | 621 | 559 |
| Black rot | Grape | 1180 | 1062 |
| Early blight | Potato | 1000 | 900 |
| Early blight | Tomato | 1000 | 900 |
| Late blight | Potato | 1000 | 900 |
| Late blight | Tomato | 1909 | 1718 |
| Powdery mildew | Cherry (including sour) | 1052 | 947 |
| Powdery mildew | Squash | 1835 | 1652 |
| Healthy | Apple | 1645 | 1481 |
| Healthy | Cherry (including sour) | 854 | 769 |
| Healthy | Grape | 423 | 381 |
| Healthy | Pepper (bell) | 1478 | 1330 |
| Healthy | Potato | 152 | 137 |
| Healthy | Tomato | 1591 | 1432 |
| Healthy | Peach | 360 | 0 |
| 6 | 8 | 21,521 | 19,369 |
| Environment | Parameters |
|---|---|
| GPU | NVIDIA GeForce RTX 4090 (24 G) |
| CPU | Intel(R) Core(TM) i9-14900K 3.20 GHz |
| Development | PyCharm 2024.3.4 |
| Language | Python 3.13.2 |
| Framework | PyTorch 2.6.0 |
| Operating platform | CUDA 12.4 |
| Operating System | Windows 11 Professional |
| Hyperparameter | Empirical Initialization | Search Space | Selected Value |
|---|---|---|---|
| Learning rate | 1 × 10−4 | {5 × 10−5, 1 × 10−4, 2 × 10−4, 5 × 10−4} | 1 × 10−4 |
| Batch size | 16 | {8, 16, 32} | 8 |
| Temperature | 0.07 | {0.03, 0.05, 0.07, 0.10, 0.15} | 0.07 |
| Text weight | 0.3 | {0.1, 0.3, 0.5, 0.8} | 0.3 |
| Da weight | 0.30 | {0.25, 0.30, 0.35, 0.40} | 0.35 |
| Epochs | 100 | {50, 100, 150} | 100 |
| Model | Seen (%) | Unseen (%) | HM (%) |
|---|---|---|---|
| ViT | 99.34 ± 0.1 | 24.78 ± 0.3 | 39.67 ± 0.3 |
| CMTL-ViT | 99.49 ± 0.2 | 31.72 ± 0.6 | 48.10 ± 0.6 |
| DADA | 97.76 ± 0.3 | 42.32 ± 0.4 | 59.07 ± 0.4 |
| FF-ViT | 99.53 ± 0.1 | 52.15 ± 0.3 | 68.44 ± 0.3 |
| CL-ViT | 99.31 ± 0.3 | 54.28 ± 0.4 | 70.19 ± 0.4 |
| CLIP | 98.59 ± 0.2 | 49.32 ± 0.3 | 65.75 ± 0.3 |
| CoCoOp | 99.36 ± 0.5 | 55.06 ± 0.6 | 70.86 ± 0.6 |
| FF-CLIP | 99.19 ± 0.2 | 69.35 ± 0.5 | 81.63 ± 0.4 |
| DisMAE | 98.56 ± 0.4 | 68.27 ± 0.8 | 80.67 ± 0.8 |
| TDC | 98.04 ± 0.4 | 74.29 ± 0.7 | 84.53 ± 0.6 |
| Model | CMTL–ViT | FF–ViT | CL–ViT | FF–CLIP | TDC |
|---|---|---|---|---|---|
| Total parameters (M) | 89 | 200 | 125 | 310 | 85 |
| Execution time (ms) | 1.91 | 4.69 | 3.43 | 4.97 | 1.89 |
| HM (%) | 48.10 | 68.44 | 70.19 | 81.63 | 84.53 |
| Dataset | Category | Unseen (%) |
|---|---|---|
| PV | Bacterial spot | 71.27 |
| PlantDoc | Bacterial spot | 61.11 |
| SP | Powdery mildew | 81.99 |
| SP_low | Powdery mildew | 76.31 |
| Dual-Branch Backbone | Contrastive | Text Guidance | Domain Adversarial | HM (%) |
|---|---|---|---|---|
| ✓ | ||||
| ✓ | ✓ | |||
| ✓ | ✓ | |||
| ✓ | ✓ | |||
| ✓ | ✓ | ✓ | ||
| ✓ | ✓ | ✓ | ||
| ✓ | ✓ | ✓ | ||
| ✓ | ✓ | ✓ | ✓ |
| Setting | HM (%) |
|---|---|
| Without prototype | |
| With prototype |
| Prompt Type | Seen (%) | Unseen (%) |
|---|---|---|
| Descriptive template | 97.97 | 69.85 |
| Decoupled descriptive template | 98.00 | 72.21 |
| Category-name template (ours) | 98.04 | 74.29 |
| Hyperparameter Setting | Value | HM (%) |
|---|---|---|
| Text weight | 0.1 | 82.27 |
| Text weight | 0.3 | 84.53 |
| Text weight | 0.5 | 83.58 |
| Da weight | 0.30 | 83.61 |
| Da weight | 0.35 | 84.53 |
| Da weight | 0.40 | 80.63 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wu, Z.; Guo, J.; Hou, W.; Zhou, K.; Cao, K.; Jung, H. Unseen-Crop Plant Disease Classification via Disentangled Representation Learning. Electronics 2026, 15, 1553. https://doi.org/10.3390/electronics15081553
Wu Z, Guo J, Hou W, Zhou K, Cao K, Jung H. Unseen-Crop Plant Disease Classification via Disentangled Representation Learning. Electronics. 2026; 15(8):1553. https://doi.org/10.3390/electronics15081553
Chicago/Turabian StyleWu, Zhenzhen, Jianli Guo, Wei Hou, Kun Zhou, Kerang Cao, and Hoekyung Jung. 2026. "Unseen-Crop Plant Disease Classification via Disentangled Representation Learning" Electronics 15, no. 8: 1553. https://doi.org/10.3390/electronics15081553
APA StyleWu, Z., Guo, J., Hou, W., Zhou, K., Cao, K., & Jung, H. (2026). Unseen-Crop Plant Disease Classification via Disentangled Representation Learning. Electronics, 15(8), 1553. https://doi.org/10.3390/electronics15081553

