DLCPD-25: A Large-Scale and Diverse Dataset for Crop Disease and Pest Recognition
Abstract
1. Introduction
- Insufficient data scale and narrow category coverage. Early datasets, such as that of Prayma Bishshash [47], contain only 2137 images, far below the data requirements of modern deep learning models. Most existing datasets also focus on a limited set of common pests or diseases, whereas real agricultural environments involve hundreds of distinct species requiring recognition.
- Simplified collection environments. More than 70% of available datasets are captured under controlled laboratory conditions, lacking realistic variations in illumination, occlusion, and soil backgrounds. Consequently, models trained on these datasets often achieve high laboratory accuracy but suffer substantial degradation when deployed in complex field settings [38].
- Inadequate representation of intra-class variability and inter-class similarity. Pest and disease appearances can vary considerably across growth stages, plant conditions, and environmental contexts. Meanwhile, morphologically similar species, such as those within the IP102 dataset [44] or the grass-family species in CWD30 [46], remain difficult to distinguish. The absence of fine-grained annotations exacerbates this issue, reducing classification precision.
- Class imbalance and annotation limitations. Pests and diseases in real fields follow long-tailed distributions, yet many datasets artificially resample data to balance categories. Although this simplifies training, it compromises the model’s ability to generalize to the true data distribution [48].
- Extensive coverage and large sample size. DLCPD-25 spans 23 major crops such as cotton, citrus, tomato, maize, soybean, grape, mango, wheat, sugar beet, apple, peach, rice, and alfalfa, containing 203 categories and over 221,000 images.
- Inclusion of unlabeled field images for SSL validation. The dataset provides unlabeled samples for evaluating self-supervised frameworks.
- Unified integration of pest, disease, and healthy samples. This design supports a transition from single-threat classification toward comprehensive diagnostic modeling for agricultural visual recognition.
2. Construction of Proposed Dataset
2.1. Online Collection and Curation
2.2. In-Field Data Collection
2.3. Data Fusion, Filtering, and Annotation
3. Comparative Analysis
3.1. Comparison with Other Datasets
3.2. Other Potential Advantages
4. Dataset Benchmarking
4.1. Evaluation Method
4.1.1. Masked Autoencoder
4.1.2. SimCLR Series
4.1.3. Momentum Contrast
4.2. Evaluation Procedure
4.2.1. Evaluation Protocol
4.2.2. Dataset and Evaluation Setup
4.2.3. Implementation Details and Pretraining Configuration
4.2.4. Model Evaluation
4.3. Results
5. Discussion
5.1. Advantages
5.2. Challenges
5.3. Future Perspectives
- Field deployment and validation: A key objective for future work is the deployment of DLCPD-25-trained models on edge-computing platforms, including drones, field robots, and mobile devices, to realize automated and real-time field monitoring systems [75]. Achieving this will require model compression, quantization, and architecture optimization to meet hardware constraints, as well as solutions for handling field-specific challenges such as motion blur, illumination changes, and target occlusion in dynamic environments.
- Data augmentation via generative AI (GenAI): To mitigate data scarcity for rare pest and disease classes and to expand coverage under extreme environmental conditions (such as drought, flooding, or frost), future studies may employ advanced generative artificial intelligence techniques, including diffusion models [76,77] and generative adversarial networks (GANs) [78,79]. These methods can synthesize high-quality and diverse imagery to supplement underrepresented categories and rare scenarios, thereby improving both model robustness and dataset completeness. Furthermore, the integration of digital twin technologies could enable the generation of physically consistent virtual crop environments [80,81], facilitating dynamic and controllable simulation of pest and disease progression under varying climatic and management conditions.
- Multimodal data fusion: In addition to visual imagery, integrating DLCPD-25 with multimodal data—such as meteorological variables, soil sensor measurements, and hyperspectral or multispectral imaging—could further enhance diagnostic precision and predictive capability [82]. Such integration would allow for a deeper understanding of crop–environment interactions and support the development of intelligent decision-support systems for precision agriculture.
- Comprehensive comparative benchmarking: While this study validated the effectiveness of DLCPD-25 as a pre-training resource, a valuable future study would involve a direct, large-scale comparative experiment against other public datasets, such as PlantVillage and CWD30. Training identical SSL models on these different datasets and evaluating them on a standardized, unseen test set would provide definitive quantitative insights into the practical advantages of DLCPD-25’s scale, diversity, and field-realism, which we have identified as a priority for our ongoing research.
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Appendix A. Complete List of Categories
| Species | Scientific Name | Quantity | Num |
|---|---|---|---|
| Citrus | Adristyrannus(Citrus) | 186 | 1 |
| Aleurocanthus Spiniferus(Citru) | 414 | 2 | |
| Aphis Citricola Vander Goot(Citru) | 210 | 3 | |
| Bactrocera Tsuneonis(Citru) | 100 | 4 | |
| Ceroplastes Rubens(Citru) | 154 | 5 | |
| Chinese Citrus Fly(Citru) | 232 | 6 | |
| Chrysomphalus Aonidum(Citru) | 135 | 7 | |
| Dacus Dorsalis(Hendel)(Citru) | 263 | 8 | |
| Icerya Purchasi Maskell(Citru) | 433 | 9 | |
| Nipaecoccus Vastalor(Citru) | 59 | 10 | |
| Orange Huanglongbing(Citrus Greening) | 10,619 | 11 | |
| Panonchus Citri McGregor(Citru) | 231 | 12 | |
| Papilio Xuthus(Citru) | 269 | 13 | |
| Parlatoria Zizyphus Lucus(Citru) | 44 | 14 | |
| Phyllocnistis Citrella Stainton(Citru) | 242 | 15 | |
| Phyllocoptes Oleiverus Ashmead(Citru) | 103 | 16 | |
| Prodenia Litura(Citru) | 782 | 17 | |
| Toxoptera Aurantii(Citru) | 135 | 18 | |
| Toxoptera Citricidus(Citru) | 113 | 19 | |
| Unaspis Yanonensis(Citru) | 251 | 20 | |
| Citrus Healthy | 367 | 21 | |
| Mango | Chlumetia Transversa(Mango) | 183 | 1 |
| Cicadellidae(Mango) | 3444 | 2 | |
| Deporaus Marginatus Pascoe(Mango) | 199 | 3 | |
| Drosophila Melanogaster(Mango) | 740 | 4 | |
| Erosomyia Mangicola Shi(Mango) | 224 | 5 | |
| Idioscopus Clypealis(Lethierry)(Mango) | 424 | 6 | |
| Parasa Lepida(Mango) | 213 | 7 | |
| Scirtothrips Dorsalis Hood(Mango) | 399 | 8 | |
| Sternochetus Frigidus(Mango) | 187 | 9 | |
| Mango Anthracnose | 513 | 10 | |
| Mango Flat Beak Leafhopper(Mango) | 260 | 11 | |
| Mango Healthy | 229 | 12 | |
| Vitis | Ampelophaga Rubiginosa(Vitis) | 458 | 1 |
| Colomerus Vitis(Vitis) | 317 | 2 | |
| Erythroneura Apicalis(Vitis) | 323 | 3 | |
| Lycorma Delicatula(Vitis) | 218 | 4 | |
| Nipponaphis(Vitis) | 167 | 5 | |
| Oides Decempunctata(Vitis) | 143 | 6 | |
| Parthenolecanium Corni(Vitis) | 132 | 7 | |
| Polyphylla Laticollis Lewis(Vitis) | 120 | 8 | |
| Pseudococcus Comstocki Kuwana(Vitis) | 388 | 9 | |
| Theretra Japonica(Vitis) | 303 | 10 | |
| Vespula Flaviceps(Vitis) | 300 | 11 | |
| Xylotrechus Pyrrhoderus(Vitis) | 212 | 12 | |
| Black Rot | 1361 | 13 | |
| Esca | 1559 | 14 | |
| Grape Healthy | 541 | 15 | |
| Leaf Blight | 1220 | 16 | |
| Alfalfa | Alfalfa Plant Bug(Alfalfa) | 393 | 1 |
| Alfalfa Seed Chalcid(Alfalfa) | 273 | 2 | |
| Alfalfa Weevil(Alfalfa) | 788 | 3 | |
| Aphids(Alfalfa) | 666 | 4 | |
| Armyworm(Alfalfa) | 885 | 5 | |
| Blister Beetle(Alfalfa) | 330 | 6 | |
| Caterpillar(Alfalfa) | 595 | 7 | |
| Click Beetle(Alfalfa) | 379 | 8 | |
| Cutworm(Alfalfa) | 345 | 9 | |
| Ladybug(Alfalfa) | 616 | 10 | |
| Leaf Hopper(Alfalfa) | 544 | 11 | |
| Lygus(Alfalfa) | 647 | 12 | |
| Thrips(Alfalfa) | 612 | 13 | |
| Western Corn Rootworm(Alfalfa) | 829 | 14 | |
| Soybean | Anticarsia Gemmatalis(Soybean) | 150 | 1 |
| Aphis Glycines(Soybean) | 446 | 2 | |
| Ascotis Selenaria(Soybean) | 172 | 3 | |
| Bemisia Tabaci(Soybean) | 557 | 4 | |
| Clanis Bilineata(Soybean) | 290 | 5 | |
| Cletus Schmidti(Soybean) | 186 | 6 | |
| Etiella Zinckenella(Soybean) | 304 | 7 | |
| Helicoverpa Armigera(Soybean) | 304 | 8 | |
| Heterodera Glycines(Soybean) | 114 | 9 | |
| Leguminivora Glycinivorella(Soybean) | 317 | 10 | |
| Maruca Testulalis(Soybean) | 222 | 11 | |
| Matsumuraeses Phaseoli(Soybean) | 346 | 12 | |
| Melanagromyza Sojae(Soybean) | 121 | 13 | |
| Monolepta Hieroglyphica(Soybean) | 218 | 14 | |
| Nezara Viridula(Soybean) | 250 | 15 | |
| Odontothrips Loti(Soybean) | 268 | 16 | |
| Omiodes Indicata(Soybean) | 159 | 17 | |
| Paraluperodes Suturalis(Soybean) | 195 | 18 | |
| Piedmont Bean Bug(Soybean) | 195 | 19 | |
| Plathypena Scabra(Soybean) | 190 | 20 | |
| Riptortus Pedestris(Soybean) | 205 | 21 | |
| Spodoptera Litura(Soybean) | 246 | 22 | |
| Tetranychus Cinnabarinus(Soybean) | 268 | 23 | |
| Angular Leaf Spot | 510 | 24 | |
| Downy Mildew | 510 | 25 | |
| Soybean Healthy | 5842 | 26 | |
| Corn | Agrotis Ypsilon(Corn) | 350 | 1 |
| Anaphothrips Obscurus(Corn) | 463 | 2 | |
| Apolygus Lucorum(Corn) | 371 | 3 | |
| Chilo Suppressalis(Corn) | 315 | 4 | |
| Gryllotalpa Orientalis(Corn) | 269 | 5 | |
| Holotrichia Diomphalia(Corn) | 337 | 6 | |
| Holotrichia Oblita(Corn) | 376 | 7 | |
| Holotrichia Parallela(Corn) | 323 | 8 | |
| Laodelphax Striatellus(Corn) | 245 | 9 | |
| Mythimna Separata(Corn) | 363 | 10 | |
| Ostrinia Furnacalis(Corn) | 265 | 11 | |
| Pleonomus Canaliculatus(Corn) | 108 | 12 | |
| Porn Cricket(Corn) | 989 | 13 | |
| Peach Borer(Corn) | 414 | 14 | |
| Protaetia Brevitarsis(Corn) | 339 | 15 | |
| Puccinia Polysora | 838 | 16 | |
| Red Spider(Corn) | 317 | 17 | |
| White Margined Moth(Corn) | 88 | 18 | |
| Wireworm(Corn) | 532 | 19 | |
| Yellow Cutworm(Corn) | 287 | 20 | |
| Rice | Asiatic Rice Borer(Rice) | 631 | 1 |
| Brown Plant Hopper(Rice) | 500 | 2 | |
| Grain Spreader Thrips(Rice) | 103 | 3 | |
| Paddy Stem Maggot(Rice) | 156 | 4 | |
| Rice Bacterial Leaf Blight | 1624 | 5 | |
| Rice Blast | 2219 | 6 | |
| Rice Brown Spot | 2163 | 7 | |
| Rice Gall Midge(Rice) | 303 | 8 | |
| Rice Hispa | 565 | 9 | |
| Rice Leaf Caterpillar(Rice) | 292 | 10 | |
| Rice Leaf Roller(Rice) | 669 | 11 | |
| Rice Leaf Smut | 40 | 12 | |
| Rice Leafhopper(Rice) | 242 | 13 | |
| Rice Shell Pest(Rice) | 245 | 14 | |
| Rice Stemfly(Rice) | 221 | 15 | |
| Rice Tungro | 1308 | 16 | |
| Rice Water Weevil(Rice) | 513 | 17 | |
| Small Brown Plant Hopper(Rice) | 331 | 18 | |
| White Backed Plant Hopper(Rice) | 271 | 19 | |
| Yellow Rice Borer(Rice) | 162 | 20 | |
| Apple | Adoxophyes Orana(Apple) | 285 | 1 |
| Aphis Citricola(Apple) | 579 | 2 | |
| Carposina Sasakii(Apple) | 417 | 3 | |
| Grapholitha Molesta(Apple) | 228 | 4 | |
| Panonchus Citri(Apple) | 410 | 5 | |
| Apple Black Rot | 671 | 6 | |
| Apple Healthy | 1899 | 7 | |
| Apple Rust | 305 | 8 | |
| Apple Scab | 680 | 9 | |
| Wheat | Macrosiphum Avenae(Wheat) | 544 | 1 |
| Penthaleus Major(Wheat) | 362 | 2 | |
| Rhopalosiphum Maidis(Wheat) | 134 | 3 | |
| Rhopalosiphum Padi(Wheat) | 333 | 4 | |
| Schizaphis Graminum(Wheat) | 380 | 5 | |
| Sitobion Avenae(Wheat) | 362 | 6 | |
| Brown Rust | 1530 | 7 | |
| Wheat Healthy | 137 | 8 | |
| Yellow Rust | 1346 | 9 | |
| Cotton | Adelphocoris Fasciaticollis(Cotton) | 276 | 1 |
| Adelphocoris Lineolatus(Cotton) | 356 | 2 | |
| Adelphocoris Suturalis(Cotton) | 174 | 3 | |
| Agrotis Segetum(Cotton) | 197 | 4 | |
| Aphis Gossypii Glover(Cotton) | 306 | 5 | |
| Creontiades Dilutus(Cotton) | 163 | 6 | |
| Earias Cupreoviridis(Cotton) | 136 | 7 | |
| Helicoverpa Armigera(Cotton) | 327 | 8 | |
| Lygus Lucorum(Cotton) | 361 | 9 | |
| Lygus Pratensis(Cotton) | 192 | 10 | |
| Pectinophora Gossypiella(Cotton) | 195 | 11 | |
| Phenacoccus Solenopsis(Cotton) | 164 | 12 | |
| Spodoptera Exigua(Cotton) | 345 | 13 | |
| Spodoptera Litura(Cotton) | 182 | 14 | |
| Tetranychus Cinnabarinus(Cotton) | 294 | 15 | |
| Tetranychus Truncatus(Cotton) | 259 | 16 | |
| Thrips Tabaci(Cotton) | 217 | 17 | |
| Tea | Aapiletucara Cristata(Tea) | 191 | 1 |
| Acapimya Theae(Tea) | 155 | 2 | |
| Aleurocanthus Spiniferus(Tea) | 133 | 3 | |
| Andraca Bipunctata(Tea) | 136 | 4 | |
| Ectropis Obliqua(Tea) | 187 | 5 | |
| Empoasca Onukii(Tea) | 142 | 6 | |
| Euproctis Pseudoconspersa(Tea) | 174 | 7 | |
| Hasora Anura(Tea) | 149 | 8 | |
| Homona Coffearia(Tea) | 121 | 9 | |
| Lymantria Dispar(Tea) | 172 | 10 | |
| Parasa Lepida(Tea) | 137 | 11 | |
| Scirtothrips Dorsalis(Tea) | 218 | 12 | |
| Teinopalpus Aureus(Tea) | 176 | 13 | |
| Toxoptera Aurantii(Tea) | 192 | 14 | |
| Xyleborus Fornicatus(Tea) | 174 | 15 | |
| Peach | Grapholitha Molesta(Peach) | 115 | 1 |
| Myzus Persicae(Peach) | 236 | 2 | |
| Bacterial Spot | 2522 | 3 | |
| Peach Healthy | 405 | 4 | |
| Tomato | Bacterial Spot | 2349 | 1 |
| Early Blight | 1100 | 2 | |
| Late Blight | 2076 | 3 | |
| Leaf Mold | 1052 | 4 | |
| Mosaic Virus | 418 | 5 | |
| Septoria Leaf Spot | 1940 | 6 | |
| Spider Mites Two-Spotted Spider Mite | 1839 | 7 | |
| Target Spot | 1555 | 8 | |
| Tomato Healthy | 1761 | 9 | |
| Tomato Yellow Leaf Curl Virus | 5775 | 10 | |
| Potato | Early Blight | 1100 | 1 |
| Late Blight | 1100 | 2 | |
| Potato Healthy | 167 | 3 | |
| Pepper | Bacterial Spot | 1097 | 1 |
| Pepper Healthy | 1625 | 2 | |
| Strawberry | Leaf Scorch | 1232 | 1 |
| Strawberry Healthy | 500 | 2 | |
| Cherry | Cherry Healthy | 948 | 1 |
| Powdery Mildew | 1169 | 2 | |
| Raspberry | Raspberry Healthy | 405 | 1 |
| Blueberry | Blueberry Healthy | 1657 | 1 |
References
- Yang, Z.Y.; Xia, W.K.; Chu, H.Q.; Su, W.H.; Wang, R.F.; Wang, H. A comprehensive review of deep learning applications in cotton industry: From field monitoring to smart processing. Plants 2025, 14, 1481. [Google Scholar] [CrossRef]
- Wang, R.F.; Qu, H.R.; Su, W.H. From sensors to insights: Technological trends in image-based high-throughput plant phenotyping. Smart Agric. Technol. 2025, 12, 101257. [Google Scholar] [CrossRef]
- Food and Agriculture Organization (FAO) of the United Nations FAO. Available online: https://www.fao.org/corporatepage/en (accessed on 17 October 2025).
- Devi, R.; Kumar, V.; Sivakumar, P. EfficientNetV2 Model for Plant Disease Classification and Pest Recognition. Comput. Syst. Sci. Eng. 2023, 45, 2249. [Google Scholar] [CrossRef]
- Mallick, M.T.; Biswas, S.; Das, A.K.; Saha, H.N.; Chakrabarti, A.; Deb, N. Deep learning based automated disease detection and pest classification in Indian mung bean. Multimed. Tools Appl. 2023, 82, 12017–12041. [Google Scholar] [CrossRef]
- Wang, S.; Xu, D.; Liang, H.; Bai, Y.; Li, X.; Zhou, J.; Su, C.; Wei, W. Advances in deep learning applications for plant disease and pest detection: A review. Remote Sens. 2025, 17, 698. [Google Scholar] [CrossRef]
- Shoaib, M.; Sadeghi-Niaraki, A.; Ali, F.; Hussain, I.; Khalid, S. Leveraging deep learning for plant disease and pest detection: A comprehensive review and future directions. Front. Plant Sci. 2025, 16, 1538163. [Google Scholar] [CrossRef]
- Wang, Z.; Zhang, H.W.; Dai, Y.Q.; Cui, K.; Wang, H.; Chee, P.W.; Wang, R.F. Resource-Efficient Cotton Network: A Lightweight Deep Learning Framework for Cotton Disease and Pest Classification. Plants 2025, 14, 2082. [Google Scholar] [CrossRef] [PubMed]
- Li, W.; Han, X.; Lin, Z.; Rahman, A. Enhanced pest and disease detection in agriculture using deep learning-enabled drones. Acadlore Trans. Ai Mach. Learn. 2024, 3, 1–10. [Google Scholar] [CrossRef]
- Chodey, M.D.; Noorullah Shariff, C. Hybrid deep learning model for in-field pest detection on real-time field monitoring. J. Plant Dis. Prot. 2022, 129, 635–650. [Google Scholar] [CrossRef]
- Guo, B.; Wang, J.; Guo, M.; Chen, M.; Chen, Y.; Miao, Y. Overview of pest detection and recognition algorithms. Electronics 2024, 13, 3008. [Google Scholar] [CrossRef]
- Polk, S.L.; Chan, A.H.; Cui, K.; Plemmons, R.J.; Coomes, D.A.; Murphy, J.M. Unsupervised detection of ash dieback disease (Hymenoscyphus fraxineus) using diffusion-based hyperspectral image clustering. In Proceedings of the IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 2287–2290. [Google Scholar]
- Wu, A.Q.; Li, K.L.; Song, Z.Y.; Lou, X.; Hu, P.; Yang, W.; Wang, R.F. Deep Learning for Sustainable Aquaculture: Opportunities and Challenges. Sustainability 2025, 17, 5084. [Google Scholar] [CrossRef]
- Skendžić, S.; Novak, H.; Zovko, M.; Pajač Živković, I.; Lešić, V.; Maričević, M.; Lemić, D. Hyperspectral Canopy Reflectance and Machine Learning for Threshold-Based Classification of Aphid-Infested Winter Wheat. Remote Sens. 2025, 17, 929. [Google Scholar] [CrossRef]
- Li, R.; Cui, K.; Chan, R.H.; Plemmons, R.J. Classification of hyperspectral images using SVM with shape-adaptive reconstruction and smoothed total variation. In Proceedings of the IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, Kuala Lumpur, Malaysia, 17–22 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1368–1371. [Google Scholar]
- Cui, K.; Shao, Z.; Larsen, G.; Pauca, V.; Alqahtani, S.; Segurado, D.; Pinheiro, J.; Wang, M.; Lutz, D.; Plemmons, R.; et al. Palmprobnet: A probabilistic approach to understanding palm distributions in ecuadorian tropical forest via transfer learning. In Proceedings of the 2024 ACM Southeast Conference, Marietta, GA, USA, 18–20 April 2024; pp. 272–277. [Google Scholar]
- Sethy, P.K.; Barpanda, N.K.; Rath, A.K.; Behera, S.K. Deep feature based rice leaf disease identification using support vector machine. Comput. Electron. Agric. 2020, 175, 105527. [Google Scholar] [CrossRef]
- Liu, T.; Chen, W.; Wu, W.; Sun, C.; Guo, W.; Zhu, X. Detection of aphids in wheat fields using a computer vision technique. Biosyst. Eng. 2016, 141, 82–93. [Google Scholar] [CrossRef]
- Rani, F.P.; Kumar, S.; Fred, A.L.; Dyson, C.; Suresh, V.; Jeba, P. K-means clustering and SVM for plant leaf disease detection and classification. In Proceedings of the 2019 International Conference on Recent Advances in Energy-Efficient Computing and Communication (ICRAECC), Nagercoil, India, 7–20 March 2019; IEEE: Piscataway, NJ, USA, 2019; pp. 1–4. [Google Scholar]
- Wang, R.F.; Su, W.H. The application of deep learning in the whole potato production Chain: A Comprehensive review. Agriculture 2024, 14, 1225. [Google Scholar] [CrossRef]
- Cui, K.; Li, R.; Polk, S.L.; Murphy, J.M.; Plemmons, R.J.; Chan, R.H. Unsupervised spatial-spectral hyperspectral image reconstruction and clustering with diffusion geometry. In Proceedings of the 2022 12th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS), Rome, Italy, 13–16 September 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–5. [Google Scholar]
- Cao, Z.; Xin, H.; Wang, R.; Nie, F. Superpixel-Based Bipartite Graph Clustering Enriched with Spatial Information for Hyperspectral and LiDAR Data. IEEE Trans. Geosci. Remote Sens. 2025, 63, 1–15. [Google Scholar] [CrossRef]
- Cui, K.; Tang, W.; Zhu, R.; Wang, M.; Larsen, G.D.; Pauca, V.P.; Alqahtani, S.; Yang, F.; Segurado, D.; Fine, P.; et al. Efficient Localization and Spatial Distribution Modeling of Canopy Palms Using UAV Imagery. IEEE Trans. Geosci. Remote Sens. 2025, 63, 4413815. [Google Scholar] [CrossRef]
- Zhao, C.T.; Wang, R.F.; Tu, Y.H.; Pang, X.X.; Su, W.H. Automatic lettuce weed detection and classification based on optimized convolutional neural networks for robotic weed control. Agronomy 2024, 14, 2838. [Google Scholar] [CrossRef]
- Cui, K.; Li, R.; Polk, S.L.; Lin, Y.; Zhang, H.; Murphy, J.M.; Plemmons, R.J.; Chan, R.H. Superpixel-based and spatially regularized diffusion learning for unsupervised hyperspectral image clustering. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–18. [Google Scholar] [CrossRef]
- Cao, Z.; Lu, Y.; Yuan, J.; Xin, H.; Wang, R.; Nie, F. Tensorized Graph Learning for Spectral Ensemble Clustering. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 2662–2674. [Google Scholar] [CrossRef]
- Isinkaye, F.O.; Olusanya, M.O.; Singh, P.K. Deep learning and content-based filtering techniques for improving plant disease identification and treatment recommendations: A comprehensive review. Heliyon 2024, 10, e29583. [Google Scholar] [CrossRef]
- Cui, K.; Zhu, R.; Wang, M.; Tang, W.; Larsen, G.D.; Pauca, V.P.; Alqahtani, S.; Yang, F.; Segurado, D.; Lutz, D.A.; et al. Detection and Geographic Localization of Natural Objects in the Wild: A Case Study on Palms. In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence, IJCAI-25, Montreal, QC, Canada, 16–22 August 2025; International Joint Conferences on Artificial Intelligence Organization: Montreal, QC, Canada, 2025; Volume 8, pp. 9601–9609. [Google Scholar] [CrossRef]
- Saki, M.; Keshavarz, R.; Franklin, D.; Abolhasan, M.; Lipman, J.; Shariati, N. A Data-Driven Review of Remote Sensing-Based Data Fusion in Precision Agriculture from Foundational to Transformer-Based Techniques. IEEE Access 2025, 13, 166188–166209. [Google Scholar] [CrossRef]
- Huo, Y.; Liu, Y.; He, P.; Hu, L.; Gao, W.; Gu, L. Identifying Tomato Growth Stages in Protected Agriculture with StyleGAN3–Synthetic Images and Vision Transformer. Agriculture 2025, 15, 120. [Google Scholar] [CrossRef]
- Elghawth, R.; Abbaoui, W.; Ariss, A.; Ziti, S. Deep Learning for Transformer-Based Plant Disease Detection: A Bibliometric Analysis. Eng. Proc. 2025, 112, 29. [Google Scholar]
- Liu, H.; Zhan, B.; Fang, R.; Zhang, Y.; Ma, Y.; Shen, Z.; Mao, Q. Recent advances in pest and disease recognition: A comprehensive review. J. Agric. Eng. 2025, 56. [Google Scholar] [CrossRef]
- Wang, H.; Nguyen, T.H.; Nguyen, T.N.; Dang, M. PD-TR: End-to-end plant diseases detection using a transformer. Comput. Electron. Agric. 2024, 224, 109123. [Google Scholar] [CrossRef]
- Wang, J.; Wang, T.; Xu, Q.; Gao, L.; Gu, G.; Jia, L.; Yao, C. RP-DETR: End-to-end rice pests detection using a transformer. Plant Methods 2025, 21, 63. [Google Scholar] [CrossRef]
- Babu, P.R.; Atluri, S.K. Deep learning-assisted SVMs for efficacious diagnosis of tomato leaf diseases: A comparative study of GoogLeNet, AlexNet, and ResNet-50. Ing. Syst. D’Inf. 2023, 28, 639. [Google Scholar] [CrossRef]
- Khan, A.T.; Jensen, S.M.; Khan, A.R.; Li, S. Plant disease detection model for edge computing devices. Front. Plant Sci. 2023, 14, 1308528. [Google Scholar] [CrossRef]
- Hassan, S.M.; Jasinski, M.; Leonowicz, Z.; Jasinska, E.; Maji, A.K. Plant disease identification using shallow convolutional neural network. Agronomy 2021, 11, 2388. [Google Scholar] [CrossRef]
- Ferentinos, K.P. Deep learning models for plant disease detection and diagnosis. Comput. Electron. Agric. 2018, 145, 311–318. [Google Scholar] [CrossRef]
- Wang, R.F.; Qin, Y.M.; Zhao, Y.Y.; Xu, M.; Schardong, I.B.; Cui, K. RA-CottNet: A Real-Time High-Precision Deep Learning Model for Cotton Boll and Flower Recognition. AI 2025, 6, 235. [Google Scholar] [CrossRef]
- Huo, Y.; Yao, M.; Wang, T.; Tian, Q.; Zhao, J.; Liu, X.; Wang, H. PR-DETR: Extracting and utilizing prior knowledge for improved end-to-end object detection. Image Vis. Comput. 2025, 163, 105745. [Google Scholar] [CrossRef]
- Sun, H.; Chu, H.Q.; Qin, Y.M.; Hu, P.; Wang, R.F. Empowering Smart Soybean Farming with Deep Learning: Progress, Challenges, and Future Perspectives. Agronomy 2025, 15, 1831. [Google Scholar] [CrossRef]
- Bilal, M.; Shah, A.A.; Abbas, S.; Khan, M.A. High-Performance Deep Learning for Instant Pest and Disease Detection in Precision Agriculture. Food Sci. Nutr. 2025, 13, e70963. [Google Scholar] [CrossRef]
- Hughes, D.; Salathé, M. An open access repository of images on plant health to enable the development of mobile disease diagnostics. arXiv 2015, arXiv:1511.08060. [Google Scholar]
- Wu, X.; Zhan, C.; Lai, Y.K.; Cheng, M.M.; Yang, J. Ip102: A large-scale benchmark dataset for insect pest recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 8787–8796. [Google Scholar]
- Dataset, N.P.D. New Plant Diseases Dataset. Available online: https://www.kaggle.com/datasets/vipoooool/new-plant-diseases-dataset (accessed on 20 October 2025).
- Ilyas, T.; Arsa, D.M.S.; Ahmad, K.; Lee, J.; Won, O.; Lee, H.; Kim, H.; Park, D.S. CWD30: A new benchmark dataset for crop weed recognition in precision agriculture. Comput. Electron. Agric. 2025, 229, 109737. [Google Scholar] [CrossRef]
- Bishshash, P.; Nirob, A.S.; Shikder, H.; Sarower, A.H.; Bhuiyan, T.; Noori, S.R.H. A comprehensive cotton leaf disease dataset for enhanced detection and classification. Data Brief 2024, 57, 110913. [Google Scholar] [CrossRef]
- Zhao, Y.; Chen, W.; Huang, K.; Zhu, J. Feature re-balancing for long-tailed visual recognition. In Proceedings of the 2022 International Joint Conference on Neural Networks (IJCNN), Padua, Italy, 18–23 July 2022; IEEE: Piscataway, NJ, USA, 2022; pp. 1–8. [Google Scholar]
- Zhao, Y.; Xie, Q. Review of Deep Learning Applications for Detecting Special Components in Agricultural Products. Computers 2025, 14, 309. [Google Scholar] [CrossRef]
- Faisal, S.; Ooi, M.P.L.; Kuang, Y.C.; Abeysekera, S.K.; Fletcher, D. An overview of integrating deep learning methods with close-range hyperspectral imaging for agriculture. IEEE Access 2025, 13, 120257–120276. [Google Scholar] [CrossRef]
- da Silva, M.P.; Correa, S.P.; Schaefer, M.A.; Reis, J.C.; Nunes, I.M.; dos Santos, J.A.; Oliveira, H.N. Advancing agricultural remote sensing: A comprehensive review of deep supervised and Self-Supervised Learning for crop monitoring. Comput. Graph. 2025, 133, 104434. [Google Scholar] [CrossRef]
- Zhang, J.; Yang, L.; Mohammadabadi, S.M.S.; Yan, F. A survey on self-supervised learning: Recent advances and open problems. Neurocomputing 2025, 655, 131409. [Google Scholar] [CrossRef]
- Carneiro, G.A.; Aubry, T.J.; Cunha, A.; Radeva, P.; Sousa, J.J. Progress in applications of self-supervised learning to computer vision in agriculture: A systematic review. Comput. Electron. Agric. 2025, 239, 111134. [Google Scholar] [CrossRef]
- Liu, X.; Min, W.; Mei, S.; Wang, L.; Jiang, S. Plant Disease Recognition: A Large-Scale Benchmark Dataset and a Visual Region and Loss Reweighting Approach. IEEE Trans. Image Process. 2021, 30, 2003–2015. [Google Scholar] [CrossRef]
- Barbedo, J.G.A.; Koenigkan, L.V.; Halfeld-Vieira, B.A.; Costa, R.V.; Nechet, K.L.; Godoy, C.V.; Junior, M.L.; Patricio, F.R.A.; Talamini, V.; Chitarra, L.G.; et al. Annotated plant pathology databases for image-based detection and recognition of diseases. IEEE Lat. Am. Trans. 2018, 16, 1749–1757. [Google Scholar] [CrossRef]
- Singh, D.; Jain, N.; Jain, P.; Kayal, P.; Kumawat, S.; Batra, N. PlantDoc: A dataset for visual plant disease detection. In Proceedings of the 7th ACM IKDD CoDS and 25th COMAD, Hyderabad, India, 5–7 January 2020; pp. 249–253. [Google Scholar]
- Liu, Z.; Miao, Z.; Zhan, X.; Wang, J.; Gong, B.; Yu, S.X. Large-scale long-tailed recognition in an open world. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2537–2546. [Google Scholar]
- Chen, T.; Kornblith, S.; Norouzi, M.; Hinton, G. A simple framework for contrastive learning of visual representations. In Proceedings of the International Conference on Machine Learning (PmLR), Virtual, 13–18 July 2020; pp. 1597–1607. [Google Scholar]
- Xie, Q.; Luong, M.T.; Hovy, E.; Le, Q.V. Self-training with noisy student improves imagenet classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 10687–10698. [Google Scholar]
- Kamilaris, A.; Prenafeta-Boldu, F. Deep learning in agri-culture: A survey. Comput. Electron. Agric. 2018, 147, 70–90. [Google Scholar] [CrossRef]
- Barbedo, J.G.A. Plant disease identification from individual lesions and spots using deep learning. Biosyst. Eng. 2019, 180, 96–107. [Google Scholar] [CrossRef]
- Kolesnikov, A.; Zhai, X.; Beyer, L. Revisiting self-supervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 1920–1929. [Google Scholar]
- He, K.; Fan, H.; Wu, Y.; Xie, S.; Girshick, R. Momentum Contrast for Unsupervised Visual Representation Learning. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020. [Google Scholar] [CrossRef]
- He, K.; Chen, X.; Xie, S.; Li, Y.; Dollár, P.; Girshick, R. Masked autoencoders are scalable vision learners. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA, 18–24 June 2022; pp. 16000–16009. [Google Scholar]
- Cao, S.; Xu, P.; Clifton, D.A. How to understand masked autoencoders. arXiv 2022, arXiv:2202.03670. [Google Scholar] [CrossRef]
- Guan, R.; Tu, W.; Li, Z.; Yu, H.; Hu, D.; Chen, Y.; Tang, C.; Yuan, Q.; Liu, X. Spatial-Spectral Graph Contrastive Clustering with Hard Sample Mining for Hyperspectral Images. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–16. [Google Scholar] [CrossRef]
- Chen, C.; Cui, K.; Cascarano, P.; Tang, W.; Piccolomini, E.L.; Chan, R.H. Blind Restoration of High-Resolution Ultrasound Video. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Daejeon, Republic of Korea, 23–27 September 2025; Springer: Berlin/Heidelberg, Germany, 2025; pp. 77–87. [Google Scholar]
- Guan, R.; Liu, T.; Tu, W.; Tang, C.; Luo, W.; Liu, X. Sampling Enhanced Contrastive Multi-View Remote Sensing Data Clustering with Long-Short Range Information Mining. IEEE Trans. Knowl. Data Eng. 2025, 37, 5598–5612. [Google Scholar] [CrossRef]
- Guan, R.; Li, Z.; Tu, W.; Wang, J.; Liu, Y.; Li, X.; Tang, C.; Feng, R. Contrastive Multiview Subspace Clustering of Hyperspectral Images Based on Graph Convolutional Networks. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–14. [Google Scholar] [CrossRef]
- Chen, X.; Xie, S.; He, K. An empirical study of training self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 9640–9649. [Google Scholar]
- Tang, W.; Cui, K.; Chan, R.H. Optimized hard exudate detection with supervised contrastive learning. In Proceedings of the 2024 IEEE International Symposium on Biomedical Imaging (ISBI), Athens, Greece, 27–30 May 2024; IEEE: Piscataway, NJ, USA, 2024; pp. 1–5. [Google Scholar]
- Chen, T.; Kornblith, S.; Swersky, K.; Norouzi, M.; Hinton, G.E. Big self-supervised models are strong semi-supervised learners. Adv. Neural Inf. Process. Syst. 2020, 33, 22243–22255. [Google Scholar]
- Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar] [CrossRef]
- Deng, J.; Dong, W.; Socher, R.; Li, L.J.; Li, K.; Fei-Fei, L. Imagenet: A large-scale hierarchical image database. In Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, Miami, FL, USA, 20–25 June 2009; IEEE: Piscataway, NJ, USA, 2009; pp. 248–255. [Google Scholar]
- Wang, R.F.; Tu, Y.H.; Li, X.C.; Chen, Z.Q.; Zhao, C.T.; Yang, C.; Su, W.H. An Intelligent Robot Based on Optimized YOLOv11l for Weed Control in Lettuce. In Proceedings of the 2025 ASABE Annual International Meeting. American Society of Agricultural and Biological Engineers, Toronto, ON, Canada, 13–16 July 2025; p. 1. [Google Scholar]
- Du, M.; Wang, F.; Wang, Y.; Li, K.; Hou, W.; Liu, L.; He, Y.; Wang, Y. Improving long-tailed pest classification using diffusion model-based data augmentation. Comput. Electron. Agric. 2025, 234, 110244. [Google Scholar] [CrossRef]
- Hu, X.; Chen, H.; Duan, Q.; Ahn, C.K.; Shang, H.; Zhang, D. A Comprehensive Review of Diffusion Models in Smart Agriculture: Progress, Applications, and Challenges. arXiv 2025, arXiv:2507.18376. [Google Scholar] [CrossRef]
- Bhattacharya, D.C.; Tausif Mallick, M.; Saha, H.N.; Chakrabarti, A. A comparative review on GAN-based data augmentation techniques for plant-based pest detection. In Proceedings of the International Conference on Data Management, Analytics & Innovation, Kolkatta, India, 17–19 January 2025; Springer: Berlin/Heidelberg, Germany, 2025; pp. 47–63. [Google Scholar]
- Zhang, Y.; Wa, S.; Zhang, L.; Lv, C. Automatic plant disease detection based on tranvolution detection network with GAN modules using leaf images. Front. Plant Sci. 2022, 13, 875693. [Google Scholar] [CrossRef] [PubMed]
- Guan, A.; Zhou, S.; Gu, W.; Wu, Z.; Gao, M.; Liu, H.; Zhang, X.P. Dynamic Simulation and Parameter Calibration-Based Experimental Digital Twin Platform for Heat-Electric Coupled System. IEEE Trans. Sustain. Energy 2025. [Google Scholar] [CrossRef]
- Nasirahmadi, A.; Hensel, O. Toward the next generation of digitalization in agriculture based on digital twin paradigm. Sensors 2022, 22, 498. [Google Scholar] [CrossRef]
- Yang, Z.X.; Li, Y.; Wang, R.F.; Hu, P.; Su, W.H. Deep Learning in Multimodal Fusion for Sustainable Plant Care: A Comprehensive Review. Sustainability 2025, 17, 5255. [Google Scholar] [CrossRef]





| Type | Crop Name | Classes | Num of Images |
|---|---|---|---|
| EC | Citrus | 21 | 15,342 |
| Tomato | 20 | 46,201 | |
| Vitis | 21 | 20,134 | |
| Apple | 5 | 14,390 | |
| Soybean | 23 | 9613 | |
| Peach | 2 | 8133 | |
| Mango | 10 | 5840 | |
| Alfalfa | 11 | 5703 | |
| Bell Pepper | 2 | 5379 | |
| Strawberry | 2 | 5264 | |
| Cherry | 2 | 3972 | |
| Cotton | 11 | 3794 | |
| Squash | 1 | 3571 | |
| Blueberry | 1 | 3318 | |
| Raspberry | 1 | 2781 | |
| Cucumber | 7 | 2384 | |
| Beet | 7 | 2176 | |
| Pepper | 2 | 1689 | |
| Garlic | 1 | 279 | |
| FC | Corn | 20 | 18,677 |
| Rice | 21 | 14,450 | |
| Potato | 4 | 11,553 | |
| Wheat | 15 | 4522 |
| Dataset | Image Count | Category Count | Coverage | Availability | Reference | Main Task |
|---|---|---|---|---|---|---|
| PDDB | 46,409 | 56 | Crop and Fruit Diseases | Public | [55] | Image classification |
| CWD30 | 219,778 | 30 | Weeds | Public | [46] | Image classification |
| Plant Village | 54,309 | 38 | Crop and Fruit Diseases | Public | [43] | Image classification |
| Plant Doc | 2598 | 17 | Crop and Fruit Diseases | Public | [56] | Image classification and object detection |
| PDD271 | 220,592 | 271 | Crop and Fruit Diseases | Private | [54] | Image classification |
| IP102 | 75,222 | 102 | Pests | Private | [44] | Image classification and object detection |
| DLCPD-25 | 221,943 | 203 | Diseases and Pests | Public | Ours | Image classification |
| Method | Accuracy (%) | F1 Score (%) | Precision (%) | Recall (%) |
|---|---|---|---|---|
| MAE | 70.2 | 69.9 | 72.0 | 68.0 |
| SimCLR v2 | 72.1 | 71.3 | 74.0 | 69.0 |
| MoCo v3 | 71.2 | 70.4 | 73.0 | 68.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Zhang, H.-W.; Wang, R.-F.; Wang, Z.; Su, W.-H. DLCPD-25: A Large-Scale and Diverse Dataset for Crop Disease and Pest Recognition. Sensors 2025, 25, 7098. https://doi.org/10.3390/s25227098
Zhang H-W, Wang R-F, Wang Z, Su W-H. DLCPD-25: A Large-Scale and Diverse Dataset for Crop Disease and Pest Recognition. Sensors. 2025; 25(22):7098. https://doi.org/10.3390/s25227098
Chicago/Turabian StyleZhang, Heng-Wei, Rui-Feng Wang, Zhengle Wang, and Wen-Hao Su. 2025. "DLCPD-25: A Large-Scale and Diverse Dataset for Crop Disease and Pest Recognition" Sensors 25, no. 22: 7098. https://doi.org/10.3390/s25227098
APA StyleZhang, H.-W., Wang, R.-F., Wang, Z., & Su, W.-H. (2025). DLCPD-25: A Large-Scale and Diverse Dataset for Crop Disease and Pest Recognition. Sensors, 25(22), 7098. https://doi.org/10.3390/s25227098

