A Multi-Scale Vision–Sensor Collaborative Framework for Small-Target Insect Pest Management
Simple Summary
Abstract
1. Introduction
- A multi-scale visual feature modeling mechanism for small-target pests is proposed, where cross-scale information fusion effectively mitigates feature sparsity and scale mismatch problems;
- A vision–sensor collaborative prior modulation strategy is designed to deeply integrate instantaneous environmental sensor information into the pest recognition process, enhancing the model’s robustness under complex local conditions;
- Systematic experiments on real field-collected datasets verify that the proposed method significantly outperforms multiple mainstream approaches in small-target pest recognition tasks;
- The proposed framework enables early and accurate identification of small-target pests to support precision pest management, which is potentially beneficial for reducing pesticide input, lowering production costs, and improving agricultural productivity.
2. Related Work
2.1. Deep Learning-Based Pest Recognition Methods
2.2. Small-Target Visual Recognition and Multi-Scale Modeling
2.3. Multimodal and Sensor-Assisted Agricultural Perception Methods
3. Materials and Method
3.1. Data Collection
3.2. Data Pre-Processing and Augmentation Strategy
3.3. Proposed Method
3.3.1. Overall
3.3.2. Multi-Scale Small-Target Visual Feature Modeling Module
3.3.3. Environmental Sensor Prior Modulation Module
3.3.4. Vision–Sensor Collaborative Discrimination and Classification Module
4. Results and Discussion
4.1. Experimental Configuration
4.1.1. Hardware and Software Platform
4.1.2. Baseline Models and Evaluation Metrics
4.2. Overall Performance Comparison with Baseline Methods
4.3. Recognition Performance Across Different Pest Categories
4.4. Ablation Study on Key Modules
4.5. Recognition Performance Under Crowded Colony Conditions
4.6. Discussion
4.6.1. Practical Implications for Intelligent Pest Monitoring
4.6.2. Implications for Agricultural Economics and Sustainable Management
4.7. Limitation and Future Work
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Swarnkar, S.K.; Rathore, Y.K.; Swarnkar, V.K. Machine learning models for early detection of pest infestation in crops: A comparative study. In Smart Agriculture; CRC Press: Boca Raton, FL, USA, 2024; pp. 147–162. [Google Scholar]
- Ye, K.; Hu, G.; Tong, Z.; Xu, Y.; Zheng, J. Key intelligent pesticide prescription spraying technologies for the control of pests, diseases, and weeds: A review. Agriculture 2025, 15, 81. [Google Scholar] [CrossRef]
- Aziz, D.; Rafiq, S.; Saini, P.; Ahad, I.; Gonal, B.; Rehman, S.A.; Rashid, S.; Saini, P.; Rohela, G.K.; Aalum, K.; et al. Remote sensing and artificial intelligence: Revolutionizing pest management in agriculture. Front. Sustain. Food Syst. 2025, 9, 1551460. [Google Scholar] [CrossRef]
- Sharma, A.; Patel, R.K.; Pranjal, P.; Panchal, B.; Chouhan, S.S. Computer vision-based smart monitoring and control system for crop. In Applications of Computer Vision and Drone Technology in Agriculture 4.0; Springer: Berlin/Heidelberg, Germany, 2024; pp. 65–82. [Google Scholar]
- Batz, P.; Will, T.; Thiel, S.; Ziesche, T.M.; Joachim, C. From identification to forecasting: The potential of image recognition and artificial intelligence for aphid pest monitoring. Front. Plant Sci. 2023, 14, 1150748. [Google Scholar] [CrossRef] [PubMed]
- Obasekore, H.; Fanni, M.; Ahmed, S.M.; Parque, V.; Kang, B.Y. Agricultural robot-centered recognition of early-developmental pest stage based on deep learning: A case study on fall armyworm (Spodoptera frugiperda). Sensors 2023, 23, 3147. [Google Scholar] [CrossRef] [PubMed]
- Wang, S.; Xu, D.; Liang, H.; Bai, Y.; Li, X.; Zhou, J.; Su, C.; Wei, W. Advances in deep learning applications for plant disease and pest detection: A review. Remote Sens. 2025, 17, 698. [Google Scholar] [CrossRef]
- Rahman, W.; Hossain, M.M.; Hasan, M.M.; Iqbal, M.S.; Rahman, M.M.; Hasan, K.F.; Moni, M.A. Automated detection of harmful insects in agriculture: A smart framework leveraging IoT, machine learning, and blockchain. IEEE Trans. Artif. Intell. 2024, 5, 4787–4798. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, Y.; Ma, X. A new strategy for tuning ReLUs: Self-adaptive linear units (SALUs). In Proceedings of the ICMLCA 2021; 2nd International Conference on Machine Learning and Computer Application; VDE: Berlin, Germany, 2021; pp. 1–8. [Google Scholar]
- Ullah, N.; Khan, J.A.; Alharbi, L.A.; Raza, A.; Khan, W.; Ahmad, I. An efficient approach for crops pests recognition and classification based on novel DeepPestNet deep learning model. IEEE Access 2022, 10, 73019–73032. [Google Scholar] [CrossRef]
- Zhao, Y.; Xie, Q. Review of deep learning applications for detecting special components in agricultural products. Computers 2025, 14, 309. [Google Scholar] [CrossRef]
- Tang, Z.; Lu, J.; Chen, Z.; Qi, F.; Zhang, L. Improved Pest-YOLO: Real-time pest detection based on efficient channel attention mechanism and transformer encoder. Ecol. Inform. 2023, 78, 102340. [Google Scholar] [CrossRef]
- Yang, Z.; Ma, W.; Lu, J.; Tian, Z.; Peng, K. The application status and trends of machine vision in tea production. Appl. Sci. 2023, 13, 10744. [Google Scholar] [CrossRef]
- Wang, K.; Chen, K.; Du, H.; Liu, S.; Xu, J.; Zhao, J.; Chen, H.; Liu, Y.; Liu, Y. New image dataset and new negative sample judgment method for crop pest recognition based on deep learning models. Ecol. Inform. 2022, 69, 101620. [Google Scholar] [CrossRef]
- Zhang, L.; Sun, S.; Zhao, H.; Li, Z.; Li, D. Multi-species insect recognition method based on computer visions: Sustainable agricultural development. Ecol. Inform. 2025, 88, 103125. [Google Scholar] [CrossRef]
- Aarif, K.O.M.; Alam, A.; Hotak, Y. Smart sensor technologies shaping the future of precision agriculture: Recent advances and future outlooks. J. Sens. 2025, 2025, 2460098. [Google Scholar] [CrossRef]
- Lyu, Y.; Lu, F.; Wang, X.; Wang, Y.; Wang, Z.; Zhu, Y.; Wang, Z.; Dong, M. A CNN-Transformer Hybrid Framework for Multi-Label Predator–Prey Detection in Agricultural Fields. Sensors 2025, 25, 4719. [Google Scholar] [CrossRef] [PubMed]
- Zhang, H.; Wang, Y.; Li, F.; Wang, H. Motion Blur Robust Wheat Pest Damage Detection with Dynamic Fuzzy Feature Fusion. arXiv 2026, arXiv:2601.03046. [Google Scholar] [CrossRef]
- Vhatkar, K.N. An intellectual model of pest detection and classification using enhanced optimization-assisted single shot detector and graph attention network. Evol. Intell. 2025, 18, 3. [Google Scholar] [CrossRef]
- Seol, J.; Kim, C.; Ju, E.; Son, H.I. STPAS: Spatial-Temporal Filtering-based Perception and Analysis System for Precision Aerial Spraying. IEEE Access 2024, 12, 145997–146008. [Google Scholar] [CrossRef]
- Zhang, M.; Liu, C.; Li, Z.; Yin, B. From Convolutional Networks to Vision Transformers: Evolution of Deep Learning in Agricultural Pest and Disease Identification. Agronomy 2025, 15, 1079. [Google Scholar] [CrossRef]
- Domingues, T.; Brandão, T.; Ferreira, J.C. Machine learning for detection and prediction of crop diseases and pests: A comprehensive survey. Agriculture 2022, 12, 1350. [Google Scholar] [CrossRef]
- Qasim, M.; Shah, S.M.A.; Safi, Q.G.K.; Mahmood, D.; Iqbal, A.; Nauman, A.; Kim, S.W. An Adaptive Features Fusion Convolutional Neural Network for Multi-Class Agriculture Pest Detection. Comput. Mater. Contin. 2025, 83, 4429–4445. [Google Scholar] [CrossRef]
- Xu, X.; Li, H.; Gao, Q.; Zhou, M.; Meng, T.; Yin, L.; Chai, X. Visual attention focusing on fine-grained foreground and eliminating background bias for pest image identification. IEEE Access 2024, 83, 161732–161741. [Google Scholar] [CrossRef]
- Xu, W.; Sun, L.; Zhen, C.; Liu, B.; Yang, Z.; Yang, W. Deep learning-based image recognition of agricultural pests. Appl. Sci. 2022, 12, 12896. [Google Scholar] [CrossRef]
- Lin, J.; Chen, X.; Cai, J.; Pan, R.; Cernava, T.; Migheli, Q.; Zhang, X.; Qin, Y. Looking from shallow to deep: Hierarchical complementary networks for large scale pest identification. Comput. Electron. Agric. 2023, 214, 108342. [Google Scholar] [CrossRef]
- Ojo, M.O.; Zahid, A. Deep learning in controlled environment agriculture: A review of recent advancements, challenges and prospects. Sensors 2022, 22, 7965. [Google Scholar] [CrossRef] [PubMed]
- Karthik, R.; Ajay, A.; Bisht, A.S.; Illakiya, T.; Suganthi, K. A deep learning approach for crop disease and pest classification using Swin transformer and Dual-Attention Multi-scale fusion network. IEEE Access 2024, 12, 152639–152655. [Google Scholar] [CrossRef]
- Bonato, M.; Martin, E.A.; Cord, A.F.; Seppelt, R.; Beckmann, M.; Strauch, M. Applying generic landscape-scale models of natural pest control to real data: Associations between crops, pests and biocontrol agents make the difference. Agric. Ecosyst. Environ. 2023, 342, 108215. [Google Scholar] [CrossRef]
- Xiong, H.; Li, J.; Wang, T.; Zhang, F.; Wang, Z. EResNet-SVM: An overfitting-relieved deep learning model for recognition of plant diseases and pests. J. Sci. Food Agric. 2024, 104, 6018–6034. [Google Scholar] [CrossRef]
- Zhang, Y.; Lv, C. TinySegformer: A lightweight visual segmentation model for real-time agricultural pest detection. Comput. Electron. Agric. 2024, 218, 108740. [Google Scholar] [CrossRef]
- Zhang, Y.; Wa, S.; Liu, Y.; Zhou, X.; Sun, P.; Ma, Q. High-accuracy detection of maize leaf diseases CNN based on multi-pathway activation function module. Remote Sens. 2021, 13, 4218. [Google Scholar] [CrossRef]
- Wang, D.; Cao, W.; Zhang, F.; Li, Z.; Xu, S.; Wu, X. A review of deep learning in multiscale agricultural sensing. Remote Sens. 2022, 14, 559. [Google Scholar] [CrossRef]
- Li, Y.; Hu, X.; Guo, J.; Wang, Z.; Yu, J. Honeysuckle Pest Detection with a Pyramid Attention Network for Multi-Dimensional Feature Fusion. Symmetry 2026, 18, 118. [Google Scholar] [CrossRef]
- Dong, S.; Du, J.; Jiao, L.; Wang, F.; Liu, K.; Teng, Y.; Wang, R. Automatic crop pest detection oriented multiscale feature fusion approach. Insects 2022, 13, 554. [Google Scholar] [CrossRef]
- Wang, L.; Liu, Y.; Liu, J.; Wang, Y.; Xu, S. Based on the multi-scale information sharing network of fine-grained attention for agricultural pest detection. PLoS ONE 2023, 18, e0286732. [Google Scholar]
- Yang, H.; Sun, H.; Wang, K.; Yang, J.; Hasan Ali Baig, M. Enhanced Farmland Extraction from Gaofen-2: Multi-Scale Segmentation, SVM Integration, and Multi-Temporal Analysis. Agriculture 2025, 15, 1073. [Google Scholar] [CrossRef]
- MacLeod, N.; Canty, R.J.; Polaszek, A. Morphology-based identification of Bemisia tabaci cryptic species puparia via embedded group-contrast convolution neural network analysis. Syst. Biol. 2022, 71, 1095–1109. [Google Scholar] [CrossRef] [PubMed]
- Xu, C.; Yu, C.; Zhang, S.; Wang, X. Multi-scale convolution-capsule network for crop insect pest recognition. Electronics 2022, 11, 1630. [Google Scholar] [CrossRef]
- Li, W.; Luo, Y.; Jiang, P.; Dong, X.; Tang, K.; Liang, Z.; Shi, Y. A sustainable crop protection through integrated technologies: UAV-based detection, real-time pesticide mixing, and adaptive spraying. Sci. Rep. 2025, 15, 35748. [Google Scholar] [CrossRef]
- Courson, E.; Petit, S.; Poggi, S.; Ricci, B. Weather and landscape drivers of the regional level of pest occurrence in arable agriculture: A multi-pest analysis at the French national scale. Agric. Ecosyst. Environ. 2022, 338, 108105. [Google Scholar] [CrossRef]
- Wu, Q.; Zeng, J.; Wu, K. Research and application of crop pest monitoring and early warning technology in China. Front. Agric. Sci. Eng. 2022, 9, 19–36. [Google Scholar] [CrossRef]
- Khan, A.; Malebary, S.J.; Dang, L.M.; Binzagr, F.; Song, H.K.; Moon, H. AI-enabled crop management framework for pest detection using visual sensor data. Plants 2024, 13, 653. [Google Scholar] [CrossRef]
- Liu, Z.; Li, S.; Yang, Y.; Jiang, X.; Wang, M.; Chen, D.; Jiang, T.; Dong, M. High-Precision Pest Management Based on Multimodal Fusion and Attention-Guided Lightweight Networks. Insects 2025, 16, 850. [Google Scholar] [CrossRef] [PubMed]
- Lin, X.; Wa, S.; Zhang, Y.; Ma, Q. A dilated segmentation network with the morphological correction method in farming area image Series. Remote Sens. 2022, 14, 1771. [Google Scholar] [CrossRef]
- Zhou, X.; Chen, S.; Ren, Y.; Zhang, Y.; Fu, J.; Fan, D.; Lin, J.; Wang, Q. Atrous Pyramid GAN Segmentation Network for Fish Images with High Performance. Electronics 2022, 11, 911. [Google Scholar] [CrossRef]
- Wang, R.; Jiao, L.; Liu, K. Deep Learning for Agricultural Visual Perception: Crop Pest and Disease Detection; Springer: Singapore, 2023. [Google Scholar] [CrossRef]
- Amoah-Nuamah, J.; Child, B.; Okyere, E.Y.; Adams, O.; Danquah, J.A. Applications of artificial intelligence in forest health surveillance and management. Discov. For. 2025, 1, 56. [Google Scholar] [CrossRef]
- Chandra, M.A.; Bedi, S. Survey on SVM and their application in image classification. Int. J. Inf. Technol. 2021, 13, 1–11. [Google Scholar] [CrossRef]
- Biau, G. Analysis of a random forests model. J. Mach. Learn. Res. 2012, 13, 1063–1095. [Google Scholar]
- Chua, L.O. CNN: A vision of complexity. Int. J. Bifurc. Chaos 1997, 7, 2219–2425. [Google Scholar] [CrossRef]
- Liang, J. Image classification based on RESNET. J. Phys. Conf. Ser. 2020, 1634, 012110. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]
- Li, Z.; Li, E.; Xu, T.; Samat, A.; Liu, W. Feature alignment FPN for oriented object detection in remote sensing images. IEEE Geosci. Remote Sens. Lett. 2023, 20, 6001705. [Google Scholar] [CrossRef]
- Turov, A.T.; Konstantinov, Y.A.; Totmina, E.E.; Votinova, A.G.; Masich, G.F.; Korobko, D.A.; Fotiadi, A.A. Registration of Sounds Emitted by the Madagascar Hissing Cockroach Using a Distributed Acoustic Sensor. Sensors 2025, 25, 2101. [Google Scholar] [CrossRef]
- Ashry, I.; Mao, Y.; Al-Fehaid, Y.; Al-Shawaf, A.; Al-Bagshi, M.; Al-Brahim, S.; Ng, T.K.; Ooi, B.S. Early detection of red palm weevil using distributed optical sensor. Sci. Rep. 2020, 10, 3155. [Google Scholar] [CrossRef]
- Wang, B.; Mao, Y.; Ashry, I.; Al-Fehaid, Y.; Al-Shawaf, A.; Ng, T.K.; Yu, C.; Ooi, B.S. Towards Detecting Red Palm Weevil Using Machine Learning and Fiber Optic Distributed Acoustic Sensing. Sensors 2021, 21, 1592. [Google Scholar] [CrossRef]
- Ashry, I.; Wang, B.; Mao, Y.; Sait, M.; Guo, Y.; Al-Fehaid, Y.; Al-Shawaf, A.; Ng, T.K.; Ooi, B.S. CNN–Aided Optical Fiber Distributed Acoustic Sensing for Early Detection of Red Palm Weevil: A Field Experiment. Sensors 2022, 22, 6491. [Google Scholar] [CrossRef]







| Data Type/Pest Category | Number of Samples | Acquisition Scenario | Acquisition Period |
|---|---|---|---|
| Aphid | 3180 | Field and greenhouse | 2023.04–2023.11 |
| Thrips | 2460 | Greenhouse | 2023.04–2023.11 |
| Whitefly | 1985 | Greenhouse | 2023.05–2023.10 |
| Leafhopper | 1540 | Field | 2023.05–2023.09 |
| Spider mite | 1870 | Field and greenhouse | 2023.06–2023.10 |
| Leaf beetle | 1095 | Field | 2023.04–2023.08 |
| Total pest image data | 12,130 | Field and greenhouse | 2023.04–2023.11 |
| Air temperature data | 34,560 | Field and greenhouse | 2023.04–2023.11 |
| Relative humidity data | 34,560 | Field and greenhouse | 2023.04–2023.11 |
| Illumination intensity data | 25,920 | Field and greenhouse | 2023.05–2023.10 |
| Method | Accuracy (%) | Precision (%) | Recall (%) | (%) |
|---|---|---|---|---|
| SVM + handcrafted features | 78.4 ± 1.5 * | 76.9 ± 1.6 * | 74.2 ± 1.8 * | 75.5 ± 1.6 * |
| Random forest + handcrafted features | 80.1 ± 1.4 * | 78.6 ± 1.5 * | 76.8 ± 1.5 * | 77.7 ± 1.4 * |
| CNN (single-scale) | 84.7 ± 1.2 * | 83.2 ± 1.3 * | 81.5 ± 1.4 * | 82.3 ± 1.3 * |
| ResNet (single-scale) | 86.9 ± 1.0 * | 85.6 ± 1.1 * | 83.8 ± 1.2 * | 84.7 ± 1.1 * |
| Vision Transformer (single-scale) | 87.4 ± 1.1 * | 86.1 ± 1.0 * | 84.2 ± 1.1 * | 85.1 ± 1.0 * |
| FPN-based multi-scale vision (no sensor) | 89.6 ± 0.9 * | 88.3 ± 0.8 * | 87.1 ± 1.0 * | 87.7 ± 0.9 * |
| Multi-scale vision (w/o sensor prior) | 90.4 ± 0.8 * | 89.1 ± 0.9 * | 88.2 ± 0.8 * | 88.6 ± 0.8 * |
| Proposed method | 93.1 ± 0.5 | 92.0 ± 0.6 | 91.2 ± 0.5 | 91.6 ± 0.5 |
| Pest Category | Accuracy (%) | Precision (%) | Recall (%) | (%) |
|---|---|---|---|---|
| Aphid | 94.2 | 93.5 | 92.8 | 93.1 |
| Thrips | 92.6 | 91.3 | 90.5 | 90.9 |
| Whitefly | 91.8 | 90.6 | 89.7 | 90.1 |
| Leafhopper | 92.1 | 91.0 | 90.2 | 90.6 |
| Spider mite | 93.4 | 92.2 | 91.5 | 91.8 |
| Leaf beetle | 90.7 | 89.4 | 88.6 | 89.0 |
| Macro-average | 92.5 | 91.3 | 90.6 | 90.9 |
| Variant | Accuracy (%) | Precision (%) | Recall (%) | (%) |
|---|---|---|---|---|
| Full model (proposed) | 93.1 ± 0.5 | 92.0 ± 0.6 | 91.2 ± 0.5 | 91.6 ± 0.5 |
| w/o multi-scale visual modeling | 88.3 ± 1.0 * | 86.9 ± 1.1 * | 85.7 ± 1.2 * | 86.3 ± 1.1 * |
| w/o sensor prior modulation | 90.4 ± 0.8 * | 89.1 ± 0.9 * | 88.2 ± 0.8 * | 88.6 ± 0.8 * |
| w/o vision–sensor collaborative head | 89.6 ± 0.9 * | 88.2 ± 1.0 * | 87.1 ± 0.9 * | 87.6 ± 0.9 * |
| Replace conditional gating with concatenation | 89.1 ± 0.9 * | 87.8 ± 0.8 * | 86.9 ± 1.0 * | 87.3 ± 0.9 * |
| Method | Density | Accuracy (%) | Recall (%) | (%) |
|---|---|---|---|---|
| Vision Transformer (single-scale) | Sparse | 88.6 | 85.9 | 86.8 |
| Vision Transformer (single-scale) | Dense | 84.9 | 81.7 | 83.1 |
| FPN-based multi-scale (no sensor) | Sparse | 90.8 | 88.3 | 89.2 |
| FPN-based multi-scale (no sensor) | Dense | 87.4 | 84.6 | 85.9 |
| Multi-scale vision (w/o sensor prior) | Sparse | 91.6 | 89.7 | 90.2 |
| Multi-scale vision (w/o sensor prior) | Dense | 88.8 | 86.1 | 87.3 |
| Proposed method | Sparse | 93.8 | 92.3 | 92.5 |
| Proposed method | Dense | 92.2 | 90.8 | 91.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, C.; Chen, Y.; Chen, S.; Chen, R.; Xia, Z.; Hu, R.; Song, Y. A Multi-Scale Vision–Sensor Collaborative Framework for Small-Target Insect Pest Management. Insects 2026, 17, 281. https://doi.org/10.3390/insects17030281
Wang C, Chen Y, Chen S, Chen R, Xia Z, Hu R, Song Y. A Multi-Scale Vision–Sensor Collaborative Framework for Small-Target Insect Pest Management. Insects. 2026; 17(3):281. https://doi.org/10.3390/insects17030281
Chicago/Turabian StyleWang, Chongyu, Yicheng Chen, Shangshan Chen, Ranran Chen, Ziqi Xia, Ruoyu Hu, and Yihong Song. 2026. "A Multi-Scale Vision–Sensor Collaborative Framework for Small-Target Insect Pest Management" Insects 17, no. 3: 281. https://doi.org/10.3390/insects17030281
APA StyleWang, C., Chen, Y., Chen, S., Chen, R., Xia, Z., Hu, R., & Song, Y. (2026). A Multi-Scale Vision–Sensor Collaborative Framework for Small-Target Insect Pest Management. Insects, 17(3), 281. https://doi.org/10.3390/insects17030281
