A Multimodal UAV-IoT Sensing Framework for Intelligent Pest Density Estimation in Smart Agricultural Systems
Abstract
1. Introduction
- A systematic multimodal modeling paradigm for pest density estimation is established, which integrates the construction of a multi-source agricultural dataset with the proposal of the Pest Density Estimation Framework (PDEF) to provide a unified data-driven foundation for field-scale population monitoring.
- Innovative architectural components, including a cross-modal feature alignment module and an environment-aware enhancement module, are designed to effectively mitigate discrepancies across heterogeneous data sources in spatial, temporal, and semantic dimensions.
- Extensive experimental evaluations are conducted under real-world monitoring conditions to demonstrate that the proposed PDEF framework significantly outperforms traditional object-level detection approaches and single-modality prediction models in terms of accuracy and robustness.
- The practical application value of the proposed framework is further illustrated by its capability to provide data-driven insights for precision resource allocation and risk-aware decision-making in intelligent agricultural management.
2. Related Work
2.1. Vision-Based Agricultural Pest Detection Methods
2.2. Remote Sensing and UAV-Based Agricultural Monitoring
2.3. Multimodal Agricultural Sensor Data Fusion
2.4. Data-Driven Agri-Economics
3. Materials and Method
3.1. Data Collection
3.2. Data Preprocessing and Augmentation Strategy
3.3. Proposed Method
3.3.1. Overall
| Algorithm 1 Inference Pathway of PDEF |
|
3.3.2. Cross-Modal Feature Alignment Module
3.3.3. Environment-Aware Enhancement Module
3.3.4. Pest Density Regression Module
4. Results and Discussion
4.1. Experimental Setup
4.2. Baseline and Evaluation Metrics
4.3. Comparison with Baseline Methods
4.4. Ablation Study
4.5. Sensitivity Analysis of Environmental Factors
4.6. Performance Comparison Under Different Input Modalities
4.7. Discussion
4.7.1. Theoretical Analysis of Model Properties and Convergence
4.7.2. Model Generalization Capability
4.7.3. Practical Deployment in Intelligent Agricultural Systems
4.7.4. Broader Implications for Agricultural Systems
4.8. Limitations and Future Work
5. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Tang, Y.; Chen, C.; Leite, A.C.; Xiong, Y. Precision control technology and application in agricultural pest and disease control. Front. Plant Sci. 2023, 14, 1163839. [Google Scholar] [CrossRef]
- Zhang, Y.; Wa, S.; Liu, Y.; Zhou, X.; Sun, P.; Ma, Q. High-accuracy detection of maize leaf diseases CNN based on multi-pathway activation function module. Remote Sens. 2021, 13, 4218. [Google Scholar] [CrossRef]
- Singh, R.; Krishnan, P.; Singh, V.K.; Banerjee, K. Application of thermal and visible imaging to estimate stripe rust disease severity in wheat using supervised image classification methods. Ecol. Inform. 2022, 71, 101774. [Google Scholar] [CrossRef]
- Nadeem, A.; Ashraf, M.; Mehmood, A.; Rizwan, K.; Siddiqui, M.S. Dataset of date palm tree (Phoenix dactylifera L.) thermal images and their classification based on red palm weevil (Rhynchophorus ferrugineus) infestation. Front. Agron. 2025, 7, 1604188. [Google Scholar] [CrossRef]
- Liu, Y.; Su, J.; Zheng, Z.; Liu, D.; Song, Y.; Fang, Y.; Yang, P.; Su, B. GLDCNet: A novel convolutional neural network for grapevine leafroll disease recognition using UAV-based imagery. Comput. Electron. Agric. 2024, 218, 108668. [Google Scholar] [CrossRef]
- Lin, X.; Wa, S.; Zhang, Y.; Ma, Q. A dilated segmentation network with the morphological correction method in farming area image Series. Remote Sens. 2022, 14, 1771. [Google Scholar] [CrossRef]
- Feng, S.; Zhao, Z.; Xi, B.; Zhao, C.; Li, W.; Tao, R.; Li, Y. DSNet: Dynamic stitchable neural network for hyperspectral image classification. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5519214. [Google Scholar] [CrossRef]
- Luo, M.; Zhang, T.; Wei, S.; Ji, S. SAM-RSIS: Progressively adapting SAM with box prompting to remote sensing image instance segmentation. IEEE Trans. Geosci. Remote Sens. 2024, 62, 1–14. [Google Scholar] [CrossRef]
- Rungchang, S.; Saeys, W.; Sringarm, C.; Numthuam, S.; Intanon, S.; Kittiwachana, S.; Jiamyangyuen, S. Near infrared spectroscopy for sustainable non-destructive classification and quantitative prediction of quality traits of seed size, pest damage, and quality traits in lychee (Litchi chinensis). Appl. Food Res. 2025, 5, 101553. [Google Scholar] [CrossRef]
- Alanazi, A.A.; Shakeabubakor, A.A.; Abdel-Khalek, S.; Alkhalaf, S. IoT enhanced metaheuristics with deep transfer learning based robust crop pest recognition and classification. Alex. Eng. J. 2023, 84, 100–111. [Google Scholar] [CrossRef]
- Wei, M.; Lyu, F.; Lu, S.; Liu, W.; Fan, Z.; Yang, N.; Hui, W. Mechanism-guided deep learning for pest classification in tomato leaves. Comput. Electron. Agric. 2026, 244, 111434. [Google Scholar] [CrossRef]
- Luo, W.; Zhang, H.; Xu, L. CALDS-RTDETR: A robust forestry pest detection model for small targets in complex environments. Comput. Electron. Agric. 2026, 244, 111482. [Google Scholar] [CrossRef]
- Johnson, B.J.; Gomez, M.M.; Munch, S.B. Empirical dynamic modeling for prediction and control of pest populations. Ecol. Model. 2025, 504, 111081. [Google Scholar] [CrossRef]
- Dou, S.; Liu, B.; Li, Q.; Sun, M.; Lu, Y. Population dynamics and predation on two pest species by Hippodamia variegata (Goeze) in Korla fragrant pear orchards of Xinjiang, China. Biol. Control 2026, 215, 106003. [Google Scholar] [CrossRef]
- Zheng, Z.; Zhang, C. Electronic noses based on metal oxide semiconductor sensors for detecting crop diseases and insect pests. Comput. Electron. Agric. 2022, 197, 106988. [Google Scholar] [CrossRef]
- IoT-based agriculture management techniques for sustainable farming: A comprehensive review. Comput. Electron. Agric. 2024, 220, 108851.
- Akkas Ali, M.; Kumar Dhanaraj, R.; Kadry, S. AI-enabled IoT-based pest prevention and controlling system using sound analytics in large agricultural field. Comput. Electron. Agric. 2024, 220, 108844. [Google Scholar] [CrossRef]
- Ma, L.; Yang, Q.; Llanes-Santiago, O.; Peng, K. A novel multi-source heterogeneous data fusion based fault diagnosis framework for manufacturing processes. Eng. Appl. Artif. Intell. 2025, 162, 112542. [Google Scholar] [CrossRef]
- Zhang, L.; Zhang, Y.; Ma, X. A new strategy for tuning ReLUs: Self-adaptive linear units (SALUs). In ICMLCA 2021, Proceedings of the 2nd International Conference on Machine Learning and Computer Application, Shenyang, China, 17–19 December 2021; VDE: Berlin, Germany, 2021; pp. 1–8. [Google Scholar]
- Gill, Y.S.; Afzaal, H.; Singh, C.; Randhawa, G.S.; Angrish, K.; Jaura, N.; Qamar, Z.; Farooque, A.A. Deep learning driven edge inference for pest detection in potato crops using the AgriScout robot. Comput. Electron. Agric. 2026, 244, 111492. [Google Scholar] [CrossRef]
- Song, P.; Chen, K.; Zhu, L.; Yang, M.; Ji, C.; Xiao, A.; Jia, H.; Zhang, J.; Yang, W. An improved cascade R-CNN and RGB-D camera-based method for dynamic cotton top bud recognition and localization in the field. Comput. Electron. Agric. 2022, 202, 107442. [Google Scholar] [CrossRef]
- Liu, J.; Zhou, C.; Zhu, Y.; Yang, B.; Liu, G.; Xiong, Y. RicePest-DETR: A transformer-based model for accurately identifying small rice pest by end-to-end detection mechanism. Comput. Electron. Agric. 2025, 235, 110373. [Google Scholar] [CrossRef]
- Ma, B.; Sun, L.; Mu, J.; Ren, Z.; Kang, G.; Liu, R.; Liu, S.; Hu, X.; Zhang, H.; Wang, J. MH-YOLO: Multiple heterogeneous YOLO for apple orchard pest detection. Inf. Process. Agric. 2026, 13, 47–71. [Google Scholar] [CrossRef]
- Zheng, Y.; Zheng, W.; Du, X. Paddy-YOLO: An accurate method for rice pest detection. Comput. Electron. Agric. 2025, 238, 110777. [Google Scholar] [CrossRef]
- Liu, J.; Sun, L.; Zhou, G.; Wang, J.; Xing, J.; Wang, C. SFCE-VT: Spatial feature fusion and contrast-enhanced visual transformer for fine-grained agricultural pests visual classification. Comput. Electron. Agric. 2025, 236, 110371. [Google Scholar] [CrossRef]
- Liu, J.; Xing, J.; Zhou, G.; Wang, J.; Sun, L.; Chen, X. Transfer large models to crop pest recognition—A cross-modal unified framework for parameters efficient fine-tuning. Comput. Electron. Agric. 2025, 237, 110661. [Google Scholar] [CrossRef]
- Kopton, J.; de Bruin, S.; Schulz, D.; Luedeling, E. Combining spatio-temporal pest risk prediction and decision theory to improve pest management in smallholder agriculture. Comput. Electron. Agric. 2025, 236, 110426. [Google Scholar] [CrossRef]
- Karimzadeh, R.; Sciarretta, A. Spatial patchiness and association of pests and natural enemies in agro-ecosystems and their application in precision pest management: A review. Precis. Agric. 2022, 23, 1836–1855. [Google Scholar] [CrossRef]
- Wang, Y.; Hong, D.; Sha, J.; Gao, L.; Liu, L.; Zhang, Y.; Rong, X. Spectral–spatial–temporal transformers for hyperspectral image change detection. IEEE Trans. Geosci. Remote Sens. 2022, 60, 1–14. [Google Scholar] [CrossRef]
- Qian, X.; Zhang, B.; He, Z.; Wang, W.; Yao, X.; Cheng, G. IPS-YOLO: Iterative pseudo-fully supervised training of YOLO for weakly supervised object detection in remote sensing images. IEEE Trans. Geosci. Remote Sens. 2025, 63, 5630414. [Google Scholar] [CrossRef]
- Soomro, T.A.; Clarke, A.; Medway, J.; Liang, B.; Summerhayes, S.; Guerschman, J.P.; de Ligt, R.; Armitage, H.; Ayers, C. UAV-based remote sensing for rangeland monitoring, A generalized and transparent workflow with an Australian lead. Ecol. Inform. 2026, 94, 103663. [Google Scholar] [CrossRef]
- Zhang, L.; Wang, Y.; Xue, X.; Huang, W.; Yang, T.; Zhu, H.; Lan, Y. UAV remote sensing-driven precision variable management in cotton: Technological framework, applications, and research outlook. Comput. Electron. Agric. 2026, 243, 111426. [Google Scholar] [CrossRef]
- Li, Y.; Deng, G.; Zhao, H.; Liu, B.; Liu, C.; Qian, W.; Qiao, X. A review of UAV remote sensing technology applications in common gramineous crops. Inf. Process. Agric. 2026, in press. [Google Scholar] [CrossRef]
- Mahanta, D.K.; Bhoi, T.K.; Komal, J.; Samal, I.; Mastinu, A. Spatial, spectral and temporal insights: Harnessing high-resolution satellite remote sensing and artificial intelligence for early monitoring of wood boring pests in forests. Plant Stress 2024, 11, 100381. [Google Scholar] [CrossRef]
- Mngadi, M.; Germishuizen, I.; Mutanga, O.; Naicker, R.; Maes, W.H.; Odebiri, O.; Schroder, M. A systematic review of the application of remote sensing technologies in mapping forest insect pests and diseases at a tree-level. Remote Sens. Appl. Soc. Environ. 2024, 36, 101341. [Google Scholar] [CrossRef]
- Yu, G.; Ma, B.; Zhang, R.; Xu, Y.; Lian, Y.; Dong, F. CPD-YOLO: A cross-platform detection method for cotton pests and diseases using UAV and smartphone imaging. Ind. Crops Prod. 2025, 234, 121515. [Google Scholar] [CrossRef]
- Mishra, R.; Mishra, A. Current research on Internet of Things (IoT) security protocols: A survey. Comput. Secur. 2025, 151, 104310. [Google Scholar] [CrossRef]
- Paul, K.; Chatterjee, S.S.; Pai, P.; Varshney, A.; Juikar, S.; Prasad, V.; Bhadra, B.; Dasgupta, S. Viable smart sensors and their application in data driven agriculture. Comput. Electron. Agric. 2022, 198, 107096. [Google Scholar] [CrossRef]
- Seesaard, T.; Goel, N.; Kumar, M.; Wongchoosuk, C. Advances in gas sensors and electronic nose technologies for agricultural cycle applications. Comput. Electron. Agric. 2022, 193, 106673. [Google Scholar] [CrossRef]
- Sharma, R.P.; Dharavath, R.; Edla, D.R. IoFT-FIS: Internet of farm things based prediction for crop pest infestation using optimized fuzzy inference system. Internet Things 2023, 21, 100658. [Google Scholar] [CrossRef]
- Zhang, Y.; Fan, Z.; Li, S.; Liu, L.; Feng, S.; Wang, Y.; Zhang, H.; Jiang, Z.; Cheein, F.A.; Wang, Q.; et al. Multimodal data fusion and attention-based deep learning for estimating winter wheat chlorophyll content. Comput. Electron. Agric. 2026, 245, 111536. [Google Scholar] [CrossRef]
- Song, Y.; Li, M.; Zhou, Z.; Zhang, J.; Du, X.; Dong, M.; Jiang, Q.; Li, C.; Hu, Y.; Yu, Q.; et al. A lightweight method for apple disease segmentation using multimodal transformer and sensor fusion. Comput. Electron. Agric. 2025, 237, 110737. [Google Scholar] [CrossRef]
- Li, G.; Zhou, B.; Ni, M.; Chen, H.; Liu, Y.; Zhang, Y.; Zhang, M.; Wang, M. Stages-based multimodal data fusion model (S-MDFM) for wheat yield prediction and screening of drought-resistant and high-yield varieties. Comput. Electron. Agric. 2025, 239, 111122. [Google Scholar] [CrossRef]
- Li, W.; Du, Z.; Xu, X.; Bai, Z.; Han, J.; Cui, M.; Li, D. A review of aquaculture: From single modality analysis to multimodality fusion. Comput. Electron. Agric. 2024, 226, 109367. [Google Scholar] [CrossRef]
- Boyabatlı, O.; Kazaz, B.; Tang, C.S. Agricultural Supply Chain Management Research; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
- Yontar, E. Critical success factor analysis of blockchain technology in agri-food supply chain management: A circular economy perspective. J. Environ. Manag. 2023, 330, 117173. [Google Scholar] [CrossRef]
- Narwane, V.S.; Gunasekaran, A.; Gardas, B.B. Unlocking adoption challenges of IoT in Indian agricultural and food supply chain. Smart Agric. Technol. 2022, 2, 100035. [Google Scholar] [CrossRef]
- Morchid, A.; Ismail, A.; Khalid, H.M.; Qjidaa, H.; El Alami, R. Blockchain and IoT technologies in smart farming to enhance the efficiency of the agri-food supply chain: A review of applications, benefits, and challenges. Internet Things 2025, 33, 101733. [Google Scholar] [CrossRef]
- El Mane, A.; Tatane, K.; Chihab, Y. Transforming agricultural supply chains: Leveraging blockchain-enabled java smart contracts and IoT integration. ICT Express 2024, 10, 650–672. [Google Scholar] [CrossRef]
- Weisberg, S. Applied Linear Regression; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 528. [Google Scholar]
- Breiman, L. Random forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Rynkiewicz, J. General bound of overfitting for MLP regression models. Neurocomputing 2012, 90, 106–110. [Google Scholar] [CrossRef]
- Koonce, B. ResNet 50. In Convolutional Neural Networks with Swift for Tensorflow: Image Recognition and Dataset Categorization; Springer: Berlin/Heidelberg, Germany, 2021; pp. 63–72. [Google Scholar]
- Greff, K.; Srivastava, R.K.; Koutník, J.; Steunebrink, B.R.; Schmidhuber, J. LSTM: A search space odyssey. IEEE Trans. Neural Netw. Learn. Syst. 2016, 28, 2222–2232. [Google Scholar] [CrossRef]
- Li, P.; Abdel-Aty, M.; Yuan, J. Real-time crash risk prediction on arterials based on LSTM-CNN. Accid. Anal. Prev. 2020, 135, 105371. [Google Scholar] [CrossRef]
- Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A survey on vision transformer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 45, 87–110. [Google Scholar] [CrossRef]







| Data Modality | Data Source | Sampling Frequency | Data Volume |
|---|---|---|---|
| UAV imagery | UAV platform with RGB camera | Every 5–7 days | 18,500 images |
| Trap counts | Pheromone traps deployed in fields | Every 3–5 days | 12,000 records |
| Environmental sensors | Field IoT weather stations | Every 10–30 min | 2.6 × 106 records |
| Temporal coverage | Entire growing season | – | April 2023–October 2024 |
| Spatial coverage | Multiple agricultural fields | – | 15+ field plots |
| Method | MAE ↓ | RMSE ↓ | MAPE (%) ↓ | R2 ↑ |
|---|---|---|---|---|
| Linear Regression | 8.91 | 11.38 | 24.7 | 0.60 |
| Random Forest | 8.12 | 10.65 | 22.3 | 0.66 |
| MLP Regression | 7.58 | 10.02 | 20.8 | 0.70 |
| CNN (ResNet50) | 6.92 | 9.24 | 18.9 | 0.74 |
| LSTM (Sensor only) | 7.25 | 9.57 | 19.6 | 0.72 |
| CNN + LSTM (Fusion) | 6.34 | 8.51 | 17.2 | 0.78 |
| Transformer Fusion | 6.01 | 8.16 | 16.5 | 0.80 |
| CLIP | 5.82 | 7.95 | 15.8 | 0.81 |
| PDEF (Ours) | 5.47 | 7.62 | 14.9 | 0.84 |
| Model Variant | MAE ↓ | RMSE ↓ | MAPE (%) ↓ | R2 ↑ |
|---|---|---|---|---|
| Full PDEF | 5.47 | 7.62 | 14.9 | 0.84 |
| w/o Cross-Modal Alignment | 6.05 | 8.28 | 16.8 | 0.80 |
| w/o Environment-Aware Module | 5.92 | 8.14 | 16.3 | 0.81 |
| w/o Feature Decomposition | 5.84 | 8.05 | 15.9 | 0.82 |
| w/o Multimodal Fusion (Image only) | 6.92 | 9.24 | 18.9 | 0.74 |
| Environmental Factor | MAE Increase ↑ | RMSE Increase ↑ | Importance Score |
|---|---|---|---|
| Temperature | 1.12 | 1.45 | 0.38 |
| Humidity | 0.89 | 1.18 | 0.29 |
| Wind Speed | 0.45 | 0.62 | 0.18 |
| Precipitation | 0.32 | 0.41 | 0.15 |
| Input Modality | MAE ↓ | RMSE ↓ | MAPE (%) ↓ | R2 ↑ |
|---|---|---|---|---|
| UAV only | 6.92 | 9.24 | 18.9 | 0.74 |
| Sensor only | 7.25 | 9.57 | 19.6 | 0.72 |
| Trap only | 8.05 | 10.78 | 23.1 | 0.67 |
| UAV + Sensor | 6.12 | 8.34 | 17.0 | 0.79 |
| UAV + Trap | 6.01 | 8.21 | 16.6 | 0.80 |
| Sensor + Trap | 6.58 | 8.88 | 17.9 | 0.77 |
| UAV + Sensor + Trap (PDEF) | 5.47 | 7.62 | 14.9 | 0.84 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, Y.; Chen, J.; Zeng, X.; Chen, R.; Chen, L.; Xiao, S.; Song, Y. A Multimodal UAV-IoT Sensing Framework for Intelligent Pest Density Estimation in Smart Agricultural Systems. Sensors 2026, 26, 2877. https://doi.org/10.3390/s26092877
Zhang Y, Chen J, Zeng X, Chen R, Chen L, Xiao S, Song Y. A Multimodal UAV-IoT Sensing Framework for Intelligent Pest Density Estimation in Smart Agricultural Systems. Sensors. 2026; 26(9):2877. https://doi.org/10.3390/s26092877
Chicago/Turabian StyleZhang, Yida, Jianxi Chen, Xin Zeng, Runxi Chen, Lirui Chen, Shanhe Xiao, and Yihong Song. 2026. "A Multimodal UAV-IoT Sensing Framework for Intelligent Pest Density Estimation in Smart Agricultural Systems" Sensors 26, no. 9: 2877. https://doi.org/10.3390/s26092877
APA StyleZhang, Y., Chen, J., Zeng, X., Chen, R., Chen, L., Xiao, S., & Song, Y. (2026). A Multimodal UAV-IoT Sensing Framework for Intelligent Pest Density Estimation in Smart Agricultural Systems. Sensors, 26(9), 2877. https://doi.org/10.3390/s26092877
