Occlusion Avoidance for Harvesting Robots: A Lightweight Active Perception Model
Highlights
- A lightweight YOLOv8n model integrated with C2f-FasterBlock and SE attention achieves high apple detection accuracy (mAP = 0.885) and real-time performance (83 FPS) with 37% fewer parameters and a compact 4.3 MB size
- An end-to-end active perception framework based on ResNet50 and multi-modal fusion enables the robotic arm to autonomously navigate to optimal viewpoints, significantly reducing occlusion and improving recognition success.
- The proposed co-design of efficient perception and active sensing offers a practical solution for reliable fruit detection in cluttered orchard environments, addressing a key bottleneck in agricultural automation.
- The system’s direct mapping from visual input to motion planning demonstrates a scalable paradigm for closed-loop robotic harvesting, paving the way for deployment in real-world field conditions.
Abstract
1. Introduction
2. Materials and Methods
2.1. Active Perception Strategy and Model Construction
2.1.1. Active Perception Process
2.1.2. Construction of Dataset
2.1.3. Active Perception Model
2.2. Construction of Lightweight Models
2.2.1. Lightweight YOLOv8n Model
2.2.2. C2f-FasterBlock
2.2.3. Squeeze-and-Excitation Networks
3. Experimental Design and Evaluation Indicators
3.1. Experimental Design
3.2. Evaluation Metrics for Network Models
4. Results
4.1. Active Perception Experiment
4.2. Comparative Experiments of Different Models
4.3. Ablation Study
4.4. Research Limitations
5. Discussion and Future Work
5.1. Discussion
5.2. Future Work
5.3. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Sun, T.; Zhang, W.; Gao, X.; Zhang, W.; Li, N.; Miao, Z. Efficient Occlusion Avoidance Based on Active Deep Sensing for Harvesting Robots. Comput. Electron. Agric. 2024, 225, 109135. [Google Scholar] [CrossRef]
- Zhang, L.; Jia, J.; Gui, G.; Hao, X.; Gao, W.; Wang, M. Deep Learning Based Improved Classification System for Designing Tomato Harvesting Robot. IEEE Access 2018, 6, 67940–67950. [Google Scholar] [CrossRef]
- Jha, K.; Doshi, A.; Patel, P.; Shah, M. A comprehensive review on automation in agriculture using artificial intelligence. Artif. Intell. Agric. 2019, 2, 1–12. [Google Scholar] [CrossRef]
- Saleem, M.H.; Potgieter, J.; Arif, K.M. Automation in agriculture by machine and deep learning techniques: A review of recent developments. Precis. Agric. 2021, 22, 2053–2091. [Google Scholar] [CrossRef]
- Zhou, H.; Wang, X.; Au, W.; Kang, H.; Chen, C. Intelligent robots for fruit harvesting: Recent developments and future challenges. Precis. Agric. 2022, 23, 1856–1907. [Google Scholar] [CrossRef]
- Bai, Q.; Li, S.; Yang, J.; Song, Q.; Li, Z.; Zhang, X. Object detection recognition and robot grasping based on machine learning: A survey. IEEE Access 2020, 8, 181855–181879. [Google Scholar] [CrossRef]
- Ji, W.; Zhang, T.; Xu, B.; He, G. Apple recognition and picking sequence planning for harvesting robot in a complex environment. J. Agric. Eng. 2024, 55, 1549. [Google Scholar]
- Iñiguez, R.; Palacios, F.; Barrio, I.; Hernández, I.; Gutiérrez, S.; Tardaguila, J. Impact of Leaf Occlusions on Yield Assessment by Computer Vision in Commercial Vineyards. Agronomy 2021, 11, 1003. [Google Scholar] [CrossRef]
- Gené-Mola, J.; Sanz-Cortiella, R.; Rosell-Polo, J.R.; Escolà, A.; Gregorio, E. In-Field Apple Size Estimation Using Photogrammetry-Derived 3D Point Clouds: Comparison of 4 Different Methods Considering Fruit Occlusions. Comput. Electron. Agric. 2021, 188, 106343. [Google Scholar] [CrossRef]
- Follmann, P.; König, R.; Härtinger, P.; Klostermann, M.; Böttger, T. Learning to See the Invisible: End-to-End Trainable Amodal Instance Segmentation. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV), Waikoloa, HI, USA, 7–11 January 2019; pp. 1328–1336. [Google Scholar]
- Sun, J.; He, X.; Wu, M.; Wu, X.; Lu, B. Detection of Tomato Organs Based on Convolutional Neural Network Under the Overlap and Occlusion Backgrounds. Mach. Vis. Appl. 2020, 31, 41. [Google Scholar] [CrossRef]
- Ranganathan, G.N.; Apostolides, P.F.; Harnett, M.T.; Xu, N.-L.; Druckmann, S.; Magee, J.C. Active Dendritic Integration and Mixed Neocortical Network Representations during an Adaptive Sensing Behavior. Nat. Neurosci. 2018, 21, 1583–1590. [Google Scholar] [CrossRef] [PubMed]
- Li, T.; Wang, C.; Meng, M.Q.-H.; de Silva, C.W. Attention-Driven Active Sensing with Hybrid Neural Network for Environmental Field Mapping. IEEE Trans. Autom. Sci. Eng. 2022, 19, 2135–2152. [Google Scholar] [CrossRef]
- Zapotezny-Anderson, P.; Lehnert, C. Towards Active Robotic Vision in Agriculture: A Deep Learning Approach to Visual Servoing in Occluded and Unstructured Protected Cropping Environments. In Proceedings of the 1st IFAC Conference on Sensing, Control and Automation Technologies for Agriculture, Virtual, 14–16 December 2020; pp. 1–6. [Google Scholar]
- Chen, J.; Kao, S.-H.; He, H.; Zhuo, W.; Wen, S.; Lee, C.-H.; Chan, S.-H.G. Run, Don’t Walk: Chasing Higher FLOPS for Faster Neural Networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 12001–12011. [Google Scholar]
- Hu, J.; Shen, L.; Sun, G.; Albanie, S. Squeeze-and-Excitation Networks. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 42, 2011–2023. [Google Scholar] [CrossRef] [PubMed]
- Molchanov, P.; Tyree, S.; Karras, T.; Aila, T.; Kautz, J. Pruning Convolutional Neural Networks for Resource Efficient Transfer Learning. arXiv 2016, arXiv:1611.06440. [Google Scholar]









| Network | mAP | Model Weight Size (MB) | GFLOPs | Parameters | FPS |
|---|---|---|---|---|---|
| YOLOv8n | 0.854 | 6.3 | 8.9 | 3,157,200 | 66 |
| YOLOv8s | 0.875 | 22.6 | 28.5 | 11,166,560 | 69 |
| YOLOv8m | 0.877 | 52.1 | 79.3 | 25,902,640 | 40 |
| YOLOv9t | 0.828 | 4.7 | 7.6 | 1,974,684 | 70 |
| Our network | 0.885 | 4.3 | 5.1 | 1,983,068 | 83 |
| Network | C2f-FasterBlock | SE Attention Mechanism | mAP | Model Weight Size (MB) | GFLOPs | Parameters | FPS |
|---|---|---|---|---|---|---|---|
| YOLOv8n | × | × | 0.854 | 6.3 | 8.9 | 3,157,200 | 69 |
| √ | × | 0.872 | 4.3 | 5.1 | 1,991,260 | 79 | |
| × | √ | 0.875 | 6.3 | 8.1 | 3,017,740 | 72 | |
| √ | √ | 0.885 | 4.3 | 5.1 | 1,983,068 | 83 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Zhang, T.; Huang, J.; Niu, J.; Liu, Z.; Zhang, L.; Song, H. Occlusion Avoidance for Harvesting Robots: A Lightweight Active Perception Model. Sensors 2026, 26, 291. https://doi.org/10.3390/s26010291
Zhang T, Huang J, Niu J, Liu Z, Zhang L, Song H. Occlusion Avoidance for Harvesting Robots: A Lightweight Active Perception Model. Sensors. 2026; 26(1):291. https://doi.org/10.3390/s26010291
Chicago/Turabian StyleZhang, Tao, Jiaxi Huang, Jinxing Niu, Zhengyi Liu, Le Zhang, and Huan Song. 2026. "Occlusion Avoidance for Harvesting Robots: A Lightweight Active Perception Model" Sensors 26, no. 1: 291. https://doi.org/10.3390/s26010291
APA StyleZhang, T., Huang, J., Niu, J., Liu, Z., Zhang, L., & Song, H. (2026). Occlusion Avoidance for Harvesting Robots: A Lightweight Active Perception Model. Sensors, 26(1), 291. https://doi.org/10.3390/s26010291

