Benchmarking YOLO and Transformer-Based Detectors for Olive Tree Crown Identification in UAV Imagery
Abstract
1. Introduction
2. Related Works
3. Materials and Methods
3.1. Datasets
3.1.1. Olive Tree Detection (OTD) Dataset
3.1.2. Yalova Dataset
3.2. You Only Look Once (YOLO)
3.3. Real-Time DEtection TRansformer (RT-DETR)
3.4. Roboflow-DEtection Transformer (RF-DETR)
3.5. Experimental Details
4. Results
4.1. Olive Tree Detection in OTD Dataset
4.2. Segmentation Results in OTD Dataset
4.3. Olive Tree Detection in Yalova Dataset
4.4. Segmentation Results in the Yalova Dataset
4.5. Tree Crown Size Analysis
5. Discussion
5.1. Evaluation of Tree Detection and Segmentation Performance
5.2. Statistical Analysis for mAP Values
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| YOLO | You Only Look Once |
| UAV | Unmanned Aerial Vehicle |
| RT-DETR | Real-Time Detection Transformer |
| RF-DETR | Roboflow-Detection Transformer |
| OTD | Olive Tree Detection |
| VHR | Very High Resolution |
| RGB | Red, Green, Blue |
| SAM | Segment Anything Model |
| AI | Artificial Intelligence |
| CNN | Convolutional Neural Network |
| RMSE | Root Mean Square Error |
| OTC | Olive Tree Crown |
| CSPNet | Cross Stage Partial Network |
| FPN | Feature Pyramid Network |
| PAN | Path Aggregation Network |
| NAS | Neural Architecture Search |
| LiDAR | Light Detection and Ranging |
| ViT | Vision Transformer |
| FOV | Field of View |
| GSD | Ground Sampling Distance |
| NMS | Non-Maximum Suppression |
| SPPF | Spatial Pyramid Pooling—Fast |
| C2PSA | Cross Stage Partial with Spatial Attention |
| mAP | Mean Average Precision |
| AP | Average Precision |
| TP | True Positive |
| FP | False Positive |
| FN | False Negative |
| COCO | Common Objects in Context |
| SLIC | Simple Linear Iterative Clustering |
| OBIA | Object-Based Image Analysis |
| IoU | Intersection over Union |
Appendix A
Appendix A.1
| Model | Precision | Recall | F1-Score |
|---|---|---|---|
| YOLOv8n | 0.947 | 0.907 | 0.927 |
| YOLOv8s | 0.948 | 0.908 | 0.927 |
| YOLOv8m | 0.943 | 0.917 | 0.930 |
| YOLOv8l | 0.943 | 0.910 | 0.926 |
| YOLOv8x | 0.951 | 0.898 | 0.924 |
| YOLOv11n | 0.956 | 0.902 | 0.928 |
| YOLOv11s | 0.941 | 0.904 | 0.922 |
| YOLOv11m | 0.935 | 0.913 | 0.924 |
| YOLOv11l | 0.936 | 0.915 | 0.925 |
| YOLOv11x | 0.945 | 0.906 | 0.925 |
| OBIA | 0.880 | 0.859 | 0.869 |
Appendix A.2
| Model | Precision | Recall | F1-Score |
|---|---|---|---|
| YOLOv8n | 0.847 | 0.771 | 0.807 |
| YOLOv8s | 0.866 | 0.748 | 0.802 |
| YOLOv8m | 0.830 | 0.774 | 0.801 |
| YOLOv8l | 0.868 | 0.795 | 0.830 |
| YOLOv8x | 0.863 | 0.792 | 0.826 |
| YOLOv11n | 0.850 | 0.748 | 0.796 |
| YOLOv11s | 0.827 | 0.812 | 0.820 |
| YOLOv11m | 0.851 | 0.786 | 0.817 |
| YOLOv11l | 0.786 | 0.842 | 0.813 |
| YOLOv11x | 0.835 | 0.809 | 0.822 |
| OBIA | 0.837 | 0.793 | 0.814 |
References
- Besnard, G.; Hernández, P.; Khadari, B.; Dorado, G.; Savolainen, V. Genomic Profiling of Plastid DNA Variation in the Mediterranean Olive Tree. BMC Plant Biol. 2011, 11, 80. [Google Scholar] [CrossRef]
- Šiljeg, A.; Marinović, R.; Domazetović, F.; Jurišić, M.; Marić, I.; Panđa, L.; Radočaj, D.; Milošević, R. GEOBIA and Vegetation Indices in Extracting Olive Tree Canopies Based on Very High-Resolution UAV Multispectral Imagery. Appl. Sci. 2023, 13, 739. [Google Scholar] [CrossRef]
- Waleed, M.; Um, T.-W.; Khan, A.; Khan, U. Automatic Detection System of Olive Trees Using Improved K-Means Algorithm. Remote Sens. 2020, 12, 760. [Google Scholar] [CrossRef]
- Araújo, R.G.; Chavez-Santoscoy, R.A.; Parra-Saldívar, R.; Melchor-Martínez, E.M.; Iqbal, H.M.N. Agro-Food Systems and Environment: Sustaining the Unsustainable. Curr. Opin. Environ. Sci. Health 2023, 31, 100413. [Google Scholar] [CrossRef]
- Atapattu, A.J.; Ranasinghe, C.; Nuwarapaksha, T.D.; Udumann, S.S.; Dissanayaka, N.S. Sustainable Agriculture and Sustainable Development Goals (SDGs). In Emerging Technologies and Marketing Strategies for Sustainable Agriculture; IGI Global Scientific Publishing: Hershey, PA, USA, 2024; pp. 1–27. [Google Scholar]
- Li, S.; Brandt, M.; Fensholt, R.; Kariryaa, A.; Igel, C.; Gieseke, F.; Nord-Larsen, T.; Oehmcke, S.; Carlsen, A.H.; Junttila, S.; et al. Deep Learning Enables Image-Based Tree Counting, Crown Segmentation, and Height Prediction at National Scale. PNAS Nexus 2023, 2, pgad076. [Google Scholar] [CrossRef]
- Srestasathiern, P.; Rakwatin, P. Oil Palm Tree Detection with High Resolution Multi-Spectral Satellite Imagery. Remote Sens. 2014, 6, 9749–9774. [Google Scholar] [CrossRef]
- Jemaa, H.; Bouachir, W.; Leblon, B.; LaRocque, A.; Haddadi, A.; Bouguila, N. UAV-Based Computer Vision System for Orchard Apple Tree Detection and Health Assessment. Remote Sens. 2023, 15, 3558. [Google Scholar] [CrossRef]
- Biyik, M.Y.; Atik, M.E.; Duran, Z. Deep Learning-Based Vehicle Detection from Orthophoto and Spatial Accuracy Analysis. Int. J. Eng. Geosci. 2023, 8, 138–145. [Google Scholar] [CrossRef]
- Arkali, M.; Biyik, M.Y.; Atik, M.E. Comparative Analysis of Machine Learning Algorithms for Classification of UAV-Based Photogrammetric Cultural Heritage Point Clouds. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2025, 48, 17–22. [Google Scholar] [CrossRef]
- Atik, Ş. Classification of Urban Vegetation Utilizing Spectral Indices and DEM with Ensemble Machine Learning Methods. Int. J. Environ. Geoinform. 2025, 12, 43–53. [Google Scholar] [CrossRef]
- Minařík, R.; Langhammer, J. Use of a Multispectral UAV Photogrammetry for Detection and Tracking of Forest Disturbance Dynamics. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, 41, 711–718. [Google Scholar] [CrossRef]
- Atik, M.E.; Arkali, M.; Atik, S.O. Impact of UAV-Derived RTK/PPK Products on Geometric Correction of VHR Satellite Imagery. Drones 2025, 9, 291. [Google Scholar] [CrossRef]
- Moreno-Garcia, J.; Jimenez, L.; Rodriguez-Benitez, L.; Solana-Cipres, C.J. Fuzzy Logic Applied to Detect Olive Trees in High Resolution Images. In Proceedings of the International Conference on Fuzzy Systems, Barcelona, Spain, 18–23 July 2010; pp. 1–7. [Google Scholar]
- González, J.; Galindo, C.; Arevalo, V.; Ambrosio, G. Applying Image Analysis and Probabilistic Techniques for Counting Olive Trees in High-Resolution Satellite Images. In Proceedings of the Advanced Concepts for Intelligent Vision Systems; Blanc-Talon, J., Philips, W., Popescu, D., Scheunders, P., Eds.; Springer: Berlin/Heidelberg, Germany, 2007; pp. 920–931. [Google Scholar]
- Peters, J.; Van Coillie, F.; Westra, T.; De Wulf, R. Synergy of Very High Resolution Optical and Radar Data for Object-Based Olive Grove Mapping. Int. J. Geogr. Inf. Sci. 2011, 25, 971–989. [Google Scholar] [CrossRef]
- Li, H.; Huang, J.; Gu, Z.; He, D.; Huang, J.; Wang, C. Positioning of Mango Picking Point Using an Improved YOLOv8 Architecture with Object Detection and Instance Segmentation. Biosyst. Eng. 2024, 247, 202–220. [Google Scholar] [CrossRef]
- Sun, C.; Huang, C.; Zhang, H.; Chen, B.; An, F.; Wang, L.; Yun, T. Individual Tree Crown Segmentation and Crown Width Extraction from a Heightmap Derived from Aerial Laser Scanning Data Using a Deep Learning Framework. Front. Plant Sci. 2022, 13, 914974. [Google Scholar] [CrossRef] [PubMed]
- Shehzadi, T.; Hashmi, K.A.; Liwicki, M.; Stricker, D.; Afzal, M.Z. Object Detection with Transformers: A Review. Sensors 2025, 25, 6025. [Google Scholar] [CrossRef] [PubMed]
- Kim, J.; Kim, S.; Ju, C.; Son, H.I. Unmanned Aerial Vehicles in Agriculture: A Review of Perspective of Platform, Control, and Applications. IEEE Access 2019, 7, 105100–105115. [Google Scholar] [CrossRef]
- Barbedo, J.G.A. A Review on the Use of Unmanned Aerial Vehicles and Imaging Sensors for Monitoring and Assessing Plant Stresses. Drones 2019, 3, 40. [Google Scholar] [CrossRef]
- Chemin, Y.H.; Beck, P.S.A. A Method to Count Olive Trees in Heterogenous Plantations from Aerial Photographs. Preprints 2017, 2017100170. [Google Scholar] [CrossRef]
- Moreno-Garcia, J.; Linares, L.J.; Rodriguez-Benitez, L.; Solana-Cipres, C. Olive Trees Detection in Very High Resolution Images. In International Conference on Information Processing and Management of Uncertainty in Knowledge-Based Systems; Springer: Berlin/Heidelberg, Germany, 2010. [Google Scholar]
- Khan, A.; Khan, U.; Waleed, M.; Khan, A.; Kamal, T.; Marwat, S.N.K.; Maqsood, M.; Aadil, F. Remote Sensing: An Automated Methodology for Olive Tree Detection and Counting in Satellite Images. IEEE Access 2018, 6, 77816–77828. [Google Scholar] [CrossRef]
- Li, W.; Fu, H.; Yu, L. Deep Convolutional Neural Network Based Large-Scale Oil Palm Tree Detection for High-Resolution Remote Sensing Images. In Proceedings of the 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Fort Worth, TX, USA, 23–28 July 2017; pp. 846–849. [Google Scholar]
- Jintasuttisak, T.; Edirisinghe, E.; Elbattay, A. Deep Neural Network Based Date Palm Tree Detection in Drone Imagery. Comput. Electron. Agric. 2022, 192, 106560. [Google Scholar] [CrossRef]
- Putra, Y.C.; Wijayanto, A.W. Automatic Detection and Counting of Oil Palm Trees Using Remote Sensing and Object-Based Deep Learning. Remote Sens. Appl. Soc. Environ. 2023, 29, 100914. [Google Scholar] [CrossRef]
- Chen, Y.; Xu, H.; Zhang, X.; Gao, P.; Xu, Z.; Huang, X. An Object Detection Method for Bayberry Trees Based on an Improved YOLO Algorithm. Int. J. Digit. Earth 2023, 16, 781–805. [Google Scholar] [CrossRef]
- Li, S.; Tao, T.; Zhang, Y.; Li, M.; Qu, H. YOLO V7-CS: A YOLO v7-Based Model for Lightweight Bayberry Target Detection Count. Agronomy 2023, 13, 2952. [Google Scholar] [CrossRef]
- Abozeid, A.; Alanazi, R.; Elhadad, A.; Taloba, A.I.; Abd El-Aziz, R.M. A Large-Scale Dataset and Deep Learning Model for Detecting and Counting Olive Trees in Satellite Imagery. Comput. Intell. Neurosci. 2022, 2022, 1549842. [Google Scholar] [CrossRef]
- Ye, Z.; Wei, J.; Lin, Y.; Guo, Q.; Zhang, J.; Zhang, H.; Deng, H.; Yang, K. Extraction of Olive Crown Based on UAV Visible Images and the U2-Net Deep Learning Model. Remote Sens. 2022, 14, 1523. [Google Scholar] [CrossRef]
- Ksibi, A.; Ayadi, M.; Soufiene, B.O.; Jamjoom, M.M.; Ullah, Z. MobiRes-Net: A Hybrid Deep Learning Model for Detecting and Classifying Olive Leaf Diseases. Appl. Sci. 2022, 12, 10278. [Google Scholar] [CrossRef]
- Șandric, I.; Irimia, R.; Petropoulos, G.P.; Anand, A.; Srivastava, P.K.; Pleșoianu, A.; Faraslis, I.; Stateras, D.; Kalivas, D. Tree’s Detection & Health’s Assessment from Ultra-High Resolution UAV Imagery and Deep Learning. Geocarto Int. 2022, 37, 10459–10479. [Google Scholar] [CrossRef]
- Mamalis, M.; Kalampokis, E.; Kalfas, I.; Tarabanis, K. Deep Learning for Detecting Verticillium Fungus in Olive Trees: Using YOLO in UAV Imagery. Algorithms 2023, 16, 343. [Google Scholar] [CrossRef]
- Hnida, Y.; Mahraz, M.A.; Yahyaouy, A.; Achebour, A.; Riffi, J.; Tairi, H. Enhanced Multi-Scale Detection of Olive Tree Crowns in UAV Orthophotos Using a Deep Learning Architecture. Smart Agric. Technol. 2025, 12, 101126. [Google Scholar] [CrossRef]
- Zhao, T.; Yang, Y.; Niu, H.; Wang, D.; Chen, Y. Comparing U-Net Convolutional Network with Mask R-CNN in the Performances of Pomegranate Tree Canopy Segmentation. In Proceedings of the Multispectral, Hyperspectral, and Ultraspectral Remote Sensing Technology, Techniques, and Applications VII; SPIE: Bellingham, WA, USA, 2018; Volume 10780, pp. 210–218. [Google Scholar]
- Safonova, A.; Guirado, E.; Maglinets, Y.; Alcaraz-Segura, D.; Tabik, S. Olive Tree Biovolume from UAV Multi-Resolution Image Segmentation with Mask R-CNN. Sensors 2021, 21, 1617. [Google Scholar] [CrossRef] [PubMed]
- Abdallah, A.B.; Kallel, A.; Dammak, M.; Ali, A.B. Olive Tree and Shadow Instance Segmentation Based on Detectron2. In Proceedings of the 2022 6th International Conference on Advanced Technologies for Signal and Image Processing (ATSIP), Moncton, NB, Canada, 24–27 May 2022; pp. 1–5. [Google Scholar]
- Alshammari, H.H.; Shahin, O.R. An Efficient Deep Learning Mechanism for the Recognition of Olive Trees in Jouf Region. Comput. Intell. Neurosci. 2022, 2022, 9249530. [Google Scholar] [CrossRef]
- Berni, J.A.J.; Zarco-Tejada, P.J.; Sepulcre-Cantó, G.; Fereres, E.; Villalobos, F. Mapping Canopy Conductance and CWSI in Olive Orchards Using High Resolution Thermal Remote Sensing Imagery. Remote Sens. Environ. 2009, 113, 2380–2388. [Google Scholar] [CrossRef]
- Taal, S.r.l. Tree Detected. 2024. Available online: https://zenodo.org/records/13121962 (accessed on 15 February 2026).
- Atik, M.E.; Duran, Z.; Özgünlük, R. Comparison of YOLO Versions for Object Detection from Aerial Images. Int. J. Environ. Geoinform. 2022, 9, 87–93. [Google Scholar] [CrossRef]
- Redmon, J.; Divvala, S.; Girshick, R.; Farhadi, A. You Only Look Once: Unified, Real-Time Object Detection. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Las Vegas, NV, USA, 2016; pp. 779–788. [Google Scholar]
- Murat, A.A.; Kiran, M.S. A Comprehensive Review on YOLO Versions for Object Detection. Eng. Sci. Technol. Int. J. 2025, 70, 102161. [Google Scholar] [CrossRef]
- Kang, S.; Hu, Z.; Liu, L.; Zhang, K.; Cao, Z. Object Detection YOLO Algorithms and Their Industrial Applications: Overview and Comparative Analysis. Electronics 2025, 14, 1104. [Google Scholar] [CrossRef]
- Ghahremani, A.; Adams, S.D.; Norton, M.; Khoo, S.Y.; Kouzani, A.Z. Detecting Defects in Solar Panels Using the YOLO V10 and V11 Algorithms. Electronics 2025, 14, 344. [Google Scholar] [CrossRef]
- Sohan, M.; Sai Ram, T.; Rami Reddy, C.V. A Review on YOLOv8 and Its Advancements. In Proceedings of the Data Intelligence and Cognitive Informatics; Jacob, I.J., Piramuthu, S., Falkowski-Gilski, P., Eds.; Springer Nature: Singapore, 2024; pp. 529–545. [Google Scholar]
- Tian, Y.; Ye, Q.; Doermann, D. YOLOv12: Attention-Centric Real-Time Object Detectors. arXiv 2025, arXiv:2502.12524. [Google Scholar]
- Terven, J.; Córdova-Esparza, D.-M.; Romero-González, J.-A. A Comprehensive Review of YOLO Architectures in Computer Vision: From YOLOv1 to YOLOv8 and YOLO-NAS. Mach. Learn. Knowl. Extr. 2023, 5, 1680–1716. [Google Scholar] [CrossRef]
- Adil Raja, M.; Loughran, R.; Mc Caffery, F. A Review of Performance of Recent YOLO Models on Cholecystectomy Tool Detection. Meas. Digit. 2025, 2–3, 100007. [Google Scholar] [CrossRef]
- Zhao, Y.; Lv, W.; Xu, S.; Wei, J.; Wang, G.; Dang, Q.; Liu, Y.; Chen, J. DETRs Beat YOLOs on Real-Time Object Detection. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: Seattle, WA, USA, 2024; pp. 16965–16974. [Google Scholar]
- Kong, Y.; Shang, X.; Jia, S. Drone-DETR: Efficient Small Object Detection for Remote Sensing Image Using Enhanced RT-DETR Model. Sensors 2024, 24, 5496. [Google Scholar] [CrossRef]
- Wang, S.; Xia, C.; Lv, F.; Shi, Y. RT-DETRv3: Real-Time End-to-End Object Detection with Hierarchical Dense Positive Supervision. In Proceedings of the 2025 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Tucson, AZ, USA, 26 February–6 March 2025; pp. 1628–1636. [Google Scholar]
- Lv, Z.; Dong, S.; Xia, Z.; He, J.; Zhang, J. Enhanced Real-Time Detection Transformer (RT-DETR) for Robotic Inspection of Underwater Bridge Pier Cracks. Autom. Constr. 2025, 170, 105921. [Google Scholar] [CrossRef]
- Hu, J.; Zheng, J.; Wan, W.; Zhou, Y.; Huang, Z. RT-DETR-EVD: An Emergency Vehicle Detection Method Based on Improved RT-DETR. Sensors 2025, 25, 3327. [Google Scholar] [CrossRef]
- Robinson, I.; Robicheaux, P.; Popov, M.; Ramanan, D.; Peri, N. RF-DETR: Neural Architecture Search for Real-Time Detection Transformers. arXiv 2025, arXiv:2511.09554. [Google Scholar]
- Dahiya, N.; Prakash, D.; Kundu, S.; Kuttan, S.R.; Suwalka, I.; Ayadi, M.; Dubale, M.; Hashmi, A. Optimised RFO Tuned RF-DETR Model for Precision Urine Microscopy for Renal and Systemic Disease Diagnosis. Sci. Rep. 2025, 15, 25842. [Google Scholar] [CrossRef]
- Sapkota, R.; Cheppally, R.H.; Sharda, A.; Karkee, M. RF-DETR Object Detection vs YOLOv12: A Study of Transformer-Based and CNN-Based Architectures for Single-Class and Multi-Class Greenfruit Detection in Complex Orchard Environments Under Label Ambiguity. arXiv 2025, arXiv:2504.13099. [Google Scholar]
- Cepni, S.; Atik, M.E.; Duran, Z. Vehicle Detection Using Different Deep Learning Algorithms from Image Sequence. BJMC 2020, 8, 347–358. [Google Scholar] [CrossRef]
- Isiler, M.; Yanalak, M.; Atik, M.E.; Atik, S.O.; Duran, Z. A Semi-Automated Two-Step Building Stock Monitoring Methodology for Supporting Immediate Solutions in Urban Issues. Sustainability 2023, 15, 8979. [Google Scholar] [CrossRef]
- Achanta, R.; Shaji, A.; Smith, K.; Lucchi, A.; Fua, P.; Süsstrunk, S. SLIC Superpixels Compared to State-of-the-Art Superpixel Methods. IEEE Trans. Pattern Anal. Mach. Intell. 2012, 34, 2274–2282. [Google Scholar] [CrossRef]
- Breiman, L. Random Forests. Mach. Learn. 2001, 45, 5–32. [Google Scholar] [CrossRef]
- Carion, N.; Massa, F.; Synnaeve, G.; Usunier, N.; Kirillov, A.; Zagoruyko, S. End-to-End Object Detection with Transformers. In Proceedings of the Computer Vision—ECCV 2020; Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M., Eds.; Springer International Publishing: Cham, Switzerland, 2020; pp. 213–229. [Google Scholar]
- Weinstein, B.G.; Marconi, S.; Bohlman, S.; Zare, A.; White, E. Individual Tree-Crown Detection in RGB Imagery Using Semi-Supervised Deep Learning Neural Networks. Remote Sens. 2019, 11, 1309. [Google Scholar] [CrossRef]
- Pantaleo, E.; Giannico, V.; Cilli, R.; Camposeo, S.; Elia, M.; Lafortezza, R.; Monaco, A.; Sanesi, G.; Tangaro, S.; Bellotti, R.; et al. Automated Olive Grove Classification and Tree Counting in Very High Resolution Aerial Imagery Using Deep Learning. Smart Agric. Technol. 2025, 12, 101551. [Google Scholar] [CrossRef]
- Efron, B.; Tibshirani, R. An Introduction to the Bootstrap; Monographs on Statistics and Applied Probability; Chapman & Hall: Boca Raton, FL, USA, 1998. [Google Scholar]

















| Parameters | OTD Dataset | Yalova Dataset |
|---|---|---|
| Data source | Public dataset | Generated for this study |
| UAV platform | DJI Mavic 3M | DJI Mavic 3M |
| Image type | Aerial RGB images | Orthophoto tiles |
| Spatial resolution | ~2 cm/pixel | ~3 cm/pixel |
| Image size | 960 × 640 pixels | 640 × 640 pixels |
| Annotation type (original) | Bounding box + segment | Segment |
| Augmentation techniques | Flip, ±90° and 180° rotation, ±15–20° rotation, crop, resize | Flip, ±10° shear, saturation (−30% to +30%), Gaussian noise (≤0.7%), rotation, reflection |
| Training images (original) | 1338 | 250 |
| Training images (augmented) | 9513 | 750 |
| Validation images | 385 | 36 |
| Test images | 202 | 69 |
| Model | FPS | GFLOPs | Parameters | Precision | Recall | mAP | mAP50-95 |
|---|---|---|---|---|---|---|---|
| YOLOv8n | 120.48 | 11.3 | 3,258,259 | 0.946 | 0.870 | 0.939 | 0.870 |
| YOLOv8s | 135.13 | 39.9 | 11,779,987 | 0.939 | 0.879 | 0.941 | 0.873 |
| YOLOv8m | 120.48 | 104.3 | 27,222,963 | 0.947 | 0.859 | 0.934 | 0.877 |
| YOLOv8l | 90.91 | 210.1 | 45,912,659 | 0.957 | 0.862 | 0.940 | 0.880 |
| YOLOv8x | 60.98 | 327.9 | 71,721,619 | 0.935 | 0.860 | 0.939 | 0.879 |
| YOLOv10n | 769.23 | 6.5 | 2,265,363 | 0.958 | 0.888 | 0.963 | 0.819 |
| YOLOv10s | 500.00 | 21.4 | 7,218,387 | 0.956 | 0.889 | 0.956 | 0.828 |
| YOLOv10m | 277.78 | 58.9 | 15,313,747 | 0.964 | 0.876 | 0.949 | 0.83 |
| YOLOv10b | 243.90 | 91.6 | 19,004,883 | 0.954 | 0.880 | 0.946 | 0.825 |
| YOLOv10l | 204.08 | 120.0 | 24,310,099 | 0.953 | 0.886 | 0.949 | 0.829 |
| YOLOv10x | 147.06 | 160.0 | 29,397,491 | 0.959 | 0.882 | 0.948 | 0.827 |
| YOLOv11n | 588.23 | 6.3 | 2,582,347 | 0.943 | 0.900 | 0.964 | 0.812 |
| YOLOv11s | 370.37 | 21.3 | 9,413,187 | 0.948 | 0.895 | 0.954 | 0.828 |
| YOLOv11m | 243.90 | 67.6 | 20,030,803 | 0.957 | 0.883 | 0.944 | 0.822 |
| YOLOv11l | 222.22 | 86.6 | 25,280,083 | 0.961 | 0.89 | 0.947 | 0.83 |
| YOLOv11x | 217.39 | 194.4 | 56,828,179 | 0.958 | 0.884 | 0.945 | 0.825 |
| YOLOv12n | 307.37 | 6.3 | 2,556,923 | 0.952 | 0.896 | 0.966 | 0.82 |
| YOLOv12s | 294.12 | 21.2 | 9,231,267 | 0.963 | 0.892 | 0.958 | 0.834 |
| YOLOv12m | 204.08 | 67.1 | 20,105,683 | 0.959 | 0.896 | 0.954 | 0.835 |
| YOLOv12l | 163.93 | 88.5 | 26,339,843 | 0.95 | 0.899 | 0.957 | 0.836 |
| YOLOv12x | 97.09 | 198.5 | 59,044,499 | 0.967 | 0.886 | 0.953 | 0.834 |
| RT-DETR | 106.39 | 103.4 | 31,985,795 | 0.943 | 0.915 | 0.976 | 0.771 |
| RF-DETR | 66.21 | 76.3 | 31,854,308 | 0.945 | 0.739 | 0.963 | 0.788 |
| Model | FPS | GFLOPs | Parameters | Precision | Recall | mAP | mAP50-95 |
|---|---|---|---|---|---|---|---|
| YOLOv8n | 666.66 | 12.0 | 3,258,269 | 0.949 | 0.920 | 0.968 | 0.898 |
| YOLOv8s | 286.71 | 42.4 | 11,779,987 | 0.951 | 0.919 | 0.968 | 0.899 |
| YOLOv8m | 163.93 | 110.0 | 27,222,963 | 0.951 | 0.929 | 0.968 | 0.896 |
| YOLOv8l | 111.11 | 220.1 | 45,912,659 | 0.954 | 0.923 | 0.969 | 0.902 |
| YOLOv8x | 62.5 | 343.7 | 71,721,619 | 0.944 | 0.921 | 0.968 | 0.906 |
| YOLOv10n | 769.23 | 6.5 | 2,265,363 | 0.959 | 0.936 | 0.983 | 0.841 |
| YOLOv10s | 500.00 | 21.4 | 7,218,387 | 0.952 | 0.940 | 0.981 | 0.842 |
| YOLOv10m | 277.78 | 58.9 | 15,313,747 | 0.944 | 0.953 | 0.984 | 0.845 |
| YOLOv10b | 243.90 | 91.6 | 19,004,883 | 0.955 | 0.937 | 0.983 | 0.842 |
| YOLOv10l | 204.08 | 120.0 | 24,310,099 | 0.951 | 0.940 | 0.983 | 0.840 |
| YOLOv10x | 147.06 | 160.0 | 29,397,491 | 0.951 | 0.940 | 0.983 | 0.840 |
| YOLOv11n | 588.23 | 6.3 | 2,582,347 | 0.960 | 0.953 | 0.987 | 0.841 |
| YOLOv11s | 370.37 | 21.3 | 9,413,187 | 0.951 | 0.965 | 0.987 | 0.847 |
| YOLOv11m | 243.90 | 67.6 | 20,030,803 | 0.955 | 0.950 | 0.984 | 0.846 |
| YOLOv11l | 222.22 | 86.6 | 25,280,083 | 0.957 | 0.942 | 0.982 | 0.841 |
| YOLOv11x | 217.39 | 194.4 | 56,828,179 | 0.933 | 0.955 | 0.980 | 0.828 |
| YOLOv12n | 307.37 | 6.3 | 2,556,923 | 0.939 | 0.965 | 0.986 | 0.840 |
| YOLOv12s | 294.12 | 21.2 | 9,231,267 | 0.949 | 0.961 | 0.987 | 0.845 |
| YOLOv12m | 204.08 | 67.1 | 20,105,683 | 0.938 | 0.962 | 0.984 | 0.839 |
| YOLOv12l | 163.93 | 88.5 | 26,339,843 | 0.952 | 0.949 | 0.983 | 0.840 |
| YOLOv12x | 97.09 | 198.5 | 59,044,499 | 0.938 | 0.954 | 0.982 | 0.825 |
| RT-DETR | 106.39 | 103.4 | 31,985,795 | 0.939 | 0.945 | 0.977 | 0.811 |
| RF-DETR | 66.21 | 76.3 | 31,854,308 | 0.913 | 0.724 | 0.977 | 0.806 |
| OTD Dataset Without Augmentation | OTD Dataset with Augmentation | |||||||
|---|---|---|---|---|---|---|---|---|
| Model | Precision | Recall | mAP | mAP50-95 | Precision | Recall | mAP | mAP50-95 |
| YOLOv8n | 0.947 | 0.907 | 0.956 | 0.833 | 0.924 | 0.850 | 0.916 | 0.793 |
| YOLOv8s | 0.948 | 0.908 | 0.955 | 0.830 | 0.921 | 0.857 | 0.919 | 0.796 |
| YOLOv8m | 0.943 | 0.917 | 0.955 | 0.832 | 0.928 | 0.839 | 0.914 | 0.807 |
| YOLOv8l | 0.943 | 0.910 | 0.954 | 0.835 | 0.936 | 0.844 | 0.921 | 0.806 |
| YOLOv8x | 0.951 | 0.898 | 0.953 | 0.842 | 0.924 | 0.835 | 0.917 | 0.806 |
| YOLOv11n | 0.956 | 0.902 | 0.955 | 0.815 | 0.917 | 0.845 | 0.922 | 0.796 |
| YOLOv11s | 0.941 | 0.904 | 0.949 | 0.824 | 0.936 | 0.843 | 0.923 | 0.812 |
| YOLOv11m | 0.935 | 0.913 | 0.954 | 0.831 | 0.923 | 0.850 | 0.919 | 0.805 |
| YOLOv11l | 0.936 | 0.915 | 0.954 | 0.841 | 0.945 | 0.836 | 0.924 | 0.790 |
| YOLOv11x | 0.945 | 0.906 | 0.954 | 0.836 | 0.921 | 0.843 | 0.919 | 0.789 |
| Model | FPS | GFLOPs | Parameters | Precision | Recall | mAP | mAP50-95 |
|---|---|---|---|---|---|---|---|
| YOLOv8n | 312.50 | 12.0 | 3,258,269 | 0.858 | 0.780 | 0.844 | 0.630 |
| YOLOv8s | 192.31 | 42.4 | 11,779,987 | 0.835 | 0.789 | 0.856 | 0.622 |
| YOLOv8m | 103.09 | 110.0 | 27,222,963 | 0.857 | 0.739 | 0.849 | 0.616 |
| YOLOv8l | 56.82 | 220.1 | 45,912,659 | 0.818 | 0.828 | 0.878 | 0.676 |
| YOLOv8x | 35.67 | 343.7 | 71,721,619 | 0.838 | 0.836 | 0.862 | 0.623 |
| YOLOv10n | 357.14 | 6.5 | 2,265,363 | 0.758 | 0.761 | 0.779 | 0.548 |
| YOLOv10s | 294.12 | 21.4 | 7,218,387 | 0.853 | 0.727 | 0.800 | 0.564 |
| YOLOv10m | 185.19 | 58.9 | 15,313,747 | 0.822 | 0.703 | 0.758 | 0.541 |
| YOLOv10b | 120.48 | 91.6 | 19,004,883 | 0.786 | 0.718 | 0.772 | 0.549 |
| YOLOv10l | 95.24 | 120.0 | 24,310,099 | 0.828 | 0.705 | 0.771 | 0.556 |
| YOLOv10x | 68.97 | 160.0 | 29,397,491 | 0.792 | 0.711 | 0.763 | 0.535 |
| YOLOv11n | 294.12 | 6.3 | 2,582,347 | 0.864 | 0.762 | 0.858 | 0.640 |
| YOLOv11s | 200.00 | 21.3 | 9,413,187 | 0.838 | 0.804 | 0.861 | 0.652 |
| YOLOv11m | 100.00 | 67.6 | 20,030,803 | 0.855 | 0.798 | 0.878 | 0.660 |
| YOLOv11l | 89.29 | 86.6 | 25,280,083 | 0.847 | 0.797 | 0.869 | 0.677 |
| YOLOv11x | 46.73 | 194.4 | 56,828,179 | 0.849 | 0.824 | 0.877 | 0.680 |
| YOLOv12n | 250.00 | 6.3 | 2,556,923 | 0.821 | 0.736 | 0.801 | 0.575 |
| YOLOv12s | 208.33 | 21.2 | 9,231,267 | 0.801 | 0.717 | 0.771 | 0.571 |
| YOLOv12m | 137.00 | 67.1 | 20,105,683 | 0.816 | 0.754 | 0.801 | 0.589 |
| YOLOv12l | 94.34 | 88.5 | 26,339,843 | 0.774 | 0.728 | 0.799 | 0.583 |
| YOLOv12x | 51.81 | 198.5 | 59,044,499 | 0.769 | 0.749 | 0.770 | 0.569 |
| RT-DETR | 102.04 | 103.4 | 31,985,795 | 0.812 | 0.718 | 0.775 | 0.548 |
| RF-DETR | 74.13 | 76.3 | 31,854,308 | 0.783 | 0.793 | 0.875 | 0.666 |
| Model | FPS | GFLOPs | Parameters | Precision | Recall | mAP | mAP50-95 |
|---|---|---|---|---|---|---|---|
| YOLOv8n | 285.71 | 12.0 | 3,258,269 | 0.845 | 0.751 | 0.835 | 0.636 |
| YOLOv8s | 181.82 | 42.4 | 11,779,987 | 0.851 | 0.780 | 0.863 | 0.653 |
| YOLOv8m | 100.00 | 110.0 | 27,222,963 | 0.857 | 0.739 | 0.849 | 0.616 |
| YOLOv8l | 58.83 | 220.1 | 45,912,659 | 0.881 | 0.798 | 0.884 | 0.676 |
| YOLOv8x | 37.73 | 343.7 | 71,721,619 | 0.875 | 0.803 | 0.884 | 0.679 |
| YOLOv10n | 312.50 | 6.5 | 2,265,363 | 0.728 | 0.728 | 0.758 | 0.555 |
| YOLOv10s | 416.67 | 21.4 | 7,218,387 | 0.766 | 0.763 | 0.806 | 0.575 |
| YOLOv10m | 208.33 | 58.9 | 15,313,747 | 0.765 | 0.703 | 0.776 | 0.563 |
| YOLOv10b | 121.95 | 91.6 | 19,004,883 | 0.717 | 0.710 | 0.727 | 0.501 |
| YOLOv10l | 94.34 | 120.0 | 24,310,099 | 0.772 | 0.743 | 0.773 | 0.556 |
| YOLOv10x | 69.44 | 160.0 | 29,397,491 | 0.750 | 0.719 | 0.748 | 0.544 |
| YOLOv11n | 333.33 | 6.3 | 2,582,347 | 0.815 | 0.798 | 0.859 | 0.632 |
| YOLOv11s | 238.05 | 21.3 | 9,413,187 | 0.824 | 0.823 | 0.863 | 0.684 |
| YOLOv11m | 144.93 | 67.6 | 20,030,803 | 0.853 | 0.785 | 0.866 | 0.675 |
| YOLOv11l | 125.00 | 86.6 | 25,280,083 | 0.789 | 0.845 | 0.875 | 0.679 |
| YOLOv11x | 46.51 | 194.4 | 56,828,179 | 0.892 | 0.772 | 0.878 | 0.686 |
| YOLOv12n | 270.27 | 6.3 | 2,556,923 | 0.739 | 0.785 | 0.788 | 0.563 |
| YOLOv12s | 200.00 | 21.2 | 9,231,267 | 0.750 | 0.760 | 0.785 | 0.582 |
| YOLOv12m | 125.00 | 67.1 | 20,105,683 | 0.794 | 0.731 | 0.775 | 0.570 |
| YOLOv12l | 92.59 | 88.5 | 26,339,843 | 0.765 | 0.725 | 0.782 | 0.578 |
| YOLOv12x | 52.08 | 198.5 | 59,044,499 | 0.789 | 0.713 | 0.781 | 0.572 |
| RT-DETR | 104.67 | 103.4 | 31,985,795 | 0.772 | 0.725 | 0.775 | 0.561 |
| RF-DETR | 74.70 | 76.3 | 31,854,308 | 0.819 | 0.794 | 0.875 | 0.667 |
| Yalova Dataset Without Augmentation | Yalova Dataset with Augmentation | |||||||
|---|---|---|---|---|---|---|---|---|
| Model | Precision | Recall | mAP | mAP50-95 | Precision | Recall | mAP | mAP50-95 |
| YOLOv8n | 0.843 | 0.745 | 0.827 | 0.589 | 0.847 | 0.771 | 0.824 | 0.584 |
| YOLOv8s | 0.840 | 0.768 | 0.843 | 0.588 | 0.866 | 0.748 | 0.837 | 0.591 |
| YOLOv8m | 0.830 | 0.774 | 0.854 | 0.599 | 0.845 | 0.736 | 0.828 | 0.571 |
| YOLOv8l | 0.868 | 0.795 | 0.874 | 0.608 | 0.815 | 0.815 | 0.863 | 0.620 |
| YOLOv8x | 0.863 | 0.792 | 0.870 | 0.622 | 0.808 | 0.812 | 0.827 | 0.568 |
| YOLOv11n | 0.804 | 0.786 | 0.841 | 0.578 | 0.850 | 0.748 | 0.836 | 0.574 |
| YOLOv11s | 0.827 | 0.812 | 0.856 | 0.621 | 0.840 | 0.789 | 0.849 | 0.581 |
| YOLOv11m | 0.851 | 0.786 | 0.855 | 0.619 | 0.851 | 0.780 | 0.857 | 0.611 |
| YOLOv11l | 0.786 | 0.842 | 0.867 | 0.623 | 0.831 | 0.781 | 0.849 | 0.624 |
| YOLOv11x | 0.884 | 0.754 | 0.863 | 0.614 | 0.835 | 0.809 | 0.860 | 0.618 |
| OTD Dataset | Yalova Dataset | |||
|---|---|---|---|---|
| Model | With Augmentation (m2) | Without Augmentation (m2) | With Augmentation (m2) | Without Augmentation (m2) |
| YOLOv8n | ±0.478 | ±0.915 | ±0.556 | ±0.624 |
| YOLOv8s | ±0.395 | ±0.842 | ±0.473 | ±0.519 |
| YOLOv8m | ±0.312 | ±0.785 | ±0.419 | ±0.462 |
| YOLOv8l | ±0.288 | ±0.714 | ±0.386 | ±0.418 |
| YOLOv8x | ±0.245 | ±0.672 | ±0.342 | ±0.371 |
| YOLOv11n | ±0.210 | ±0.745 | ±0.315 | ±0.348 |
| YOLOv11s | ±0.192 | ±0.688 | ±0.287 | ±0.312 |
| YOLOv11m | ±0.165 | ±0.610 | ±0.251 | ±0.273 |
| YOLOv11l | ±0.148 | ±0.552 | ±0.228 | ±0.245 |
| YOLOv11x | ±0.131 | ±0.514 | ±0.204 | ±0.221 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Atik, M.E.; Arkali, M. Benchmarking YOLO and Transformer-Based Detectors for Olive Tree Crown Identification in UAV Imagery. Geomatics 2026, 6, 22. https://doi.org/10.3390/geomatics6020022
Atik ME, Arkali M. Benchmarking YOLO and Transformer-Based Detectors for Olive Tree Crown Identification in UAV Imagery. Geomatics. 2026; 6(2):22. https://doi.org/10.3390/geomatics6020022
Chicago/Turabian StyleAtik, Muhammed Enes, and Mehmet Arkali. 2026. "Benchmarking YOLO and Transformer-Based Detectors for Olive Tree Crown Identification in UAV Imagery" Geomatics 6, no. 2: 22. https://doi.org/10.3390/geomatics6020022
APA StyleAtik, M. E., & Arkali, M. (2026). Benchmarking YOLO and Transformer-Based Detectors for Olive Tree Crown Identification in UAV Imagery. Geomatics, 6(2), 22. https://doi.org/10.3390/geomatics6020022

