SkyPin: Benchmarking Target Geo-Localization from UAV Imagery on 2.5D Maps
Highlights
- We introduce SkyPin, the first multi-modal benchmark dataset for UAV passive target geo-localization. It integrates 2.5D maps with centimeter-level RTK ground truth, effectively filling the gap in high-quality evaluation data.
- We propose a visual-geometric localization pipeline that reformulates the highly pose-sensitive projection problem into a cross-view feature alignment task. Through comprehensive evaluation, we establish RoMa combined with raytracing as the current best-performing baseline.
- The proposed framework significantly reduces the heavy reliance on highly accurate prior UAV poses, providing a practical technical path for the real-world deployment of low-cost drones in complex environments.
- The error sources identified in the benchmark—particularly under large-tilt, low-texture, and thermal cross-modal conditions—clearly define the limitations of existing methods and provide specific directions for future research to enhance localization robustness.
Abstract
1. Introduction
- 1.
- We introduce the first dataset for UAV passive target geo-localization, which includes 2.5D maps, aerial multi-modal imagery from drones, and centimeter-level ground-truth coordinates of the targets.
- 2.
- We propose an image-matching-based pipeline and apply it to UAV-based target geo-localization within 2.5D maps.
- 3.
- Using the proposed pipeline, we extensively evaluated multiple feature matching and geometric projection methods, and derived practical recommendations to guide future research.
2. Related Work
2.1. Target Geo-Localization
2.2. Target Localization Dataset
2.3. UAV–Satellite Image Matching
3. SkyPin Dataset
3.1. Dataset Overview
3.2. Dataset Collection and Processing
3.3. Dataset Scale and Characteristics
- 1.
- Multi-modal query imagery: The dataset provides synchronously captured infrared and RGB data, establishing a foundation for cross-modal UAV-based target geo-localization research.
- 2.
- Precise target geo-localization labeling: Centimeter-level positioning for target points is achieved using RTK equipment, a level of accuracy that surpasses most existing datasets.
- 3.
- Multi-source map and multi-perspective queries: The dataset incorporates both satellite and UAV reference maps, complemented by aerial queries captured from multiple altitudes and viewpoints around each target, thus providing a rich experimental environment.
4. Method
4.1. Reference Map Cropping
4.2. Feature Matching Module
4.3. Coordinate Estimation Algorithms
- Homography-based warping
- PnP-based raytracing
5. Results and Discussion
5.1. Experimental Methodology
- Median 2D Error (, m): The median Euclidean distance between the predicted and ground-truth XY coordinates:
- Median Height Error (, m): The median absolute difference between the predicted and ground-truth heights:
- 2D Recall@5m (Recall@5m): The percentage of queries whose predicted XY positions fall within 5 m of the ground truth:where is the indicator function that returns 1 if the condition is satisfied and 0 otherwise. In geo-localization tasks such as search-and-rescue, a 5-m XY tolerance is generally sufficient for practical operations, making Recall@5m a meaningful metric.
5.2. Localization Performance on RGB Data
5.3. Localization Performance on Thermal Data
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| UAV | Unmanned aerial vehicle |
| RTK | Real-Time Kinematic |
| TIR | Thermal infrared |
| DOM | Digital orthophoto map |
| DSM | Digital surface model |
| PnP | Perspective-n-Point |
References
- Cai, Y.; Zhou, Y.; Zhang, H.; Xia, Y.; Qiao, P.; Zhao, J. Review of Target Geo-Location Algorithms for Aerial Remote Sensing Cameras without Control Points. Appl. Sci. 2022, 12, 12689. [Google Scholar] [CrossRef]
- Stich, E.J. Geo-Pointing and Threat Location Techniques for Airborne Border Surveillance. In Proceedings of the 2013 IEEE International Conference on Technologies for Homeland Security (HST); IEEE: New York, NY, USA, 2013; pp. 136–140. [Google Scholar] [CrossRef]
- Liu, X.; Teng, X.; Li, Z.; Yu, Q.; Bian, Y. A Fast Algorithm for High Accuracy Airborne SAR Geolocation Based on Local Linear Approximation. IEEE Trans. Instrum. Meas. 2022, 71, 1–12. [Google Scholar] [CrossRef]
- Jin, G.; Dong, Z.; He, F.; Yu, A. Background-Free Ground Moving Target Imaging for Multi-PRF Airborne SAR. IEEE Trans. Geosci. Remote Sens. 2019, 57, 1949–1962. [Google Scholar] [CrossRef]
- Tang, X.; Zhang, X.; Shi, J.; Wei, S.; Pu, L. Ground Slowly Moving Target Detection and Velocity Estimation via High-Speed Platform Dual-Beam Synthetic Aperture Radar. J. Appl. Remote Sens. 2019, 13, 026516. [Google Scholar] [CrossRef]
- Jin, M.; Bai, Y.; Devys, E.; Di, L. Toward a Standardized Encoding of Remote Sensing Geo-Positioning Sensor Models. Remote Sens. 2020, 12, 1530. [Google Scholar] [CrossRef]
- Paulin, G.; Sambolek, S.; Ivasic-Kos, M. Application of Raycast Method for Person Geolocalization and Distance Determination Using UAV Images in Real-World Land Search and Rescue Scenarios. Expert Syst. Appl. 2024, 237, 121495. [Google Scholar] [CrossRef]
- Qiao, C.; Ding, Y.; Xu, Y.; Xiu, J. Ground Target Geolocation Based on Digital Elevation Model for Airborne Wide-Area Reconnaissance System. J. Appl. Remote Sens. 2018, 12, 016004. [Google Scholar] [CrossRef]
- DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperPoint: Self-Supervised Interest Point Detection and Description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW); IEEE: New York, NY, USA, 2018; pp. 224–236. [Google Scholar]
- Lindenberger, P.; Sarlin, P.E.; Pollefeys, M. LightGlue: Local Feature Matching at Light Speed. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; IEEE: New York, NY, USA, 2023; pp. 17581–17592. [Google Scholar] [CrossRef]
- Wang, Y.; He, X.; Peng, S.; Tan, D.; Zhou, X. Efficient LoFTR: Semi-Dense Local Feature Matching with Sparse-Like Speed. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2024; pp. 21666–21675. [Google Scholar]
- Edstedt, J.; Sun, Q.; Bökman, G.; Wadenbäck, M.; Felsberg, M. RoMa: Robust Dense Feature Matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2024; pp. 19790–19800. [Google Scholar]
- Sun, J.; Shen, Z.; Wang, Y.; Bao, H.; Zhou, X. LoFTR: Detector-Free Local Feature Matching with Transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2021; pp. 8922–8931. [Google Scholar]
- Tuzcuoğlu, Ö.; Köksal, A.; Sofu, B.; Kalkan, S.; Alatan, A.A. XoFTR: Cross-modal Feature Matching Transformer. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2024; pp. 4275–4286. [Google Scholar]
- Abdeen, M.A.R.; Nemer, I.A.; Sheltami, T.R. A Balanced Algorithm for In-City Parking Allocation: A Case Study of al Madinah City. Sensors 2021, 21, 3148. [Google Scholar] [CrossRef] [PubMed]
- Su, Y.; Liu, J. Research on UAV Target Location Algorithm of Linear Frequency Modulated Continuous Wave Laser Ranging Method. In Proceedings of the International Conference on Cognitive Computation and Systems; Springer: Singapore, 2023; pp. 107–122. [Google Scholar]
- Wang, X.; Liu, J.; Zhou, Q. Real-Time Multi-Target Localization from Unmanned Aerial Vehicles. Sensors 2016, 17, 33. [Google Scholar] [CrossRef] [PubMed]
- Qiao, C.; Ding, Y.l.; Xu, Y.s.; Xiu, J.h.; Du, Y.l. Ground target geo-location using imaging aerial camera with large inclined angles. Opt. Precis. Eng. 2017, 25, 1714. [Google Scholar]
- Bai, G.; Song, Y.; Zuo, Y.; Song, M.; Wang, X. Multitarget Location Capable of Adapting to Complex Geomorphic Environment for the Airborne Photoelectric Reconnaissance System. J. Appl. Remote Sens. 2020, 14, 036510. [Google Scholar] [CrossRef]
- El Habchi, A.; Moumen, Y.; Zerrouk, I.; Khiati, W.; Berrich, J.; Bouchentouf, T. CGA: A New Approach to Estimate the Geolocation of a Ground Target from Drone Aerial Imagery. In Proceedings of the 2020 Fourth International Conference On Intelligent Computing in Data Sciences (ICDS); IEEE: New York, NY, USA, 2020; pp. 1–4. [Google Scholar] [CrossRef]
- Huang, C.; Zhang, H.; Zhao, J. High-Efficiency Determination of Coastline by Combination of Tidal Level and Coastal Zone DEM from UAV Tilt Photogrammetry. Remote Sens. 2020, 12, 2189. [Google Scholar] [CrossRef]
- Cheng, B.T. A Simulation of Wide Area Surveillance (WAS) Systems and Algorithm for Digital Elevation Model (DEM) Extraction. In Proceedings of the Airborne Intelligence, Surveillance, Reconnaissance (ISR) Systems and Applications VII; SPIE: Bellingham, WA, USA, 2010; Volume 7668, pp. 90–104. [Google Scholar] [CrossRef]
- Yang, A.; Li, X.; Xie, J.; Wei, Y. Three-Dimensional Panoramic Terrain Reconstruction from Aerial Imagery. J. Appl. Remote Sens. 2013, 7, 073497. [Google Scholar] [CrossRef]
- Belkhouche, Y.; Duraisamy, P.; Buckles, B. Graph-Connected Components for Filtering Urban LiDAR Data. J. Appl. Remote Sens. 2015, 9, 096075. [Google Scholar] [CrossRef]
- Athmania, D.; Achour, H. External Validation of the ASTER GDEM2, GMTED2010 and CGIAR-CSI- SRTM v4.1 Free Access Digital Elevation Models (DEMs) in Tunisia and Algeria. Remote Sens. 2014, 6, 4600–4620. [Google Scholar] [CrossRef]
- Sambolek, S.; Ivasic-Kos, M. Automatic Person Detection in Search and Rescue Operations Using Deep CNN Detectors. IEEE Access 2021, 9, 37905–37922. [Google Scholar] [CrossRef]
- Mueller, M.; Smith, N.; Ghanem, B. A Benchmark and Simulator for UAV Tracking. In Proceedings of the European Conference on Computer Vision (ECCV); Springer: Cham, Switzerland, 2016; pp. 445–461. [Google Scholar] [CrossRef]
- Zhu, P.; Wen, L.; Bian, X.; Ling, H.; Hu, Q. Vision meets drones: A challenge. arXiv 2018, arXiv:1804.07437. [Google Scholar]
- Xu, W.; Yao, Y.; Cao, J.; Wei, Z.; Liu, C.; Wang, J.; Peng, M. UAV-VisLoc: A Large-Scale Dataset for UAV Visual Localization. arXiv 2024, arXiv:2405.11936. [Google Scholar]
- Xiao, J.; Tortei, D.; Roura, E.; Loianno, G. Long-Range UAV Thermal Geo-Localization with Satellite Imagery. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS); IEEE: New York, NY, USA, 2023; pp. 5820–5827. [Google Scholar]
- Morbidi, F.; Mariottini, G.L. Active Target Tracking and Cooperative Localization for Teams of Aerial Vehicles. IEEE Trans. Control Syst. Technol. 2013, 21, 1694–1707. [Google Scholar] [CrossRef]
- Van Dalen, G.J.; Magree, D.P.; Johnson, E.N. Absolute Localization Using Image Alignment and Particle Filtering. In AIAA Guidance, Navigation, and Control Conference; American Institute of Aeronautics and Astronautics: Reston, VA, USA, 2016. [Google Scholar] [CrossRef]
- Yol, A.; Delabarre, B.; Dame, A.; Dartois, J.É.; Marchand, E. Vision-Based Absolute Localization for Unmanned Aerial Vehicles. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems; IEEE: New York, NY, USA, 2014; pp. 3429–3434. [Google Scholar] [CrossRef]
- Shan, M.; Wang, F.; Lin, F.; Gao, Z.; Tang, Y.Z.; Chen, B.M. Google Map Aided Visual Navigation for UAVs in GPS-denied Environment. In Proceedings of the IEEE International Conference on Robotics and Biomimetics (ROBIO), Zhuhai, China, 6–9 December 2015; IEEE: New York, NY, USA, 2015; pp. 114–119. [Google Scholar] [CrossRef]
- Mantelli, M.; Pittol, D.; Neuland, R.; Ribacki, A.; Maffei, R.; Jorge, V.; Prestes, E.; Kolberg, M. A Novel Measurement Model Based on abBRIEF for Global Localization of a UAV over Satellite Images. Robot. Auton. Syst. 2019, 112, 304–319. [Google Scholar] [CrossRef]
- He, Y.; Cisneros, I.; Keetha, N.; Patrikar, J.; Ye, Z.; Higgins, I.; Hu, Y.; Kapoor, P.; Scherer, S. FoundLoc: Vision-based Onboard Aerial Localization in the Wild. arXiv 2023, arXiv:2310.16299. [Google Scholar] [CrossRef]
- Fragoso, A.T.; Lee, C.T.; McCoy, A.S.; Chung, S.J. A Seasonally Invariant Deep Transform for Visual Terrain-Relative Navigation. Sci. Robot. 2021, 6, eabf3320. [Google Scholar] [CrossRef] [PubMed]
- Shetty, A.; Gao, G.X. UAV Pose Estimation Using Cross-View Geolocalization with Satellite Imagery. In Proceedings of the International Conference on Robotics and Automation (ICRA), Montreal, QC, Canada, 20–24 May 2019; IEEE: New York, NY, USA, 2019; pp. 1827–1833. [Google Scholar] [CrossRef]
- Goforth, H.; Lucey, S. GPS-denied UAV Localization Using Pre-Existing Satellite Imagery. In Proceedings of the International Conference on Robotics and Automation (ICRA); IEEE: New York, NY, USA, 2019; pp. 2974–2980. [Google Scholar] [CrossRef]
- Bianchi, M.; Barfoot, T.D. UAV Localization Using Autoencoded Satellite Images. IEEE Robot. Autom. Lett. 2021, 6, 1761–1768. [Google Scholar] [CrossRef]
- Ren, J.; Jiang, X.; Li, Z.; Liang, D.; Zhou, X.; Bai, X. MINIMA: Modality Invariant Image Matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); IEEE: New York, NY, USA, 2025; pp. 23059–23068. [Google Scholar]

indicates that the scene includes both daytime and nighttime images, and
indicates multiple weather conditions.
indicates that the scene includes both daytime and nighttime images, and
indicates multiple weather conditions.




indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
indicate predicted locations,
denote ground truth, and lines connect each prediction to its corresponding ground truth.
| Dataset | SARD [26] | UAV123 [27] | VisDrone [28] | UAV-VisLoc [29] | Boson-Nighttime [30] | SkyPin (Ours) |
|---|---|---|---|---|---|---|
| RGB | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
| Thermal | ✗ | ✗ | ✗ | ✗ | ✓ | ✓ |
| Source | drone | drone, synthetic | drone | drone, satellite | drone, satellite | drone, satellite |
| Target Annotation | ✓ human only | ✓ | ✓ | ✗ | ✗ | ✓ |
| Geo Annotation | ✗ | ✗ | ✗ | ✓ image center | ✓ image center | ✓ target |
| View Type | Oblique | Oblique | Mixed | Mixed | Nadir | Oblique |
| Humidity | ✗ | ✗ | ✗ | ✗ | ✗ | ✓ |
| Task | Search & Rescue | Obj. Track. | Obj. Det./Track. | UAV visual Geo-loc. | Thermal Geo-loc. | Target Geo-loc. |
| Covered Scenes | Area (m2) | Source |
|---|---|---|
| Lakeside/Pond/Square | 2,450,000 | DOM/DSM |
DOM/DSM | ||
| Campus/Residential | 1,660,000 | DOM/DSM |
DOM/DSM | ||
| Industrial/Farmland | 1,220,000 | DOM/DSM |
DOM/DSM | ||
| Mountain | 1,450,000 | DOM/DSM |
DOM/DEM |
| Modality | Matcher | Crop (ms) | Match (ms) | Homo. (ms) | Ray. (ms) | Full Homo. (ms) | Full Ray. (ms) | GPU Mem. (MB) |
|---|---|---|---|---|---|---|---|---|
| RGB | RoMa | 30.5 | 375.3 | 29.1 | 43.8 | 434.9 | 449.6 | 7174 |
| ELoFTR | 68.5 | 128.1 | 142.8 | 4694 | ||||
| LoFTR | 96.2 | 155.8 | 170.5 | 5608 | ||||
| XoFTR | 80.5 | 140.1 | 154.8 | 4152 | ||||
| SP-LightGlue | 50.1 | 109.7 | 124.4 | 2874 | ||||
| Thermal | RoMa | 14.7 | 336.4 | 29.1 | 43.8 | 380.2 | 394.9 | 7174 |
| ELoFTR | 32.4 | 76.2 | 90.9 | 4694 | ||||
| LoFTR | 43.6 | 87.4 | 102.1 | 5608 | ||||
| XoFTR | 41.5 | 85.3 | 100.0 | 4152 | ||||
| SP-LightGlue | 31.0 | 74.8 | 89.5 | 2874 |
| Method | Satellite | UAV | ||||
|---|---|---|---|---|---|---|
| R@5m | R@5m | |||||
| RGB | Homography | RoMa | 3.08 | 0.84 | 1.32 | 0.89 |
| ELoFTR | 4.13 | 0.55 | 2.28 | 0.63 | ||
| XoFTR | 56.95 | 0.16 | 20.93 | 0.33 | ||
| LoFTR | 196.37 | 0.02 | 103.11 | 0.07 | ||
| SP-LightGlue | 26.94 | 0.32 | 2.03 | 0.69 | ||
| Raytrace | RoMa | 2.93 | 0.90 | 0.87 | 0.94 | |
| ELoFTR | 3.90 | 0.57 | 1.51 | 0.68 | ||
| XoFTR | 91.38 | 0.17 | 35.06 | 0.38 | ||
| LoFTR | 219.77 | 0.03 | 164.15 | 0.11 | ||
| SP-LightGlue | 28.54 | 0.38 | 1.42 | 0.73 | ||
| Thermal | Homography | RoMa | 3.05 | 0.87 | 1.20 | 0.95 |
| ELoFTR | 5.58 | 0.47 | 2.74 | 0.62 | ||
| XoFTR | 13.83 | 0.34 | 3.24 | 0.57 | ||
| LoFTR | 83.61 | 0.04 | 38.06 | 0.14 | ||
| SP-LightGlue | 33.52 | 0.13 | 3.25 | 0.58 | ||
| Raytrace | RoMa | 2.89 | 0.93 | 0.87 | 0.98 | |
| ELoFTR | 5.27 | 0.49 | 2.16 | 0.62 | ||
| XoFTR | 12.18 | 0.39 | 2.07 | 0.60 | ||
| LoFTR | 93.39 | 0.06 | 49.43 | 0.19 | ||
| SP-LightGlue | 49.33 | 0.16 | 1.97 | 0.61 | ||
| Method | Lakeside | Pond | Square | Campus | Residential | Industrial | Farmland | Mountain | |
|---|---|---|---|---|---|---|---|---|---|
| , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | ||
| Homography | RoMa | 2.36, 0.96 | 2.86, 0.87 | 2.48, 0.87 | 3.18, 0.87 | 3.52, 0.77 | 3.14, 0.98 | 3.38, 0.80 | 27.09, 0.11 |
| ELoFTR | 2.89, 0.76 | 34.35, 0.33 | 2.93, 0.71 | 5.09, 0.50 | 13.02, 0.40 | 3.47, 0.69 | 3.66, 0.65 | 49.59, 0.14 | |
| XoFTR | 89.85, 0.11 | 139.42, 0.04 | 41.71, 0.26 | 48.17, 0.13 | 72.78, 0.07 | 39.19, 0.27 | 30.45, 0.18 | 322.29, 0.04 | |
| LoFTR | 119.75, 0.01 | 342.98, 0.00 | 271.62, 0.00 | 111.88, 0.01 | inf, 0.00 | 100.26, 0.07 | 224.98, 0.02 | inf, 0.00 | |
| SP-LightGlue | 35.74, 0.27 | 44.31, 0.26 | 16.62, 0.38 | 40.98, 0.21 | 79.06, 0.09 | 3.47, 0.69 | 11.68, 0.37 | 89.24, 0.06 | |
| Raytrace | RoMa | 2.16, 0.97 | 2.82, 0.84 | 2.22, 0.92 | 3.31, 0.93 | 3.25, 0.87 | 2.91, 0.99 | 3.12, 0.89 | 3.71, 0.68 |
| ELoFTR | 2.80, 0.65 | 48.86, 0.36 | 2.81, 0.69 | 4.80, 0.50 | 27.95, 0.38 | 3.15, 0.73 | 3.41, 0.69 | 9.28, 0.39 | |
| XoFTR | 146.31, 0.09 | 182.23, 0.07 | 89.75, 0.19 | 61.16, 0.14 | 120.59, 0.04 | 50.71, 0.37 | 29.94, 0.29 | 400.06, 0.03 | |
| LoFTR | 137.83, 0.00 | 297.40, 0.00 | 228.68, 0.00 | 165.23, 0.02 | inf, 0.00 | 142.84, 0.11 | 189.34, 0.04 | inf, 0.00 | |
| SP-LightGlue | 31.71, 0.31 | 70.08, 0.30 | 17.97, 0.42 | 41.93, 0.27 | 110.90, 0.10 | 3.21, 0.74 | 5.22, 0.50 | 149.10, 0.11 | |
| Method | Lakeside | Pond | Square | Campus | Residential | Industrial | Farmland | Mountain | |
|---|---|---|---|---|---|---|---|---|---|
| , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | ||
| Homography | RoMa | 0.86, 0.98 | 1.35, 0.98 | 1.00, 0.88 | 1.39, 0.94 | 1.27, 0.96 | 1.52, 0.99 | 1.39, 0.78 | 18.82, 0.33 |
| ELoFTR | 0.99, 0.82 | 4.53, 0.51 | 1.54, 0.72 | 2.17, 0.65 | 2.94, 0.60 | 1.86, 0.70 | 2.86, 0.66 | 46.17, 0.12 | |
| XoFTR | 6.64, 0.48 | 124.72, 0.18 | 12.94, 0.40 | 29.77, 0.32 | 29.18, 0.13 | 5.70, 0.48 | 7.06, 0.42 | 77.82, 0.09 | |
| LoFTR | 52.68, 0.03 | 199.41, 0.05 | 107.76, 0.02 | 70.54, 0.04 | 232.77, 0.02 | 41.55, 0.26 | 69.00, 0.09 | inf, 0.00 | |
| SP-LightGlue | 1.35, 0.76 | 1.51, 0.67 | 1.26, 0.79 | 2.49, 0.64 | 2.85, 0.63 | 1.72, 0.86 | 2.60, 0.69 | 50.28, 0.14 | |
| Raytrace | RoMa | 0.57, 0.97 | 1.31, 0.83 | 0.66, 0.92 | 0.77, 1.00 | 1.00, 0.96 | 1.06, 0.99 | 1.10, 0.98 | 0.65, 0.79 |
| ELoFTR | 0.84, 0.76 | 4.40, 0.51 | 1.14, 0.75 | 1.52, 0.63 | 1.75, 0.62 | 1.63, 0.74 | 1.33, 0.80 | 0.85, 0.63 | |
| XoFTR | 10.15, 0.45 | 149.33, 0.23 | 28.35, 0.39 | 39.15, 0.36 | 57.87, 0.09 | 2.22, 0.62 | 3.76, 0.57 | 109.30, 0.15 | |
| LoFTR | 90.67, 0.05 | 295.62, 0.06 | 229.84, 0.02 | 122.51, 0.08 | 218.98, 0.02 | 41.97, 0.33 | 80.21, 0.25 | inf, 0.00 | |
| SP-LightGlue | 0.90, 0.77 | 1.43, 0.67 | 0.98, 0.80 | 1.57, 0.66 | 2.09, 0.60 | 1.48, 0.87 | 1.29, 0.83 | 22.46, 0.38 | |
| Method | Lakeside | Pond | Square | Campus | Residential | Industrial | Farmland | Mountain | |
|---|---|---|---|---|---|---|---|---|---|
| , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | ||
| Homography | RoMa | 2.84, 0.94 | 3.10, 0.92 | 2.51, 0.94 | 3.30, 0.83 | 3.75, 0.76 | 2.82, 0.99 | 3.19, 0.90 | 5.14, 0.47 |
| ELoFTR | 3.46, 0.67 | 6.79, 0.45 | 4.46, 0.53 | 9.26, 0.38 | 6.20, 0.45 | 4.19, 0.57 | 4.96, 0.50 | 27.69, 0.15 | |
| XoFTR | 7.22, 0.44 | 30.06, 0.20 | 4.42, 0.54 | 18.25, 0.23 | 26.13, 0.28 | 6.38, 0.43 | 8.71, 0.38 | inf, 0.00 | |
| LoFTR | 64.36, 0.01 | 256.63, 0.02 | 78.50, 0.05 | 52.26, 0.02 | inf, 0.01 | 31.85, 0.13 | 113.65, 0.04 | inf, 0.00 | |
| SP-LightGlue | 39.54, 0.07 | 32.76, 0.14 | 29.29, 0.15 | 38.16, 0.09 | 40.68, 0.15 | 27.75, 0.19 | 39.58, 0.11 | 29.72, 0.09 | |
| Raytrace | RoMa | 2.50, 0.97 | 2.92, 0.95 | 2.31, 0.99 | 3.40, 0.92 | 3.41, 0.84 | 2.67, 0.96 | 2.98, 0.94 | 3.80, 0.77 |
| ELoFTR | 3.14, 0.61 | 5.68, 0.49 | 4.56, 0.52 | 7.05, 0.45 | 7.57, 0.47 | 3.65, 0.60 | 4.93, 0.50 | 34.65, 0.16 | |
| XoFTR | 5.80, 0.48 | 30.16, 0.26 | 3.51, 0.55 | 17.34, 0.33 | 27.64, 0.32 | 4.43, 0.53 | 6.99, 0.45 | inf, 0.02 | |
| LoFTR | 80.51, 0.00 | 193.46, 0.03 | 82.12, 0.06 | 69.53, 0.05 | inf, 0.00 | 45.64, 0.17 | 87.60, 0.06 | inf, 0.00 | |
| SP-LightGlue | 63.68, 0.08 | 39.87, 0.18 | 56.36, 0.15 | 48.18, 0.16 | 63.30, 0.17 | 30.07, 0.23 | 53.46, 0.15 | 66.96, 0.09 | |
| Method | Lakeside | Pond | Square | Campus | Residential | Industrial | Farmland | Mountain | |
|---|---|---|---|---|---|---|---|---|---|
| , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | , R@5m | ||
| Homography | RoMa | 1.05, 0.98 | 1.38, 0.99 | 1.00, 0.96 | 1.39, 0.98 | 1.17, 0.95 | 1.22, 0.99 | 1.20, 0.90 | 2.24, 0.70 |
| ELoFTR | 1.13, 0.91 | 3.11, 0.57 | 2.37, 0.65 | 3.63, 0.56 | 2.70, 0.61 | 1.98, 0.76 | 2.65, 0.66 | 22.54, 0.17 | |
| XoFTR | 1.15, 0.83 | 9.39, 0.43 | 1.86, 0.70 | 7.69, 0.44 | 6.24, 0.47 | 1.78, 0.75 | 2.53, 0.66 | inf, 0.09 | |
| LoFTR | 27.74, 0.06 | 61.85, 0.11 | 40.42, 0.11 | 39.65, 0.07 | 56.02, 0.06 | 7.41, 0.40 | 32.31, 0.20 | inf, 0.00 | |
| SP-LightGlue | 1.53, 0.70 | 4.80, 0.51 | 2.12, 0.64 | 5.66, 0.48 | 4.26, 0.54 | 1.78, 0.80 | 3.20, 0.60 | 22.72, 0.20 | |
| Raytrace | RoMa | 0.83, 1.00 | 1.30, 0.99 | 0.71, 1.00 | 0.78, 1.00 | 0.95, 0.95 | 0.94, 0.96 | 0.83, 0.97 | 0.60, 0.96 |
| ELoFTR | 0.80, 0.90 | 2.72, 0.60 | 2.29, 0.60 | 3.90, 0.53 | 1.77, 0.65 | 1.56, 0.75 | 1.80, 0.65 | 23.59, 0.30 | |
| XoFTR | 0.80, 0.88 | 5.80, 0.48 | 1.19, 0.72 | 3.63, 0.52 | 8.16, 0.47 | 1.47, 0.75 | 1.43, 0.67 | inf, 0.11 | |
| LoFTR | 43.77, 0.08 | 64.19, 0.19 | 54.28, 0.15 | 50.71, 0.15 | 80.15, 0.09 | 10.26, 0.44 | 37.09, 0.26 | 161.28, 0.00 | |
| SP-LightGlue | 1.10, 0.69 | 3.43, 0.54 | 1.28, 0.68 | 3.73, 0.53 | 2.32, 0.59 | 1.30, 0.79 | 1.30, 0.67 | 46.96, 0.20 | |
| Map | Method | Pitch | Pitch | Pitch | |
|---|---|---|---|---|---|
| , R@5m | , R@5m | , R@5m | |||
| Satellite | Homography | RoMa | 2.71, 1.00 | 2.83, 0.97 | 3.70, 0.67 |
| ELoFTR | 2.89, 0.95 | 3.05, 0.88 | 6.74, 0.41 | ||
| XoFTR | 3.29, 0.76 | 5.50, 0.46 | 49.45, 0.12 | ||
| LoFTR | 77.71, 0.02 | 29.06, 0.19 | 98.40, 0.04 | ||
| SP-LightGlue | 14.95, 0.30 | 10.38, 0.33 | 67.36, 0.08 | ||
| Raytrace | RoMa | 2.63, 1.00 | 2.71, 1.00 | 3.04, 0.82 | |
| ELoFTR | 2.74, 0.97 | 2.67, 1.00 | 4.32, 0.57 | ||
| XoFTR | 2.99, 0.90 | 3.53, 0.63 | 46.27, 0.28 | ||
| LoFTR | 95.55, 0.00 | 26.74, 0.27 | 117.21, 0.07 | ||
| SP-LightGlue | 13.23, 0.36 | 5.67, 0.47 | 79.97, 0.15 | ||
| UAV | Homography | RoMa | 1.08, 1.00 | 1.33, 1.00 | 2.45, 0.85 |
| ELoFTR | 1.27, 1.00 | 1.45, 0.97 | 4.42, 0.53 | ||
| XoFTR | 1.29, 0.91 | 1.81, 0.82 | 47.65, 0.19 | ||
| LoFTR | 23.49, 0.21 | 6.13, 0.45 | 88.47, 0.13 | ||
| SP-LightGlue | 1.43, 0.85 | 1.70, 0.78 | 19.86, 0.38 | ||
| Raytrace | RoMa | 0.78, 1.00 | 1.10, 1.00 | 2.22, 0.88 | |
| ELoFTR | 0.82, 0.99 | 1.10, 0.99 | 3.10, 0.63 | ||
| XoFTR | 0.83, 0.98 | 1.29, 0.90 | 41.21, 0.35 | ||
| LoFTR | 32.02, 0.24 | 2.66, 0.68 | 84.12, 0.20 | ||
| SP-LightGlue | 0.91, 0.89 | 1.34, 0.84 | 9.72, 0.44 | ||
| Map | Method | Low Humidity | High Humidity | |
|---|---|---|---|---|
| , R@5m | , R@5m | |||
| Satellite | Homography | RoMa | 2.91, 0.88 | 3.21, 0.86 |
| ELoFTR | 5.93, 0.46 | 5.16, 0.49 | ||
| XoFTR | 14.14, 0.33 | 13.56, 0.35 | ||
| LoFTR | 77.09, 0.04 | 94.09, 0.06 | ||
| SP-LightGlue | 29.75, 0.14 | 39.34, 0.11 | ||
| Raytrace | RoMa | 2.85, 0.95 | 2.94, 0.91 | |
| ELoFTR | 5.97, 0.48 | 4.84, 0.51 | ||
| XoFTR | 12.19, 0.39 | 12.12, 0.39 | ||
| LoFTR | 95.01, 0.04 | 92.81, 0.08 | ||
| SP-LightGlue | 43.52, 0.19 | 59.19, 0.12 | ||
| UAV | Homography | RoMa | 1.08, 0.97 | 1.38, 0.92 |
| ELoFTR | 2.59, 0.62 | 2.87, 0.63 | ||
| XoFTR | 3.03, 0.57 | 3.60, 0.56 | ||
| LoFTR | 39.40, 0.11 | 35.93, 0.18 | ||
| SP-LightGlue | 2.43, 0.65 | 6.30, 0.47 | ||
| Raytrace | RoMa | 0.86, 0.99 | 0.87, 0.97 | |
| ELoFTR | 2.15, 0.61 | 2.16, 0.63 | ||
| XoFTR | 1.84, 0.62 | 2.63, 0.56 | ||
| LoFTR | 50.75, 0.18 | 47.39, 0.21 | ||
| SP-LightGlue | 1.58, 0.69 | 4.43, 0.51 | ||
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Wang, Z.; Wu, R.; Liu, Y.; Huang, Y.; Yan, S.; Zhang, M. SkyPin: Benchmarking Target Geo-Localization from UAV Imagery on 2.5D Maps. Drones 2026, 10, 500. https://doi.org/10.3390/drones10070500
Wang Z, Wu R, Liu Y, Huang Y, Yan S, Zhang M. SkyPin: Benchmarking Target Geo-Localization from UAV Imagery on 2.5D Maps. Drones. 2026; 10(7):500. https://doi.org/10.3390/drones10070500
Chicago/Turabian StyleWang, Zhaochen, Rouwan Wu, Yuxiang Liu, Yudong Huang, Shen Yan, and Maojun Zhang. 2026. "SkyPin: Benchmarking Target Geo-Localization from UAV Imagery on 2.5D Maps" Drones 10, no. 7: 500. https://doi.org/10.3390/drones10070500
APA StyleWang, Z., Wu, R., Liu, Y., Huang, Y., Yan, S., & Zhang, M. (2026). SkyPin: Benchmarking Target Geo-Localization from UAV Imagery on 2.5D Maps. Drones, 10(7), 500. https://doi.org/10.3390/drones10070500



