A Depth-Guided Local Outlier Rejection Methodology for Robust Feature Matching in Urban UAV Images
Highlights
- The proposed depth-guided local outlier rejection methodology integrates monocular depth estimation, DBSCAN clustering, and localized model estimation to improve feature matching reliability in complex urban UAV imagery.
- Higher Recall and F1-score were achieved than with conventional 2D-based outlier rejection methods, while comparable Precision was maintained, demonstrating robust inlier preservation under depth and viewpoint variations.
- Incorporating single-image depth information enhances geometric consistency and registration stability in depth-varying urban environments.
- The methodology effectively corrects depth- and viewpoint-related mismatches, enhancing UAV image registration reliability.
Abstract
1. Introduction
1.1. Related Work
1.1.1. Global Geometric Model-Based Approaches
1.1.2. Multi-Plane Fitting-Based Approaches
1.1.3. Photogrammetry-Based Approaches
1.1.4. Depth-Assisted Approaches
1.2. Limitations of Existing Approaches
1.3. Research Objectives
2. Methods
2.1. Dataset Configuration
2.2. Proposed Methodology
2.2.1. Depth Estimation
2.2.2. Feature Matching
2.2.3. Clustering of Pseudo-3D Coordinates
2.2.4. Local Clustering-Based RANSAC
2.3. Implementation Details
3. Results
3.1. Overall Performance Comparison
3.2. Performance Analysis by Image Category
3.3. Overall Results
4. Conclusions
- (1)
- By integrating Monocular Depth Estimation results into feature coordinates to form a pseudo-3D space, the method enables geometrically consistent correspondence verification even in urban imagery containing depth discontinuities.
- (2)
- Through a cluster-based local outlier rejection structure, it performs independent model estimation for each local region, overcoming the limitations of global registration.
- (3)
- Despite the integration of depth information, the method maintains stable Precision across all outlier rejection algorithms considered, confirming the effectiveness of the pseudo-3D constraint in selectively incorporating depth cues into the registration process.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
Abbreviations
| UAV | Unmanned Aerial Vehicle |
| Visual SLAM | Visual Simultaneous Localization and Mapping |
| RANSAC | Random Sample Consensus |
| LMedS | Least Median of Squares |
| BA | Bundle Adjustment |
| SfM | Structure-from-Motion |
| SIFT | Scale-Invariant Feature Transform |
| DBSCAN | Density-Based Spatial Clustering of Applications with Noise |
References
- Zhou, G.; Ambrosia, V.; Gasiewski, A.J.; Bland, G. Foreword to the Special Issue on Unmanned Airborne Vehicle (UAV) Sensing Systems for Earth Observations. IEEE Trans. Geosci. Remote Sens. 2009, 47, 687–689. [Google Scholar] [CrossRef]
- Colomina, I.; Molina, P. Unmanned Aerial Systems for Photogrammetry and Remote Sensing: A Review. ISPRS J. Photogramm. Remote Sens. 2014, 92, 79–97. [Google Scholar] [CrossRef]
- Saikhom, V.; Kalita, M. UAV for Remote Sensing Applications: An Analytical Review. In Proceedings of the Emerging Global Trends in Engineering and Technology 2022, Guwahati, India, 21–22 April 2022; Springer Nature: Singapore, 2022; pp. 51–59. [Google Scholar] [CrossRef]
- Zhang, Z.; Zhu, L. A Review on Unmanned Aerial Vehicle Remote Sensing: Platforms, Sensors, Data Processing Methods, and Applications. Drones 2023, 7, 398. [Google Scholar] [CrossRef]
- Butilă, E.V.; Boboc, R.G. Urban Traffic Monitoring and Analysis Using Unmanned Aerial Vehicles (UAVs): A Systematic Literature Review. Remote Sens. 2022, 14, 620. [Google Scholar] [CrossRef]
- Fang, Z.; Ma, H.; Zhu, X.; Guo, X.; Zhou, R. SEFM: A Sequential Feature Point Matching Algorithm for Object 3D Reconstruction. In Proceedings of the International Conference on Frontier Computing 2020, Singapore, 10–13 July 2020; Springer: Singapore, 2020; pp. 283–296. [Google Scholar] [CrossRef]
- Azzam, R.; Taha, T.; Huang, S.; Zweiri, Y. Feature-Based Visual Simultaneous Localization and Mapping: A Survey. SN Appl. Sci. 2020, 2, 224. [Google Scholar] [CrossRef]
- Rousseeuw, P.J. Least median of squares regression. J. Am. Stat. Assoc. 1984, 79, 871–880. [Google Scholar] [CrossRef]
- Fischler, M.A.; Bolles, R.C. Random Sample Consensus: A Paradigm for Model Fitting with Applications to Image Analysis and Automated Cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Torr, P.H.; Zisserman, A. MLESAC: A New Robust Estimator with Application to Estimating Image Geometry. Comput. Vis. Image Underst. 2000, 78, 138–156. [Google Scholar] [CrossRef]
- Chum, O.; Matas, J. Matching with PROSAC—Progressive Sample Consensus. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 20–25 June 2005; IEEE: Washington, DC, USA, 2005; Volume 1, pp. 220–226. [Google Scholar] [CrossRef]
- Chum, O.; Matas, J.; Kittler, J. Locally Optimized RANSAC. In Proceedings of the DAGM 2003, Magdeburg, Germany, 10–12 September 2003; Springer: Berlin/Heidelberg, Germany, 2003; pp. 236–243. [Google Scholar] [CrossRef]
- Raguram, R.; Chum, O.; Pollefeys, M.; Matas, J.; Frahm, J.M. USAC: A Universal Framework for Random Sample Consensus. IEEE Trans. Pattern Anal. Mach. Intell. 2013, 35, 2022–2038. [Google Scholar] [CrossRef]
- Barath, D.; Matas, J. Graph-Cut RANSAC. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; IEEE: New York, NY, USA, 2018; pp. 6733–6741. [Google Scholar] [CrossRef]
- Barath, D.; Matas, J. Progressive-X: Efficient, Anytime, Multi-Model Fitting Algorithm. In Proceedings of the 2019 IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019; IEEE: Seoul, Republic of Korea, 2019; pp. 3780–3788. [Google Scholar] [CrossRef]
- Barath, D.; Noskova, J.; Ivashechkin, M.; Matas, J. MAGSAC++: A Fast, Reliable and Accurate Robust Estimator. In Proceedings of the 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; IEEE: Seattle, WA, USA, 2020; pp. 1304–1312. [Google Scholar] [CrossRef]
- Zuliani, M.; Kenney, C.S.; Manjunath, B.S. The MultiRANSAC Algorithm and Its Application to Detect Planar Homographies. In Proceedings of the IEEE International Conference on Image Processing 2005, Genova, Italy, 14 September 2005; IEEE: Genoa, Italy, 2005; Volume 3, p. III-153. [Google Scholar] [CrossRef]
- Isack, H.; Boykov, Y. Energy-Based Geometric Multi-Model Fitting. Int. J. Comput. Vis. 2012, 97, 123–147. [Google Scholar] [CrossRef]
- Toldo, R.; Fusiello, A. Robust Multiple Structures Estimation with J-Linkage. In Proceedings of the European Conference on Computer Vision 2008, Marseille, France, 12–18 October 2008; Springer: Berlin, Germany, 2008; pp. 537–547. [Google Scholar] [CrossRef]
- Vincent, E.; Laganière, R. Detecting Planar Homographies in an Image Pair. In Proceedings of the ISPA 2001: Proceedings of the 2nd International Symposium on Image and Signal Processing and Analysis: In Conjunction with 23rd International Conference on Information Technology Interfaces, Pula, Croatia, 19–21 June 2001; IEEE: Pula, Croatia, 2001; pp. 182–187. [Google Scholar] [CrossRef]
- Magri, L.; Fusiello, A. T-Linkage: A Continuous Relaxation of J-Linkage for Multi-Model Fitting. In Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; IEEE: Columbus, OH, USA, 2014; pp. 3954–3961. [Google Scholar] [CrossRef]
- Triggs, B.; McLauchlan, P.F.; Hartley, R.I.; Fitzgibbon, A.W. Bundle Adjustment—A Modern Synthesis. In Vision Algorithms: Theory and Practice; Triggs, B., Zisserman, A., Szeliski, R., Eds.; Lecture Notes in Computer Science 1883; Springer: Berlin/Heidelberg, Germany, 2000; pp. 298–372. [Google Scholar]
- Snavely, N.; Seitz, S.M.; Szeliski, R. Photo Tourism: Exploring Photo Collections in 3D. In ACM SIGGRAPH 2006 Papers; ACM: Boston, MA, USA, 2006; pp. 835–846. [Google Scholar] [CrossRef]
- Schönberger, J.L.; Frahm, J.M. Structure-from-Motion Revisited. In Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; IEEE: Las Vegas, NV, USA, 2016; pp. 4104–4113. [Google Scholar] [CrossRef]
- Toft, C.; Turmukhambetov, D.; Sattler, T.; Kahl, F.; Brostow, G.J. Single-image depth prediction makes feature matching easier. In Proceedings of the European Conference on Computer Vision 2020, Glasgow, UK, 23–28 August 2020; Springer International Publishing: Cham, Switzerland, 2020; pp. 473–492. [Google Scholar] [CrossRef]
- Wang, S.; Kannala, J.; Pollefeys, M.; Barath, D. Guiding local feature matching with surface curvature. In Proceedings of the IEEE/CVF International Conference on Computer Vision 2023, Paris, France, 1–6 October 2023; IEEE: New York, NY, USA, 2023; pp. 17981–17991. [Google Scholar] [CrossRef]
- Liu, Y.; Lai, W.; Zhao, Z.; Xiong, Y.; Zhu, J.; Cheng, J.; Xu, Y. LiftFeat: 3D Geometry-Aware Local Feature Matching. arXiv 2025, arXiv:2505.03422. [Google Scholar] [CrossRef]
- Depth-Aware Features for Robust Image Matching; OpenReview (ICLR submission): Alameda, CA, USA, 2025.
- Li, K.; Ou, Y.; Ning, J.; Kong, F.; Cai, H.; Li, H. Unified Depth-Guided Feature Fusion and Reranking for Hierarchical Place Recognition. Sensors 2025, 25, 4056. [Google Scholar] [CrossRef]
- Saval-Calvo, M.; Azorin-Lopez, J.; Fuster-Guillo, A.; Garcia-Rodriguez, J. Three-Dimensional Planar Model Estimation Using Multi-Constraint Knowledge Based on k-Means and RANSAC. Appl. Soft Comput. 2015, 34, 572–586. [Google Scholar] [CrossRef]
- He, H.; Xiong, W.; Zhou, F.; He, Z.; Zhang, T.; Sheng, Z. Topology-Aware Multi-View Street Scene Image Matching for Cross-Daylight Conditions Integrating Geometric Constraints and Semantic Consistency. ISPRS Int. J. Geo Inf. 2025, 14, 212. [Google Scholar] [CrossRef]
- Martínez-Otzeta, J.M.; Rodríguez-Moreno, I.; Mendialdua, I.; Sierra, B. RANSAC for Robotic Applications: A Survey. Sensors 2023, 23, 327. [Google Scholar] [CrossRef] [PubMed]
- Xu, X.; Cheong, L.F.; Li, Z. Learning for Multi-Model and Multi-Type Fitting. arXiv 2019, arXiv:1901.10254. [Google Scholar] [CrossRef]
- Gallo, O.; Manduchi, R.; Rafii, A. CC-RANSAC: Fitting Planes in the Presence of Multiple Surfaces in Range Data. Pattern Recognit. Lett. 2011, 32, 403–410. [Google Scholar] [CrossRef]
- Chang, R.; Yu, K.; Yang, Y. Self-Supervised Monocular Depth Estimation Using Global and Local Mixed Multi-Scale Feature Enhancement Network for Low-Altitude UAV Remote Sensing. Remote Sens. 2023, 15, 3275. [Google Scholar] [CrossRef]
- Madhuanand, L.; Nex, F.; Yang, M.Y. Self-Supervised Monocular Depth Estimation from Oblique UAV Videos. ISPRS J. Photogramm. Remote Sens. 2021, 176, 1–14. [Google Scholar] [CrossRef]
- Lahiri, S.; Ren, J.; Lin, X. Deep learning-based stereopsis and monocular depth estimation techniques: A review. Vehicles 2024, 6, 305–351. [Google Scholar] [CrossRef]
- Zhang, J.; Wu, Y.; Jiang, H. Survey on monocular metric depth estimation. Computers 2025, 14, 502. [Google Scholar] [CrossRef]
- Ranftl, R.; Lasinger, K.; Hafner, D.; Schindler, K.; Koltun, V. Towards Robust Monocular Depth Estimation: Mixing Datasets for Zero-Shot Cross-Dataset Transfer. IEEE Trans. Pattern Anal. Mach. Intell. 2022, 44, 1623–1637. [Google Scholar] [CrossRef]
- Yang, L.; Kang, B.; Huang, Z.; Zhao, Z.; Xu, X.; Feng, J.; Zhao, H. Depth Anything V2. arXiv 2024, arXiv:2406.09414. [Google Scholar] [CrossRef]
- Bochkovskii, A.; Delaunoy, A.; Germain, H.; Santos, M.; Zhou, Y.; Richter, S.R.; Koltun, V. Depth Pro: Sharp Monocular Metric Depth in Less than a Second. arXiv 2024, arXiv:2410.02073. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive Image Features from Scale-Invariant Keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Ester, M.; Kriegel, H.P.; Sander, J.; Xu, X. A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. In Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, Portland, OR, USA, 2–4 August 1996; AAAI Press: Portland, OR, USA, 1996; pp. 226–231. [Google Scholar]
- Fragoso, V.; Sweeney, C.; Sen, P.; Turk, M. Ansac: Adaptive non-minimal sample and consensus. arXiv 2017, arXiv:1709.09559. [Google Scholar] [CrossRef]
- Chen, K.; Snavely, N.; Makadia, A. Wide-baseline relative camera pose estimation with directional learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2021, Nashville, TN, USA, 20–25 June 2021; IEEE: New York, NY, USA, 2021; pp. 3258–3268. [Google Scholar] [CrossRef]













| Depth Normalization Range | Precision (%) | Recall (%) | F1-Score | Number of Matches |
|---|---|---|---|---|
| 0–10 | 98.48 | 63.29 | 0.75 | 376 |
| 0–50 | 99.49 | 60.66 | 0.71 | 380 |
| 0–100 | 98.64 | 79.50 | 0.87 | 463 |
| 0–200 | 98.46 | 76.95 | 0.84 | 476 |
| 0–1000 | 98.49 | 63.43 | 0.75 | 375 |
| Metric | Precision (%) | Recall (%) | F1-Score | Number of Matches | p-Value (F1-Score) |
|---|---|---|---|---|---|
| RANSAC | 98.59 | 66.63 | 0.78 | 424 | 0.0002 |
| Proposed + RANSAC | 98.79 | 81.62 | 0.88 | 519 | |
| LMedS | 87.31 | 17.81 | 0.24 | 143 | 0.0022 |
| Proposed + LMedS | 99.83 | 43.45 | 0.56 | 328 | |
| MAGSAC++ | 98.35 | 73.68 | 0.84 | 426 | 0.0094 |
| Proposed + MAGSAC++ | 98.60 | 84.25 | 0.91 | 513 |
| Class 1 | Method | Precision (%) | Recall (%) | F1-Score | Number of Matches |
|---|---|---|---|---|---|
| 1-1 | RANSAC | 100 | 68.34 | 0.81 | 1196 |
| Proposed + RANSAC | 100 | 87.54 | 0.93 | 1532 | |
| LMedS | 100 | 9.71 | 0.18 | 170 | |
| Proposed + LMedS | 100 | 62.29 | 0.77 | 1090 | |
| MAGSAC++ | 99.83 | 66.74 | 0.8 | 1170 | |
| Proposed + MAGSAC++ | 100 | 86.69 | 0.93 | 1517 | |
| 1-2 | RANSAC | 99.45 | 52.48 | 0.69 | 181 |
| Proposed + RANSAC | 98.89 | 78.13 | 0.87 | 271 | |
| LMedS | 100 | 1.75 | 0.03 | 6 | |
| Proposed + LMedS | 100 | 40.23 | 0.57 | 138 | |
| MAGSAC++ | 98.98 | 56.85 | 0.72 | 197 | |
| Proposed + MAGSAC++ | 99.63 | 77.55 | 0.87 | 267 | |
| 1-3 | RANSAC | 99.40 | 85.40 | 0.92 | 500 |
| Proposed + RANSAC | 99.47 | 96.22 | 0.98 | 563 | |
| LMedS | 98.44 | 10.82 | 0.2 | 64 | |
| Proposed + LMedS | 98.90 | 15.46 | 0.27 | 91 | |
| MAGSAC++ | 99.00 | 85.40 | 0.92 | 502 | |
| Proposed + MAGSAC++ | 99.48 | 98.28 | 0.99 | 575 |
| Class 2 | Method | Precision (%) | Recall (%) | F1-score | Number of Matches |
|---|---|---|---|---|---|
| 2-1 | RANSAC | 90.62 | 64.24 | 0.75 | 224 |
| Proposed + RANSAC | 92.73 | 80.70 | 0.86 | 275 | |
| LMedS | 100 | 0.95 | 0.02 | 3 | |
| Proposed + LMedS | 100 | 17.72 | 0.3 | 56 | |
| MAGSAC++ | 89.82 | 64.24 | 0.75 | 226 | |
| Proposed + MAGSAC++ | 90.62 | 64.24 | 0.75 | 224 | |
| 2-2 | RANSAC | 99.34 | 86.01 | 0.92 | 458 |
| Proposed + RANSAC | 99.42 | 96.60 | 0.98 | 514 | |
| LMEDS | 100 | 48.20 | 0.65 | 255 | |
| Proposed + LMedS | 99.72 | 67.49 | 0.80 | 358 | |
| MAGSAC++ | 99.32 | 82.80 | 0.90 | 441 | |
| Proposed + MAGSAC++ | 99.40 | 94.14 | 0.97 | 501 | |
| 2-3 | RANSAC | 99.87 | 81.02 | 0.89 | 761 |
| Proposed + RANSAC | 99.78 | 96.48 | 0.98 | 907 | |
| LMedS | 100 | 68.76 | 0.81 | 645 | |
| Proposed + LMedS | 100 | 90.41 | 0.95 | 848 | |
| MAGSAC++ | 99.87 | 81.02 | 0.89 | 761 | |
| Proposed + MAGSAC++ | 99.67 | 96.16 | 0.98 | 905 |
| Class 3 | Method | Precision (%) | Recall (%) | F1-Score | Number of Matches |
|---|---|---|---|---|---|
| 3-1 | RANSAC | 100 | 39.33 | 0.56 | 35 |
| Proposed + RANSAC | 100 | 51.69 | 0.68 | 46 | |
| LMedS | 100 | 2.25 | 0.04 | 2 | |
| Proposed + LMedS | 100 | 25.84 | 0.41 | 23 | |
| MAGSAC++ | 100 | 61.8 | 0.76 | 55 | |
| Proposed + MAGSAC++ | 100 | 66.29 | 0.80 | 59 | |
| 3-2 | RANSAC | 100 | 56.25 | 0.72 | 36 |
| Proposed + RANSAC | 100 | 65.62 | 0.79 | 42 | |
| LMedS | - | - | - | 0 | |
| Proposed + LMedS | 100 | 28.12 | 0.44 | 18 | |
| MAGSAC++ | 100 | 90.62 | 0.95 | 58 | |
| Proposed + MAGSAC++ | 100 | 90.62 | 0.95 | 58 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Lee, G.; Youn, J.; Choi, K. A Depth-Guided Local Outlier Rejection Methodology for Robust Feature Matching in Urban UAV Images. Drones 2025, 9, 869. https://doi.org/10.3390/drones9120869
Lee G, Youn J, Choi K. A Depth-Guided Local Outlier Rejection Methodology for Robust Feature Matching in Urban UAV Images. Drones. 2025; 9(12):869. https://doi.org/10.3390/drones9120869
Chicago/Turabian StyleLee, Geonseok, Junhee Youn, and Kanghyeok Choi. 2025. "A Depth-Guided Local Outlier Rejection Methodology for Robust Feature Matching in Urban UAV Images" Drones 9, no. 12: 869. https://doi.org/10.3390/drones9120869
APA StyleLee, G., Youn, J., & Choi, K. (2025). A Depth-Guided Local Outlier Rejection Methodology for Robust Feature Matching in Urban UAV Images. Drones, 9(12), 869. https://doi.org/10.3390/drones9120869

