Progressive Structure from Motion by Iteratively Prioritizing and Refining Match Pairs
Abstract
:1. Introduction
2. Incremental SfM Pipeline
2.1. COLMAP
2.2. The Negative Influence of Problematic Match Pairs
2.2.1. Repetitive Structure
2.2.2. Very Short Baselines
3. Method
3.1. Overview
3.2. Construction of Weighted View-Graph
3.3. Outlier Elimination Using CCI
3.4. Initialization
3.5. Expansion
4. Results
4.1. Datasets
4.2. Performance on Five Small Datasets
4.3. Performance on Three Middle-Scale Datasets with Repetitive Structures
4.4. Performance on Three Benchmark Datasets
4.5. Performance on a Large-Scale Dataset
4.6. Performance of without Iteratively Refining Match Pairs
4.7. Settings of Parameter
5. Discussion
5.1. Effect of Iteratively Refining Match Pairs
5.2. Effect of Iteratively Prioritizing Match Pairs
5.3. Limitation on Condition of Completeness
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Acknowledgments
Conflicts of Interest
Abbreviations
SfM | Structure from motion |
CCI | Cycle consistency inference |
MST | Minimum spanning tree |
References
- Förstner, W.; Wrobel, B.P. Photogrammetric Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016. [Google Scholar]
- McGlone, C.; Mikhail, E.; Bethel, J. Manual of Photogrammetry, 5th ed.; American Society of Photogrammetry: Falls Church, VA, USA, 2004. [Google Scholar]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Longuet-Higgins, H.C. A computer algorithm for reconstructing a scene from two projections. Nature 1981, 293, 133–135. [Google Scholar] [CrossRef]
- Stewenius, H.; Engels, C.; Nistér, D. Recent developments on direct relative orientation. ISPRS J. Photogramm. Remote Sens. 2006, 60, 284–294. [Google Scholar] [CrossRef]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Sweeney, C.; Sattler, T.; Hollerer, T.; Turk, M.; Pollefeys, M. Optimizing the viewing graph for structure-from-motion. In Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile, 7–13 December 2015; pp. 801–809. [Google Scholar]
- Shen, T.; Zhu, S.; Fang, T.; Zhang, R.; Quan, L. Graph-based consistent matching for structure-from-motion. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2016; pp. 139–155. [Google Scholar]
- Cui, H.; Shi, T.; Zhang, J.; Xu, P.; Meng, Y.; Shen, S. View-graph construction framework for robust and efficient structure-from-motion. Pattern Recognit. 2020, 114, 107712. [Google Scholar] [CrossRef]
- Schonberger, J.L.; Frahm, J.M. Structure-from-motion revisited. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 4104–4113. [Google Scholar]
- Wang, X.; Xiao, T.; Kasten, Y. A hybrid global structure from motion method for synchronously estimating global rotations and global translations. ISPRS J. Photogramm. Remote Sens. 2021, 174, 35–55. [Google Scholar] [CrossRef]
- Snavely, N.; Seitz, S.M.; Szeliski, R. Modeling the World from Internet Photo Collections. Int. J. Comput. Vis. 2008, 80, 189–210. [Google Scholar] [CrossRef] [Green Version]
- Hartley, R.; Zisserman, A. Multiple View Geometry in Computer Vision; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Triggs, B.; McLauchlan, P.F.; Hartley, R.I.; Fitzgibbon, A.W. Bundle adjustment—A modern synthesis. In International Workshop on Vision Algorithms; Springer: Berlin/Heidelberg, Germany, 1999; pp. 298–372. [Google Scholar]
- Wu, C. Towards linear-time incremental structure from motion. In Proceedings of the 2013 International Conference on 3D Vision-3DV 2013, Seattle, WA, USA, 29 June–1 July 2013; IEEE: Piscataway, NJ, USA, 2013; pp. 127–134. [Google Scholar]
- Mayer, H. Efficient hierarchical triplet merging for camera pose estimation. In German Conference on Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2014; pp. 399–409. [Google Scholar]
- Toldo, R.; Gherardi, R.; Farenzena, M.; Fusiello, A. Hierarchical structure-and-motion recovery from uncalibrated images. Comput. Vis. Image Underst. 2015, 140, 127–143. [Google Scholar] [CrossRef] [Green Version]
- Xie, X.; Yang, T.; Li, D.; Li, Z.; Zhang, Y. Hierarchical clustering-aligning framework based fast large-scale 3D reconstruction using aerial imagery. Remote Sens. 2019, 11, 315. [Google Scholar] [CrossRef] [Green Version]
- Chen, Y.; Shen, S.; Chen, Y.; Wang, G. Graph-based parallel large scale structure from motion. Pattern Recognit. 2020, 107, 107537. [Google Scholar] [CrossRef]
- Govindu, V.M. Robustness in motion averaging. In Asian Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2006; pp. 457–466. [Google Scholar]
- Wilson, K.; Snavely, N. Robust global translations with 1dsfm. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2014; pp. 61–75. [Google Scholar]
- Agarwal, S.; Furukawa, Y.; Snavely, N.; Simon, I.; Curless, B.; Seitz, S.M.; Szeliski, R. Building rome in a day. Commun. ACM 2011, 54, 105–112. [Google Scholar] [CrossRef]
- Wang, X.; Rottensteiner, F.; Heipke, C. Structure from motion for ordered and unordered image sets based on random kd forests and global pose estimation. ISPRS J. Photogramm. Remote Sens. 2019, 147, 19–41. [Google Scholar] [CrossRef]
- Jiang, S.; Jiang, C.; Jiang, W. Efficient structure from motion for large-scale UAV images: A review and a comparison of SfM tools. ISPRS J. Photogramm. Remote Sens. 2020, 167, 230–251. [Google Scholar] [CrossRef]
- Cui, H.; Shen, S.; Gao, W.; Liu, H.; Wang, Z. Efficient and robust large-scale structure-from-motion via track selection and camera prioritization. ISPRS J. Photogramm. Remote Sens. 2019, 156, 202–214. [Google Scholar] [CrossRef]
- Wang, X.; Xiao, T.; Gruber, M.; Heipke, C. Robustifying relative orientations with respect to repetitive structures and very short baselines for global SfM. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Long Beach, CA, USA, 16–17 June 2019. [Google Scholar]
- Enqvist, O.; Kahl, F.; Olsson, C. Non-sequential structure from motion. In Proceedings of the 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), Barcelona, Spain, 6–13 November 2011; IEEE: Piscataway, NJ, USA, 2011; pp. 264–271. [Google Scholar]
- Wang, X.; Heipke, C. An Improved Method of Refining Relative Orientation in Global Structure from Motion with a Focus on Repetitive Structure and Very Short Baselines. Photogramm. Eng. Remote Sens. 2020, 86, 299–315. [Google Scholar] [CrossRef]
- Michelini, M.; Mayer, H. Structure from motion for complex image sets. ISPRS J. Photogramm. Remote Sens. 2020, 166, 140–152. [Google Scholar] [CrossRef]
- Jiang, N.; Tan, P.; Cheong, L.F. Seeing double without confusion: Structure-from-motion in highly ambiguous scenes. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1458–1465. [Google Scholar]
- Heinly, J.; Dunn, E.; Frahm, J.M. Correcting for duplicate scene structure in sparse 3D reconstruction. In European Conference on Computer Vision; Springer: Berlin/Heidelberg, Germany, 2014; pp. 780–795. [Google Scholar]
- Zach, C.; Klopschitz, M.; Pollefeys, M. Disambiguating visual relations using loop constraints. In Proceedings of the 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, CA, USA, 13–18 June 2010; IEEE: Piscataway, NJ, USA, 2010; pp. 1426–1433. [Google Scholar]
- Moulon, P.; Monasse, P.; Perrot, R.; Marlet, R. Openmvg: Open multiple view geometry. In International Workshop on Reproducible Research in Pattern Recognition; Springer: Berlin/Heidelberg, Germany, 2016; pp. 60–74. [Google Scholar]
- Jiang, N.; Cui, Z.; Tan, P. A global linear method for camera pose registration. In Proceedings of the IEEE International Conference on Computer Vision, Sydney, Australia, 1–8 December 2013; pp. 481–488. [Google Scholar]
- Cui, H.; Shen, S.; Gao, W.; Wang, Z. Progressive large-scale structure-from-motion with orthogonal msts. In Proceedings of the 2018 International Conference on 3D Vision (3DV), Verona, Italy, 5–8 September 2018; IEEE: Piscataway, NJ, USA, 2018; pp. 79–88. [Google Scholar]
- Snavely, N.; Seitz, S.M.; Szeliski, R. Skeletal graphs for efficient structure from motion. In Proceedings of the 2008 IEEE Conference on Computer Vision and Pattern Recognition, Anchorage, AK, USA, 24–26 June 2008; IEEE: Piscataway, NJ, USA, 2008; pp. 1–8. [Google Scholar]
- Cui, Z.; Jiang, N.; Tang, C.; Tan, P. Linear Global Translation Estimation with Feature Tracks. Proc. ECCV 2014, 3, 61–75. [Google Scholar]
- Kschischang, F.R.; Frey, B.J.; Loeliger, H.A. Factor graphs and the sum-product algorithm. IEEE Trans. Inf. Theory 2001, 47, 498–519. [Google Scholar] [CrossRef] [Green Version]
- Prim, R.C. Shortest Connection Networks and Some Generalizations. Bell Syst. Tech. J. 1957, 36, 1389–1401. [Google Scholar] [CrossRef]
- Cheng, J.; Leng, C.; Wu, J.; Cui, H.; Lu, H. Fast and accurate image matching with cascade hashing for 3D reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 23–28 June 2014; pp. 1–8. [Google Scholar]
- Cohen, A.; Zach, C.; Sinha, S.N.; Pollefeys, M. Discovering and exploiting 3D symmetries in structure from motion. In Proceedings of the 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, 16–21 June 2012; IEEE: Piscataway, NJ, USA, 2012; pp. 1514–1521. [Google Scholar]
Framework | FE | FM | MP | BA |
---|---|---|---|---|
PRMP-PSfM | SiftGPU [15] | Nearest neighbor ratio | PRMP | LM [14] |
COLMAP | SiftGPU [15] | Nearest neighbor ratio | Original | LM [14] |
VisualSFM | SiftGPU [15] | Preemptive feature matching [15] | Original | PCG [15] |
OpenMVG | SIFT | Cascade hashing [40] | Original | LM [14] |
GraphSfM | SIFT | Cascade hashing [40] | Original | LM [14] |
APE | SiftGPU [15] | Wide baseline method [29] | Classification [29] | RBA [29] |
Name | Images | Resolution | Type | Reference |
---|---|---|---|---|
Books | 21 | 1067 × 800 | Repetitive structure | Yes |
Cereal | 25 | |||
Cup | 64 | |||
Desk | 31 | |||
Street | 19 | |||
Indoor | 152 | 1200 × 800 | Repetitive structure | No |
Redmond | 148 | 3968 × 2232 | ||
ToH | 341 | 4368 × 2912 | ||
B1 | 182 | 3936 × 2624 | Repetitive structure Very short baselines | Yes |
B2 | 215 | |||
B3 | 342 | |||
Church | 1455 | 3264 × 2448 3648 × 2736 7360 × 4912 | Repetitive structure Very short baselines | No |
Dataset | Pipeline | Rotation Error | Translation Error () | ||||||
---|---|---|---|---|---|---|---|---|---|
B1 | PRMP-PSfM | 0.02 | 0.11 | 0.13 | 0.52 | 0.11 | 0.53 | 0.54 | 3.21 |
COLMAP | 0.17 | 1.47 | 1.27 | 2.21 | 1.25 | 8.93 | 9.78 | 18.4 | |
OpenMVG | 0.18 | 1.7 | 1.63 | 3.74 | 1.55 | 10.02 | 10.40 | 18.66 | |
B2 | PRMP-PSfM | 0.03 | 0.08 | 0.07 | 0.32 | 0.08 | 0.55 | 0.49 | 3.27 |
COLMAP | 0.15 | 0.91 | 1.02 | 1.84 | 1.05 | 4.58 | 4.86 | 9.89 | |
OpenMVG | 0.15 | 0.66 | 0.46 | 4.54 | 0.22 | 7.29 | 4.84 | 48.61 | |
B3 | PRMP-PSfM | 0.02 | 0.10 | 0.09 | 0.46 | 0.06 | 0.74 | 0.66 | 2.89 |
OpenMVG | 0.06 | 0.39 | 0.40 | 0.89 | 0.58 | 3.02 | 2.55 | 88.61 |
Pipeline | |||
---|---|---|---|
PRMP-PSfM | 1448 (99.5) | 0.37 | 491,992 |
COLMAP | 1454 (99.9) | 1.09 | 549,957 |
VisualSFM | 288(19.8) | 0.74 | 14,295 |
OpenMVG | 1452 (99.8) | 0.54 | 1,687,694 |
GraphSfM | 1439 (98.9) | 0.51 | 2,762,371 |
APE | - | 0.55 | 290,748 |
Dataset | PRMP-PSfM | PSfM | ||||
---|---|---|---|---|---|---|
Cup | 0.26 | 0.67 | 0.23 | 0.26 | 0.73 | 0.23 |
Desk | 0.32 | 1.32 | 0.90 | 0.47 | 6.46 | 1.78 |
Street | 0.18 | 0.78 | 0.11 | 0.35 | 1.16 | 0.29 |
B3 | 0.39 | 0.06 | 0.16 | 0.36 | 0.35 | 0.42 |
Dataset | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
B1 | 109 | 0.93 | 7.82 | 182 | 0.11 | 0.53 | 182 | 0.18 | 1.24 | 182 | 1.47 | 8.89 |
B2 | 215 | 0.06 | 0.51 | 215 | 0.08 | 0.55 | 215 | 0.31 | 1.08 | 215 | 0.43 | 1.22 |
B3 | 182 | 0.26 | 1.26 | 275 | 0.22 | 1.07 | 342 | 0.35 | 1.44 | 342 | 0.47 | 1.93 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Xiao, T.; Yan, Q.; Ma, W.; Deng, F. Progressive Structure from Motion by Iteratively Prioritizing and Refining Match Pairs. Remote Sens. 2021, 13, 2340. https://doi.org/10.3390/rs13122340
Xiao T, Yan Q, Ma W, Deng F. Progressive Structure from Motion by Iteratively Prioritizing and Refining Match Pairs. Remote Sensing. 2021; 13(12):2340. https://doi.org/10.3390/rs13122340
Chicago/Turabian StyleXiao, Teng, Qingsong Yan, Weile Ma, and Fei Deng. 2021. "Progressive Structure from Motion by Iteratively Prioritizing and Refining Match Pairs" Remote Sensing 13, no. 12: 2340. https://doi.org/10.3390/rs13122340
APA StyleXiao, T., Yan, Q., Ma, W., & Deng, F. (2021). Progressive Structure from Motion by Iteratively Prioritizing and Refining Match Pairs. Remote Sensing, 13(12), 2340. https://doi.org/10.3390/rs13122340