Tie Point Matching between Terrestrial and Aerial Images Based on Patch Variational Refinement
Abstract
:1. Introduction
- (1)
- Image rectification-based methods. Warping all of the images to the same reference plane is a valid solution to eliminate the global geometric deformation between images [10]. Some researchers have proposed alleviating the difference in view angle by transforming images of arbitrary views into a standard view [1,16,17,18]. Maria et al. [16] transformed an arbitrary perspective view of a planar façade into a frontal view by applying an orthorectification preprocessing step. Finding a common reference plane and then projecting the images onto it is an effective method to alleviate geometric deformations [19,20,21]. However, it is not always possible to find a suitable reference plane. Thus, view-dependent rectifications that reduce the geometric distortion by correcting each pair of images individually have been proposed [10]. In the work of [22], a ground-based multiview stereo was used to produce a depth map of terrestrial images and warp the images to aerial views. Gao et al. [3] rendered 3D data onto a target view. These methods require dense reconstruction to generate the model, and the quality of the mesh model influences the synthetic image. As the quality of the synthesized images decreases, the matching performance becomes limited. Although these methods effectively reduce the variance in descriptors, the large number of matches and the ambiguous repeating structures hinder the approach [23]; it is usually acceptable for local windows rather than a whole image.
- (2)
- Deep-learning-based methods. Different from traditional methods, convolutional neural networks (CNN) extract image features; the more adaptive and robust feature descriptors are extracted through iterative learning of the neural networks. Some researchers [24,25] have obtained image feature detectors that perform better than traditional feature detectors such as SIFT. In recent years, with the development of artificial intelligence, deep-learning-based methods, such as deep feature matching (DFM) [26], SuperPoint [27], and SuperGlue [28], have also made great progress in cross-view image matching. SuperGlue [28] takes the features and the descriptors detected by SuperPoint [27] as inputs and then uses a graphical neural network to find the cross- and self-attention between features. It can significantly improve the matching performance of extracted features. However, the authors in [29] showed that due to the tradeoff between invariance and discrimination, it is generally impossible to make the network more rotation and illumination invariant. DFM [26] adopts a pretrained VGG-19 [30] extractor that is applied to the features in the terminal layers first and hierarchically matches features up to the first layers. However, DFM requires the target object to be planar or approximately planar. It should be noted that deep-learning methods cannot completely replace the classical method, and [31] proves that classical solutions may still outperform the perceived state-of-the-art approaches with proper settings.
- (1)
- A robust aerial-terrestrial image-matching method is proposed. The proposed method can address drastic viewpoint changes, scale differences, and illumination differences in the image pairs.
- (2)
- Mismatches caused by weak texture and repeated texture are alleviated. The geometric constraints between patches (inherited from the point clouds) can eliminate outliers in matching.
- (3)
- The effect of matching is not affected by the shape of the target building. The proposed method does not require the matching object to be a plane.
2. Materials and Methods
2.1. Overview of the Proposed Method
2.2. Variational Patch Refinement
2.2.1. Energy Function Construction
2.2.2. Derivative of the Energy Function
- Derivative of photo-consistency
- 2.
- Derivative of the homographic matrix
2.2.3. Initial Parameters of the Patch
2.3. Feature Matching with the Optimized Patch
3. Experiments and Analyses
3.1. Introduction to the Experimental Dataset
3.2. Result of Variational Patch Refinement
3.3. Evaluation of the Effectiveness of the Feature-Matching Algorithms
3.3.1. Effective Constraint Range of Feature Matching
3.3.2. Matching Results of Public and Local Datasets
3.3.3. Result of Image Reconstruction
3.3.4. Evaluation of the Effectiveness and Robustness of the Proposed Method
4. Conclusions
4.1. Contribution
4.2. Limitations of the Proposed Method
- Compared with traditional matching methods, the proposed method needs to build a patch based on images in the same view, which will increase the calculation time.
- This method rectifies the image scale difference by setting the size of the patch corresponding to the feature point based on GPS data. If there is no GPS assistance during image acquisition, the scale difference needs to be reduced by building an image pyramid on the patch.
- Occlusion is still a difficult problem; obscured parts of images usually cannot be used to generate correct point clouds, and it is difficult to maximize the photo-consistency between occluded images and normal images.
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Wu, B.; Xie, L.F.; Hu, H.; Zhu, Q.; Yau, E. Integration of aerial oblique imagery and terrestrial imagery for optimized 3D modeling in urban areas. ISPRS J. Photogramm. 2018, 139, 119–132. [Google Scholar] [CrossRef]
- Nex, F.; Remondino, F.; Gerke, M.; Przybilla, H.-J.; Bäumker, M.; Zurhorst, A. ISPRS benchmark for multi-platform photogrammetry. In Proceedings of the Joint Isprs Conference, Munich, Germany, 25–27 March 2015; pp. 135–142. [Google Scholar]
- Gao, X.; Shen, S.; Zhou, Y.; Cui, H.; Zhu, L.; Hu, Z. Ancient Chinese architecture 3D preservation by merging ground and aerial point clouds. ISPRS J. Photogramm. 2018, 143, 72–84. [Google Scholar] [CrossRef]
- Balletti, C.; Guerra, F.; Scocca, V.; Gottardi, C. 3D integrated methodologies for the documentation and the virtual reconstruction of an archaeological site. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci. 2015, XL–5/W4, 215–222. [Google Scholar] [CrossRef]
- Jian, M.; Dong, J.; Gong, M.; Yu, H.; Nie, L.; Yin, Y.; Lam, K.M. Learning the Traditional Art of Chinese Calligraphy via Three-Dimensional Reconstruction and Assessment. IEEE Trans. Multimed. 2020, 22, 970–979. [Google Scholar] [CrossRef]
- Zhang, C.-S.; Zhang, M.-M.; Zhang, W.-X. Reconstruction of measurable three-dimensional point cloud model based on large-scene archaeological excavation sites. J. Electron. Imagining 2017, 26, 011027. [Google Scholar] [CrossRef]
- Ren, C.; Zhi, X.; Pu, Y.; Zhang, F. A multi-scale UAV image matching method applied to large-scale landslide reconstruction. Math. Biosci. Eng. 2021, 18, 2274–2287. [Google Scholar] [CrossRef] [PubMed]
- Rumpler, M.; Tscharf, A.; Mostegel, C.; Daftry, S.; Hoppe, C.; Prettenthaler, R.; Fraundorfer, F.; Mayer, G.; Bischof, H. Evaluations on multi-scale camera networks for precise and geo-accurate reconstructions from aerial and terrestrial images with user guidance. Comput. Vis. Image Underst. 2017, 157, 255–273. [Google Scholar] [CrossRef]
- Zhang, Y.; Ma, G.; Wu, J. Air-Ground Multi-Source Image Matching Based on High-Precision Reference Image. Remote Sens. 2022, 14, 588. [Google Scholar] [CrossRef]
- Zhu, Q.; Wang, Z.D.; Hu, H.; Xie, L.F.; Ge, X.M.; Zhang, Y.T. Leveraging photogrammetric mesh models for aerial-ground feature point matching toward integrated 3D reconstruction. ISPRS J. Photogramm. 2020, 166, 26–40. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Bay, H.; Ess, A.; Tuytelaars, T.; Van Gool, L. Speeded-up robust features (SURF). Comput. Vis. Image Underst. 2008, 110, 346–359. [Google Scholar] [CrossRef]
- Morel, J.-M.; Yu, G. ASIFT: A new framework for fully affine invariant image comparison. SIAM J. Imaging Sci. 2009, 2, 438–469. [Google Scholar] [CrossRef]
- Mikolajczyk, K.; Tuytelaars, T.; Schmid, C.; Zisserman, A.; Matas, J.; Schaffalitzky, F.; Kadir, T.; Gool, L.V. A Comparison of Affine Region Detectors. Int. J. Comput. Vis. 2005, 65, 43–72. [Google Scholar] [CrossRef]
- Xue, N.; Xia, G.S.; Bai, X.; Zhang, L.P.; Shen, W.M. Anisotropic-Scale Junction Detection and Matching for Indoor Images. IEEE Trans. Image Process. 2018, 27, 78–91. [Google Scholar] [CrossRef]
- Kushnir, M.; Shimshoni, I. Epipolar Geometry Estimation for Urban Scenes with Repetitive Structures. IEEE T. Pattern Anal. 2014, 36, 2381–2395. [Google Scholar] [CrossRef]
- Zhang, Q.; Li, Y.J.; Blum, R.S.; Xiang, P. Matching of images with projective distortion using transform invariant low-rank textures. J. Vis. Commun. Image Represent. 2016, 38, 602–613. [Google Scholar] [CrossRef]
- Yue, L.W.; Li, H.J.; Zheng, X.W. Distorted Building Image Matching with Automatic Viewpoint Rectification and Fusion. Sensors 2019, 19, 5205. [Google Scholar] [CrossRef]
- Hu, H.; Zhu, Q.; Du, Z.Q.; Zhang, Y.T.; Ding, Y.L. Reliable Spatial Relationship Constrained Feature Point Matching of Oblique Aerial Images. Photogramm. Eng. Rem. S. 2015, 81, 49–58. [Google Scholar] [CrossRef]
- Jiang, S.; Jiang, W.S. On-Board GNSS/IMU Assisted Feature Extraction and Matching for Oblique UAV Images. Remote Sens. 2017, 9, 813. [Google Scholar] [CrossRef]
- Fanta-Jende, P.; Gerke, M.; Nex, F.; Vosselman, G. Co-registration of Mobile Mapping Panoramic and Airborne Oblique Images. Photogramm. Rec. 2019, 34, 149–173. [Google Scholar] [CrossRef]
- Shan, Q.; Wu, C.; Curless, B.; Furukawa, Y.; Hernandez, C.; Seitz, S.M. Accurate Geo-Registration by Ground-to-Aerial Image Matching. In Proceedings of the 3DV, Tokyo, Japan, 8–11 December 2014; pp. 525–532. [Google Scholar]
- Altwaijry, H.; Belongie, S. Ultra-wide Baseline Aerial Imagery Matching in Urban Environments. In Proceedings of the British Machine Vision Conference 2013, Bristol, UK, 9–13 September 2013; p. 15.11. [Google Scholar]
- Yi, K.M.; Trulls, E.; Lepetit, V.; Fua, P. LIFT: Learned Invariant Feature Transform. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 11–14 October; Springer: Cham, Switzerland, 2016; pp. 467–483. [Google Scholar]
- Zhang, X.; Yu, F.X.; Karaman, S.; Chang, S. Learning Discriminative and Transformation Covariant Local Feature Detectors. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017; pp. 4923–4931. [Google Scholar]
- Efe, U.; Ince, K.G.; Alatan, A.A.; Soc, I.C. DFM: A Performance Baseline for Deep Feature Matching. In Proceedings of the CVPR, Virtual Event, 19–25 June 2021; pp. 4279–4288. [Google Scholar]
- DeTone, D.; Malisiewicz, T.; Rabinovich, A.; IEEE. SuperPoint: Self-Supervised Interest Point Detection and Description. In Proceedings of the CVPR, Salt Lake City, UT, USA, 18–22 June 2018; pp. 337–349. [Google Scholar]
- Sarlin, P.E.; DeTone, D.; Malisiewicz, T.; Rabinovich, A.; IEEE. SuperGlue: Learning Feature Matching with Graph Neural Networks. In Proceedings of the CVPR, Virtual Event, 14–19 June 2020; pp. 4937–4946. [Google Scholar]
- Pautrat, R.; Larsson, V.; Oswald, M.; Pollefeys, M. Online Invariance Selection for Local Feature Descriptors. In Proceedings of the ECCV, Glasgow, UK, 23–28 August 2020; Springer: Cham, Switzerland, 2020; pp. 707–724. [Google Scholar]
- Simonyan, K.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2014, arXiv:1409.1556. [Google Scholar]
- Jin, Y.; Mishkin, D.; Mishchuk, A.; Matas, J.; Fua, P.; Yi, K.M.; Trulls, E. Image Matching Across Wide Baselines: From Paper to Practice. Int. J. Comput. Vis. 2021, 129, 517–547. [Google Scholar] [CrossRef]
- Moulon, P.; Monasse, P.; Perrot, R.; Marlet, R. OpenMVG: Open Multiple View Geometry. In Proceedings of the Reproducible Research in Pattern Recognition, Virtual Event, 11 January 2017; Springer: Cham, Switzerland, 2017; pp. 60–74. [Google Scholar]
- Mikolajczyk, K.; Schmid, C. A performance evaluation of local descriptors. IEEE T. Pattern Anal. 2005, 27, 1615–1630. [Google Scholar] [CrossRef] [PubMed]
- Bian, J.W.; Lin, W.Y.; Matsushita, Y.; Yeung, S.K.; Nguyen, T.D.; Cheng, M.M. GMS: Grid-based Motion Statistics for Fast, Ultra-robust Feature Correspondence. In Proceedings of the CVPR, Honolulu, HI, USA, 21–26 July 2017; pp. 2828–2837. [Google Scholar]
- Sarlin, P.E.; DeTone, D.; Malisiewicz, T.; Rabinovich, A. SuperGlue github repository. In Proceedings of the CVPR, Virtual Event, 14–19 June 2020; Available online: https://github.com/magicleap/SuperGluePretrainedNetwork (accessed on 5 December 2022).
- Efe, U.; Ince, K.G.; Alatan, A.A.; Soc, I.C. DFM github repository. In Proceedings of the CVPR, Virtual Event, 19–25 June 2021; Available online: https://github.com/ufukefe/DFM (accessed on 5 December 2022).
- Ju, Y.; Shi, B.; Jian, M.; Qi, L.; Dong, J.; Lam, K.-M. NormAttention-PSN: A High-frequency Region Enhanced Photometric Stereo Network with Normalized Attention. Int. J. Comput. Vis. 2022, 130, 3014–3034. [Google Scholar] [CrossRef]
Dataset | Sensor | Ground POS | ||
---|---|---|---|---|
Aerial | Ground | GPS | Attitude | |
Dort-ZECHE | SONY Nex-7 | SONY Nex-7 | ○ | ○ |
Dort-CENTER | SONY Nex-7 | SONY Nex-7 | ○ | ○ |
SDUST-AB | DJI_FC6310S | SONY Ilce-7 | ○ | × |
SDUST-HOSP | DJI_FC6310R | IPAD ari-3 | × | × |
Method | ASIFT | GMS | DFM | SuperGlue | Proposed | |
---|---|---|---|---|---|---|
Image | Ni/Nt | Ni/Nt | Ni/Nt | Ni/Nt | Ni/Nt | |
Pair 1 | 22/35 | 0/35 | 55/64 | 184/239 | 515/517 | |
Pair 2 | 2/17 | 0/77 | 119/142 | 146/164 | 392/395 | |
Pair 3 | 0/5 | 0/76 | 0/5 | 2/4 | 404/409 | |
Pair 4 | 0/7 | 0/17 | 12/14 | 3/6 | 354/359 | |
Pair 5 | 0/16 | 0/44 | 0/5 | 0/1 | 309/311 | |
MMA | 0.3 | 0 | 0.88 | 0.81 | 0.99 | |
MNI | 4.8 | 0 | 37.2 | 67 | 394.8 |
Method | ASIFT | GMS | DFM | SuperGlue | Proposed | |
---|---|---|---|---|---|---|
Image | Ni/Nt | Ni/Nt | Ni/Nt | P | Ni/Nt | |
Pair 6 | 0/23 | 0/0 | 1/4 | 175/190 | 543/547 | |
Pair 7 | 0/11 | 0/15 | 3/8 | 359/374 | 489/495 | |
Pair 8 | 0/32 | 0/8 | 5/16 | 162/195 | 446/450 | |
Pair 9 | 0/17 | 0/6 | 0/2 | 141/171 | 532/537 | |
Pair 10 | 0/18 | 0/44 | 1/8 | 64/78 | 455/457 | |
MMA | 0 | 0 | 0.26 | 0.89 | 0.99 | |
MNI | 0 | 0 | 2 | 180.2 | 493 |
Method | ASIFT | GMS | DFM | SuperGlue | Proposed | |
---|---|---|---|---|---|---|
Image | Ni/Nt | Ni/Nt | Ni/Nt | Ni/Nt | Ni/Nt | |
Pair 11 | 4/23 | 0/21 | 71/75 | 110/126 | 576/584 | |
Pair 12 | 0/15 | 0/8 | 65/83 | 146/152 | 406/417 | |
Pair 13 | 2/21 | 0/0 | 16/24 | 35/40 | 374/377 | |
Pair 14 | 0/16 | 0/14 | 89/104 | 197/211 | 511/518 | |
Pair 15 | 3/18 | 0/4 | 33/48 | 20/25 | 506/523 | |
MMA | 0.097 | 0 | 0.82 | 0.92 | 0.99 | |
MNI | 1.8 | 0 | 54.8 | 101.6 | 474.6 |
Method | ASIFT | GMS | DFM | SuperGlue | Proposed | |
---|---|---|---|---|---|---|
Image | Ni/Nt | Ni/Nt | Ni/Nt | Ni/Nt | Ni/Nt | |
Pair 16 | 3/12 | 0/35 | 0/3 | 7/14 | 48/53 | |
Pair 17 | 0/11 | 0/10 | 0/9 | 0/8 | 40/43 | |
Pair 18 | 0/9 | 0/16 | 0/5 | 0/9 | 39/41 | |
Pair 19 | 0/7 | 0/12 | 0/6 | 2/18 | 33/33 | |
Pair 20 | 0/10 | 0/9 | 0/3 | 0/11 | 37/39 | |
MMA | 0.07 | 0 | 0 | 0.15 | 0.94 | |
MNI | 0.6 | 0 | 0 | 1.8 | 39.4 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Liu, J.; Yin, H.; Liu, B.; Lu, P. Tie Point Matching between Terrestrial and Aerial Images Based on Patch Variational Refinement. Remote Sens. 2023, 15, 968. https://doi.org/10.3390/rs15040968
Liu J, Yin H, Liu B, Lu P. Tie Point Matching between Terrestrial and Aerial Images Based on Patch Variational Refinement. Remote Sensing. 2023; 15(4):968. https://doi.org/10.3390/rs15040968
Chicago/Turabian StyleLiu, Jianchen, Haoxuan Yin, Baohua Liu, and Pingshe Lu. 2023. "Tie Point Matching between Terrestrial and Aerial Images Based on Patch Variational Refinement" Remote Sensing 15, no. 4: 968. https://doi.org/10.3390/rs15040968
APA StyleLiu, J., Yin, H., Liu, B., & Lu, P. (2023). Tie Point Matching between Terrestrial and Aerial Images Based on Patch Variational Refinement. Remote Sensing, 15(4), 968. https://doi.org/10.3390/rs15040968