PM-Net: A Multi-Level Keypoints Detector and Patch Feature Learning Network for Optical and SAR Image Matching
Abstract
:1. Introduction
- We present a multi-level keypoints detector, called MKD, which detects and fuses low-level and high-level feature maps by joint peak values to obtain richer image information for more robust keypoints.
- We propose a two-channel patch matching network for optical and SAR images to improve the patch matching performance. This trained Siamese-type network can directly determine the similarity between optical and SAR image patches without manually designed features and descriptors. By joint processing of image patch training samples, it can effectively reduce training costs.
- We propose a PM-Net to solve the difficulty in acquiring robust keypoints and completing accurate patch matching between optical and SAR images.
2. Related Work
2.1. Area-Based Methods
2.2. Feature-Based Methods
2.3. Learning-Based Methods
3. Method
3.1. Overview
- Multi-level keypoints detector
- Image patch feature learning
3.2. Multi-Level Keypoints Detector
3.3. Image Patch Feature Learning
3.4. Loss Design
3.4.1. Hardest-Contrastive Loss
3.4.2. Cross-Entropy Loss
4. Experiments
4.1. Implementation Details
4.1.1. Training Dataset
4.1.2. Training Details
4.1.3. Test Dataset
4.2. Comparison Experiments
4.2.1. Qualitative Experiments
4.2.2. Quantitative Experiments
- The NCM is determined by checking the output of matching point . When the error rate of a point is less than the threshold , it is considered to be a correct match.
- The F1-measure captures the suitability of matched points through joint Recall and Precision, which is calculated as follows [47]:
- 3.
- The root mean square error of matching points reflects the accuracy of points derived by the matching algorithm and RMSE is calculated as:
4.2.3. Ablation Experiments
- Performance of MKD
- Performance of the two-channel matching network
5. Discussion
6. Conclusions
Author Contributions
Funding
Data Availability Statement
Conflicts of Interest
References
- Suri, S.; Reinartz, P. Mutual-information-based registration of TerraSAR-X and Ikonos imagery in urban areas. IEEE Trans. Geosci. Remote Sens. 2009, 48, 939–949. [Google Scholar] [CrossRef]
- Li, G.; Lin, Y.; Qu, X. An infrared and visible image fusion method based on multi-scale transformation and norm optimization. Inf. Fusion 2021, 71, 109–129. [Google Scholar] [CrossRef]
- Ma, J.; Tang, L.; Xu, M.; Zhang, H.; Xiao, G. STDFusionNet: An infrared and visible image fusion network based on salient target detection. IEEE Trans. Instrum. Meas. 2021, 70, 1–13. [Google Scholar] [CrossRef]
- Sahin, G.; Cabuk, S.N.; Cetin, M. The change detection in coastal settlements using image processing techniques: A case study of Korfez. Environ. Sci. Pollut. Res. 2022, 29, 15172–15187. [Google Scholar] [CrossRef] [PubMed]
- Hou, B.; Liu, Q.; Wang, H.; Wang, Y. From W-Net to CDGAN: Bitemporal change detection via deep learning techniques. IEEE Trans. Geosci. Remote Sens. 2019, 58, 1790–1802. [Google Scholar] [CrossRef]
- Zhang, H.; Lei, L.; Ni, W.; Tang, T.; Wu, J.; Xiang, D.; Kuang, G. Explore Better Network Framework for High Resolution Optical and SAR Image Matching. IEEE Trans. Geosci. Remote Sens. 2021, 60. [Google Scholar] [CrossRef]
- Fan, B.; Huo, C.; Pan, C.; Kong, Q. Registration of optical and SAR satellite images by exploring the spatial relationship of the improved SIFT. IEEE Geosci. Remote Sens. Lett. 2012, 10, 657–661. [Google Scholar] [CrossRef]
- Xiang, Y.; Wang, F.; You, H. OS-SIFT: A robust SIFT-like algorithm for high-resolution optical-to-SAR image registration in suburban areas. IEEE Trans. Geosci. Remote Sens. 2018, 56, 3078–3090. [Google Scholar] [CrossRef]
- Gong, M.; Zhao, S.; Jiao, L.; Tian, D.; Wang, S. A novel coarse-to-fine scheme for automatic image registration based on SIFT and mutual information. IEEE Trans. Geosci. Remote Sens. 2013, 52, 4328–4338. [Google Scholar] [CrossRef]
- Cui, S.; Ma, A.; Wan, Y.; Zhong, Y.; Luo, B.; Xu, M. Cross-Modality Image Matching Network With Modality-Invariant Feature Representation for Airborne-Ground Thermal Infrared and Visible Datasets. IEEE Trans. Geosci. Remote Sens. 2021, 60. [Google Scholar] [CrossRef]
- Merkle, N.; Luo, W.; Auer, S.; Müller, R.; Urtasun, R. Exploiting deep matching and SAR data for the geo-localization accuracy improvement of optical satellite images. Remote Sens. 2017, 9, 586. [Google Scholar] [CrossRef]
- Zhang, H.; Ni, W.; Yan, W.; Xiang, D.; Wu, J.; Yang, X.; Bian, H. Registration of multimodal remote sensing image based on deep fully convolutional neural network. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2019, 12, 3028–3042. [Google Scholar] [CrossRef]
- Hughes, L.H.; Schmitt, M.; Mou, L.; Wang, Y.; Zhu, X.X. Identifying corresponding patches in SAR and optical images with a pseudo-siamese CNN. IEEE Geosci. Remote Sens. Lett. 2018, 15, 784–788. [Google Scholar] [CrossRef]
- Zhu, H.; Jiao, L.; Ma, W.; Liu, F.; Zhao, W. A novel neural network for remote sensing image matching. IEEE Trans. Neural Netw. Learn. Syst. 2019, 30, 2853–2865. [Google Scholar] [CrossRef]
- Jiang, X.; Ma, J.; Xiao, G.; Shao, Z.; Guo, X. A review of multimodal image matching: Methods and applications. Inf. Fusion 2021, 73, 22–71. [Google Scholar] [CrossRef]
- Parmehr, E.G.; Zhang, C.; Fraser, C.S. Automatic registration of multi-source data using mutual information. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2012, 7, 301–308. [Google Scholar] [CrossRef]
- Liang, J.; Liu, X.; Huang, K.; Li, X.; Wang, D.; Wang, X. Automatic registration of multisensor images using an integrated spatial and mutual information (SMI) metric. IEEE Trans. Geosci. Remote Sens. 2013, 52, 603–615. [Google Scholar] [CrossRef]
- Xu, X.; Li, X.; Liu, X.; Shen, H.; Shi, Q. Multimodal registration of remotely sensed images based on Jeffrey’s divergence. ISPRS J. Photogramm. Remote Sens. 2016, 122, 97–115. [Google Scholar] [CrossRef]
- Wang, S.; Quan, D.; Liang, X.; Ning, M.; Guo, Y.; Jiao, L. A deep learning framework for remote sensing image registration. ISPRS J. Photogramm. Remote Sens. 2018, 145, 148–164. [Google Scholar] [CrossRef]
- Lowe, D.G. Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 2004, 60, 91–110. [Google Scholar] [CrossRef]
- Xu, C.; Sui, H.; Li, D.; Sun, K.; Liu, J. An automatic optical and sar image registration method using iterative multi-level and refinement model. Int. Arch. Photogramm. Remote Sens. Spat. Inf. Sci 2016, 7, 593–600. [Google Scholar] [CrossRef]
- Ma, J.; Jiang, X.; Fan, A.; Jiang, J.; Yan, J. Image matching from handcrafted to deep features: A survey. Int. J. Comput. Vis. 2021, 129, 23–79. [Google Scholar] [CrossRef]
- LeCun, Y.; Bengio, Y.; Hinton, G. Deep learning. Nature 2015, 521, 436–444. [Google Scholar] [CrossRef] [PubMed]
- Luo, Z.; Shen, T.; Zhou, L.; Zhang, J.; Yao, Y.; Li, S.; Fang, T.; Quan, L. Contextdesc: Local descriptor augmentation with cross-modality context. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 2527–2536. [Google Scholar]
- Dusmanu, M.; Rocco, I.; Pajdla, T.; Pollefeys, M.; Sivic, J.; Torii, A.; Sattler, T. D2-net: A trainable cnn for joint detection and description of local features. arXiv 2019, arXiv:1905.03561. [Google Scholar]
- Revaud, J.; De Souza, C.; Humenberger, M.; Weinzaepfel, P. R2d2: Reliable and repeatable detector and descriptor. In Proceedings of the Advances in Neural Information Processing Systems, Vancouver, BC, Canada, 8–14 December 2019; Volume 32. [Google Scholar]
- Luo, Z.; Zhou, L.; Bai, X.; Chen, H.; Zhang, J.; Yao, Y.; Li, S.; Fang, T.; Quan, L. Aslfeat: Learning local features of accurate shape and localization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, Seattle, WA, USA, 13–19 June 2020; pp. 6589–6598. [Google Scholar]
- Yang, Z.; Dan, T.; Yang, Y. Multi-temporal remote sensing image registration using deep convolutional features. Ieee Access 2018, 6, 38544–38555. [Google Scholar] [CrossRef]
- Ye, F.; Su, Y.; Xiao, H.; Zhao, X.; Min, W. Remote sensing image registration using convolutional neural network features. IEEE Geosci. Remote Sens. Lett. 2018, 15, 232–236. [Google Scholar] [CrossRef]
- Ma, W.; Zhang, J.; Wu, Y.; Jiao, L.; Zhu, H.; Zhao, W. A novel two-step registration method for remote sensing images based on deep and local features. IEEE Trans. Geosci. Remote Sens. 2019, 57, 4834–4843. [Google Scholar] [CrossRef]
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. [Google Scholar] [CrossRef]
- Simo-Serra, E.; Trulls, E.; Ferraz, L.; Kokkinos, I.; Fua, P.; Moreno-Noguer, F. Discriminative learning of deep convolutional feature point descriptors. In Proceedings of the IEEE International Conference on Computer Vision, Washington, DC, USA, 7–13 December 2015; pp. 118–126. [Google Scholar]
- Ahmed, E.; Jones, M.; Marks, T.K. An improved deep learning architecture for person re-identification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 3908–3916. [Google Scholar]
- He, H.; Chen, M.; Chen, T.; Li, D. Matching of remote sensing images with complex background variations via Siamese convolutional neural network. Remote Sens. 2018, 10, 355. [Google Scholar] [CrossRef]
- Zagoruyko, S.; Komodakis, N. Learning to compare image patches via convolutional neural networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015; pp. 4353–4361. [Google Scholar]
- Zhang, L.; Rusinkiewicz, S. Learning to detect features in texture images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 6325–6333. [Google Scholar]
- Tian, Y.; Fan, B.; Wu, F. L2-net: Deep learning of discriminative patch descriptor in euclidean space. In Proceedings of the IEEE conference on computer vision and pattern recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 661–669. [Google Scholar]
- Dai, J.; Qi, H.; Xiong, Y.; Li, Y.; Zhang, G.; Hu, H.; Wei, Y. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 764–773. [Google Scholar]
- Zhu, X.; Hu, H.; Lin, S.; Dai, J. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 9308–9316. [Google Scholar]
- Choy, C.; Park, J.; Koltun, V. Fully convolutional geometric features. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–18 October 2019; pp. 8958–8966. [Google Scholar]
- Ruby, U.; Yendapalli, V. Binary cross entropy with deep learning technique for image classification. Int. J. Adv. Trends Comput. Sci. Eng. 2020, 9. [Google Scholar]
- Shen, T.; Luo, Z.; Zhou, L.; Zhang, R.; Zhu, S.; Fang, T.; Quan, L. Matchable image retrieval by learning from surface reconstruction. In Proceedings of the Asian Conference on Computer Vision, Perth, Australia, 2–6 December 2018; pp. 415–431. [Google Scholar]
- Huang, M.; Xu, Y.; Qian, L.; Shi, W.; Zhang, Y.; Bao, W.; Wang, N.; Liu, X.; Xiang, X. The QXS-SAROPT dataset for deep learning in SAR-optical data fusion. arXiv 2021, arXiv:2103.08259. [Google Scholar]
- Zhao, L.; Zhang, Q.; Li, Y.; Qi, Y.; Yuan, X.; Liu, J.; Li, H. China′s Gaofen-3 Satellite System and Its Application and Prospect. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2021, 14, 11019–11028. [Google Scholar] [CrossRef]
- Bottou, L. Stochastic gradient descent tricks. In Neural Networks: Tricks of the Trade; Springer: Berlin/Heidelberg, Germany, 2012; pp. 421–436. [Google Scholar]
- Dogo, E.; Afolabi, O.; Nwulu, N.; Twala, B.; Aigbavboa, C. A comparative analysis of gradient descent-based optimization algorithms on convolutional neural networks. In Proceedings of the 2018 international conference on computational techniques, electronics and mechanical systems (CTEMS), Belgaum, India, 21–22 December 2018; pp. 92–99. [Google Scholar]
- Nunes, C.F.; Pádua, F.L. A local feature descriptor based on log-Gabor filters for keypoint matching in multispectral images. IEEE Geosci. Remote Sens. Lett. 2017, 14, 1850–1854. [Google Scholar] [CrossRef]
- Ye, Y.; Shan, J.; Hao, S.; Bruzzone, L.; Qin, Y. A local phase based invariant feature for remote sensing image matching. ISPRS J. Photogramm. Remote Sens. 2018, 142, 205–221. [Google Scholar] [CrossRef]
Image Pair | Method | ||||
---|---|---|---|---|---|
Contextdesc | R2D2 | D2-Net | ASLFeat | PM-Net | |
A | 140 | 129 | 265 | 243 | 305 |
B | 105 | 45 | 155 | 140 | 259 |
C | 7 | 34 | 81 | 101 | 238 |
D | 3 | 12 | 113 | 16 | 216 |
E | 6 | 5 | 124 | 94 | 298 |
F | 2 | 62 | 165 | 116 | 276 |
Image Pair | Method | ||||
---|---|---|---|---|---|
Contextdesc | R2D2 | D2-Net | ASLFeat | PM-Net | |
A | 0.054 | 0.116 | 0.487 | 0.213 | 0.359 |
B | 0.042 | 0.049 | 0.342 | 0.183 | 0.208 |
C | 0.074 | 0.023 | 0.155 | 0.313 | 0.388 |
D | 0.041 | 0.051 | 0.184 | 0.082 | 0.297 |
E | 0.025 | 0.055 | 0.163 | 0.057 | 0.219 |
F | 0.058 | 0.103 | 0.370 | 0.078 | 0.434 |
Image Pair | Method | ||||
---|---|---|---|---|---|
Contextdesc | R2D2 | D2-Net | ASLFeat | PM-Net | |
A | 3.084 | 1.358 | 2.465 | 1.557 | 1.043 |
B | 2.691 | 3.480 | 2.151 | 2.377 | 1.101 |
C | 3.671 | 3.062 | 2.590 | 3.372 | 1.166 |
D | / | 3.237 | 3.149 | 3.671 | 1.797 |
E | 3.196 | 3.839 | 3.226 | 3.504 | 1.989 |
F | / | 3.592 | 3.319 | 3.746 | 2.810 |
Metric | Test Number | SIFT+MM | D2-Net + MM | MKD (w/o Peak Values) + MM | MKD (w/o Weighted Combination) + MM | MKD + MM |
---|---|---|---|---|---|---|
%Rep. | Test 1 | 44.18 | 62.70 | 26.00 | 30.00 | 61.06 |
Test 2 | 49.56 | 47.27 | 28.24 | 42.86 | 53.33 | |
Test 3 | 21.13 | 31.46 | 45.28 | 33.33 | 48.05 | |
Test 4 | 41.57 | 44.72 | 26.09 | 39.13 | 55.07 | |
Test 5 | 38.51 | 63.56 | 35.91 | 39.89 | 59.87 | |
Average (1–10) | 38.70 | 46.64 | 30.91 | 38.18 | 53.19 |
Metric | Test Number | MKD + PSN | MKD + SN | MKD + 2chN |
---|---|---|---|---|
%Precision | Test 1 | 38.72 | 40.89 | 57.05 |
Test 2 | 31.93 | 61.69 | 66.36 | |
Test 3 | 36.84 | 51.49 | 45.38 | |
Test 4 | 34.13 | 22.98 | 38.89 | |
Test 5 | 41.19 | 35.64 | 49.87 | |
Average (1–10) | 38.01 | 40.69 | 48.28 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2022 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Li, Z.; Fu, Z.; Nie, H.; Chen, S. PM-Net: A Multi-Level Keypoints Detector and Patch Feature Learning Network for Optical and SAR Image Matching. Appl. Sci. 2022, 12, 5989. https://doi.org/10.3390/app12125989
Li Z, Fu Z, Nie H, Chen S. PM-Net: A Multi-Level Keypoints Detector and Patch Feature Learning Network for Optical and SAR Image Matching. Applied Sciences. 2022; 12(12):5989. https://doi.org/10.3390/app12125989
Chicago/Turabian StyleLi, Ziqian, Zhitao Fu, Han Nie, and Sijing Chen. 2022. "PM-Net: A Multi-Level Keypoints Detector and Patch Feature Learning Network for Optical and SAR Image Matching" Applied Sciences 12, no. 12: 5989. https://doi.org/10.3390/app12125989
APA StyleLi, Z., Fu, Z., Nie, H., & Chen, S. (2022). PM-Net: A Multi-Level Keypoints Detector and Patch Feature Learning Network for Optical and SAR Image Matching. Applied Sciences, 12(12), 5989. https://doi.org/10.3390/app12125989