Stereo Radargrammetry Using Deep Learning-Based Image Matching with Fine-Tuned Model on Synthetic Aperture Radar Images
Highlights
- Fine-tuning a Transformer-based model (RoMa) on a newly constructed SAR dataset significantly outperforms conventional image matching methods and pre-trained deep learning methods.
- Establishing direct matching on slant-range images avoids the image quality degradation caused by traditional ground-range projection, achieving highly accurate and dense 3D elevation measurements.
- The proposed framework successfully bridges the domain gap between optical and SAR images, enabling robust 3D elevation measurement in mountainous and forested terrains with large geometric modulations.
- Eliminating the need for ground-range projection prevents the loss of high-frequency components, making it possible to generate dense and accurate Digital Surface Models (DSMs) while maintaining the original SAR image resolution.
Abstract
1. Introduction
- Dataset generation: We develop a fully automated pipeline to construct a large-scale, patch-based SAR image dataset. By back-calculating the true disparities using a reference DSM and a rigorous SAR projection model, we overcome the lack of training data in the SAR domain.
- Model adaptation: We fine-tune RoMa, a Transformer-based dense matching model, on our SAR dataset. This explicitly adapts the network to capture complex, SAR-specific geometric modulations.
- Slant-range matching: We establish a framework that performs matching directly on slant-range images. By eliminating the conventional ground-range projection step, our method preserves high-frequency components and avoids interpolation-induced image quality degradation.
2. Materials and Methods
2.1. Geometric Model of Stereo Radargrammetry
2.2. Overview of the Proposed Method
- 1.
- Introducing RoMa [12], a deep learning model robust against complex geometric deformations, as the image correspondence method.
- 2.
- Developing an SAR image dataset construction method that automatically calculates ground truth disparities using a DSM and the projection model, enabling the application of deep learning models to SAR image processing.
- 3.
- Eliminating the ground projection process for SAR images, performing matching directly using slant-range images as input.
2.3. Construction of SAR Image Dataset
2.4. Fine-Tuning of RoMa for SAR Images
2.5. Overall Pipeline of 3D Measurement
3. Results
3.1. Experimental Setup
3.1.1. Study Area and Data
3.1.2. Dataset Preparation
3.1.3. Implementation Details
3.2. Matching Performance
3.2.1. Evaluation Metrics and Test Patches
- Patch 1: The crossing angle of the flight paths is relatively small, and the terrain is generally flat. Textures such as agricultural boundaries are clearly visible, making image matching relatively straightforward.
- Patch 2: The crossing angle is larger, and the area is located on a mountainside with steep terrain. The severe geometric image modulations make accurate correspondence much more difficult than in Patch 1.
- Patch 3: In addition to a large crossing angle, the trees across the image appear to lean in different directions due to SAR-specific geometric modulations (e.g., layover). Thus, similarly to Patch 2, accurate matching is highly challenging.

3.2.2. Quantitative Results
3.2.3. Qualitative Results
3.3. 3D Measurement Accuracy
3.3.1. Evaluation Metrics and Test Areas
3.3.2. Quantitative Results
3.3.3. Qualitative Results
4. Discussion
4.1. Mechanism Analysis: Transformer- vs. CNN-Based Matching
4.2. Impact of SAR-Specific Geometric Distortions
4.3. Reliability and Confidence-Aware Filtering
4.4. Error Propagation and System Limitations
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| DKM | Deep Kernelized Dense Geometric Matching |
| DSM | Digital Surface Model |
| GCP | Ground Control Point |
| InSAR | Interferometric Synthetic Aperture Radar |
| LiDAR | Light Detection and Ranging |
| NICT | National Institute of Information and Communications Technology |
| POC | Phase-Only Correlation |
| RESTEC | Remote Sensing Technology Center of Japan |
| RoMa | Robust Dense Feature Matching |
| SAR | Synthetic Aperture Radar |
References
- Richards, J.A. Remote Sensing with Imaging Radar; Springer: Berlin/Heidelberg, Germany, 2009. [Google Scholar]
- Kerle, N. Encyclopedia of Natural Hazards; Springer: Dordrecht, The Netherlands, 2013. [Google Scholar]
- Lillesand, T.M.; Kiefer, R.W.; Chipman, J.W. Remote Sensing and Image Interpretation; John Wiley & Sons: New York, NY, USA, 2015. [Google Scholar]
- Rosen, P.; Hensley, S.; Joughin, I.; Li, F.; Madsen, S.; Rodriguez, E.; Goldstein, R. Synthetic aperture radar interferometry. Proc. IEEE 2000, 88, 333–382. [Google Scholar] [CrossRef]
- Toutin, T.; Gray, L. State-of-the-art of elevation extraction from satellite SAR data. ISPRS J. Photogramm. Remote Sens. 2000, 55, 13–33. [Google Scholar] [CrossRef]
- Leberl, F.W. Radargrammetric Image Processing; Artech House: Norwood, MA, USA, 1990. [Google Scholar]
- Raggam, H.; Gutjahr, K.; Perko, R.; Schardt, M. Assessment of the Stereo-Radargrammetric Mapping Potential of TerraSAR-X Multibeam Spotlight Data. IEEE Trans. Geosci. Remote Sens. 2010, 48, 971–977. [Google Scholar] [CrossRef]
- Maruki, D.; Sakai, S.; Ito, K.; Aoki, T.; Uemoto, J.; Uratsuka, S. Stereo radargrammetry using airborne SAR images without GCP. In Proceedings of the 2015 IEEE International Conference on Image Processing (ICIP), Quebec City, QC, Canada, 27–30 September 2015; pp. 3585–3589. [Google Scholar]
- Insfran, K.; Ito, K.; Aoki, T. Accurate 3D measurement from two SAR images without prior knowledge of scene. In Proceedings of the 2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS, Brussels, Belgium, 11–16 July 2021; pp. 4814–4817. [Google Scholar]
- Takita, K.; Muquit, M.; Aoki, T.; Higuchi, T. A sub-pixel correspondence search technique for computer vision applications. IEICE Trans. Fundam. 2004, E87-A, 1913–1923. [Google Scholar]
- Goodfellow, I.; Bengio, Y.; Courville, A. Deep Learning; The MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Edstedt, J.; Sun, Q.; Bökman, G.; Wadenbäck, M.; Felsberg, M. RoMa: Robust dense feature matching. In Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 19790–19800. [Google Scholar]
- Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16x16 words: Transformers for image recognition at scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New York, NY, USA, 23–26 June 2021. [Google Scholar]
- Li, H.L.; Chen, S.W. Polyhedral Corner Reflectors Multidomain Joint Characterization With Fully Polarimetric Radar. IEEE Trans. Antennas Propag. 2025, 73, 10679–10693. [Google Scholar] [CrossRef]
- Li, X.; Liu, L.; Wan, G.; Zheng, F.; Guo, S.; Sun, G.; Wang, Z.; Liu, X. Physics-Driven SAR Target Detection: A Review and Perspective. Remote Sens. 2026, 18, 200. [Google Scholar] [CrossRef]
- Li, H.L.; Chen, S.W. General Polarimetric Correlation Pattern: A Visualization and Characterization Tool for Target Joint-Domain Scattering Mechanisms Investigation. IEEE Trans. Geosci. Remote Sens. 2026, 64, 5200417. [Google Scholar] [CrossRef]
- Sasayama, T.; Ito, S.; Ito, K.; Aoki, T. Stereo Radargrammetry Using Deep Learning from Airborne SAR Images. In Proceedings of the 2025 IEEE International Geoscience and Remote Sensing Symposium, Brisbane, Australia, 3–8 August 2025; pp. 7904–7908. [Google Scholar]
- Oquab, M.; Darcet, T.; Moutakanni, T.; Vo, H.V.; Szafraniec, M.; Khalidov, V.; Fernandez, P.; Haziza, D.; Massa, F.; El-Nouby, A.; et al. DINOv2: Learning robust visual features without supervision. arXiv 2024, arXiv:2304.07193. [Google Scholar] [CrossRef]
- Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar] [CrossRef]
- Edstedt, J.; Athanasiadis, I.; Wadenbäck, M.; Felsberg, M. DKM: Dense kernelized feature matching for geometry estimation. In Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 17765–17775. [Google Scholar]
- Nadai, A.; Uratsuka, S.; Umehara, T.; Matsuoka, T.; Satake, M. Development of X-band airborne polarimetric and interferometric SAR with submeter spatial resolution. In Proceedings of the 2009 IEEE International Geoscience and Remote Sensing, Cape Town, South Africa, 12–17 July 2009; Volume 2, pp. 913–916. [Google Scholar]
- Takaku, J.; Tadono, T.; Tsutsui, K.; Ichikawa, M. VaLidation of ‘AW3D’ global DSM generated from ALOS PRISM. ISPRS Ann. Photogramm. Remote Sens. Spat. Inf. Sci. 2016, III-4, 25–31. [Google Scholar]
- Li, Z.; Snavely, N. MegaDepth: Learning single-view depth prediction from internet photos. In Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 2041–2050. [Google Scholar]
- Loshchilov, I.; Hutter, F. Decoupled weight decay regularization. In Proceedings of the International Conference on Learning Representations (ICLR 2019), New Orleans, LA, USA, 6–9 May 2019; pp. 1–19. [Google Scholar]
- Jannati, H.; Zoej, H.J.V.; Ghaderpour, E.; Mazzanti, P. Dense Matching with Low Computational Complexity for Disparity Estimation in the Radargrammetric Approach of SAR Intensity Images. Remote Sens. 2025, 17, 2693. [Google Scholar] [CrossRef]
- Dong, Y.; Li, Y.; Zhao, J.; Sun, Y.; Liao, M. Deep Learning for Radargrammetric DSM Generation: A StereoSAR Dataset and Multiscale Fusion Network. IEEE Trans. Geosci. Remote Sens. 2026, 64, 5205115. [Google Scholar] [CrossRef]










| Method | Patch Size | Number of Patch Pairs | |||
|---|---|---|---|---|---|
| Training | Validation | Test | Total | ||
| DKM [20] | 3001 | 1019 | 372 | 4392 | |
| RoMa [12] | 1656 | 552 | 207 | 2415 | |
| Matching Method | 1 pixel | 3 pixels | 5 pixels | 10 pixels |
|---|---|---|---|---|
| POC [9] | 0.05 | 4.71 | 14.54 | 38.97 |
| DKM (M) [20] | 0.30 | 2.46 | 9.50 | 47.12 |
| DKM (S) | 16.73 | 46.06 | 62.10 | 79.54 |
| RoMa (M) [12] | 0.08 | 1.73 | 11.73 | 60.30 |
| Proposed (RoMa (S)) | 15.93 | 48.13 | 65.08 | 82.86 |
| Test Area | POC [9] | DKM (M) [20] | DKM (S) | RoMa (M) [12] | Proposed (RoMa (S)) |
|---|---|---|---|---|---|
| Area 1 | 0.90 ± 21.2 | −5.33 ± 40.0 | −9.83 ± 8.4 | 14.40 ± 73.3 | −1.24 ± 10.0 |
| (21.1) | (16.2) | (12.5) | (16.2) | (42.2) | |
| Area 2 | −0.56 ± 9.9 | −5.59 ± 19.9 | −5.96 ± 7.0 | 2.15 ± 20.8 | −1.62 ± 6.5 |
| (47.9) | (36.1) | (19.6) | (56.5) | (60.5) | |
| Area 3 | −0.14 ± 12.1 | −4.50 ± 32.6 | −5.60 ± 5.6 | 2.37 ± 28.9 | −1.56 ± 4.3 |
| (47.1) | (33.6) | (25.0) | (66.5) | (74.1) |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2026 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license.
Share and Cite
Ito, K.; Sasayama, T.; Ito, S.; Iwasa, H.; Aoki, T.; Uemoto, J. Stereo Radargrammetry Using Deep Learning-Based Image Matching with Fine-Tuned Model on Synthetic Aperture Radar Images. Remote Sens. 2026, 18, 1662. https://doi.org/10.3390/rs18101662
Ito K, Sasayama T, Ito S, Iwasa H, Aoki T, Uemoto J. Stereo Radargrammetry Using Deep Learning-Based Image Matching with Fine-Tuned Model on Synthetic Aperture Radar Images. Remote Sensing. 2026; 18(10):1662. https://doi.org/10.3390/rs18101662
Chicago/Turabian StyleIto, Koichi, Tatsuya Sasayama, Shintaro Ito, Haruki Iwasa, Takafumi Aoki, and Jyunpei Uemoto. 2026. "Stereo Radargrammetry Using Deep Learning-Based Image Matching with Fine-Tuned Model on Synthetic Aperture Radar Images" Remote Sensing 18, no. 10: 1662. https://doi.org/10.3390/rs18101662
APA StyleIto, K., Sasayama, T., Ito, S., Iwasa, H., Aoki, T., & Uemoto, J. (2026). Stereo Radargrammetry Using Deep Learning-Based Image Matching with Fine-Tuned Model on Synthetic Aperture Radar Images. Remote Sensing, 18(10), 1662. https://doi.org/10.3390/rs18101662

