# Multi-Class Double-Transformation Network for SAR Image Registration

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- We utilize each key point directly as a class to design the multi-class model of SAR image registration, which avoids the difficulty of constructing the positive instances (matched-point pairs) in the traditional (two-classification) registration model.
- We design the double-transformation network with the coarse-to-precise structure, where key points from two images are, respectively, used to train two sub-networks that alternately predict key points from another image. It addresses the problem that the categories are inconsistent in training and testing sets.
- A precise-matching module is designed to modify the predictions of two sub-networks and obtain the consistent matched-points, where the nearest points of each key point are introduced to refine the predicted matched-points.

## 2. Related Works

#### 2.1. The Attention Mechanism

#### 2.2. The Transformer Model

## 3. The Proposed Method

#### 3.1. The Multi-Class Double-Transformation Networks

#### 3.1.1. Constructing Samples-Based Key Points

#### 3.1.2. Multi-Class Double-Transformation Networks

#### 3.2. The Precise-Matching Module

## 4. Experiments and Analyses

- 1.
- $RM{S}_{all}$ expresses the root mean square error of the registration result. Note that $RM{S}_{all}\le 1$ means that the performance reaches sub-pixel accuracy.
- 2.
- ${N}_{red}$ is the number of matched-points pairs. Its value is higher, which may be beneficial for obtaining a transformation matrix with a better performance of image registration.
- 3.
- $RM{S}_{LOO}$ expresses the error obtained based on the Leave-One-Out strategy and the root mean square error. For each point in ${N}_{red}$, $RM{S}_{LOO}$ is the average of all errors ($RM{S}_{all}$ of ${N}_{red}-1$ points).
- 4.
- ${P}_{quad}$ is used to detect whether the retained feature points are evenly distributed in the quadrant, and its value should be less than $95\%$.
- 5.
- $BPP\left(r\right)$ expresses the bad point proportion in obtained matched-points pairs, where a point with a residual value above a certain threshold (r) is called the bad point.
- 6.
- ${S}_{kew}$ denotes the absolute value of the calculated correlation coefficient. Note that the Spearman correlation coefficient is used when ${N}_{red}<20$; otherwise, the Pearson correlation coefficient is applied.
- 7.
- ${S}_{cat}$ is a statistical evaluation of the entire image feature point distribution [43], which should be less than $95\%$.
- 8.
- $\varphi $ is the linear combination of the above seven indicators, calculated by$$\begin{array}{c}\hfill \varphi =\frac{1}{12}[2\times (\frac{1}{{N}_{red}}+RM{S}_{LOO}+BPP\left(1.0\right)+{S}_{cat})\\ \hfill +RM{S}_{all}+1.5\times ({P}_{quad}+{S}_{knew})].\end{array}$$When ${N}_{red}\ge 20$, ${P}_{quad}$ is not used, and the above formula is simplified as$$\begin{array}{c}\hfill \varphi =\frac{1}{10.5}[2\times (\frac{1}{{N}_{red}}+RM{S}_{LOO}+BPP\left(1.0\right)+{S}_{cat})\\ \hfill +RM{S}_{all}+1.5\times {S}_{knew}],\end{array}$$

#### 4.1. Comparison and Analysis of the Experimental Results

**SIFT**is mainly matched by using the Euclidean distance ratio between the nearest and second-nearest neighbors of the corresponding features.

**SAR-SIFT**is an improvement of the SIFT method, and it is more consistent with the SAR image characteristics.

**VGG16-LS, ResNet50-LS and ViT-LS**are deep-learning-based classification methods.

**DNN + RANSAC**[12] constructs the training sample set by using self-learning methods, and then it uses DNN networks to obtain matched image pairs.

**MSDF-Net**[16] uses deep forest to construct multiple matching models based on multi-scale fusion to obtain the matched-points pairs, and then it uses RANSAC to calculate the transformation matrix.

**AdaSSIR**[25] proposes an adaptive self-supervised SAR image registration method, where the registration of SAR images is considered as a self-supervised learning problem and each key point is regarded as a category-independent instance to construct the contrastive model for searching out the accurate matched points.

#### 4.2. The Visual Results of SAR Image Registration

#### 4.3. Analyses on the Precise-Matching Module

#### 4.4. Analyses on the Double-Transformation Network

## 5. Discussion

## 6. Conclusions

## Author Contributions

## Funding

## Data Availability Statement

## Conflicts of Interest

## References

- Fitch, J.P. Synthetic Aperture Radar; Springer Science & Business Media: New York, NY, USA, 2012. [Google Scholar]
- Joyce, K.E.; Samsonov, S.V.; Levick, S.R.; Engelbrecht, J.; Belliss, S. Mapping and monitoring geological hazards using optical, LiDAR, and synthetic aperture RADAR image data. Nat. Hazards
**2014**, 73, 137–163. [Google Scholar] [CrossRef] - Quartulli, M.; Olaizola, I.G. A review of EO image information mining. ISPRS J. Photogramm. Remote Sens.
**2013**, 75, 11–28. [Google Scholar] [CrossRef] [Green Version] - Wang, Y.; Du, L.; Dai, H. Unsupervised sar image change detection based on sift keypoints and region information. IEEE Geosci. Remote Sens. Lett.
**2016**, 13, 931–935. [Google Scholar] [CrossRef] - Poulain, V.; Inglada, J.; Spigai, M. High-resolution optical and SAR image fusion for building database updating. IEEE Trans. Geosci. Remote Sens.
**2011**, 49, 2900–2910. [Google Scholar] [CrossRef] [Green Version] - Byun, Y.; Choi, J.; Han, Y. An Area-Based Image Fusion Scheme for the Integration of SAR and Optical Satellite Imagery. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2013**, 6, 2212–2220. [Google Scholar] [CrossRef] - Moser, G.; Serpico, S.B. Unsupervised Change Detection from Multichannel SAR Data by Markovian Data Fusion. IEEE Trans. Geosci. Remote Sens.
**2009**, 47, 2114–2128. [Google Scholar] [CrossRef] - Song, T.; Yi, S. Fast and Accurate Target Detection Based on Multiscale Saliency and Active Contour Model for High-Resolution SAR Images. IEEE Trans. Geosci. Remote Sens.
**2016**, 54, 5729–5744. [Google Scholar] - Giusti, E.; Ghio, S.; Oveis, A.H.; Martorella, M. Proportional Similarity-Based Openmax Classifier for Open Set Recognition in SAR Images. Remote Sens.
**2022**, 14, 4665. [Google Scholar] [CrossRef] - Schwind, P.; Suri, S.; Reinartz, P.; Siebert, A. Applicability of the sift operator to geometric sar image registration. Int. J. Remote Sens.
**2010**, 31, 1959–1980. [Google Scholar] [CrossRef] - Wang, S.H.; You, H.J.; Fu, K. Bfsift: A novel method to find feature matches for sar image registration. IEEE Geosci. Remote Sens. Lett.
**2012**, 9, 649–653. [Google Scholar] [CrossRef] - Wang, S.; Quan, D.; Liang, X.; Ning, M.; Guo, Y.; Jiao, L. A deep learning framework for remote sensing image registration. ISPRS J. Photogramm. Remote Sens.
**2018**, 145, 148–164. [Google Scholar] [CrossRef] - Dellinger, F.; Delon, J.; Gousseau, Y.; Michel, J.; Tupin, F. SAR-SIFT: A SIFT-Like Algorithm for SAR Images. IEEE Trans. Geosci. Remote Sens.
**2015**, 53, 453–466. [Google Scholar] [CrossRef] [Green Version] - Wu, Y.; Miao, Q.; Ma, W.; Gong, M.; Wang, S. PSOSAC: Particle Swarm Optimization Sample Consensus Algorithm for Remote Sensing Image Registration. IEEE Geosci. Remote Sens. Lett.
**2018**, 15, 242–246. [Google Scholar] [CrossRef] - Liu, Y.; Zhou, Y.; Zhou, Y.; Ma, L.; Wang, B.; Zhang, F. Accelerating SAR Image Registration Using Swarm-Intelligent GPU Parallelization. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.
**2020**, 13, 5694–5703. [Google Scholar] [CrossRef] - Mao, S.; Yang, J.; Gou, S.; Jiao, L.; Xiong, T.; Xiong, L. Multi-Scale Fused SAR Image Registration Based on Deep Forest. Remote Sens.
**2021**, 13, 2227. [Google Scholar] [CrossRef] - Zhang, S.; Sui, L.; Zhou, R.; Xun, Z.; Du, C.; Guo, X. Mountainous SAR Image Registration Using Image Simulation and an L2E Robust Estimator. Sustainability
**2022**, 14, 9315. [Google Scholar] [CrossRef] - Gong, M.; Zhao, S.; Jiao, L.; Tian, D.; Wang, S. A novel coarse-to-fine scheme for automatic image registration based on SIFT and mutual information. IEEE Trans. Geosci. Remote Sens.
**2013**, 52, 4328–4338. [Google Scholar] [CrossRef] - Sollers, J.J.; Buchanan, T.W.; Mowrer, S.M.; Hill, L.K.; Thayer, J.F. Comparison of the ratio of the standard deviation of the RR interval and the root mean squared successive differences (SD/rMSSD) to the low frequency-to-high frequency (LF/HF) ratio in a patient population and normal healthy controls. Biomed. Sci. Instrum.
**2007**, 43, 158–163. [Google Scholar] [PubMed] - Ma, W.; Zhang, J.; Wu, Y.; Jiao, L.; Zhu, H.; Zhao, W. A Novel Two-Step Registration Method for Remote Sensing Images Based on Deep and Local Features. IEEE Trans. Geosci. Remote Sens.
**2019**, 57, 4834–4843. [Google Scholar] [CrossRef] - Quan, D.; Wang, S.; Ning, M.; Xiong, T.; Jiao, L. Using deep neural networks for synthetic aperture radar image registration. In Proceedings of the 2016 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), Beijing, China, 10–15 July 2016; pp. 2799–2802. [Google Scholar]
- Ye, F.; Su, Y.; Xiao, H.; Zhao, X.; Min, W. Remote sensing image registration using convolutional neural network features. IEEE Geosci. Remote Sens. Lett.
**2018**, 15, 232–236. [Google Scholar] [CrossRef] - Mu, J.; Gou, S.; Mao, S.; Zheng, S. A Stepwise Matching Method for Multi-modal Image based on Cascaded Network. In Proceedings of the 29th ACM International Conference on Multimedia, Nice, France, 21–25 October 2021; pp. 1284–1292. [Google Scholar]
- Zou, B.; Li, H.; Zhang, L. Self-Supervised SAR Image Registration With SAR-Superpoint and Transformation Aggregation. IEEE Trans. Geosci. Remote Sens.
**2023**, 61, 5201115. [Google Scholar] [CrossRef] - Mao, S.; Yang, J.; Gou, S.; Lu, K.; Jiao, L.; Xiong, T.; Xiong, L. Adaptive Self-Supervised SAR Image Registration with Modifications of Alignment Transformation. IEEE Trans. Geosci. Remote Sens.
**2023**, 61, 5203715. [Google Scholar] [CrossRef] - Niu, Z.; Zhong, G.; Yu, H. A review on the attention mechanism of deep learning. Neurocomputing
**2021**, 452, 48–62. [Google Scholar] [CrossRef] - Chaudhari, S.; Mithal, V.; Polatkan, G.; Ramanath, R. An attentive survey of attention models. ACM Trans. Intell. Syst. Technol.
**2021**, 12, 53. [Google Scholar] [CrossRef] - Kim, Y.; Lee, J.; Lee, E.B.; Lee, J.H. Application of Natural Language Processing (NLP) and Text-Mining of Big-Data to Engineering-Procurement-Construction (EPC) Bid and Contract Documents. In Proceedings of the 6th Conference on Data Science and Machine Learning Applications (CDMA), Riyadh, Saudi Arabia, 4–5 March 2020; pp. 123–128. [Google Scholar]
- Xu, K.; Ba, J.; Kiros, R.; Cho, K.; Courville, A.; Salakhudinov, R.; Zemel, R.; Bengio, Y. Show, attend and tell: Neural image caption generation with visual attention. In Proceedings of the International Conference on Machine Learning, Lille, France, 6–11 July 2015; pp. 2048–2057. [Google Scholar]
- Guo, H.; Zheng, K.; Fan, X.; Yu, H.; Wang, S. Visual attention consistency under image transforms for multi-label image classification. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, CA, USA, 15–20 June 2019; pp. 729–739. [Google Scholar]
- Haut, J.M.; Paoletti, M.E.; Plaza, J.; Plaza, A.; Li, J. Visual attention-driven hyperspectral image classification. IEEE Trans. Geosci. Remote Sens.
**2019**, 57, 8065–8080. [Google Scholar] [CrossRef] - Li, W.; Liu, K.; Zhang, L.; Cheng, F. Object detection based on an adaptive attention mechanism. Sci. Rep.
**2020**, 10, 11307. [Google Scholar] [CrossRef] - Zhu, Y.; Zhao, C.; Guo, H.; Wang, J.; Zhao, X.; Lu, H. Attention CoupleNet: Fully convolutional attention coupling network for object detection. IEEE Trans. Image Process.
**2018**, 28, 113–126. [Google Scholar] [CrossRef] - Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A.N.; Kaiser, Ł.; Polosukhin, I. Attention is all you need. In Proceedings of the 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, CA, USA, 4–9 December 2017; pp. 5998–6008. [Google Scholar]
- Devlin, J.; Chang, M.W.; Lee, K.; Toutanova, K. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv
**2018**, arXiv:1810.04805. [Google Scholar] - Han, K.; Wang, Y.; Chen, H.; Chen, X.; Guo, J.; Liu, Z.; Tang, Y.; Xiao, A.; Xu, C.; Xu, Y.; et al. A Survey on Vision Transformer. IEEE Trans. Pattern Anal. Mach. Intell.
**2023**, 45, 87–110. [Google Scholar] [CrossRef] - Dai, Z.; Yang, Z.; Yang, Y.; Carbonell, J.; Le, Q.V.; Salakhutdinov, R. Transformer-xl: Attentive language models beyond a fixed-length context. arXiv
**2019**, arXiv:1901.02860. [Google Scholar] - Chen, J.; Lu, Y.; Yu, Q.; Luo, X.; Adeli, E.; Wang, Y.; Lu, L.; Yuille, A.L.; Zhou, Y. Transunet: Transformers make strong encoders for medical image segmentation. arXiv
**2021**, arXiv:2102.04306. [Google Scholar] - Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 x 16 words: Transformers for image recognition at scale. arXiv
**2020**, arXiv:2010.11929. [Google Scholar] - Liu, Z.; Lin, Y.; Cao, Y.; Hu, H.; Wei, Y.; Zhang, Z.; Lin, S.; Guo, B. Swin transformer: Hierarchical vision transformer using shifted windows. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Montreal, BC, Canada, 11–17 October 2021; pp. 10012–10022. [Google Scholar]
- Mansard, E.P.; Funke, E.R. The measurement of incident and reflected spectra using a least squares method. Coast. Eng. Proc.
**1980**, 17, 8. [Google Scholar] [CrossRef] [Green Version] - Hugo, T.; Matthieu, C.; Matthijs, D.; Francisco, M.; Alexandre, S.; Hervé, J. Training data-efficient image transformers & distillation through attention. In Proceedings of the 38th International Conference on Machine Learning, PMLR, Virtual Event, 18–24 July 2021; pp. 10347–10357. [Google Scholar]
- Goncalves, H.; Goncalves, J.A.; Corte-Real, L. Measures for an Objective Evaluation of the Geometric Correction Process Quality. IEEE Geosci. Remote Sens. Lett.
**2009**, 6, 292–296. [Google Scholar] [CrossRef] - Lowe, D.G. Object recognition from local scale-invariant features. IEEE Int. Conf. Comput. Vis.
**1999**, 2, 1150–1157. [Google Scholar] - Simonyan, K.; Zisserman, A. Very deep convolutional networks for large-scale image recognition. arXiv
**2014**, arXiv:1409.1556. [Google Scholar] - He, K.; Zhang, X.; Ren, S.; Sun, J. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 770–778. [Google Scholar]

**Figure 2.**A visual example of eight near-points around a key point from the sensed image with k pixels, where $k=5$ and the predictions are obtained by the R-S branch ($Ne{t}_{R}$).

**Figure 3.**Reference and sensed images of Wuhan data. The image size is $400\times 400$ and the resolution is 10 m.

**Figure 4.**Reference and sensed images of Australia-Yama data. The image size is $650\times 350$ pixels.

**Figure 5.**Reference and sensed images of YellowR1 data. The image size is $700\times 700$ pixels and the resolution is 8 m.

**Figure 6.**Reference and sensed images of YellowR2 data. The image size is $1000\times 1000$ pixels and the resolution is 8 m.

**Figure 11.**The comparison of sub-images corresponding to matched-points obtained by the proposed method without precised-matching module and with precise-matching module. For each point, the left sub-image corresponds to that point, the right top sub-image corresponds to its matched-point obtained by the proposed method without precised-matching module, and the right under sub-image (labeled in the red box) corresponds to the matched results with precised-matching module.

**Figure 12.**The comparison of the proposed double-transformation network to two single branches (the R-S branch and the S-R branch).

Methods | ${\mathit{N}}_{\mathit{red}}$ | ${\mathit{RMS}}_{\mathit{all}}$ | ${\mathit{RMS}}_{\mathit{Loo}}$ | ${\mathit{P}}_{\mathit{quad}}$ | $\mathit{BPP}\left(\mathit{r}\right)$ | ${\mathit{S}}_{\mathit{kew}}$ | ${\mathit{S}}_{\mathit{cat}}$ | $\mathit{\varphi}$ |
---|---|---|---|---|---|---|---|---|

SIFT | 17 | 1.2076 | 1.2139 | — | 0.6471 | 0.1367 | 0.9991 | 0.7048 |

SAR-SIFT | 66 | 1.2455 | 1.2491 | 0.6300 | 0.6212 | 0.1251 | 0.9961 | 0.6784 |

VGG16-LS | 58 | 0.5611 | 0.5694 | 0.6665 | 0.2556 | 0.0389 | 1.0000 | 0.4420 |

ResNet50-LS | 68 | 0.4818 | 0.4966 | 0.7162 | 0.2818 | 0.1943 | 0.9766 | 0.4489 |

ViT-LS | 64 | 0.5218 | 0.5304 | 0.6101 | 0.2330 | 0.1072 | 1.0000 | 0.4296 |

DNN + RANSAC | 8 | 0.6471 | 0.6766 | – | 0.1818 | 0.0943 | 0.9766 | 0.4484 |

MSDF-Net | 39 | 0.4345 | 0.4893 | 0.6101 | 0.3124 | 0.1072 | 1.0000 | 0.4304 |

AdaSSIR | 47 | 0.4217 | 0.4459 | 0.6254 | 0.3377 | 0.1165 | 1.0000 | 0.4287 |

STDT-Net (Ours) | 78 | 0.4490 | 0.4520 | 0.6254 | 0.2277 | 0.1165 | 1.0000 | 0.4122 |

Rank/All | 1/10 | 3/10 | 2/10 | 2/7 | 2/10 | 4/10 | 4/4 | 1/10 |

Methods | ${\mathit{N}}_{\mathit{red}}$ | ${\mathit{RMS}}_{\mathit{all}}$ | ${\mathit{RMS}}_{\mathit{Loo}}$ | ${\mathit{P}}_{\mathit{quad}}$ | $\mathit{BPP}\left(\mathit{r}\right)$ | ${\mathit{S}}_{\mathit{kew}}$ | ${\mathit{S}}_{\mathit{cat}}$ | $\mathit{\varphi}$ |
---|---|---|---|---|---|---|---|---|

SIFT | 69 | 1.1768 | 1.1806 | 0.9013 | 0.6812 | $\mathbf{0}.\mathbf{0975}$ | 0.9922 | 0.7010 |

SAR-SIFT | $\mathbf{151}$ | 1.2487 | 1.2948 | 0.6016 | 0.6755 | 0.1274 | 0.9980 | 0.6910 |

VGG16-LS | 112 | 0.5604 | 0.5685 | 0.6150 | 0.3621 | 0.1271 | 1.0000 | 0.4626 |

ResNet50-LS | 120 | 0.4903 | 0.5064 | $\mathbf{0}.\mathbf{5873}$ | 0.2515 | 0.1027 | 1.0000 | 0.4215 |

ViT-LS | 109 | 0.5276 | 0.5371 | 0.7162 | 0.2529 | 0.1105 | 1.0000 | 0.4472 |

DNN+RANSAC | 8 | 0.7293 | 0.7582 | – | 0.5000 | 0.1227 | 0.9766 | 0.5365 |

MSDF-Net | 12 | 0.4645 | 0.4835 | – | 0.4000 | 0.1175 | 0.9999 | 0.4356 |

AdaSSIR | 71 | 0.4637 | 0.4707 | 0.6013 | 0.4545 | 0.1072 | 1.0000 | 0.4504 |

STDT-Net (Ours) | 115 | $\mathbf{0}.\mathbf{4604}$ | 0.4732 | 0.6740 | $\mathbf{0}.\mathbf{2173}$ | 0.1175 | 1.0000 | $\mathbf{0}.\mathbf{4205}$ |

Rank/All | 3/9 | 1/9 | 2/9 | 5/7 | 2/9 | 4/9 | 4/4 | 1/9 |

Methods | ${\mathit{N}}_{\mathit{red}}$ | ${\mathit{RMS}}_{\mathit{all}}$ | ${\mathit{RMS}}_{\mathit{Loo}}$ | ${\mathit{P}}_{\mathit{quad}}$ | $\mathit{BPP}\left(\mathit{r}\right)$ | ${\mathit{S}}_{\mathit{kew}}$ | ${\mathit{S}}_{\mathit{cat}}$ | $\mathit{\varphi}$ |
---|---|---|---|---|---|---|---|---|

SIFT | 11 | 0.9105 | 0.9436 | — | 0.5455 | 0.1055 | 0.9873 | 0.5908 |

SAR-SIFT | $\mathbf{31}$ | 1.1424 | 1.2948 | 0.5910 | 0.7419 | 0.0962 | 1.0000 | 0.6636 |

VGG16-LS | 19 | 0.6089 | 0.6114 | — | 0.4211 | 0.1061 | 1.0000 | 0.4703 |

ResNet50-LS | 25 | 0.5725 | 0.5889 | 0.5814 | 0.6058 | 0.1387 | 1.0000 | 0.5102 |

ViT-LS | 20 | 0.5986 | 0.5571 | 0.5821 | 0.5875 | 0.1266 | 1.0000 | 0.5118 |

DNN+RANSAC | 10 | 0.8024 | 0.8518 | – | 0.6000 | 0.1381 | 0.9996 | 0.5821 |

MSDF-Net | 11 | 0.5923 | 0.6114 | – | 0.4351 | 0.0834 | 0.9990 | 0.4753 |

AdaSSIR | 20 | 0.5534 | 0.5720 | 0.5395 | 0.4444 | 0.1086 | 1.0000 | 0.4715 |

STDT-Net (Ours) | 24 | $\mathbf{0}.\mathbf{5487}$ | $\mathbf{0}.\mathbf{5531}$ | 0.5486 | 0.4038 | 0.1088 | 1.0000 | $\mathbf{0}.\mathbf{4610}$ |

Rank/All | 3/9 | 1/9 | 1/9 | 2/7 | 1/9 | 6/9 | 4/4 | 1/9 |

Methods | ${\mathit{N}}_{\mathit{red}}$ | ${\mathit{RMS}}_{\mathit{all}}$ | ${\mathit{RMS}}_{\mathit{Loo}}$ | ${\mathit{P}}_{\mathit{quad}}$ | $\mathit{BPP}\left(\mathit{r}\right)$ | ${\mathit{S}}_{\mathit{kew}}$ | ${\mathit{S}}_{\mathit{cat}}$ | $\mathit{\varphi}$ |
---|---|---|---|---|---|---|---|---|

SIFT | 88 | 1.1696 | 1.1711 | 0.6399 | 0.7841 | 0.1138 | $\mathbf{0}.\mathbf{9375}$ | 0.6757 |

SAR-SIFT | $\mathbf{301}$ | 1.1903 | 1.1973 | 0.8961 | 0.8671 | 0.1318 | 1.0000 | 0.7390 |

VGG16-LS | 54 | 0.5406 | 0.5504 | 0.6804 | 0.3187 | 0.1277 | 1.0000 | 0.4607 |

ResNet50-LS | 70 | 0.5036 | 0.5106 | 0.7162 | 0.2778 | 0.1208 | 0.9999 | 0.4470 |

ViT-LS | 67 | 0.5015 | 0.5095 | $\mathbf{0}.\mathbf{6000}$ | 0.2925 | 0.1281 | 1.0000 | 0.4356 |

DNN+RANSAC | 10 | 0.5784 | 0.5906 | – | 0.0000 | 0.1308 | 0.9999 | 0.3946 |

MSDF-Net | 52 | 0.5051 | 0.5220 | 0.6112 | 0.7692 | 0.1434 | 1.0000 | 0.5215 |

AdaSSIR | 68 | 0.4858 | 0.4994 | 0.6013 | 0.5714 | 0.1149 | 1.0000 | 0.4776 |

STDT-Net (Ours) | 79 | 0.4808 | 0.4954 | 0.6740 | 0.2692 | 0.1134 | 1.0000 | 0.4347 |

Rank/All | 3/9 | 1/9 | 1/9 | 5/7 | 2/9 | 1/9 | 4/4 | 2/9 |

Datasets | Branch | Without Precise-Matching | With Precise-Matching |
---|---|---|---|

Wuhan | R→S | 0.4598 | 0.4579 |

S→R | 0.4620 | 0.4590 | |

YellowR1 | R→S | 0.5798 | 0.5525 |

S→R | 0.5585 | 0.5535 | |

YAMBA | R→S | 0.4788 | 0.4960 |

S→R | 0.4858 | 0.4763 | |

YellowR2 | R→S | 0.5253 | 0.5185 |

S→R | 0.5093 | 0.4960 |

Datasets | Performance | VGG16 | ResNet50 | ViT | Swin-Transformer |
---|---|---|---|---|---|

YellowR1 | $Acc$ (%) | 87.13 | 89.32 | 89.59 | 92.74 |

$Time$ (m) | 47 | 38 | 42 | 31 | |

Wuhan | $Acc$ (%) | 89.26 | 92.71 | 91.10 | 94.83 |

$Time$ (m) | 19 | 13 | 28 | 10 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Deng, X.; Mao, S.; Yang, J.; Lu, S.; Gou, S.; Zhou, Y.; Jiao, L.
Multi-Class Double-Transformation Network for SAR Image Registration. *Remote Sens.* **2023**, *15*, 2927.
https://doi.org/10.3390/rs15112927

**AMA Style**

Deng X, Mao S, Yang J, Lu S, Gou S, Zhou Y, Jiao L.
Multi-Class Double-Transformation Network for SAR Image Registration. *Remote Sensing*. 2023; 15(11):2927.
https://doi.org/10.3390/rs15112927

**Chicago/Turabian Style**

Deng, Xiaozheng, Shasha Mao, Jinyuan Yang, Shiming Lu, Shuiping Gou, Youming Zhou, and Licheng Jiao.
2023. "Multi-Class Double-Transformation Network for SAR Image Registration" *Remote Sensing* 15, no. 11: 2927.
https://doi.org/10.3390/rs15112927