Next Article in Journal
Feature-Driven Joint Source–Channel Coding for Robust 3D Image Transmission
Previous Article in Journal
Analysis of Sensor Location and Time–Frequency Feature Contributions in IMU-Based Gait Identity Recognition
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

VimGeo: An Efficient Visual Model for Cross-View Geo-Localization

1
School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
2
Yangzhou Petroleum Branch, Yangzhou 225002, China
3
School of Information Technology, Murdoch University, Perth 6150, Australia
*
Author to whom correspondence should be addressed.
Electronics 2025, 14(19), 3906; https://doi.org/10.3390/electronics14193906
Submission received: 25 August 2025 / Revised: 24 September 2025 / Accepted: 25 September 2025 / Published: 30 September 2025

Abstract

Cross-view geo-localization is a challenging task due to the significant changes in the appearance of target scenes from variable perspectives. Most existing methods primarily adopt Transformers or ConvNeXt as backbone models but often face high computational costs and accuracy degradation in complex scenarios. Therefore, this paper proposes a visual Mamba framework based on the state-space model (SSM) for cross-view geo-localization. Compared with the existing methods, Vision Mamba is more efficient in modeling and memory usage and achieves more efficient cross-view matching by combining the twin architecture of shared weights with multiple mixed losses. Additionally, this paper introduces Dice Loss to handle scale differences and imbalance issues in cross-view images. Extensive experiments on the public cross-view dataset University-1652 demonstrate that Vision Mamba not only achieves excellent performance in UAV target localization tasks but also attains the highest efficiency with lower memory consumption. This work provides a novel solution for cross-view geo-localization tasks and shows great potential to become the backbone model for the next generation of cross-view geo-localization.
Keywords: cross-view; Vision Mamba; Dice Loss; image retrieval cross-view; Vision Mamba; Dice Loss; image retrieval

Share and Cite

MDPI and ACS Style

Yang, K.; Zhang, Y.; Wang, L.; Muzahid, A.A.M.; Sohel, F.; Wu, F.; Wu, Q. VimGeo: An Efficient Visual Model for Cross-View Geo-Localization. Electronics 2025, 14, 3906. https://doi.org/10.3390/electronics14193906

AMA Style

Yang K, Zhang Y, Wang L, Muzahid AAM, Sohel F, Wu F, Wu Q. VimGeo: An Efficient Visual Model for Cross-View Geo-Localization. Electronics. 2025; 14(19):3906. https://doi.org/10.3390/electronics14193906

Chicago/Turabian Style

Yang, Kaiqian, Yujin Zhang, Li Wang, A. A. M. Muzahid, Ferdous Sohel, Fei Wu, and Qiong Wu. 2025. "VimGeo: An Efficient Visual Model for Cross-View Geo-Localization" Electronics 14, no. 19: 3906. https://doi.org/10.3390/electronics14193906

APA Style

Yang, K., Zhang, Y., Wang, L., Muzahid, A. A. M., Sohel, F., Wu, F., & Wu, Q. (2025). VimGeo: An Efficient Visual Model for Cross-View Geo-Localization. Electronics, 14(19), 3906. https://doi.org/10.3390/electronics14193906

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop