Next Article in Journal
Backpack LiDAR Supports Biotope-Scale Assessment of Structure, Maintenance, and Net Carbon Budget in Urban Park Plant Communities
Previous Article in Journal
Three-Dimensional Deformation Field Inversion Based on Fused Monitoring Data of GNSS and InSAR: A Case Study of Jinchuan No. 2 Mining Area
Previous Article in Special Issue
ATCFNet: A Lightweight Cross-Level Attention-Guided High-Resolution Remote Sensing Image Change Detection Network
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

ChangeVLM: A Language-Guided Semantic Alignment Framework for Binary Remote Sensing Change Detection

1
School of Electronic Information, Xijing University, Chang’an District, Xi’an 710123, China
2
School of Computer Science and Engineering, Xi’an University of Technology, Xi’an 710048, China
3
School of Computer Science, Northwestern Polytechnical University, Chang’an District, Xi’an 710129, China
4
Information and Navigation College, Air Force Engineering University, No. 1 Fenggao Road, Lianhu District, Xi’an 710077, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(10), 1671; https://doi.org/10.3390/rs18101671
Submission received: 23 March 2026 / Revised: 18 May 2026 / Accepted: 20 May 2026 / Published: 21 May 2026
(This article belongs to the Special Issue Foundation Model-Based Multi-Modal Data Fusion in Remote Sensing)

Abstract

Against the backdrop of complex features and spectral heterogeneity in high-resolution remote sensing imagery, traditional methods suffer from insufficient semantic understanding, while existing vision–language change detection models face low efficiency, poor spatial localization, and decoupled detection–description pipelines. To overcome these limitations, this paper proposes ChangeVLM, a language-guided semantic alignment framework for binary remote sensing change detection, enabling end-to-end, prompt-free, highly efficient, and interpretable change detection. Its key advantages include the following, (1) Higher detection accuracy with F1 scores of 91.52%, 83.56%, and 75.29% on LEVIR-CD, SYSU-ChangeDet, and HRCUS datasets, outperforming 18 state-of-the-art methods. (2) Stronger edge integrity and small-object detection capability; (3) practical deployment efficiency: the end-to-end FLOPs is 560.7G. Additionally, under an optimized inference setting with pre-extracted features, the effective computation can be reduced to 13.05G. (4) Language-guided semantic regularization to enhance visual discrimination, without requiring external text prompts. The Asymmetric Fusion Module (AFM), lightweight ChangeHead, and Change-Aware Cross-Modal Fusion Module (CACMF) jointly enhance spatial precision, efficiency, and interpretability. Extensive experiments validate that ChangeVLM achieves a superior accuracy–efficiency trade-off. This method provides an effective, deployable solution for high-resolution remote sensing binary change detection, where the language branch acts only as a regularization signal.
Keywords: remote sensing change detection; high-resolution remote sensing imagery; vision–language foundation model; deep learning; cross-modal fusion; lightweight model; spatial prior guidance; semantic regularization remote sensing change detection; high-resolution remote sensing imagery; vision–language foundation model; deep learning; cross-modal fusion; lightweight model; spatial prior guidance; semantic regularization

Share and Cite

MDPI and ACS Style

Li, D.; Chu, P.; Yang, C.; Wang, Z.; Dai, C. ChangeVLM: A Language-Guided Semantic Alignment Framework for Binary Remote Sensing Change Detection. Remote Sens. 2026, 18, 1671. https://doi.org/10.3390/rs18101671

AMA Style

Li D, Chu P, Yang C, Wang Z, Dai C. ChangeVLM: A Language-Guided Semantic Alignment Framework for Binary Remote Sensing Change Detection. Remote Sensing. 2026; 18(10):1671. https://doi.org/10.3390/rs18101671

Chicago/Turabian Style

Li, Dongxu, Peng Chu, Chen Yang, Zhen Wang, and Chuanjin Dai. 2026. "ChangeVLM: A Language-Guided Semantic Alignment Framework for Binary Remote Sensing Change Detection" Remote Sensing 18, no. 10: 1671. https://doi.org/10.3390/rs18101671

APA Style

Li, D., Chu, P., Yang, C., Wang, Z., & Dai, C. (2026). ChangeVLM: A Language-Guided Semantic Alignment Framework for Binary Remote Sensing Change Detection. Remote Sensing, 18(10), 1671. https://doi.org/10.3390/rs18101671

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop