This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
LMFusion: Breaking the Computational Barrier for Multimodal Classification in Remote Sensing
by
Shenbo Zhou
Shenbo Zhou
,
Sibo He
Sibo He ,
Daixun Li
Daixun Li ,
Weiying Xie
Weiying Xie
and
Yunsong Li
Yunsong Li *
State Key Laboratory of Integrated Services Networks, Xidian University, Xi’an 710071, China
*
Author to whom correspondence should be addressed.
Remote Sens. 2026, 18(12), 1972; https://doi.org/10.3390/rs18121972 (registering DOI)
Submission received: 30 April 2026
/
Revised: 6 June 2026
/
Accepted: 9 June 2026
/
Published: 13 June 2026
Abstract
Multi-modal land cover classification plays an important role in remote sensing applications such as urban monitoring and environmental analysis. By integrating complementary information from hyperspectral imagery (HSI) and LiDAR data, multimodal learning can significantly improve classification performance. However, existing Transformer-based fusion methods often suffer from high computational complexity and inefficient cross-modal interaction modeling, which limits their applicability in resource-constrained scenarios. To address these challenges, we propose LMFusion, an efficient framework for multimodal feature learning. Specifically, LMFusion enables efficient bidirectional feature interaction through a linear-complexity cross-attention mechanism and enhances long-range spatial-spectral representation learning with Mamba-based state space modeling, thereby achieving effective multimodal dependency modeling with linear computational complexity. In addition, a selective quantization-aware optimization strategy is introduced to support multiple bit-width settings (down to 1-bit), yielding a more compact and efficient model while improving representation robustness under low-bit constraints. Extensive experiments on the Houston2013, MUUFL, and Augsburg datasets demonstrate the effectiveness of LMFusion. It achieves overall accuracies of 95.84%, 94.95%, and 99.05%, respectively, consistently outperforming representative multimodal classification methods and showing strong potential for accurate and efficient multimodal remote sensing classification.
Share and Cite
MDPI and ACS Style
Zhou, S.; He, S.; Li, D.; Xie, W.; Li, Y.
LMFusion: Breaking the Computational Barrier for Multimodal Classification in Remote Sensing. Remote Sens. 2026, 18, 1972.
https://doi.org/10.3390/rs18121972
AMA Style
Zhou S, He S, Li D, Xie W, Li Y.
LMFusion: Breaking the Computational Barrier for Multimodal Classification in Remote Sensing. Remote Sensing. 2026; 18(12):1972.
https://doi.org/10.3390/rs18121972
Chicago/Turabian Style
Zhou, Shenbo, Sibo He, Daixun Li, Weiying Xie, and Yunsong Li.
2026. "LMFusion: Breaking the Computational Barrier for Multimodal Classification in Remote Sensing" Remote Sensing 18, no. 12: 1972.
https://doi.org/10.3390/rs18121972
APA Style
Zhou, S., He, S., Li, D., Xie, W., & Li, Y.
(2026). LMFusion: Breaking the Computational Barrier for Multimodal Classification in Remote Sensing. Remote Sensing, 18(12), 1972.
https://doi.org/10.3390/rs18121972
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.