This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Open AccessArticle
Domain-Adapted MLLMs for Interpretable Road Traffic Accident Analysis Using Remote Sensing Imagery
by
Bing He
Bing He 1
,
Wei He
Wei He 1,*,
Qing Chang
Qing Chang 2,
Wen Luo
Wen Luo 1
and
Lingli Xiao
Lingli Xiao 1
1
School of Computer Science, Chengdu University of Information Technology, Chengdu 610225, China
2
Sain Associates, Inc. 5021 Technology Drive Northwest, Suite B2, Huntsville, AL 35805, USA
*
Author to whom correspondence should be addressed.
ISPRS Int. J. Geo-Inf. 2026, 15(1), 8; https://doi.org/10.3390/ijgi15010008 (registering DOI)
Submission received: 1 November 2025
/
Revised: 7 December 2025
/
Accepted: 15 December 2025
/
Published: 21 December 2025
Abstract
Traditional road traffic accident analysis has long relied on structured data, making it difficult to integrate high-dimensional heterogeneous information such as remote sensing imagery and leading to an incomplete understanding of accident scene environments. This study proposes a road traffic accident analysis framework based on Multimodal Large Language Models. The approach integrates high-resolution remote sensing imagery with structured accident data through a three-stage progressive training pipeline. Specifically, we fine-tune three open-source vision–language models using Low-Rank Adaptation (LoRA) to sequentially optimize the model’s capabilities in visual environmental description, multi-task accident classification, and Chain-of-Thought (CoT) driven causal reasoning. A multimodal dataset was constructed containing remote sensing image descriptions, accident classification labels, and interpretable reasoning chains. Experimental results show that the fine-tuned model achieved a maximum improvement in the CIDEr score for image description tasks. In the joint classification task of accident severity and duration, the model achieved an accuracy of 71.61% and an F1-score of 0.8473. In the CoT reasoning task, both METEOR and CIDEr scores improved significantly. These results validate the effectiveness of structured reasoning mechanisms in multimodal fusion for transportation applications, providing a feasible path toward interpretable and intelligent analysis for real-world traffic management.
Share and Cite
MDPI and ACS Style
He, B.; He, W.; Chang, Q.; Luo, W.; Xiao, L.
Domain-Adapted MLLMs for Interpretable Road Traffic Accident Analysis Using Remote Sensing Imagery. ISPRS Int. J. Geo-Inf. 2026, 15, 8.
https://doi.org/10.3390/ijgi15010008
AMA Style
He B, He W, Chang Q, Luo W, Xiao L.
Domain-Adapted MLLMs for Interpretable Road Traffic Accident Analysis Using Remote Sensing Imagery. ISPRS International Journal of Geo-Information. 2026; 15(1):8.
https://doi.org/10.3390/ijgi15010008
Chicago/Turabian Style
He, Bing, Wei He, Qing Chang, Wen Luo, and Lingli Xiao.
2026. "Domain-Adapted MLLMs for Interpretable Road Traffic Accident Analysis Using Remote Sensing Imagery" ISPRS International Journal of Geo-Information 15, no. 1: 8.
https://doi.org/10.3390/ijgi15010008
APA Style
He, B., He, W., Chang, Q., Luo, W., & Xiao, L.
(2026). Domain-Adapted MLLMs for Interpretable Road Traffic Accident Analysis Using Remote Sensing Imagery. ISPRS International Journal of Geo-Information, 15(1), 8.
https://doi.org/10.3390/ijgi15010008
Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details
here.
Article Metrics
Article metric data becomes available approximately 24 hours after publication online.