Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization

Li, Hangbiao; Mo, Haojun; Li, Xing; Fang, Tao; Liu, Sikun; Yu, Shuzhen; Rao, Zhibo

doi:10.3390/s26103173

This is an early access version, the complete PDF, HTML, and XML versions will be available soon.

Open AccessArticle

Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization

by

Hangbiao Li

^1,2,

Haojun Mo

²,

Xing Li

¹,

Tao Fang

²,

Sikun Liu

³,

Shuzhen Yu

^1,4 and

Zhibo Rao

^1,*

¹

School of Information and Engineering, Nanchang Hangkong University, Nanchang 330063, China

²

North Lian Chuang Communication Co., Ltd., Nanchang 330096, China

³

School of Computer Science and Technology, University of Science and Technology of China, Hefei 230026, China

⁴

School of Instrumentation and Optoelectronic Engineering, Beihang University, Beijing 100191, China

^*

Author to whom correspondence should be addressed.

Sensors 2026, 26(10), 3173; https://doi.org/10.3390/s26103173

Submission received: 4 April 2026 / Revised: 8 May 2026 / Accepted: 14 May 2026 / Published: 17 May 2026

(This article belongs to the Section Sensing and Imaging)

Download Review Reports Versions Notes

Abstract

Stereo matching has witnessed rapid advances on curated benchmarks, yet deploying models in unconstrained real-world environments remains a fundamental challenge. This paper presents a sparse self-prompt-guided network (SSPGNet) for stereo matching with strong generalization across diverse environments. Our core innovation lies in a sparse self-prompt guidance mechanism: (1) a sparse disparity map, used as a prompt, is self-estimated from visual foundation model features via cost aggregation; (2) the sparse disparity is progressively refined into dense disparity maps through cross-attention-based stereo feature interaction, enabling sparse-to-dense disparity prediction. Additionally, we collected a diverse set of indoor and outdoor stereo pairs by using a ZED 2 camera to assess the real-world performance of our model. Extensive experiments demonstrate that the proposed sparse-to-dense prompt mechanism not only preserves the semantic awareness of visual foundation models but also enhances stereo correspondence reasoning, achieving strong performance on public benchmarks and our in-the-wild dataset. Specifically, under the cross-domain (zero-shot) protocol, the proposed SSPGNet achieves bad-pixel error rates of

3.6 %

on KITTI 2012 (>3 px),

4.4 %

on KITTI 2015 (>3 px),

7.6 %

on Middlebury (>2 px), and

2.1 %

on ETH3D (>1 px), ranking first on three of the four public benchmarks. These results highlight the potential of SSPGNet for direct deployment in real-world stereo perception systems. The code is publicly available at GitHub.

Keywords: stereo matching; domain generalization; vision foundation models; sparse prompt; real-world perception; disparity estimation

Share and Cite

MDPI and ACS Style

Li, H.; Mo, H.; Li, X.; Fang, T.; Liu, S.; Yu, S.; Rao, Z. Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization. Sensors 2026, 26, 3173. https://doi.org/10.3390/s26103173

AMA Style

Li H, Mo H, Li X, Fang T, Liu S, Yu S, Rao Z. Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization. Sensors. 2026; 26(10):3173. https://doi.org/10.3390/s26103173

Chicago/Turabian Style

Li, Hangbiao, Haojun Mo, Xing Li, Tao Fang, Sikun Liu, Shuzhen Yu, and Zhibo Rao. 2026. "Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization" Sensors 26, no. 10: 3173. https://doi.org/10.3390/s26103173

APA Style

Li, H., Mo, H., Li, X., Fang, T., Liu, S., Yu, S., & Rao, Z. (2026). Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization. Sensors, 26(10), 3173. https://doi.org/10.3390/s26103173

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Sparse Self-Prompt-Guided Stereo Matching for Real-World Generalization

Abstract

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI