Next Article in Journal
Spherical Polar Pattern Matching for Star Identification
Previous Article in Journal
Using the Spark Plug as a Sensor for Analyzing the State of the Combustion System
Previous Article in Special Issue
High-Speed Multiple Object Tracking Based on Fusion of Intelligent and Real-Time Image Processing
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
This is an early access version, the complete PDF, HTML, and XML versions will be available soon.
Article

Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation

1
School of Computer Science and Technology, Zhejiang University, Hangzhou 310058, China
2
Zhejiang Dahua Technology Co., Ltd., Hangzhou 310053, China
*
Author to whom correspondence should be addressed.
Sensors 2025, 25(13), 4199; https://doi.org/10.3390/s25134199 (registering DOI)
Submission received: 24 May 2025 / Revised: 26 June 2025 / Accepted: 3 July 2025 / Published: 5 July 2025

Abstract

The Segment Anything Model 2 (SAM2) has achieved state-of-the-art performance in pixel-level object segmentation for both static and dynamic visual content. Its streaming memory architecture maintains spatial context across video sequences, yet struggles with long-term tracking due to its static inference framework. SAM 2’s fixed temporal window approach indiscriminately retains historical frames, failing to account for frame quality or dynamic motion patterns. This leads to error propagation and tracking instability in challenging scenarios involving fast-moving objects, partial occlusions, or crowded environments. To overcome these limitations, this paper proposes SAM2Plus, a zero-shot enhancement framework that integrates Kalman filter prediction, dynamic quality thresholds, and adaptive memory management. The Kalman filter models object motion using physical constraints to predict trajectories and dynamically refine segmentation states, mitigating positional drift during occlusions or velocity changes. Dynamic thresholds, combined with multi-criteria evaluation metrics (e.g., motion coherence, appearance consistency), prioritize high-quality frames while adaptively balancing confidence scores and temporal smoothness. This reduces ambiguities among similar objects in complex scenes. SAM2Plus further employs an optimized memory system that prunes outdated or low-confidence entries and retains temporally coherent context, ensuring constant computational resources even for infinitely long videos. Extensive experiments on two video object segmentation (VOS) benchmarks demonstrate SAM2Plus’s superiority over SAM 2. It achieves an average improvement of 1.0 in J&F metrics across all 24 direct comparisons, with gains exceeding 2.3 points on SA-V and LVOS datasets for long-term tracking. The method delivers real-time performance and strong generalization without fine-tuning or additional parameters, effectively addressing occlusion recovery and viewpoint changes. By unifying motion-aware physics-based prediction with spatial segmentation, SAM2Plus bridges the gap between static and dynamic reasoning, offering a scalable solution for real-world applications such as autonomous driving and surveillance systems.
Keywords: SAM 2; long-term video; SA-V; LVOS; Kalman filter SAM 2; long-term video; SA-V; LVOS; Kalman filter

Share and Cite

MDPI and ACS Style

Yin, J.; Wu, F.; Su, H.; Huang, P.; Qixuan, Y. Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation. Sensors 2025, 25, 4199. https://doi.org/10.3390/s25134199

AMA Style

Yin J, Wu F, Su H, Huang P, Qixuan Y. Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation. Sensors. 2025; 25(13):4199. https://doi.org/10.3390/s25134199

Chicago/Turabian Style

Yin, Jun, Fei Wu, Hao Su, Peng Huang, and Yuetong Qixuan. 2025. "Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation" Sensors 25, no. 13: 4199. https://doi.org/10.3390/s25134199

APA Style

Yin, J., Wu, F., Su, H., Huang, P., & Qixuan, Y. (2025). Improvement of SAM2 Algorithm Based on Kalman Filtering for Long-Term Video Object Segmentation. Sensors, 25(13), 4199. https://doi.org/10.3390/s25134199

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop