OcclusionTrack: Multi-Object Tracking in Dense Scenes
Abstract
1. Introduction
- (1)
- Vulnerability to Partial Occlusions: Occlusions pose a major challenge. Partial occlusions can degrade detection quality, but the KF’s update stage typically uses a uniform of measurement noise, failing to account for this variability.
- (2)
- Sensitivity to Camera Motion: The performance of IoU-based tracking is highly dependent on the accuracy of the Kalman filter’s predicted boxes. Camera motion can cause coordinate shifts between frames, leading to prediction offsets and subsequent association failures.
- (3)
- Ineffective Association under Occlusion: During the association stage, occlusion poses significant challenges to matching methods. Because occlusion arises from overlapping targets, the positions and sizes of the bounding boxes for these overlapping targets become extremely close, and conventional matching algorithms fail to deliver satisfactory performance.
- (4)
- Vulnerability to Complete Occlusions: Complete occlusions cause tracks to be lost, meaning that the detector cannot detect that trajectory in the image, during which the KF continues to predict without correction, accumulating significant error and hindering the target’s tracking performance after rematching.
- We introduce a confidence-based Kalman filter (CBKF) that dynamically adjusts the measurement noise based on detection confidence, improving robustness to partial occlusions.
- We incorporate camera motion compensation (CMC) [5] to align frames, thereby rectifying the prediction offsets in the KF and boosting the reliability of IoU-based association.
- We integrate depth–cascade-matching (DCM) [7] into the association phase. By leveraging relative depth information to partition targets and perform matching within the same depth level, DCM effectively resolves ambiguities caused by inter-object occlusions.
- For tracks recovering from complete occlusion, we employ the CMC-detection-based Re-activate method to correct accumulated errors in the track state and KF parameters, which ensures that the target can be successfully associated in subsequent frames.
2. Related Works
3. Methods
3.1. Confidence-Based Kalman Filter
3.2. Camera Motion Compensation
3.3. Depth–Cascade-Matching
3.4. CMC-Detection-Based Re-Activate Method
3.5. Tracking Pipeline
| Algorithm 1: Proposed OcclusionTrack Pipeline for Frame t. |
| Input: Current frame I_t, Previous tracks T_{t-1}, Previous image I_{t-1} Output: Updated tracks T_t # Step 1: Object Detection and Classification 1: D_all = detector(I_t) // Get all detections {bbox, score} 2: D_high = {d ∈ D_all | d.score ≥ τ_high} // High-score detections 3: D_low = {d ∈ D_all | d.score < τ_high ∧ d.score ≥ τ_low} // Low-score detections # Step 2: Kalman Filter Prediction with CMC 4: H = estimate(I_{t-1}, I_t) // Camera motion compensation 5: for each track τ in T_{t-1} do 6: if τ.state ∈ {“active”, “lost”} then 7: τ.pred_box = KalmanPredict(τ) // Standard prediction 8: τ.pred_box = compensate(H, τ.pred_box) // Apply CMC 9: end if 10: end for # Step 3: First Association (High-score detections) 11: T_candidates = {τ ∈ T_{t-1} | τ.state ∈ {“active”, “lost”}} // All tracks 12: matches_high, unmatched_tracks_1, unmatched_dets_high = DCM_Association(D_high, T_candidates) # Step 4: Second Association (Low-score detections) 13: matches_low, unmatched_tracks_2, unmatched_dets_low = DCM_Association(D_low, unmatched_tracks_1) # Step 5: Merge Matching Results 14: all_matches = matches_high ∪ matches_low 15: unmatched_tracks = unmatched_tracks_2 16: unmatched_dets = unmatched_dets_high ∪ unmatched_dets_low # Step 6: Track Update and Management 17: for each match (τ, d) in all_matches do 18: if τ.state == “lost” then 19: τ = Re_activate(τ, d, CMC) // Re-activate for lost tracks 20: end if 21: 22: # CBKF update with non-linear measurement noise 23: R = compute_measurement_noise(d.score, R) // Measurement noise based on confidence 24: KalmanUpdate(τ, d.bbox, R) 25: τ.state = “active” 26: τ.lost_count = 0 // Reset lost counter 27: end for # Step 7: Handle Unmatched Tracks 28: for each track τ in unmatched_tracks do 29: τ.lost_count += 1 30: if τ.lost_count > max_age then 31: τ.state = “removed” // Delete track 32: else 33: τ.state = “lost” 34: end if 35: end for # Step 8: Initialize New Tracks 36: for each detection d in unmatched_dets do 37: if d.score ≥ τ_init then 38: create_new_track(d) 39: end if 40: end for # Step 9: Output Results 41: T_t = {τ ∈ T_{t-1} ∪ new_tracks | τ.state ≠ “removed”} 42: return T_t |
4. Experiments
4.1. Datasets
4.2. Metrics
4.3. Implementation Details
4.4. Comparison on MOT17, MOT20, and DanceTrack
4.5. Ablation Studies of Different Components
4.6. The Impact of Different Values
4.7. The Impact of the Max Age
4.8. The Impact of Different Detectors
4.9. Visualization
5. Conclusions
- (1)
- Objective 1 (Handling Partial Occlusion Challenges during Kalman Filter Update): To mitigate the impact of detection quality degradation during partial occlusion, the CBKF module was designed. Its efficacy is confirmed through experiments, where its introduction leads to the +0.3 IDF1 improvement on both the MOT17 and MOT20 validation sets. This gain is attributed to the module’s ability to dynamically change the weight of detections during the Kalman update, preventing trajectory drift.
- (2)
- Objective 2 (Compensation for Camera Motion): To stabilize the prediction base for IoU matching, a CMC module was incorporated. Results indicate a +0.9 IDF1 increase on the MOT17 validation set. It is the component with the greatest improvement among the four. More importantly, its application is a prerequisite for the effective functioning of the reactivation mechanism, showcasing its systemic value beyond a direct performance bump.
- (3)
- Objective 3 (Robust Association under Occlusion): To resolve ambiguities in dense, overlapping scenes, the DCM algorithm was integrated. DCM contributes significant gains, boosting IDF1 by +0.4 on the MOT20 validation set. This confirms that depth-level partitioning effectively addresses complex inter-object interactions.
- (4)
- Objective 4 (Correction for Complete Occlusion): To address error accumulation in lost tracks, a Re-activate method was proposed. While its isolated impact is smaller (+0.1 HOTA) on the MOT20 validation set, its role in maintaining long-term identity consistency is critical.
- (1)
- Dependence on Detector: Experiments demonstrate that TBD trackers are highly dependent on detector performance, with higher-quality detectors significantly improving tracking accuracy. Additionally, the CBKF module relies on the confidence scores from the detector being reasonably calibrated to detection quality. Performance may degrade if confidence scores are not predictive of localization accuracy.
- (2)
- Assumption for Camera Motion Scenario: The CMC module employs a global 2D model, which assumes dominant, low-frequency camera motion. Its effectiveness may diminish in scenarios with highly dynamic, non-planar backgrounds or very high-frequency jitter. Moreover, it cannot meet the requirements of 3D scenes.
- (3)
- Scene-Type Specificity: The current motion model retains a linear constant-velocity assumption. Consequently, tracking performance can decline in sequences with extremely low frame rates or highly non-linear, agile object motion. Furthermore, the framework is primarily designed and tested on pedestrians, and its extension to objects with highly deformable shapes may require adjustments.
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
Abbreviations
| Abbreviations | Full Term |
| OCCTrack | OcclusionTrack |
| MOT | Multi-Object Tracking |
| CBKF | Confidence-Based Kalman Filter |
| CMC | Camera Motion Compensation |
| DCM | Depth–Cascade-Matching |
| KF | Kalman Filter |
| TBD | Tracking-By-Detection |
References
- Bewley, A.; Ge, Z.; Ott, L.; Ramos, F.; Upadhya, A. Simple Online and Realtime Tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP), Phoenix, AZ, USA, 25–28 September 2016; pp. 3464–3468. [Google Scholar]
- Wojke, N.; Bewley, A.; Paulus, D. Simple online and realtime tracking with a deep association metric. In Proceedings of the IEEE International Conference on Image Processing, Beijing, China, 17–20 September 2017; pp. 3645–3649. Available online: https://ieeexplore.ieee.org/document/8296962 (accessed on 11 October 2023).
- Zhang, Y.; Sun, P.; Jiang, Y.; Yu, D.; Weng, F.; Yuan, Z.; Luo, P.; Liu, W.; Wang, X. Bytetrack: Multi-object tracking byassociating every detection box. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 1–21. [Google Scholar]
- Kalman, R.E. A New Approach to Linear Filtering and Prediction Problems; Wiley Press: New York, NY, USA, 2001; Volume 82D, pp. 35–45. Available online: https://ieeexplore.ieee.org/document/5311910 (accessed on 12 October 2023).
- Aharon, N.; Orfaig, R.; Bobrovsky, B.Z. BoT-SORT: Robust Associations Multi-Pedestrian Tracking. arXiv 2022, arXiv:2206.14651. [Google Scholar]
- Maggiolino, G.; Ahmad, A.; Cao, J.; Kitani, K. Deep OC-SORT: Multi-pedestrian tracking by adaptive re-identification. In Proceedings of the 2023 IEEE International Conference on Image Processing (ICIP), Kuala Lumpur, Malaysia, 8–11 October 2023; pp. 3025–3029. Available online: https://ieeexplore.ieee.org/document/10222576 (accessed on 8 January 2025).
- Liu, Z.; Wang, X.; Wang, C.; Liu, W.; Bai, X. Sparsetrack: Multi-object tracking by performing scene decomposition based on pseudo-depth. IEEE Trans. Circuits Syst. Video Technol. 2025, 35, 4870–4882. Available online: https://ieeexplore.ieee.org/document/10819455 (accessed on 4 September 2025). [CrossRef]
- Du, Y.; Zhao, Z.; Song, Y.; Zhao, Y.; Su, F.; Gong, T.; Meng, H. Strongsort: Make deepsort great again. IEEE Trans. Multimed. 2023, 25, 8725–8737. Available online: https://ieeexplore.ieee.org/document/10032656 (accessed on 22 November 2024). [CrossRef]
- Cao, J.; Pang, J.; Weng, X.; Khirodkar, R.; Kitani, K. Observation-Centric SORT: Rethinking SORT for Robust Multi-Object Tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, BC, Canada, 17–24 June 2023; pp. 9686–9696. [Google Scholar]
- Lv, W.; Huang, Y.; Zhang, N.; Lin, R.S.; Han, M.; Zeng, D. Diffmot: A real-time diffusion-based multiple object tracker with non-linear prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 16–22 June 2024; pp. 19321–19330. [Google Scholar]
- Hong, J.; Li, Y.; Yan, J.; Wei, X.; Xian, W.; Qin, Y. KalmanFormer: Integrating a Deep Motion Model into SORT for Video Multi-Object Tracking. Appl. Sci. 2025, 15, 9727. Available online: https://www.mdpi.com/2076-3417/15/17/9727 (accessed on 25 October 2025).
- Bradski, G. The opencv library. Dr. Dobb’s J. Softw. Tools Prof. Program. 2000, 25, 122–125. Available online: https://www.researchgate.net/publication/233950935_The_Opencv_Library (accessed on 17 August 2025).
- Shi, J. Good features to track. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 21–23 June 1994; pp. 593–600. Available online: https://ieeexplore.ieee.org/abstract/document/323794 (accessed on 17 August 2025).
- Fischler, M.A.; Bolles, R.C. Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Commun. ACM 1981, 24, 381–395. Available online: https://dl.acm.org/doi/10.1145/358669.358692 (accessed on 17 August 2025).
- Milan, A.; Leal-Taixé, L.; Reid, I.; Roth, S.; Schindler, K. Mot16: A benchmark for multi-object tracking. arXiv 2016, arXiv:1603.00831. [Google Scholar]
- Dendorfer, P.; Rezatofighi, H.; Milan, A.; Shi, J.; Cremers, D.; Reid, I.; Roth, S.; Schindler, K.; Leal-Taixé, L. Mot20: A benchmark for multi object tracking in crowded scenes. arXiv 2020, arXiv:2003.09003. [Google Scholar]
- Sun, Y.; Zhang, W.; Zhao, B.; Li, L.; Wang, J. DanceTrack: Multi-Object Tracking in Uniform Appearance and Diverse Motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, LA, USA, 18–24 June 2022; Available online: https://ieeexplore.ieee.org/document/9879192 (accessed on 3 September 2024).
- Bernardin, K.; Stiefelhagen, R. Evaluating multiple object tracking performance: The clear mot metrics. EURASIP J. Image Video Process. 2008, 2008, 246309. Available online: https://jivp-eurasipjournals.springeropen.com/articles/10.1155/2008/246309 (accessed on 5 September 2025).
- Luiten, J.; Osep, A.; Dendorfer, P.; Torr, P.; Geiger, A.; Leal-Taixé, L.; Leibe, B. Hota: A higher order metric for evaluating multi-object tracking. Int. J. Comput. Vis. 2021, 129, 548–578. [Google Scholar] [CrossRef] [PubMed]
- Ristani, E.; Solera, F.; Zou, R.; Cucchiara, R.; Tomasi, C. Performance measures and a data set for multi-target, multi-camera tracking. In Proceedings of the European Conference on Computer Vision, Amsterdam, The Netherlands, 8–10 and 15–16 October 2016; pp. 17–35. Available online: https://link.springer.com/chapter/10.1007/978-3-319-48881-3_2 (accessed on 5 September 2025).
- Pang, B.; Li, Y.; Zhang, Y.; Li, M.; Lu, C. Tubetk: Adopting tubes to track multi-object in a one-step training model. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, WA, USA, 13–19 June 2020; pp. 6308–6318. [Google Scholar]
- Zhou, X.; Koltun, V.; Krähenbühl, P. Tracking objects as points. In Proceedings of the European Conference on Computer Vision, Glasgow, UK, 23–28 August 2020; pp. 474–490. [Google Scholar]
- Xu, Y.; Ban, Y.; Delorme, G.; Gan, C.; Rus, D.; Alameda-Pineda, X. TransCenter: Transformers with Dense Representations for Multiple-Object Tracking. IEEE Trans. Pattern Anal. Mach. Intell. 2023, 45, 7820–7835. [Google Scholar] [CrossRef] [PubMed]
- Zhang, Y.; Wang, C.; Wang, X.; Zeng, W.; Liu, W. Fairmot: On the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 2021, 129, 3069–3087. [Google Scholar] [CrossRef]
- Chu, P.; Wang, J.; You, Q.; Ling, H.; Liu, Z. Transmot: Spatial-temporal graph transformer for multiple object tracking. arXiv 2021, arXiv:2104.00194. Available online: https://ieeexplore.ieee.org/abstract/document/10030267 (accessed on 5 October 2025).
- Stadler, D.; Beyerer, J. Modelling ambiguous assignments for multi-person tracking in crowds. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, Waikoloa, HI, USA, 4–8 January 2022; pp. 133–142. Available online: https://ieeexplore.ieee.org/document/9707576 (accessed on 7 December 2025).
- Yang, M.; Han, G.; Yan, B.; Zhang, W.; Qi, J.; Lu, H.; Wang, D. Hybrid-sort: Weak cues matter for online multi-object tracking. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; pp. 6504–6512. [Google Scholar]
- Yi, K.; Luo, K.; Luo, X.; Huang, J.; Wu, H.; Hu, R.; Hao, W. Ucmctrack: Multi-object tracking with uniform camera motion compensation. In Proceedings of the AAAI Conference on Artificial Intelligence, Vancouver, BC, Canada, 20–27 February 2024; pp. 6702–6710. [Google Scholar]
- Zhang, Y.; Sheng, H.; Wu, Y.; Wang, S.; Ke, W.; Xiong, Z. Multiplex labeling graph for near-online tracking in crowded scenes. IEEE Internet Things J. 2020, 7, 7892–7902. Available online: https://ieeexplore.ieee.org/document/9098857 (accessed on 11 October 2025).
- Sun, P.; Cao, J.; Jiang, Y.; Zhang, R.; Xie, E.; Yuan, Z.; Wang, C.; Luo, P. Transtrack: Multiple-object tracking with transformer. arXiv 2020, arXiv:2012.15460. [Google Scholar]
- Li, W.; Xiong, Y.; Yang, S.; Xu, M.; Wang, Y.; Xia, W. Semi-TCL: Semi-supervised track contrastive representation learning. arXiv 2021, arXiv:2107.02396. [Google Scholar]
- Yu, E.; Li, Z.; Han, S.; Wang, H. RelationTrack: Relation aware multiple object tracking with decoupled representation. arXiv 2021, arXiv:2105.04322. Available online: https://ieeexplore.ieee.org/document/9709649 (accessed on 11 October 2025).
- Qin, Z.; Zhou, S.; Wang, L.; Duan, J.; Hua, G.; Tang, W. MotionTrack: Learning Robust Short-term and Long-term Motions for Multi-Object Tracking. arXiv 2023, arXiv:2303.10404. [Google Scholar]
- Wu, J.; Cao, J.; Song, L.; Wang, Y.; Yang, M.; Yuan, J. Track to detect and segment: An online multi-object tracker. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, TN, USA, 20–25 June 2021; pp. 12352–12361. Available online: https://ieeexplore.ieee.org/document/9578864 (accessed on 11 October 2025).
- Zeng, F.; Dong, B.; Zhang, Y.; Wang, T.; Zhang, X.; Wei, Y. MOTR: End-to-end multiple-object tracking with transformer. In Proceedings of the European Conference on Computer Vision, Tel Aviv, Israel, 23–27 October 2022; pp. 659–675. [Google Scholar]












| Method | Key Characteristics and Limitations |
|---|---|
| SORT | SORT is the pioneer of the tracking-by-detection model, providing a powerful baseline model for subsequent research. However, it did not incorporate specific modules designed to address occlusion. |
| DeepSORT | DeepSORT incorporates an appearance feature extraction module on top of SORT, introducing appearance feature similarity into the association stage. However, in scenes with severe occlusions, the appearance feature extraction module may fail to extract accurate appearance features, potentially introducing additional errors into the association stage. |
| ByteTrack | ByteTrack employs confidence scores during the association stage: manually set confidence thresholds categorize detection results into high-score and low-score detections, with high-score detections used for the first association and low-score detections for the second association. This two-stage association method effectively reduces the interference of occlusions during the association stage. However, ByteTrack does not address camera motion or the impact of occlusions on other components. |
| Methods | HOTA | MOTA | IDF1 |
|---|---|---|---|
| Baseline | 67.9 | 77.8 | 79.9 |
| Linear | 68.0 | 77.8 | 79.8 |
| Sigmoid | 67.6 | 77.8 | 79.4 |
| CBKF | 68.1 | 77.8 | 80.3 |
| Methods | HOTA | MOTA | IDF1 |
|---|---|---|---|
| BB and same conf | 49.9 | 56.8 | 68.3 |
| Improved method | 51.1 | 59.0 | 69.1 |
| Tracker | HOTA↑ | MOTA↑ | IDF1↑ | ID Switch↓ | FP↓ | FN↓ | FPS↑ |
|---|---|---|---|---|---|---|---|
| Tube_TK [21] | 48.0 | 63.0 | 58.6 | 4137 | 27,060 | 177,483 | 3.0 |
| CenterTrack [22] | 52.2 | 67.8 | 64.7 | 3039 | 18,498 | 160,332 | 17.5 |
| TransCenter [23] | 54.5 | 73.2 | 62.2 | 4614 | 23,112 | 123,738 | 1.0 |
| FairMOT [24] | 59.3 | 73.7 | 72.3 | 3303 | 27,507 | 117,477 | 25.9 |
| TransMOT [25] | 61.7 | 76.7 | 75.1 | 2346 | 36,231 | 93,150 | 9.6 |
| MAATrack [26] | 62.0 | 79.4 | 75.9 | 1452 | 37,320 | 77,661 | 189.1 |
| ByteTrack [3] | 63.1 | 80.3 | 77.3 | 2196 | 25,491 | 83,721 | 29.6 |
| OC-SORT [9] | 63.2 | 78.0 | 77.5 | 1950 | 15,129 | 107,055 | 29.0 |
| Hybrid-SORT [27] | 63.6 | 79.3 | 78.4 | \ | \ | \ | \ |
| UCMCTrack [28] | 64.3 | 79.0 | 79.0 | \ | \ | \ | \ |
| StrongSORT [8] | 64.4 | 79.6 | 79.5 | 1194 | 27,876 | 86,205 | 7.1 |
| DiffMOT [10] | 64.5 | 79.8 | 79.3 | \ | \ | \ | \ |
| BoT-SORT [5] | 64.6 | 80.6 | 79.5 | 1257 | 22,524 | 85,398 | 6.6 |
| Deep OC-SORT [6] | 64.9 | 79.4 | 80.6 | 1023 | 16,572 | 98,796 | 28.1 |
| OCCTrack | 64.9 | 80.9 | 79.7 | 1269 | 23,421 | 83,199 | 12.5 |
| Tracker | HOTA↑ | MOTA↑ | IDF1↑ | ID Switch↓ | FP↓ | FN↓ | FPS↑ |
|---|---|---|---|---|---|---|---|
| MLT [29] | 43.2 | 48.9 | 54.6 | 2187 | 45,660 | 216,803 | 3.7 |
| TransTrack [30] | 48.5 | 65.0 | 59.4 | 3608 | 27,197 | 150,197 | 7.2 |
| FairMOT [24] | 54.6 | 61.8 | 67.3 | 5243 | 10,3440 | 88,901 | 13.2 |
| Semi-TCL [31] | 55.3 | 65.2 | 70.1 | 4139 | 61,209 | 114,709 | \ |
| RelationTrack [32] | 56.5 | 67.2 | 70.5 | 4243 | 61,134 | 104,597 | 2.7 |
| MAATrack [26] | 57.3 | 73.9 | 71.2 | 1331 | 24,942 | 108,744 | 14.7 |
| ByteTrack [3] | 61.3 | 77.8 | 75.2 | 1223 | 26,249 | 87,594 | 17.5 |
| OC-SORT [9] | 62.4 | 75.7 | 76.3 | 942 | 19,067 | 105,894 | 18.7 |
| Hybrid-SORT [27] | 62.5 | 76.4 | 76.2 | \ | \ | \ | \ |
| StrongSORT [8] | 62.6 | 73.8 | 77.0 | 770 | 16,632 | 117,920 | 1.4 |
| BoT-SORT [5] | 62.6 | 77.7 | 76.3 | 1212 | 22,521 | 86,037 | 6.6 |
| MotionTrack [33] | 62.8 | 78.0 | 76.5 | 1165 | 28,629 | 84,152 | 9.0 |
| UCMCTrack [28] | 62.8 | 75.6 | 77.4 | 1335 | 28,678 | 96,198 | 44.8 |
| OCCTrack | 63.2 | 76.9 | 77.5 | 916 | 23,042 | 95,412 | 8.5 |
| Tracker | HOTA↑ | MOTA↑ | IDF1↑ | AssA↑ | DetA↑ |
|---|---|---|---|---|---|
| FairMOT [24] | 39.7 | 82.2 | 40.8 | 23.8 | 66.7 |
| CenterTrack [22] | 41.8 | 86.8 | 35.7 | 22.6 | 78.1 |
| TraDes [34] | 43.3 | 86.2 | 41.2 | 25.4 | 74.5 |
| TransTrack [30] | 45.5 | 88.4 | 45.2 | 27.5 | 75.9 |
| ByteTrack [3] | 47.7 | 89.6 | 53.9 | 32.1 | 71.0 |
| MotionTrack [33] | 52.9 | 91.3 | 53.8 | 34.7 | 80.9 |
| MOTR [35] | 54.2 | 79.7 | 51.5 | 40.2 | 73.5 |
| BoT-SORT [5] | 54.7 | 91.3 | 56.0 | 37.8 | 79.6 |
| OC-SORT [9] | 55.1 | 89.4 | 54.2 | 38.0 | 80.3 |
| SparseTrack [7] | 55.5 | 91.3 | 58.3 | 39.1 | 78.9 |
| StrongSORT [8] | 55.6 | 91.1 | 55.2 | 38.6 | 80.7 |
| OCCTrack | 57.5 | 91.4 | 58.4 | 40.9 | 81.0 |
| Method | CBKF | DCM | CMC | Re-Activate | HOTA | MOTA | IDF1 |
|---|---|---|---|---|---|---|---|
| Baseline | 68.1 | 77.7 | 80.0 | ||||
| Baseline + 1 | √ | 68.1 | 77.8 | 80.3 | |||
| Baseline + 2 | √ | √ | 67.8 | 77.6 | 79.5 | ||
| Baseline + 3 | 68.4 | 78.1 | 80.9 | ||||
| Baseline + 1–2 | √ | √ | 68.1 | 77.5 | 80.3 | ||
| Baseline + 1–3 | √ | √ | √ | 69.1 | 77.9 | 82.1 | |
| Baseline + 1–4 | √ | √ | √ | √ | 69.1 | 78.0 | 82.1 |
| Method | CBKF | DCM | CMC | Re-Activate | HOTA | MOTA | IDF1 |
|---|---|---|---|---|---|---|---|
| Baseline | 69.5 | 87.9 | 84.4 | ||||
| Baseline + 1 | √ | 69.6 | 87.9 | 84.7 | |||
| Baseline + 2 | √ | √ | 69.6 | 87.8 | 84.7 | ||
| Baseline + 3 | 69.4 | 87.9 | 84.4 | ||||
| Baseline + 1–2 | √ | √ | 69.6 | 87.8 | 85.1 | ||
| Baseline + 1–3 | √ | √ | √ | 69.7 | 87.8 | 85.3 | |
| Baseline + 1–4 | √ | √ | √ | √ | 69.8 | 87.8 | 85.3 |
| Values | HOTA | MOTA | IDF1 |
|---|---|---|---|
| 2 | 68.9 | 77.9 | 81.9 |
| 4 | 69.0 | 77.9 | 82.0 |
| 6 | 69.1 | 78.0 | 82.1 |
| 8 | 69.1 | 78.0 | 82.1 |
| 10 | 69.1 | 77.9 | 82.0 |
| Detector | HOTA | MOTA | IDF1 |
|---|---|---|---|
| YOLOX-X | 79.9 | 90.3 | 89.0 |
| YOLOX-Ablation | 69.1 | 78.0 | 82.1 |
| Method | CBKF | DCM | CMC | Re-Activate | HOTA | MOTA | IDF1 |
|---|---|---|---|---|---|---|---|
| Baseline | 77.9 | 90.1 | 86.4 | ||||
| Baseline + 1 | √ | 78.4 | 90.2 | 87.0 | |||
| Baseline + 2 | √ | 78.2 | 89.6 | 86.8 | |||
| Baseline + 3 | √ | 78.9 | 90.5 | 87.5 | |||
| Baseline + 1–4 | √ | √ | √ | √ | 79.9 | 90.3 | 89.0 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Chen, Y.; Meng, F.; Chen, Z. OcclusionTrack: Multi-Object Tracking in Dense Scenes. Appl. Sci. 2025, 15, 13030. https://doi.org/10.3390/app152413030
Chen Y, Meng F, Chen Z. OcclusionTrack: Multi-Object Tracking in Dense Scenes. Applied Sciences. 2025; 15(24):13030. https://doi.org/10.3390/app152413030
Chicago/Turabian StyleChen, Yuzhi, Fanqin Meng, and Ziqiu Chen. 2025. "OcclusionTrack: Multi-Object Tracking in Dense Scenes" Applied Sciences 15, no. 24: 13030. https://doi.org/10.3390/app152413030
APA StyleChen, Y., Meng, F., & Chen, Z. (2025). OcclusionTrack: Multi-Object Tracking in Dense Scenes. Applied Sciences, 15(24), 13030. https://doi.org/10.3390/app152413030
