KOM-SLAM: A GNN-Based Tightly Coupled SLAM and Multi-Object Tracking Framework

Jinze Liu; Ye Tian; Yanlei Gu; Shunsuke Kamijo

doi:10.3390/s26010128

,

and

¹

Graduate School of Information Science and Technology, The University of Tokyo, Tokyo 113-0033, Japan

²

Institute of Industrial Science, The University of Tokyo, Tokyo 153-8505, Japan

³

Graduate School of Advanced Science and Engineering, Hiroshima University, Hiroshima 739-8527, Japan

^*

Author to whom correspondence should be addressed.

Sensors2026, 26(1), 128;https://doi.org/10.3390/s26010128

This article belongs to the Section Intelligent Sensors

Version Notes

Order Reprints

Abstract

Coupled simultaneous localization and mapping (SLAM) and multi-object tracking have been studied in recent years. Although these tasks achieve promising results, they mainly associate keypoints and objects across frames separately, which limits their robustness in complex dynamic scenes. To overcome this limitation, we propose KOM-SLAM, a tightly coupled SLAM and multi-object tracking framework based on a Graph Neural Network (GNN), which jointly learns keypoint and object associations across frames while estimating ego-poses in a differentiable manner. The framework constructs a spatiotemporal graph over keypoints and object detections for association, and employs a multilayer perceptron (MLP) followed by a sigmoid activation that adaptively adjusts association thresholds based on ego-motion and spatial context. We apply a soft assignment on keypoints to ensure differentiable pose estimation, enabling the pose loss to supervise the association learning directly. Experiments on the KITTI Tracking demonstrate that our method achieves improved performance in both localization and object tracking.

Keywords:

SLAM; multi-object tracking; GNN

Article Metrics

Citations

Article Access Statistics

Journal Statistics

Article metric data becomes available approximately 24 hours after publication online.