Traditional Person Re-identification (ReID) methods mainly focus on cross-camera scenarios, while identifying a person in the same video/camera from adjacent subsequent frames is also an important question, for example, in human tracking and pose tracking. We try to address this unexplored in-video ReID problem with a new large-scale video-based ReID dataset called PoseTrack-ReID with full images available and a new network structure called ReID-Head, which can extract multi-person features efficiently in real time and can be integrated with both one-stage and two-stage human or pose detectors. A new loss function is also required to solve this new in-video problem. Hence, a triplet-based loss function with an online hard example mining designed to distinguish persons in the same video/group is proposed, called instance hard triplet loss, which can be applied in both cross-camera ReID and in-video ReID. Compared with the widely-used batch hard triplet loss, our proposed loss achieves competitive performance and saves more than 30% of the training time. We also propose an automatic reciprocal identity association method, so we can train our model in an unsupervised way, which further extends the potential applications of in-video ReID. The PoseTrack-ReID dataset and code will be publicly released.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited