One of the biggest challenges of training deep neural network is the need for massive data annotation. To train the neural network for object detection, millions of annotated training images are required. However, currently, there are no large-scale thermal image datasets that could be used to train the state of the art neural networks, while voluminous RGB image datasets are available. This paper presents a method that allows to create hundreds of thousands of annotated thermal images using the RGB pre-trained object detector. A dataset created in this way can be used to train object detectors with improved performance. The main gain of this work is the novel method for fully automatic thermal image labeling. The proposed system uses the RGB camera, thermal camera, 3D LiDAR, and the pre-trained neural network that detects objects in the RGB domain. Using this setup, it is possible to run the fully automated process that annotates the thermal images and creates the automatically annotated thermal training dataset. As the result, we created a dataset containing hundreds of thousands of annotated objects. This approach allows to train deep learning models with similar performance as the common human-annotation-based methods do. This paper also proposes several improvements to fine-tune the results with minimal human intervention. Finally, the evaluation of the proposed solution shows that the method gives significantly better results than training the neural network with standard small-scale hand-annotated thermal image datasets.
This is an open access article distributed under the Creative Commons Attribution License
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited