Face with Mask Detection in Thermal Images Using Deep Neural Networks
Abstract
:1. Introduction
1.1. Literature Review
1.2. Contribution
2. Materials and Methods
2.1. Description of Cameras
2.2. Image Acquisition Methods
2.3. Dataset
2.4. Annotations of Images
2.5. Data Preprocessing
2.6. Adaptation of Deep Learning Models
2.7. Models Testing Scenario
2.8. Metrics
3. Results
4. Discussion
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Acknowledgments
Conflicts of Interest
References
- Kwaśniewska, A.; Rumiński, J.; Rad, P. Deep features class activation map for thermal face detection and tracking. In Proceedings of the 2017 10th International Conference on Human System Interactions (HSI), Ulsan, Korea, 17–19 July 2017; pp. 41–47. [Google Scholar]
- Wu, Z.; Peng, M.; Chen, T. Thermal face recognition using convolutional neural network. In Proceedings of the 2016 International Conference on Optoelectronics and Image Processing (ICOIP), Warsaw, Poland, 10–12 June 2016; pp. 6–9. [Google Scholar]
- Ruminski, J.; Nowakowski, A.; Kaczmarek, M.; Hryciuk, M. Model-based parametric images in dynamic thermography. Pol. J. Med. Phys. Eng. 2000, 6, 159–164. [Google Scholar]
- Sonkusare, S.; Ahmedt-Aristizabal, D.; Aburn, M.J.; Nguyen, V.T.; Pang, T.; Frydman, S.; Denman, S.; Fookes, C.; Breakspear, M.; Guo, C.C. Detecting changes in facial temperature induced by a sudden auditory stimulus based on deep learning-assisted face tracking. Sci. Rep. 2019, 9, 1–11. [Google Scholar] [CrossRef] [PubMed]
- Ruminski, J.; Kwasniewska, A. Evaluation of respiration rate using thermal imaging in mobile conditions. In Application of Infrared to Biomedical Sciences; Springer Nature: Singapore, 2017; pp. 311–346. [Google Scholar]
- Rumiński, J. Analysis of the parameters of respiration patterns extracted from thermal image sequences. Biocybern. Biomed. Eng. 2016, 36, 731–741. [Google Scholar] [CrossRef]
- Kwasniewska, A.; Ruminski, J.; Szankin, M.; Kaczmarek, M. Super-resolved thermal imagery for high-accuracy facial areas detection and analysis. Eng. Appl. Artif. Intell. 2020, 87, 103263. [Google Scholar] [CrossRef]
- Kwasniewska, A.; Szankin, M.; Ruminski, J.; Sarah, A.; Gamba, D. Improving Accuracy of Respiratory Rate Estimation by Restoring High Resolution Features with Transformers and Recursive Convolutional Models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 19-25 June 2021; pp. 3857–3867. [Google Scholar]
- Reese, K.; Zheng, Y.; Elmaghraby, A. A comparison of face detection algorithms in visible and thermal spectrums. In Proceedings of the Int’l Conference on Advances in Computer Science and Application, Amsterdam, The Netherlands, 7–8 June 2012. [Google Scholar]
- Friedrich, G.; Yeshurun, Y. Seeing people in the dark: Face recognition in infrared images. In International Workshop on Biologically Motivated Computer Vision; Springer: Cham, Switzerland, 2002; pp. 348–359. [Google Scholar]
- Viola, P.; Jones, M.J. Robust real-time face detection. Int. J. Comput. Vis. 2004, 57, 137–154. [Google Scholar] [CrossRef]
- Zhang, S.; Zhu, X.; Lei, Z.; Shi, H.; Wang, X.; Li, S.Z. FaceBoxes: A CPU real-time face detector with high accuracy. In Proceedings of the 2017 IEEE International Joint Conference on Biometrics (IJCB), Denver, CO, USA, 1–4 October 2017; pp. 1–9. [Google Scholar]
- Yang, W.; Jiachun, Z. Real-time face detection based on YOLO. In Proceedings of the 2018 1st IEEE International Conference on Knowledge Innovation and Invention (ICKII), Jeju, Korea, 23–27 July 2018; pp. 221–224. [Google Scholar]
- Yang, S.; Luo, P.; Loy, C.C.; Tang, X. Wider face: A face detection benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 5525–5533. [Google Scholar]
- Pang, L.; Ming, Y.; Chao, L. F-DR Net: Face detection and recognition in One Net. In Proceedings of the 2018 14th IEEE International Conference on Signal Processing (ICSP), Beijing, China, 12–16 August 2018; pp. 332–337. [Google Scholar]
- Hussain, S.; Yu, Y.; Ayoub, M.; Khan, A.; Rehman, R.; Wahid, J.A.; Hou, W. IoT and Deep Learning Based Approach for Rapid Screening and Face Mask Detection for Infection Spread Control of COVID-19. Appl. Sci. 2021, 11, 3495. [Google Scholar] [CrossRef]
- Kopaczka, M.; Nestler, J.; Merhof, D. Face Detection in Thermal Infrared Images: A Comparison of Algorithm- and Machine-Learning-Based Approaches. In Advanced Concepts for Intelligent Vision Systems; Springer: Cham, Switzerland, 2017; pp. 518–529. [Google Scholar]
- Dalal, N.; Triggs, B. Histograms of Oriented Gradients for Human Detection. Comput. Vis. Pattern Recognit. 2005, 1, 886–893. [Google Scholar]
- Redmon, J.; Farhadi, A. Yolov3: An incremental improvement. arXiv 2018, arXiv:1804.02767. [Google Scholar]
- Silva, G.; Monteiro, R.; Ferreira, A.; Carvalho, P.; Corte-Real, L. Face Detection in Thermal Images with YOLOv3. In Innternational Symposium on Visual Computing; Springer: Cham, Switzerland, 2019; pp. 89–99. [Google Scholar]
- Szegedy, C.; Vanhoucke, V.; Ioffe, S.; Shlens, J.; Wojna, Z. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016; pp. 2818–2826. [Google Scholar]
- Peng, M.; Wang, C.; Chen, T.; Liu, G. NIRFaceNet: A Convolutional Neural Network for Near-Infrared Face Identification. Information 2016, 7, 61. [Google Scholar] [CrossRef] [Green Version]
- Szegedy, C.; Liu, W.; Jia, Y.; Sermanet, P.; Reed, S.; Anguelov, D.; Erhan, D.; Vanhoucke, V.; Rabinovich, A. Going deeper with convolutions. In Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, 7–12 June 2015; pp. 1–9. [Google Scholar]
- Krizhevsky, A.; Sutskever, I.; Hinton, G.E. ImageNet Classification with Deep Convolutional Neural Networks. In Advances in Neural Information Processing Systems; Pereira, F., Burges, C.J.C., Bottou, L., Weinberger, K.Q., Eds.; Curran Associates, Inc.: New York, NY, USA, 2012; Volume 25, pp. 1097–1105. [Google Scholar]
- Li, S.Z.; Chu, R.; Liao, S.; Zhang, L. Illumination Invariant Face Recognition Using Near-Infrared Images. IEEE Trans. Pattern Anal. Mach. Intell. 2007, 29, 627–639. [Google Scholar] [CrossRef] [PubMed]
- Sayed, M.; Baker, F. Thermal Face Authentication with Convolutional Neural Network. J. Comput. Sci. 2018, 14, 1627–1637. [Google Scholar] [CrossRef] [Green Version]
- Nikisins, O.; Nasrollahi, K.; Greitans, M.; Moeslund, T.B. RGB-D-T Based Face Recognition. In Proceedings of the 2014 22nd International Conference on Pattern Recognition, Stockholm, Sweden, 24–28 August 2014; pp. 1716–1721. [Google Scholar]
- Su, J.W.; Chu, H.K.; Huang, J.B. Instance-aware image colorization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 13–19 June 2020; pp. 7968–7977. [Google Scholar]
- Ultra-Lightweight Face Detection Model. Available online: https://github.com/Linzaer/Ultra-Light-Fast-Generic-Face-Detector-1MB (accessed on 5 January 2021).
- Deng, J.; Guo, J.; Yuxiang, Z.; Yu, J.; Kotsia, I.; Zafeiriou, S. RetinaFace: Single-stage Dense Face Localisation in the Wild. arXiv 2019, arXiv:1905.00641. [Google Scholar]
- RetinaFace in PyTorch. Available online: https://github.com/biubug6/Pytorch_Retinaface (accessed on 10 January 2021).
- He, Y.; Xu, D.; Wu, L.; Jian, M.; Xiang, S.; Pan, C. LFFD: A Light and Fast Face Detector for Edge Devices. arXiv 2019, arXiv:1904.10633. [Google Scholar]
- Schütze, H.; Manning, C.D.; Raghavan, P. Introduction to Information Retrieval; Cambridge University Press: Cambridge, UK, 2008; Volume 39. [Google Scholar]
- Powers, D.M. Evaluation: From precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv 2020, arXiv:2010.16061. [Google Scholar]
- Li, D.; Zhu, X.; Chen, X.; Tian, D.; Hu, X.; Qin, G. Thermal Imaging Face Detection Based on Transfer Learning. In Proceedings of the 2021 6th International Conference on Smart Grid and Electrical Automation (ICSGEA), Kunming, China, 29–30 May 2021; pp. 263–266. [Google Scholar]
- Mallat, K.; Dugelay, J.L. A benchmark database of visible and thermal paired face images across multiple variations. In Proceedings of the 2018 International Conference of the Biometrics Special Interest Group (BIOSIG), Darmstadt, Germany, 26–28 September 2018; pp. 1–5. [Google Scholar]
Model | Manufacturer | Spatial Resolution | Dynamic Range | Frame Rate |
---|---|---|---|---|
A320G | FLIR Systems | 320 × 240 | 16 bit | 60 fps |
A655SC | FLIR Systems | 640 × 480 | 16 bit | 50 fps |
SC3000 | FLIR Systems | 320 × 240 | 14 bit | 50 fps |
Boson | FLIR Systems | 640 × 512 | 14 bit | 8 fps |
Model Name | Base Model | Number of Epochs | Batch Size | Optimizer | Initial Learning Rate |
---|---|---|---|---|---|
UltraLight | version-slim | 200 | 24 | SGD | 0.01 |
version-RBF | |||||
RetinaFace | MobileNet-0.25 | 250 | 32 | SGD | 0.001 |
ResnNet-50 | 200 | 24 | |||
Yolov3 | - | 200 | 16 | SGD | 0.01 |
LFFD | - | 32 | SGD | 0.1 |
Trained on WIDER FACE | |||||
---|---|---|---|---|---|
Model | Dataset | mAP | Precision | Recall | |
UltraLight | version-slim | original | 0.165 | 0.514 | 0.144 |
with CLAHE | 0.207 | 0.514 | 0.196 | ||
with colorization | 0.216 | 0.530 | 0.209 | ||
original with mask | 0.107 | 0.436 | 0.105 | ||
original without mask | 0.267 | 0.598 | 0.210 | ||
version-RBF | original | 0.166 | 0.514 | 0.180 | |
with CLAHE | 0.200 | 0.535 | 0.192 | ||
with colorization | 0.222 | 0.539 | 0.260 | ||
original with mask | 0.096 | 0.398 | 0.121 | ||
original without mask | 0.286 | 0.640 | 0.281 | ||
RetinaFace | MobileNet-0.25 | original | 0.315 | 0.565 | 0.285 |
with CLAHE | 0.337 | 0.594 | 0.297 | ||
with colorization | 0.296 | 0.475 | 0.325 | ||
original with mask | 0.218 | 0.487 | 0.209 | ||
original without mask | 0.467 | 0.648 | 0.416 | ||
ResNet-50 | original | 0.233 | 0.464 | 0.245 | |
with CLAHE | 0.274 | 0.528 | 0.261 | ||
with colorization | 0.231 | 0.434 | 0.254 | ||
original with mask | 0.125 | 0.353 | 0.172 | ||
original without mask | 0.392 | 0.597 | 0.373 | ||
Yolov3 | original | 0.994 | 0.638 | 0.997 | |
with CLAHE | 0.996 | 0.634 | 0.997 | ||
with colorization | 0.996 | 0.621 | 0.997 | ||
original with mask | 0.994 | 0.625 | 0.998 | ||
original without mask | 0.994 | 0.663 | 0.996 | ||
LFFD | original | 0.163 | 0.461 | 0.168 | |
with CLAHE | 0.220 | 0.562 | 0.206 | ||
with colorization | 0.172 | 0.439 | 0.193 | ||
original with mask | 0.090 | 0.360 | 0.119 | ||
original without mask | 0.287 | 0.570 | 0.252 |
Transfer Learning—Original Training Set | |||||
---|---|---|---|---|---|
Model | Dataset | mAP | Precision | Recall | |
UltraLight | version-slim | original | 0.839 | 0.802 | 0.829 |
with CLAHE | 0.795 | 0.788 | 0.749 | ||
with colorization | 0.828 | 0.799 | 0.829 | ||
original with mask | 0.836 | 0.793 | 0.834 | ||
original without mask | 0.844 | 0.820 | 0.822 | ||
version-RBF | original | 0.829 | 0.826 | 0.819 | |
with CLAHE | 0.764 | 0.818 | 0.713 | ||
with colorization | 0.836 | 0.835 | 0.818 | ||
original with mask | 0.831 | 0.810 | 0.826 | ||
original without mask | 0.827 | 0.855 | 0.807 | ||
RetinaFace | MobileNet-0.25 | original | 0.473 | 0.674 | 0.444 |
with CLAHE | 0.347 | 0.543 | 0.353 | ||
with colorization | 0.399 | 0.655 | 0.375 | ||
original with mask | 0.333 | 0.577 | 0.352 | ||
original without mask | 0.662 | 0.798 | 0.609 | ||
ResNet-50 | original | 0.516 | 0.755 | 0.515 | |
with CLAHE | 0.435 | 0.655 | 0.478 | ||
with colorization | 0.474 | 0.721 | 0.496 | ||
original with mask | 0.437 | 0.703 | 0.433 | ||
original without mask | 0.618 | 0.823 | 0.654 | ||
Yolov3 | original | 0.996 | 0.716 | 0.997 | |
with CLAHE | 0.994 | 0.727 | 0.997 | ||
with colorization | 0.996 | 0.718 | 0.997 | ||
original with mask | 0.996 | 0.691 | 0.998 | ||
original without mask | 0.993 | 0.764 | 0.996 | ||
LFFD | original | 0.688 | 0.748 | 0.667 | |
with CLAHE | 0.608 | 0.743 | 0.555 | ||
with colorization | 0.685 | 0.753 | 0.659 | ||
original with mask | 0.722 | 0.749 | 0.715 | ||
original without mask | 0.645 | 0.743 | 0.585 |
Transfer Learning—Training Set with CLAHE | |||||
---|---|---|---|---|---|
Model | Dataset | mAP | Precision | Recall | |
UltraLight | version-slim | original | 0.780 | 0.805 | 0.778 |
with CLAHE | 0.780 | 0.802 | 0.794 | ||
with colorization | 0.771 | 0.803 | 0.775 | ||
original with mask | 0.795 | 0.805 | 0.792 | ||
original without mask | 0.759 | 0.803 | 0.758 | ||
version-RBF | original | 0.843 | 0.852 | 0.770 | |
with CLAHE | 0.843 | 0.839 | 0.786 | ||
with colorization | 0.823 | 0.847 | 0.741 | ||
original with mask | 0.840 | 0.840 | 0.771 | ||
original without mask | 0.850 | 0.872 | 0.776 | ||
RetinaFace | MobileNet-0.25 | original | 0.506 | 0.704 | 0.472 |
with CLAHE | 0.507 | 0.700 | 0.476 | ||
with colorization | 0.494 | 0.716 | 0.437 | ||
original with mask | 0.386 | 0.622 | 0.385 | ||
original without mask | 0.675 | 0.805 | 0.623 | ||
ResNet-50 | original | 0.469 | 0.729 | 0.494 | |
with CLAHE | 0.565 | 0.767 | 0.562 | ||
with colorization | 0.438 | 0.745 | 0.450 | ||
original with mask | 0.385 | 0.666 | 0.416 | ||
original without mask | 0.605 | 0.815 | 0.629 | ||
Yolov3 | original | 0.994 | 0.727 | 0.997 | |
with CLAHE | 0.995 | 0.725 | 0.997 | ||
with colorization | 0.994 | 0.731 | 0.997 | ||
original with mask | 0.994 | 0.711 | 0.998 | ||
original without mask | 0.994 | 0.761 | 0.996 | ||
LFFD | original | 0.511 | 0.729 | 0.497 | |
with CLAHE | 0.671 | 0.743 | 0.682 | ||
with colorization | 0.507 | 0.726 | 0.499 | ||
original with mask | 0.492 | 0.703 | 0.496 | ||
original without mask | 0.556 | 0.779 | 0.499 |
Transfer Learning—Training Set with Colorization | |||||
---|---|---|---|---|---|
Model | Dataset | mAP | Precision | Recall | |
UltraLight | version-slim | original | 0.813 | 0.822 | 0.796 |
with CLAHE | 0.758 | 0.844 | 0.684 | ||
with colorization | 0.800 | 0.821 | 0.787 | ||
original with mask | 0.817 | 0.810 | 0.801 | ||
original without mask | 0.810 | 0.843 | 0.790 | ||
version-RBF | original | 0.833 | 0.815 | 0.795 | |
with CLAHE | 0.779 | 0.825 | 0.702 | ||
with colorization | 0.828 | 0.818 | 0.787 | ||
original with mask | 0.831 | 0.800 | 0.800 | ||
original without mask | 0.834 | 0.840 | 0.790 | ||
RetinaFace | MobileNet-0.25 | original | 0.484 | 0.695 | 0.454 |
with CLAHE | 0.375 | 0.534 | 0.393 | ||
with colorization | 0.468 | 0.702 | 0.438 | ||
original with mask | 0.335 | 0.592 | 0.345 | ||
original without mask | 0.687 | 0.807 | 0.644 | ||
ResNet-50 | original | 0.532 | 0.707 | 0.541 | |
with CLAHE | 0.403 | 0.587 | 0.459 | ||
with colorization | 0.537 | 0.724 | 0.560 | ||
original with mask | 0.406 | 0.627 | 0.458 | ||
original without mask | 0.691 | 0.825 | 0.682 | ||
Yolov3 | original | 0.995 | 0.643 | 0.997 | |
with CLAHE | 0.993 | 0.639 | 0.997 | ||
with colorization | 0.995 | 0.642 | 0.997 | ||
original with mask | 0.994 | 0.616 | 0.998 | ||
original without mask | 0.995 | 0.699 | 0.996 | ||
LFFD | original | 0.568 | 0.729 | 0.546 | |
with CLAHE | 0.568 | 0.729 | 0.546 | ||
with colorization | 0.665 | 0.733 | 0.667 | ||
original with mask | 0.677 | 0.749 | 0.690 | ||
original without mask | 0.682 | 0.794 | 0.589 |
Inference Time in Milliseconds—640 × 480 Images | |||
---|---|---|---|
Model | Jetson Nano | DGX-1 Station | |
UltraLight | version-slim | 28.57 | 5.80 |
version-RBF | 37.42 | 6.99 | |
RetinaFace | MobileNet-0.25 | 37.16 | 29.74 |
ResNet-50 | 58.52 | 20.07 | |
Yolov3 | 685.11 | 3.82 | |
LFFD | 69.04 | 20.18 |
Inference Time in Milliseconds—320 × 240 Images | |||
---|---|---|---|
Model | Jetson Nano | DGX-1 Station | |
UltraLight | version-slim | 34.44 | 7.34 |
version-RBF | 35.50 | 9.10 | |
RetinaFace | MobileNet-0.25 | 35.63 | 9.04 |
ResNet-50 | 57.95 | 15.26 | |
Yolov3 | 325.47 | 4.85 | |
LFFD | 63.23 | 16.83 |
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Głowacka, N.; Rumiński, J. Face with Mask Detection in Thermal Images Using Deep Neural Networks. Sensors 2021, 21, 6387. https://doi.org/10.3390/s21196387
Głowacka N, Rumiński J. Face with Mask Detection in Thermal Images Using Deep Neural Networks. Sensors. 2021; 21(19):6387. https://doi.org/10.3390/s21196387
Chicago/Turabian StyleGłowacka, Natalia, and Jacek Rumiński. 2021. "Face with Mask Detection in Thermal Images Using Deep Neural Networks" Sensors 21, no. 19: 6387. https://doi.org/10.3390/s21196387
APA StyleGłowacka, N., & Rumiński, J. (2021). Face with Mask Detection in Thermal Images Using Deep Neural Networks. Sensors, 21(19), 6387. https://doi.org/10.3390/s21196387