Enhancing Obscured Regions in Thermal Imaging: A Novel GAN-Based Approach for Efficient Occlusion Inpainting
Abstract
:1. Introduction
2. Related Work
2.1. Image Segmentation
2.2. Image Inpainting
2.2.1. GAN Data Generation
2.2.2. Pix2Pix Key Concepts
2.2.3. Relevance
3. Methodology
3.1. Data Collection
3.1.1. Occlusion Classification
3.1.2. KAIST Thermal Suburban Dataset
3.2. Occlusion Two-Stage Removal Model Architecture
3.2.1. Mask Generation Using U-Net
3.2.2. Image Inpainting Using Pix2Pix
3.3. Training and Validation
3.4. Evaluation Metrics
3.4.1. Segmentation Evaluation Metrics
F1 Score
Intersection over Union (IoU)
3.4.2. Inpainting Evaluation Metrics
Structural Similarity (SSIM)
- x and y are the two images being compared.
- and are the average pixel values of images x and y, respectively.
- and are the standard deviations of pixel values of images x and y, respectively.
- is the covariance of pixel values of images x and y.
- and are constants used to stabilize the division with a weak denominator, defined as and , where L is the dynamic range of pixel values, which in our case is , and and are default values.
Peak Signal to Noise Ratio (PSNR)
- is the maximum possible pixel value of the image. For instance, if the image has 8 bits per pixel, is 255.
- stands for Mean Squared Error, which is calculated as:
- is the original image.
- is the reconstructed (compressed or distorted) image.
- m and n are the dimensions of the images (height and width).
4. Experimental Results
4.1. Qualitative
4.2. Quantitative
5. Discussions and Future Work
6. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Liu, X.; Zhou, Y.; Zhang, Y.; Zhang, L. Thermal image enhancement in foggy scenes. J. Vis. Commun. Image Represent. 2018, 68, 33–44. [Google Scholar]
- Ma, J.; Chen, Y.; Liu, Y. An efficient algorithm for thermal image restoration in foggy weather. J. Vis. Commun. Image Represent. 2014, 29, 356–364. [Google Scholar]
- Wang, Z.; Han, Z.; Fan, J.; Zhang, L. An improved thermal image dehazing method based on guided filtering. Infrared Phys. Technol. 2017, 89, 39–46. [Google Scholar]
- Long, J.; Shelhamer, E.; Darrell, T. Fully Convolutional Networks for Semantic Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Boston, MA, USA, 7–12 June 2015. [Google Scholar]
- Ronneberger, O.; Fischer, P.; Brox, T. U-Net: Convolutional Networks for Biomedical Image Segmentation. In Proceedings of the Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015: 18th International Conference, Munich, Germany, 5–9 October 2015. [Google Scholar]
- Badrinarayanan, V.; Kendall, A.; Cipolla, R. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Trans. Pattern Anal. Mach. Intell. 2017, 39, 2481–2495. [Google Scholar] [CrossRef]
- Oktay, O.; Schlemper, J.; Folgoc, L.L.; Lee, M.C.; Heinrich, M.; Misawa, K.; Mori, K.; McDonagh, S.; Hammerla, N.Y.; Kainz, B.; et al. Attention U-Net: Learning Where to Look for the Pancreas. arXiv 2018, arXiv:1804.03999. [Google Scholar]
- Yang, M.; Yu, K.; Zhang, C.; Li, Z.; Yang, K. DenseASPP for Semantic Segmentation in Street Scenes. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Chen, L.C.; Zhu, Y.; Papandreou, G.; Schroff, F.; Adam, H. Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation. In Proceedings of the European Conference on Computer Vision, Munich, Germany, 8–14 September 2018. [Google Scholar]
- Wang, K.; Liew, J.H.; Zou, Y.; Zhou, D. PANet: Few-Shot Image Semantic Segmentation with Prototype Alignment. In Proceedings of the IEEE International Conference on Computer Vision, Seoul, Republic of Korea, 27 October–2 November 2019. [Google Scholar]
- Nandhakumar, N.; Aggarwal, J.K. Integrated analysis of thermal and visual images for scene interpretation. IEEE Trans. Pattern Anal. Mach. Intell. 1988, 10, 469–481. [Google Scholar] [CrossRef]
- Almadani, I.; Abuhussein, M.; Robinson, A.L. Sow Localization in Thermal Images Using Gabor Filters. In Proceedings of the FICC2022, San Francisco, CA, USA, 3–4 March 2021. [Google Scholar]
- Hung, W.C.; Tsai, Y.H.; Liou, Y.T.; Lin, Y.Y.; Yang, M.H. Adversarial learning for semi-supervised semantic segmentation. arXiv 2018, arXiv:1802.07934. [Google Scholar]
- Niu, R.; Sun, X.; Tian, Y.; Diao, W.; Chen, K.; Fu, K. Hybrid multiple attention network for semantic segmentation in aerial images. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–18. [Google Scholar]
- Zhang, B.; Tian, Z.; Tang, Q.; Chu, X.; Wei, X.; Shen, C.; Liu, Y. SegViT: Semantic Segmentation with Plain Vision Transformers. Comput. Vis. Pattern Recognit. 2022, 35, 4971–4982. [Google Scholar]
- Xie, E.; Sun, K.; Song, X.; Wang, R.; Li, Z.; Song, S.; Luo, P. SegFormer: Simple and Efficient Design for Semantic Segmentation with Transformers. Adv. Neural Inf. Process. Syst. 2021, 34, 12077–12090. [Google Scholar]
- Kirillov, A.; Wu, Y.; He, K.; Girshick, R. PointRend: Image Segmentation as Rendering. In Proceedings of the CVPR, Seattle, WA, USA, 13–19 June 2020. [Google Scholar]
- Zhang, H.; Li, C.; Liu, Y.; Ji, R. CascadePSP: A Cascade of Global-Local-Global Pathways for Semantic Segmentation. In Proceedings of the AAAI, Online, 2–9 February 2021. [Google Scholar]
- Chen, Z.; He, T.; Zhang, X. Swin-Unet: Unet-like Pure Transformer Architecture for Medical Image Segmentation; Springer: Berlin/Heidelberg, Germany, 2022. [Google Scholar]
- Barnes, C.; Shechtman, E.; Finkelstein, A.; Goldman, D.B. Patch-Based Synthesis Method for Image Inpainting. ACM Trans. Graph. (Tog) 2009, 13, 1–48. [Google Scholar] [CrossRef]
- Bertalmío, M.; Sapiro, G.; Caselles, V.; Ballester, C. Image Inpainting. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, New Orleans, LA, USA, 23–28 July 2000. [Google Scholar]
- Wang, J.; Agrawal, A.; Chen, M.; Konolige, K. Scale-Invariant Feature Transform for Image Inpainting. In Proceedings of the IEEE International Conference on Computer Vision, Kyoto, Japan, 29 September–2 October 2009. [Google Scholar]
- Pathak, D.; Krahenbuhl, P.; Donahue, J.; Darrell, T.; Efros, A.A. Context Encoders: Feature Learning by Inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Yu, J.; Lin, Z.; Yang, J.; Shen, X.; Lu, X.; Huang, T.S. Generative Image Inpainting with Contextual Attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Iizuka, S.; Simo-Serra, E.; Ishikawa, H. Image Inpainting with Deep Generative Models. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 27–30 June 2016. [Google Scholar]
- Sauber-Cole, R.; Khoshgoftaar, T.M. The use of generative adversarial networks to alleviate class imbalance in tabular data: A survey. J. Big Data 2023, 9, 98. [Google Scholar]
- Blanchard, A.E.; Stanley, C.; Bhowmik, D. Using GANs with adaptive training data to search for new molecules. J. Cheminform. 2023, 13, 1–8. [Google Scholar]
- Havugimana, F.; Moinudin, K.A.; Yeasin, M. Deep Learning Framework for Modeling Cognitive Load from Small and Noisy EEG data. IEEE Trans. Cogn. Dev. Syst. 2023, 16, 1006–1015. [Google Scholar] [CrossRef]
- Isola, P.; Zhu, J.Y.; Zhou, T.; Efros, A.A. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 1125–1134. [Google Scholar]
- Zhu, J.Y.; Park, T.; Isola, P.; Efros, A.A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE International Conference on Computer Vision, Venice, Italy, 22–29 October 2017; pp. 2223–2232. [Google Scholar]
- Wang, T.C.; Liu, M.Y.; Zhu, J.Y.; Tao, A.; Kautz, J.; Catanzaro, B. High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018. [Google Scholar]
- Abuhussein, M.; Robinson, A. Obscurant Segmentation in Long Wave Infrared Images Using GLCM Textures. J. Imaging 2022, 8, 266. [Google Scholar] [CrossRef] [PubMed]
- Choi, Y.; Kim, N.; Hwang, S.; Park, K.; Yoon, J.S.; An, K.; Kweon, I.S. KAIST multi-spectral day/night data set for autonomous and assisted driving. IEEE Trans. Intell. Transp. Syst. 2018, 19, 934–948. [Google Scholar] [CrossRef]
- Perlin, K. An image synthesizer. Siggraph Comput. Graph. 1985, 19, 287–296. [Google Scholar] [CrossRef]
- Shorten, C.; Khoshgoftaar, T.M. A survey on Image Data Augmentation for Deep Learning. J. Big Data 2019, 6, 1–48. [Google Scholar] [CrossRef]
- Chinchor, N. MUC-4 Evaluation Metrics. In Proceedings of the 4th Conference on Message Understanding, McLean, VA, USA, 16–18 June 1992; MUC4 ’92. pp. 22–29. [Google Scholar] [CrossRef]
- Rahman, M.A.; Wang, Y. Optimizing intersection-over-union in deep neural networks for image segmentation. In Proceedings of the International Symposium on Visual Computing, Las Vegas, NV, USA, 12–14 December 2016; Springer: Berlin/Heidelberg, Germany, 2016; pp. 234–244. [Google Scholar]
Image | IoU | F1 Score | Accuracy | Precision | Recall | Time (ms) |
---|---|---|---|---|---|---|
Reduced UNet | 0.66 | 0.80 | 0.87 | 0.76 | 0.71 | 25 |
Pix2Pix Backbone | 0.75 | 0.85 | 0.92 | 0.88 | 0.83 | 180 |
Method | PSNR (dB) | SSIM | Run Time (s) |
---|---|---|---|
BM3D | 20 | 0.75 | 2 |
BM4D | 22 | 0.78 | 3 |
Dark Channel Prior | 25 | 0.85 | 5 |
Atmospheric Light Estimation | 23 | 0.80 | 4 |
Transmission Map Refinement | 28 | 0.90 | 6 |
Pix2Pix on KAIST | 47.3 | 0.90 | 0.2 |
Pix2Pix on our dataset | 35.7 | 0.89 | 0.2 |
Our Approach | 32.5 | 0.89 | 0.3 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Abuhussein, M.; Almadani, I.; Robinson, A.L.; Younis, M. Enhancing Obscured Regions in Thermal Imaging: A Novel GAN-Based Approach for Efficient Occlusion Inpainting. J 2024, 7, 218-235. https://doi.org/10.3390/j7030013
Abuhussein M, Almadani I, Robinson AL, Younis M. Enhancing Obscured Regions in Thermal Imaging: A Novel GAN-Based Approach for Efficient Occlusion Inpainting. J. 2024; 7(3):218-235. https://doi.org/10.3390/j7030013
Chicago/Turabian StyleAbuhussein, Mohammed, Iyad Almadani, Aaron L. Robinson, and Mohammed Younis. 2024. "Enhancing Obscured Regions in Thermal Imaging: A Novel GAN-Based Approach for Efficient Occlusion Inpainting" J 7, no. 3: 218-235. https://doi.org/10.3390/j7030013
APA StyleAbuhussein, M., Almadani, I., Robinson, A. L., & Younis, M. (2024). Enhancing Obscured Regions in Thermal Imaging: A Novel GAN-Based Approach for Efficient Occlusion Inpainting. J, 7(3), 218-235. https://doi.org/10.3390/j7030013