Fusion of Visible and Infrared Aerial Images from Uncalibrated Sensors Using Wavelet Decomposition and Deep Learning
Abstract
:1. Introduction
2. Related Work
2.1. Multi-Modal Image Matching
2.1.1. Classical Methods
2.1.2. Handcrafted Feature Vector-Based Methods
2.1.3. DNN-Based Image Feature Matching
2.2. Multi-Modal Image Fusion
2.3. Fusion Performance Metrics
3. DeepFusion Pipeline
3.1. Wavelet Spectral Decomposition (WSD)-Based Multi-Modal Image Matching
3.2. Image Fusion
3.3. Keypoint-Based Quality Analysis
4. Results and Discussion
4.1. Dataset and Implementation Details
4.1.1. Dataset
4.1.2. Implementation Details
4.2. Experiments
Image Registration
4.3. Performance Analysis
5. Conclusions
Author Contributions
Funding
Institutional Review Board Statement
Informed Consent Statement
Data Availability Statement
Conflicts of Interest
References
- Zhang, H.; Zhang, L.; Zhuo, L.; Zhang, J. Object tracking in RGB-T videos using modal-aware attention network and competitive learning. Sensors 2020, 20, 393. [Google Scholar] [CrossRef]
- Lan, X.; Ye, M.; Zhang, S.; Zhou, H.; Yuen, P.C. Modality-correlation-aware sparse representation for RGB-infrared object tracking. Pattern Recognit. Lett. 2020, 130, 12–20. [Google Scholar] [CrossRef]
- Wang, Y.; Wei, X.; Tang, X.; Shen, H.; Zhang, H. Adaptive fusion CNN features for RGBT object tracking. IEEE Trans. Intell. Transp. Syst. 2021, 23, 7831–7840. [Google Scholar] [CrossRef]
- Al-Shakarji, N.; Gao, K.; Bunyak, F.; Aliakbarpour, H.; Blasch, E.; Narayaran, P.; Seetharaman, G.; Palaniappan, K. Impact of georegistration accuracy on wide area motion imagery object detection and tracking. In Proceedings of the IEEE 24th International Conference on Information Fusion (FUSION), Sun City, South Africa, 1–4 November 2021; pp. 1–8. [Google Scholar]
- Sun, Y.; Zuo, W.; Yun, P.; Wang, H.; Liu, M. FuseSeg: Semantic segmentation of urban scenes based on RGB and thermal data fusion. IEEE Trans. Autom. Sci. Eng. 2020, 18, 1000–1011. [Google Scholar] [CrossRef]
- Guo, Z.; Li, X.; Xu, Q.; Sun, Z. Robust semantic segmentation based on RGB-thermal in variable lighting scenes. Measurement 2021, 186, 110–176. [Google Scholar] [CrossRef]
- Fu, Y.; Chen, Q.; Zhao, H. CGFNet: Cross-guided fusion network for RGB-thermal semantic segmentation. Vis. Comput. 2022, 38, 3243–3252. [Google Scholar] [CrossRef]
- Song, X.; Wu, X.J.; Li, H. A medical image fusion method based on mdlatlrrv2. arXiv 2022, arXiv:2206.15179. [Google Scholar]
- Negishi, T.; Abe, S.; Matsui, T.; Liu, H.; Kurosawa, M.; Kirimoto, T.; Sun, G. Contactless vital signs measurement system using RGB-thermal image sensors and its clinical screening test on patients with seasonal influenza. Sensors 2020, 20, 2171. [Google Scholar] [CrossRef] [PubMed]
- Maurya, L.; Mahapatra, P.; Chawla, D. Non-contact breathing monitoring by integrating RGB and thermal imaging via RGB-thermal image registration. Biocybern. Biomed. Eng. 2021, 41, 1107–1122. [Google Scholar] [CrossRef]
- Marais-Sicre, C.; Queguiner, S.; Bustillo, V.; Lesage, L.; Barcet, H.; Pelle, N.; Breil, N.; Coudert, B. Sun/Shade Separation in Optical and Thermal UAV Images for Assessing the Impact of Agricultural Practices. Remote Sens. 2024, 16, 1436. [Google Scholar] [CrossRef]
- Fevgas, G.; Lagkas, T.; Argyriou, V.; Sarigiannidis, P. Detection of biotic or abiotic stress in vineyards using thermal and RGB images captured via IoT sensors. IEEE Access 2023, 11, 105902–105915. [Google Scholar] [CrossRef]
- Iwashita, Y.; Nakashima, K.; Rafol, S.; Stoica, A.; Kurazume, R. MU-Net: Deep Learning-Based Thermal IR Image Estimation From RGB Image. In Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Long Beach, CA, USA, 16–17 June 2019; pp. 1022–1028. [Google Scholar]
- Zhou, T.; Cheng, Q.; Lu, H.; Li, Q.; Zhang, X.; Qiu, S. Deep learning methods for medical image fusion: A review. Comput. Biol. Med. 2023, 160, 106959. [Google Scholar] [CrossRef]
- Zhang, T.; Guo, H.; Jiao, Q.; Zhang, Q.; Han, J. Efficient rgb-t tracking via cross-modality distillation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada, 17–24 June 2023; pp. 5404–5413. [Google Scholar]
- Barbedo, J.G. A review on the combination of deep learning techniques with proximal hyper-spectral images in agriculture. Comput. Electron. Agric. 2023, 210, 107–920. [Google Scholar] [CrossRef]
- Farmonov, N.; Amankulova, K.; Szatmári, J.; Sharifi, A.; Abbasi-Moghadam, D.; Nejad, S.M.; Mucsi, L. Crop type classification by DESIS hyperspectral imagery and machine learning algorithms. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2023, 16, 1576–1588. [Google Scholar] [CrossRef]
- Curcio, A.C.; Barbero, L.; Peralta, G. UAV-hyperspectral imaging to estimate species distribution in salt marshes: A case study in the Cadiz Bay (SW Spain). Remote Sens. 2023, 15, 1419. [Google Scholar] [CrossRef]
- Ye, Y.; Bruzzone, L.; Shan, J.; Bovolo, F.; Zhu, Q. Fast and robust matching for multi-modal remote sensing image registration. IEEE Trans. Geosci. Remote Sens. 2019, 57, 9059–9070. [Google Scholar] [CrossRef]
- Gao, K.; Aliakbarpour, H.; Seetharaman, G.; Palaniappan, K. DCT-based local descriptor for robust matching and feature tracking in wide area motion imagery. IEEE Geosci. Remote Sens. Lett. 2020, 18, 1441–1446. [Google Scholar] [CrossRef]
- Lowe, G. Sift-the scale invariant feature transform. Int. J. 2004, 91–110. [Google Scholar]
- Rublee, E.; Rabaud, V.; Konolige, K.; Bradski, G. ORB: An efficient alternative to SIFT or SURF. In Proceedings of the International Conference on Computer Vision, Barcelona, Spain, 6–13 November 2011; pp. 2564–2571. [Google Scholar]
- Oyallon, E.; Rabin, J. An analysis of the SURF method. Proc. Image Process. Line 2015, 5, 176–218. [Google Scholar] [CrossRef]
- Song, H.; Liu, Q.; Wang, G.; Hang, R.; Huang, B. Spatiotemporal satellite image fusion using deep convolutional neural networks. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens. 2018, 11, 821–829. [Google Scholar] [CrossRef]
- Liu, Y.; Chen, X.; Peng, H.; Wang, Z. Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion 2017, 36, 191–207. [Google Scholar] [CrossRef]
- Sun, Y.; Zuo, W.; Liu, M. Rtfnet: Rgb-thermal fusion network for semantic segmentation of urban scenes. IEEE Robot. Autom. Lett. 2019, 4, 2576–2583. [Google Scholar] [CrossRef]
- Pielawski, N.; Wetzer, E.; Öfverstedt, J.; Lu, J.; Wählby, C.; Lindblad, J.; Sladoje, N. CoMIR: Contrastive multi-modal image representation for registration. Adv. Neural Inf. Process. Syst. 2020, 33, 18433–18444. [Google Scholar]
- Arar, M.; Ginger, Y.; Danon, D.; Bermano, A.H.; Cohen-Or, D. Unsupervised multi-modal image registration via geometry preserving image-to-image translation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, WA, USA, 14–19 June 2020; pp. 13410–13419. [Google Scholar]
- Jiang, X.; Ma, J.; Xiao, G.; Shao, Z.; Guo, X. A review of multi-modal image matching: Methods and applications. Inf. Fusion 2021, 73, 22–71. [Google Scholar] [CrossRef]
- DeTone, D.; Malisiewicz, T.; Rabinovich, A. Superpoint: Self-supervised interest point detection and description. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Salt Lake City, UT, USA, 18–22 June 2018; pp. 224–236. [Google Scholar]
- Roszyk, K.; Nowicki, M.R.; Skrzypczyński, P. Adopting the YOLOv4 architecture for low-latency multispectral pedestrian detection in autonomous driving. Sensors 2022, 22, 1082. [Google Scholar] [CrossRef] [PubMed]
- Zhu, Y.; Li, C.; Tang, J.; Luo, B.; Wang, L. RGBT tracking by trident fusion network. IEEE Trans. Circuits Syst. Video Technol. 2021, 32, 579–592. [Google Scholar] [CrossRef]
- Peng, J.; Zhao, H.; Hu, Z. Dynamic fusion network for RGBT tracking. IEEE Trans. Intell. Transp. Syst. 2022, 24, 3822–3832. [Google Scholar] [CrossRef]
- Zhang, Y.; Liu, Y.; Sun, P.; Yan, H.; Zhao, X.; Zhang, L. IFCNN: A general image fusion framework based on convolutional neural network. Proc. Inf. Fusion 2020, 54, 99–118. [Google Scholar] [CrossRef]
- Liu, Y.; Liu, Y.; Yan, S.; Chen, C.; Zhong, J.; Peng, Y.; Zhang, M. A multi-view thermal–visible image dataset for cross-spectral matching. Remote Sens. 2022, 15, 174. [Google Scholar] [CrossRef]
- Cui, S.; Ma, A.; Wan, Y.; Zhong, Y.; Luo, B.; Xu, M. Cross-modality image matching network with modality-invariant feature representation for airborne-ground thermal infrared and visible datasets. IEEE Trans. Geosci. Remote Sens. 2021, 60, 1–4. [Google Scholar] [CrossRef]
- Cheng, K.S.; Lin, H.Y. Automatic target recognition by infrared and visible image matching. In Proceedings of the 14th IAPR International Conference on Machine Vision Applications (MVA), Tokyo, Japan, 18–22 May 2015; pp. 312–315. [Google Scholar]
- Choi, M.; Kim, R.Y.; Nam, M.R.; Kim, H.O. Fusion of multispectral and panchromatic satellite images using the curvelet transform. IEEE Geosci. Remote Sens. Lett. 2005, 2, 136–140. [Google Scholar] [CrossRef]
- Meng, F.; Song, M.; Guo, B.; Shi, R.; Shan, D. Image fusion based on object region detection and non-subsampled contourlet transform. Comput. Electr. Eng. 2017, 62, 375–383. [Google Scholar] [CrossRef]
- Yin, H. Tensor sparse representation for 3-D medical image fusion using weighted average rule. IEEE Trans. Biomed. Eng. 2018, 65, 2622–2633. [Google Scholar] [CrossRef]
- He, C.; Liu, Q.; Li, H.; Wang, H. Multi-Modal medical image fusion based on IHS and PCA. Procedia Eng. 2010, 7, 280–285. [Google Scholar] [CrossRef]
- Zhao, J.; Zhou, Q.; Chen, Y.; Feng, H.; Xu, Z.; Li, Q. Fusion of visible and infrared images using saliency analysis and detail preserving based image decomposition. Infrared Phys. Technol. 2013, 56, 93–99. [Google Scholar] [CrossRef]
- Han, J.; Pauwels, E.J.; De Zeeuw, P. Fast saliency-aware multi-modality image fusion. Neurocomputing 2013, 111, 70–80. [Google Scholar] [CrossRef]
- Liu, Y.; Chen, X.; Ward, R.K.; Wang, Z.J. Image fusion with convolutional sparse representation. IEEE Signal Process. Lett. 2016, 23, 1882–1886. [Google Scholar] [CrossRef]
- Ma, J.; Yu, W.; Liang, P.; Li, C.; Jiang, J. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion 2019, 48, 11–26. [Google Scholar] [CrossRef]
- Mo, Y.; Kang, X.; Duan, P.; Sun, B.; Li, S. Attribute filter based infrared and visible image fusion. Inf. Fusion 2021, 75, 41–54. [Google Scholar] [CrossRef]
- Li, L.; Shi, Y.; Lv, M.; Jia, Z.; Liu, M.; Zhao, X.; Zhang, X.; Ma, H. Infrared and Visible Image Fusion via Sparse Representation and Guided Filtering in Laplacian Pyramid Domain. Remote Sens. 2024, 16, 3804. [Google Scholar] [CrossRef]
- Chen, W.; Miao, L.; Wang, Y.; Zhou, Z.; Qiao, Y. Infrared–Visible Image Fusion through Feature-Based Decomposition and Domain Normalization. Remote Sens. 2024, 16, 969. [Google Scholar] [CrossRef]
- Shahsavarani, S.; Lopez, F.; Ibarra-Castanedo, C.; Maldague, X.P. Robust Multi-Modal Image Registration for Image Fusion Enhancement in Infrastructure Inspection. Sensors 2024, 24, 3994. [Google Scholar] [CrossRef] [PubMed]
- Wang, C.; Wu, J.; Zhu, Z.; Chen, H. MSFNet: Multistage fusion network for infrared and visible image fusion. Neurocomputing 2022, 507, 26–39. [Google Scholar] [CrossRef]
- Lan, X.; Gu, X.; Gu, X. MMNet: Multi-modal multi-stage network for RGB-T image semantic segmentation. Appl. Intell. 2022, 52, 5817–5829. [Google Scholar] [CrossRef]
- Zhang, L.; Danelljan, M.; Gonzalez-Garcia, A.; Van De Weijer, J.; Shahbaz, K.F. Multi-modal fusion for end-to-end RGB-T tracking. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Seoul, Republic of Korea, 27–28 October 2019. [Google Scholar]
- Li, H.; Wu, X.J.; Kittler, J. RFN-Nest: An end-to-end residual fusion network for infrared and visible images. Inf. Fusion 2021, 73, 72–86. [Google Scholar] [CrossRef]
- Zhang, X.; Ye, P.; Xiao, G. VIFB: A visible and infrared image fusion benchmark. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, Seattle, WA, USA, 14–19 June 2020. [Google Scholar]
- Pereira, A.; Warwick, S.; Moutinho, A.; Suleman, A. Infrared and visible camera integration for detection and tracking of small UAVs: Systematic evaluation. Drones 2024, 8, 650. [Google Scholar] [CrossRef]
- Zhang, Y.; Yu, H.; He, Y.; Wang, X.; Yang, W. Illumination-guided RGBT object detection with inter-and intra-modality fusion. IEEE Trans. Instrum. Meas. 2023, 72, 1–13. [Google Scholar] [CrossRef]
- Li, G.; Lin, Y.; Ouyang, D.; Li, S.; Luo, X.; Qu, X.; Pi, D.; Li, S.E. A RGB-thermal image segmentation method based on parameter sharing and attention fusion for safe autonomous driving. IEEE Trans. Intell. Transp. Syst. 2023, 25, 5122–5137. [Google Scholar] [CrossRef]
- Choi, Y.; Kim, N.; Hwang, S.; Park, K.; Yoon, J.S.; An, K.; Kweon, I.S. KAIST multi-spectral day/night data set for autonomous and assisted driving. IEEE Trans. Intell. Transp. Syst. 2018, 19, 934–948. [Google Scholar] [CrossRef]
- Eltahan, M.; Elsayed, K. Enhancing Autonomous Driving By Exploiting Thermal Object Detection Through Feature Fusion. Int. J. Intell. Transp. Syst. Res. 2024, 22, 146–158. [Google Scholar] [CrossRef]
- Wang, Z.; Bovik, A.C.; Sheikh, H.R.; Simoncelli, E.P. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process. 2004, 13, 600–612. [Google Scholar] [CrossRef] [PubMed]
- Yoo, J.C.; Han, T.H. Fast normalized cross-correlation. Circuits, Syst. Signal Process. 2009, 28, 819–843. [Google Scholar] [CrossRef]
- Cui, G.; Feng, H.; Xu, Z.; Li, Q.; Chen, Y. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Opt. Commun. 2015, 341, 199–209. [Google Scholar] [CrossRef]
- Xydeas, C.S.; Petrovic, V. Objective image fusion performance measure. Mil. Tech. Cour. 2000, 36, 308–309. [Google Scholar] [CrossRef]
- Qu, G.; Zhang, D.; Yan, P. Information measure for performance of image fusion. Electron. Lett. 2002, 38, 313–315. [Google Scholar] [CrossRef]
- Eskicioglu, A.M.; Fisher, P.S. Image quality measures and their performance. IEEE Trans. Commun. 1995, 43, 2959–2965. [Google Scholar] [CrossRef]
- Kassim, Y.M.; Palaniappan, K.; Yang, F.; Poostchi, M.; Palaniappan, N.; Maude, R.J.; Antani, S.; Jaeger, S. Clustering-Based Dual Deep Learning Architecture for Detecting Red Blood Cells in Malaria Diagnostic Smears. IEEE J. Biomed. Health Inf. 2021, 25, 1735–1746. [Google Scholar] [CrossRef]
- Sun, J.; Shen, Z.; Wang, Y.; Bao, H.; Zhou, X. LoFTR: Detector-free local feature matching with transformers. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Nashville, TN, USA, 20–25 June 2021; pp. 8922–8931. [Google Scholar]
- Lindenberger, P.; Sarlin, P.E.; Pollefeys, M. Lightglue: Local feature matching at light speed. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), Paris, France, 1–6 October 2023; pp. 17627–17638. [Google Scholar]
- Chandrakanth, V.; Murthy, V.S.; Channappayya, S.S. Siamese cross-domain tracker design for seamless tracking of targets in RGB and thermal videos. IEEE Trans. Artif. Intell. 2022, 4, 161–172. [Google Scholar]
- Zhang, D. Fundamentals of Image Data Mining: Analysis, Features, Classification and Retrieval; Springer Nature: Cham, Switzerland, 2019. [Google Scholar]
- Skodras, A.; Christopoulos, C.; Ebrahimi, T. The JPEG 2000 still image compression standard. IEEE Signal Process. Mag. 2001, 18, 36–58. [Google Scholar] [CrossRef]
- Jagalingam, P.; Hegde, A.V. A review of quality metrics for fused image. Aquat. Procedia 2015, 4, 133–142. [Google Scholar] [CrossRef]
- Chen, Y.; Blum, R.S. A new automated quality assessment algorithm for image fusion. Image Vis. Comput. 2009, 27, 1421–1432. [Google Scholar] [CrossRef]
- Chen, H.; Varshney, P.K. A human perception inspired quality metric for image fusion based on regional information. Inf. Fusion 2007, 8, 193–207. [Google Scholar] [CrossRef]
- Li, Y.; Wang, Y.; Huang, W.; Zhang, Z. Automatic image stitching using SIFT. In Proceedings of the International Conference on Audio, Language and Image Processing, Shanghai, China, 7–9 July 2008; pp. 568–571. [Google Scholar]
- Zhang, J.; Xiu, Y. Image stitching based on human visual system and SIFT algorithm. Vis. Comput. 2024, 40, 427–439. [Google Scholar] [CrossRef]
- Wang, X.; Zhang, H. Realization of 3D Reconstruction Algorithm Based on 2D Video. In Proceedings of the 33rd Chinese Control and Decision Conference (CCDC), Kunming, China, 22–24 May 2021; pp. 7299–7304. [Google Scholar]
- Gao, L.; Zhao, Y.; Han, J.; Liu, H. Research on multi-view 3D reconstruction technology based on SFM. Sensors 2022, 22, 4366. [Google Scholar] [CrossRef]
- Palaniappan, K.; Rao, R.M.; Seetharaman, G. Wide-area persistent airborne video: Architecture and challenges. In Distributed Video Sensor Networks; Springer: London, UK, 2011; pp. 349–420. [Google Scholar]
- Blasch, E.; Seetharaman, G.; Suddarth, S.; Palaniappan, K.; Chen, G.; Ling, H.; Basharat, A. Summary of methods in wide-area motion imagery (WAMI). In Geospatial InfoFusion and Video Analytics IV and Motion Imagery for ISR and Situational Awareness II; SPIE: Bellingham, WA, USA, 2014; Volume 9089, pp. 91–100. [Google Scholar]
- AliAkbarpour, H.; Palaniappan, K.; Seetharaman, G. Fast structure from motion for sequential and wide area motion imagery. In Proceedings of the IEEE International Conference on Computer Vision Workshops, Santiago, Chile, 7–13 December 2015; pp. 34–41. [Google Scholar]
- Aliakbarpour, H.; Palaniappan, K.; Seetharaman, G. Robust camera pose refinement and rapid SfM for multiview aerial imagery—Without RANSAC. Proc. IEEE Geosci. Remote Sens. Lett. 2015, 12, 2203–2230. [Google Scholar] [CrossRef]
- AliAkbarpour, H.; Palaniappan, K.; Seetharaman, G. Parallax-tolerant aerial image georegistration and efficient camera pose refinement—Without piecewise homographies. IEEE Trans. Geosci. Remote Sens. 2017, 55, 4618–4655. [Google Scholar] [CrossRef]
- Pelapur, R.; Candemir, S.; Bunyak, F.; Poostchi, M.; Seetharaman, G.; Palaniappan, K. Persistent target tracking using likelihood fusion in wide-area and full motion video sequences. In Proceedings of the 15th IEEE International Conference on Information Fusion, Singapore, 9–12 July 2012; pp. 2420–2427. [Google Scholar]
- Li, H.; Wu, X.J.; Kittler, J. Infrared and visible image fusion using a deep learning framework. In Proceedings of the 24th IEEE International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 2705–2710. [Google Scholar]
- Li, C.; Liang, X.; Lu, Y.; Zhao, N.; Tang, J. RGB-T object tracking: Benchmark and baseline. Proc. Pattern Recognit. 2019, 96, 1069–1077. [Google Scholar] [CrossRef]
- Xu, H.; Ma, J.; Le, Z.; Jiang, J.; Guo, X. Fusiondn: A unified densely connected network for image fusion. Proc. AAAI Conf. Artif. Intell. 2020, 34, 12484–12491. [Google Scholar] [CrossRef]
- Davis, J.W.; Sharma, V. Background-subtraction using contour-based fusion of thermal and visible imagery. Comput. Vis. Image Underst. 2007, 106, 162–182. [Google Scholar] [CrossRef]
- Lee, C.; Anderson, M.; Raganathan, N.; Zuo, X.; Do, K.; Gkioxari, G.; Chung, S.J. CART: Caltech Aerial RGB-Thermal Dataset in the Wild. arXiv 2024, arXiv:2403.08997. [Google Scholar]
- Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
- Naidu, V.P. Image fusion technique using multi-resolution singular value decomposition. Def. Sci. J. 2011, 61, 479–484. [Google Scholar] [CrossRef]
- Li, S.; Kang, X.; Hu, J. Image fusion with guided filtering. IEEE Trans. Image Process. 2013, 22, 2864–2875. [Google Scholar] [PubMed]
- Liu, Y.; Liu, S.; Wang, Z. A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fusion 2015, 24, 147–164. [Google Scholar] [CrossRef]
- Bavirisetti, D.P.; Dhuli, R. Two-scale image fusion of visible and infrared images using saliency detection. Infrared Phys. Technol. 2016, 76, 52–64. [Google Scholar] [CrossRef]
- Zhou, Z.; Wang, B.; Li, S.; Dong, M. Perceptual fusion of infrared and visible images through a hybrid multi-scale decomposition with Gaussian and bilateral filters. Inf. Fusion 2016, 30, 15–26. [Google Scholar] [CrossRef]
- Zhou, Z.; Dong, M.; Xie, X.; Gao, Z. Fusion of infrared and visible images for night-vision context enhancement. Appl. Opt. 2016, 55, 6480–6490. [Google Scholar] [CrossRef] [PubMed]
- Ma, J.; Chen, C.; Li, C.; Huang, J. Infrared and visible image fusion via gradient transfer and total variation minimization. Inf. Fusion 2016, 31, 100–109. [Google Scholar] [CrossRef]
- Zhang, Y.; Zhang, L.; Bai, X.; Zhang, L. Infrared and visual image fusion through infrared feature extraction and visual information preservation. Infrared Phys. Technol. 2017, 83, 227–237. [Google Scholar] [CrossRef]
- Ma, J.; Zhou, Z.; Wang, B.; Zong, H. Infrared and visible image fusion based on visual saliency map and weighted least square optimization. Infrared Phys. Technol. 2017, 82, 8–17. [Google Scholar] [CrossRef]
- Bavirisetti, D.P.; Xiao, G.; Liu, G. Multi-sensor image fusion based on fourth order partial differential equations. In Proceedings of the 20th IEEE International Conference on Information Fusion (Fusion), Xi’an, China, 10–13 July 2017; pp. 1–9. [Google Scholar]
- Liu, Y.; Chen, X.; Cheng, J.; Peng, H.; Wang, Z. Infrared and visible image fusion with convolutional neural networks. Int. J. Wavelets Multiresolut. Inf. Process. 2018, 16, 1850018. [Google Scholar] [CrossRef]
- Li, H.; Wu, X.J. Infrared and visible image fusion using latent low-rank representation. arXiv 2018, arXiv:1804.08992. [Google Scholar]
- Xu, H.; Ma, J.; Jiang, J.; Guo, X.; Ling, H. U2Fusion: A unified unsupervised image fusion network. IEEE Trans. Pattern Anal. Mach. Intell. 2020, 44, 502–518. [Google Scholar] [CrossRef] [PubMed]
- Tang, W.; He, F.; Liu, Y. YDTR: Infrared and visible image fusion via Y-shape dynamic transformer. IEEE Trans. Multimed. 2022, 25, 5413–5428. [Google Scholar] [CrossRef]
- Tang, L.; Yuan, J.; Ma, J. Image fusion in the loop of high-level vision tasks: A semantic-aware real-time infrared and visible image fusion network. Inf. Fusion 2022, 82, 28–42. [Google Scholar] [CrossRef]
- Ma, J.; Tang, L.; Fan, F.; Huang, J.; Mei, X.; Ma, Y. SwinFusion: Cross-domain long-range learning for general image fusion via Swin transformer. IEEE/CAA J. Autom. Sin. 2022, 9, 1200–1217. [Google Scholar] [CrossRef]
- Bulanon, D.M.; Burks, T.; Alchanatis, V. Image fusion of visible and thermal images for fruit detection. Biosyst. Eng. 2009, 103, 12–22. [Google Scholar] [CrossRef]
- Roberts, J.W.; Van Aardt, J.A.; Ahmed, F.B. Assessment of image fusion procedures using entropy, image quality, and multispectral classification. J. Appl. Remote Sens. 2008, 2, 023522. [Google Scholar]
- Rajalingam, B.; Priya, R. Hybrid multi-modality medical image fusion technique for feature enhancement in medical diagnosis. Int. J. Eng. Sci. Invent. 2018, 2, 52–60. [Google Scholar]
- Tanchenko, A. Visual-PSNR measure of image quality. J. Vis. Commun. Image Represent. 2014, 25, 874–878. [Google Scholar] [CrossRef]
Network (→) Metric | Deep Fusion (Ours) | SeA Fusion | Swin Fusion | U2 Fusion | YDTR |
---|---|---|---|---|---|
PSNR ↑ | 62.2142 | 63.6008 | 66.1444 | 62.0929 | 63.5828 |
RMSE ↓ | 0.2642 | 0.2213 | 0.226 | 0.1754 | 0.1866 |
SSIM ↑ | 1.3252 | 1.0737 | 1.1487 | 1.1992 | 1.2241 |
Variance ↑ | 40.8556 | 40.7661 | 36.3481 | 21.8177 | 22.9194 |
Mutual Information ↑ | 0.7628 | 1.6846 | 1.8256 | 1.0549 | 1.2813 |
Entropy ↑ | 7.1268 | 6.9883 | 6.7096 | 6.1869 | 6.178 |
Cross Entropy ↓ | 1.964 | 1.6284 | 1.3206 | 1.3869 | 1.053 |
Average Gradient ↑ | 6.1181 | 5.644 | 4.752 | 4.0065 | 3.0561 |
Edge Intensity ↑ | 57.3388 | 55.051 | 46.7207 | 38.9731 | 30.2318 |
↑ | 0.1414 | 0.3728 | 0.3896 | 0.3121 | 0.2837 |
Chen–Blum ↑ | 0.3485 | 0.4348 | 0.4416 | 0.4524 | 0.4376 |
Chen–Varshney ↓ | 1685.9 | 748 | 767.1 | 839.2 | 802.7 |
Spatial Frequency ↑ | 17.9543 | 14.826 | 12.3796 | 10.1304 | 7.8594 |
Average Rank (Top-1) ↑ | 6/13 | 1/13 | 3/13 | 2/13 | 1/13 |
Dataset (→) Metric | Jesse Hall (Gaussian, ) | Kittler | VIFB | Rolla | Cart |
---|---|---|---|---|---|
PSNR ↑ | 61.7586 | 62.2142 | 62.1937 | 62.5957 | 62.3601 |
RMSE ↓ | 0.2668 | 0.2642 | 0.3102 | 0.2812 | 0.2669 |
SSIM ↑ | 1.1752 | 1.3252 | 1.2034 | 0.6392 | 0.9818 |
Variance ↑ | 30.0103 | 40.2556 | 54.4031 | 42.3442 | 48.2999 |
Mutual Information ↑ | 0.7451 | 0.7628 | 0.882 | 0.4608 | 0.7971 |
Entropy ↑ | 6.8014 | 6.7268 | 6.5993 | 6.6426 | 6.6503 |
Cross Entropy ↓ | 1.9155 | 1.964 | 0.7745 | 0.5571 | 0.0302 |
Average Gradient ↑ | 6.9558 | 6.1181 | 6.7735 | 9.9162 | 7.7682 |
Edge Intensity ↑ | 68.2244 | 57.3388 | 67.9655 | 92.069 | 71.5114 |
↑ | 0.1337 | 0.1414 | 0.102 | 0.1392 | 0.1446 |
Chen–Blum ↑ | 0.3427 | 0.3485 | 0.295 | 0.3935 | 0.3712 |
Chen–Varshney ↓ | 1778.6 | 1685.9 | 2989.1025 | 1235.8 | 2134.1 |
Spatial Frequency ↑ | 19.5658 | 17.9543 | 25.1461 | 28.083 | 22.7736 |
VIFB | DeepFusion (Ours) | SeAFusion | SwinFusion | U2Fusion | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
Image Pair | Prec. | Re. | F1 | Prec. | Re. | F1 | Prec. | Re. | F1 | Prec. | Re. | F1 |
Car Light | 0.440 | 0.535 | 0.483 | 0.522 | 0.574 | 0.547 | 0.600 | 0.529 | 0.562 | 0.611 | 0.534 | 0.571 |
Car Shadow | 0.656 | 0.623 | 0.639 | 0.681 | 0.695 | 0.688 | 0.714 | 0.567 | 0.632 | 0.716 | 0.690 | 0.703 |
Car White | 0.573 | 0.530 | 0.551 | 0.548 | 0.454 | 0.497 | 0.528 | 0.546 | 0.537 | 0.625 | 0.380 | 0.472 |
Elec Bike | 0.410 | 0.479 | 0.442 | 0.402 | 0.604 | 0.483 | 0.396 | 0.656 | 0.494 | 0.402 | 0.792 | 0.534 |
Fight | 0.782 | 0.751 | 0.766 | 0.716 | 0.744 | 0.730 | 0.812 | 0.666 | 0.732 | 0.627 | 0.714 | 0.668 |
Kettle | 0.359 | 0.435 | 0.393 | 0.438 | 0.584 | 0.501 | 0.447 | 0.547 | 0.492 | 0.365 | 0.435 | 0.397 |
Lab Man | 0.500 | 0.536 | 0.517 | 0.500 | 0.579 | 0.537 | 0.529 | 0.583 | 0.555 | 0.518 | 0.588 | 0.551 |
Man | 0.392 | 0.793 | 0.526 | 0.407 | 0.612 | 0.489 | 0.444 | 0.607 | 0.512 | 0.436 | 0.707 | 0.539 |
Man Call | 0.421 | 0.708 | 0.528 | 0.370 | 0.554 | 0.444 | 0.371 | 0.629 | 0.465 | 0.443 | 0.507 | 0.473 |
Man Car | 0.574 | 0.522 | 0.547 | 0.593 | 0.528 | 0.559 | 0.621 | 0.602 | 0.611 | 0.622 | 0.510 | 0.561 |
Man Light | 0.750 | 0.631 | 0.685 | 0.671 | 0.622 | 0.646 | 0.671 | 0.581 | 0.623 | 0.589 | 0.649 | 0.617 |
Man Walking | 0.457 | 0.540 | 0.495 | 0.410 | 0.530 | 0.462 | 0.455 | 0.630 | 0.528 | 0.478 | 0.560 | 0.516 |
Man with Bag | 0.372 | 0.657 | 0.474 | 0.433 | 0.562 | 0.489 | 0.359 | 0.428 | 0.391 | 0.413 | 0.482 | 0.445 |
Night Car | 0.466 | 0.583 | 0.518 | 0.387 | 0.580 | 0.465 | 0.408 | 0.559 | 0.471 | 0.543 | 0.530 | 0.536 |
People Shadow | 0.736 | 0.670 | 0.694 | 0.735 | 0.610 | 0.667 | 0.661 | 0.598 | 0.628 | 0.607 | 0.662 | 0.635 |
Running | 0.681 | 0.565 | 0.618 | 0.721 | 0.540 | 0.618 | 0.723 | 0.584 | 0.646 | 0.654 | 0.493 | 0.563 |
Snow | 0.625 | 0.532 | 0.573 | 0.520 | 0.530 | 0.525 | 0.542 | 0.635 | 0.585 | 0.426 | 0.425 | 0.425 |
Tricycle | 0.659 | 0.550 | 0.600 | 0.610 | 0.622 | 0.616 | 0.635 | 0.581 | 0.607 | 0.625 | 0.673 | 0.648 |
Walking | 0.690 | 0.645 | 0.667 | 0.588 | 0.561 | 0.574 | 0.551 | 0.609 | 0.579 | 0.630 | 0.640 | 0.635 |
Walking2 | 0.705 | 0.598 | 0.647 | 0.750 | 0.616 | 0.676 | 0.674 | 0.649 | 0.661 | 0.602 | 0.639 | 0.620 |
Walking Night | 0.500 | 0.590 | 0.541 | 0.625 | 0.644 | 0.634 | 0.563 | 0.658 | 0.607 | 0.711 | 0.479 | 0.572 |
Average | 0.559 | 0.594 | 0.576 | 0.554 | 0.588 | 0.564 | 0.557 | 0.592 | 0.572 | 0.554 | 0.575 | 0.561 |
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |
© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).
Share and Cite
Vipparla, C.; Krock, T.; Nouduri, K.; Fraser, J.; AliAkbarpour, H.; Sagan, V.; Cheng, J.-R.C.; Kannappan, P. Fusion of Visible and Infrared Aerial Images from Uncalibrated Sensors Using Wavelet Decomposition and Deep Learning. Sensors 2024, 24, 8217. https://doi.org/10.3390/s24248217
Vipparla C, Krock T, Nouduri K, Fraser J, AliAkbarpour H, Sagan V, Cheng J-RC, Kannappan P. Fusion of Visible and Infrared Aerial Images from Uncalibrated Sensors Using Wavelet Decomposition and Deep Learning. Sensors. 2024; 24(24):8217. https://doi.org/10.3390/s24248217
Chicago/Turabian StyleVipparla, Chandrakanth, Timothy Krock, Koundinya Nouduri, Joshua Fraser, Hadi AliAkbarpour, Vasit Sagan, Jing-Ru C. Cheng, and Palaniappan Kannappan. 2024. "Fusion of Visible and Infrared Aerial Images from Uncalibrated Sensors Using Wavelet Decomposition and Deep Learning" Sensors 24, no. 24: 8217. https://doi.org/10.3390/s24248217
APA StyleVipparla, C., Krock, T., Nouduri, K., Fraser, J., AliAkbarpour, H., Sagan, V., Cheng, J.-R. C., & Kannappan, P. (2024). Fusion of Visible and Infrared Aerial Images from Uncalibrated Sensors Using Wavelet Decomposition and Deep Learning. Sensors, 24(24), 8217. https://doi.org/10.3390/s24248217