# Wavelet-Based Visible and Infrared Image Fusion: A Comparative Study

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Wavelet-Based Image Fusion

#### 2.1. Discrete Wavelet Transform (DWT)

**Decimated**: In this case the approximation and details images are downsampled after each level of decomposition (in case of multi level decomposition), keeping one out of every two rows and columns. As previously mentioned, the wavelet and scaling functions can be viewed as high and low pass filters, respectively. Because these filters are one dimensional, when dealing with 2D images the process consists of applying these two filters first to the rows and then to the columns. Figure 3 presents an illustration of this concatenation of high (h) and low (l) pass filters applied to the rows and columns of a given image in order to obtain the approximation (A) and details (HD, VD, DD).

**Undecimated**: In this case instead of downsampling the resulting approximation and detail images, the filters are upsampled. This produces approximation and details images of the same size as the original ones but with half the resolution. In this case, when doing the inverse transform, the filters are downsampled.

**Non separated**: The issue with the horizontal and vertical orientated features is due to the fact that rows and columns are separately filtered. A solution to this problem is to use a two dimensional wavelet filter derived from the scaling function. This will result in an approximation image obtained from the filtering and one detail image that can be obtained from the difference of the original image with the approximation image. The results are similar to the one obtained with the undecimated approach, in the sense that the resolutions will decrease with each level of decomposition because the filter is upsampled.

#### 2.2. Fusion Strategies

**Substitutive wavelet fusion**: in this scheme the information from one image is completely replaced with information from the other image. In other words, the approximation from one image is merged with the detail of the other image. In the current work the two possible configurations have been considered: (${A}_{VS},{D}_{IR}$) and (${A}_{IR},{D}_{VS}$). Once the information is merged the inverse transform is computed to obtain ${I}_{F}$.

**Additive wavelet fusion**: as indicated by the name, at this scheme the approximations from one image are added to the other one. The same happens for the detail information. If multiple decompositions were applied, the details at each resolution level are added. Finally, after merging the information the inverse transform is performed resulting in the sough ${I}_{F}$. In our implementation this scheme is implemented by considering the mean value, instead of just the result from the addition.

**Weighted models**: at this scheme a user tuned merging strategy is applied. Depending on the application and the kind of input images approximations and details are combined according to some statistic values ($\mu ,\sigma $) or according to some other relevant criteria. At the current work, since input images are of the same resolution, and we intend to evaluate the performance of fusion based on DWT of infrared and visible images in a general way, this scheme is not considered.

## 3. Evaluation Metrics

**Fused Peak Signal to Noise Ratio (FPSNR)**: is based on the widely used metric (PSNR), which is computed from the ratio between the number of gray levels (L) in the image and the mean squared error between the intensity value of the fused image and the reference one. In our case, since there is no reference image, this value is computed twice, once with the visible image and once with the infrared image used as input information. Then, the average value is considered:

**Fused Mutual Information (FMI)**: has been proposed in [29] and later on improved in [22], where a faster approach is proposed—the acronym FMI in the original paper refers to Feature Mutual Information, but here we propose to update it to our notation. This metric evaluates the performance of the fusion algorithm by measuring the amount of information carried from the source images to the fused image by means of mutual information (MI). MI measures the degree of dependency between two variables A and B, by measuring the distance between the joint distribution ${p}_{AB}(a,b)$ and the distribution associated with the case of complete independence ${p}_{A}a\xb7{p}_{B}b$, by means of the relative entropy (see [29] for more details):

**Fused Structural Similarity (FSS)**: is based on the work presented in [30]; the structural similarity between ${I}_{1}$ and ${I}_{2}$ is defined as $S{S}_{{I}_{1},{I}_{2}}=SSIM({I}_{1},{I}_{2})$, where $SSIM({I}_{1},{I}_{2})={\displaystyle \frac{1}{N}}{\sum}_{j=1}^{N}SSIM({a}_{j},{b}_{j})$ is the Structural SIMilarity (SSIM) index proposed in [21]. Hence, the FSS is computed as:

**Fused S-CIELAB (FS-CIELAB)**: is based on the spatial cielab (S-CIELAB) approach presented in [31]. Although this approach has been originally proposed for measuring color reproduction errors in digital images, it has been also used for measuring fusion results in color images [32]. It is computed as:

## 4. Experimental Results

**one level of decomposition**is enough for the fusion of images; even though in some cases another level may perform similarly or slightly better, the very small difference in the measurement value does not justify the usage of further decomposition levels; (2) the

**reverse biorthogonal wavelet family**is the one that appears more times in the set of best configurations, independently of the metric selected for the evaluation. From the reverse biorthogonal family, the rbio5.5 was the best one. Surprisingly, when counting the number of times each family appears in the worst configurations (we did a similar study but with the worst 3% of configurations), the reverse biorthogonal appears in a greater number as well. This behavior can be easily understood in combination with the next point (selection of fusion strategy); (3) regarding the fusion strategy

**the approximation weights much more than the details**, as expected, and the selection of approximation strategy varies according to metric selected for evaluating the results. For FS-CIELAB and FSS the mean between both approximation images (NIR and RGB) was always the best selection; for FPSNR it was distributed almost evenly between the minimum and maximum between both approximation images; in other words, independently of the selection (min or max) a good result is obtained. Finally, for FMI, the minimum was always the best choice. The worst configurations, correspond to the random selection of coefficients for approximation, and this leads to the conclusion that this is what really makes the configurations measure poorly with the metric. In such a case (coefficients randomly selected), the performance is always bad for any metric, independently of the selected wavelet family. In summary, the reverse biorthogonal wavelet family is the best option for decomposing the images independently of the metric selected for the evaluation; regarding the fusion strategy, there is a correlation between the best option and the selected evaluation metric as indicated above.

## 5. Conclusions

## Acknowledgments

## Author Contributions

## Conflicts of Interest

## References

- Cui, G.; Feng, H.; Xu, Z.; Li, Q.; Chen, Y. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Opt. Commun.
**2015**, 341, 199–209. [Google Scholar] [CrossRef] - Dong, L.; Yang, Q.; Wu, H.; Xiao, H.; Xu, M. High quality multi-spectral and panchromatic image fusion technologies based on Curvelet transform. Neurocomputing
**2015**, 159, 268–274. [Google Scholar] [CrossRef] - Du, P.; Liu, S.; Xia, J.; Zhao, Y. Information fusion techniques for change detection from multi-temporal remote sensing images. Inf. Fusion
**2013**, 14, 19–27. [Google Scholar] [CrossRef] - Menze, B.; Ur, J. Multitemporal Fusion for the Detection of Static Spatial Patterns in Multispectral Satellite Images—With Application to Archaeological Survey. IEEE J. Sel. Top. Appl. Earth Observ. Remote Sens.
**2014**, 7, 3513–3524. [Google Scholar] [CrossRef] - Ricaurte, P.; Chilán, C.; Aguilera-Carrasco, C.A.; Vintimilla, B.X.; Sappa, A.D. Feature point descriptors: Infrared and visible spectra. Sensors
**2014**, 14, 3690–3701. [Google Scholar] [CrossRef] [PubMed] - Li, Y.; Shi, X.; Wei, L.; Zou, J.; Chen, F. Assigning Main Orientation to an EOH Descriptor on Multispectral Images. Sensors
**2015**, 15, 15595–15610. [Google Scholar] [CrossRef] [PubMed] - Aguilera, C.; Barrera, F.; Lumbreras, F.; Sappa, A.D.; Toledo, R. Multispectral image feature points. Sensors
**2012**, 12, 12661–12672. [Google Scholar] [CrossRef] - Choi, Y.; Kim, N.; Park, K.; Hwang, S.; Yoon, J.; Kweon, I. All-Day Visual Place Recognition: Benchmark Dataset and Baseline. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition Workshops (CVPRWVPRICE), Boston, MA, USA, 8–10 June 2015.
- Denman, S.; Lamb, T.; Fookes, C.; Chandran, V.; Sridharan, S. Multi-spectral fusion for surveillance systems. Comp. Electr. Eng.
**2010**, 36, 643–663. [Google Scholar] [CrossRef] [Green Version] - Shah, P.; Reddy, B.C.S.; Merchant, S.N.; Desai, U.B. Context enhancement to reveal a camouflaged target and to assist target localization by fusion of multispectral surveillance videos. Signal Image Video Process.
**2013**, 7, 537–552. [Google Scholar] [CrossRef] - Bourlai, T.; Kalka, N.; Ross, A.; Cukic, B.; Hornak, L. Cross-spectral face verification in the short wave infrared (SWIR) band. In Proceedings of the 20th International Conference on IEEE Pattern Recognition (ICPR), Istanbul, Turkey, 23–26 August 2010; pp. 1343–1347.
- Borrmann, D.; Nüchter, A.; Ðakulović, M.; Maurović, I.; Petrović, I.; Osmanković, D.; Velagić, J. A mobile robot based system for fully automated thermal 3D mapping. Adv. Eng. Inform.
**2014**, 28, 425–440. [Google Scholar] [CrossRef] - Vidas, S.; Moghadam, P.; Bosse, M. 3D thermal mapping of building interiors using an RGB-D and thermal camera. In Proceedings of the 2013 IEEE International Conference on Robotics and Automation (ICRA), Karlsruhe, Germany, 6–10 May 2013; pp. 2311–2318.
- Poujol, J.; Aguilera, C.; Danos, E.; Vintimilla, B.; Toledo, R.; Sappa, A.D. Visible-Thermal Fusion based Monocular Visual Odometry. In ROBOT’2015: Second Iberian Robotics Conference; Springer-Verlag: Lisbon, Portugal, 2015; pp. 517–528. [Google Scholar]
- Cui, G.; Feng, H.; Xu, Z.; Li, Q.; Chen, Y. Detail preserved fusion of visible and infrared images using regional saliency extraction and multi-scale image decomposition. Opt. Commun.
**2015**, 341, 199–209. [Google Scholar] [CrossRef] - Wang, Z.; Ziou, D.; Armenakis, C.; Li, D.; Li, Q. A comparative analysis of image fusion methods. IEEE Trans. Geosci. Remote Sens.
**2005**, 43, 1391–1402. [Google Scholar] [CrossRef] - Gharbia, R.; Azar, A.T.; Baz, A.H.E.; Hassanien, A.E. Image Fusion Techniques in Remote Sensing. CoRR
**2014**. abs/1403.5473. [Google Scholar] - RGB-NIR color image fusion: Metric and psychophysical experiments. In Proceedings of the Image Quality and System Performance XII, San Francisco, CA, USA, 8 February 2015; Volume 9396.
- Jagalingam, P.; Hegde, A.V. A Review of Quality Metrics for Fused Image. Aquat. Procedia
**2015**, 4, 133–142. [Google Scholar] [CrossRef] - Yang, C.; Zhang, J.Q.; Wang, X.R.; Liu, X. A novel similarity based quality metric for image fusion. Inf. Fusion
**2008**, 9, 156–160. [Google Scholar] [CrossRef] - Zhou, W.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process.
**2004**, 13, 600–612. [Google Scholar] - Haghighat, M.; Razian, M. Fast-FMI: Non-reference image fusion metric. In Proceedings of the IEEE 8th International Conference on Application of Information and Communication Technologies, Astana, Kazakhstan, 15–17 October 2014; pp. 1–3.
- Haghighat, M.B.A.; Aghagolzadeh, A.; Seyedarabi, H. A non-reference image fusion metric based on mutual information of image features. Comput. Electr. Eng.
**2011**, 37, 744–756. [Google Scholar] [CrossRef] - Lang, M.; Guo, H.; Odegard, J.E.; Burrus, C.S.; Wells, R., Jr. Noise reduction using an undecimated discrete wavelet transform. IEEE Signal Process. Lett.
**1996**, 3, 10–12. [Google Scholar] [CrossRef] - Chang, T.; Kuo, C.J. Texture analysis and classification with tree-structured wavelet transform. IEEE Trans. Image Process.
**1993**, 2, 429–441. [Google Scholar] [CrossRef] [PubMed] - Amolins, K.; Zhang, Y.; Dare, P. Wavelet based image fusion techniques—An introduction, review and comparison. ISPRS J. Photogramm. Remote Sens.
**2007**, 62, 249–263. [Google Scholar] [CrossRef] - González-Audícana, M.; Otazu, X.; Fors, O.; Seco, A. Comparison between Mallat’s and the Atrous’ discrete wavelet transform based algorithms for the fusion of multispectral and panchromatic images. Int. J. Remote Sens.
**2005**, 26, 595–614. [Google Scholar] [CrossRef] - Mehra, I.; Nishchal, N.K. Wavelet-based image fusion for securing multiple images through asymmetric keys. Opt. Commun.
**2015**, 335, 153–160. [Google Scholar] [CrossRef] - Haghighat, M.B.A.; Aghagolzadeh, A.; Seyedarabi, H. A non-reference image fusion metric based on mutual information of image features. Comput. Electr. Eng.
**2011**, 37, 744–756. [Google Scholar] [CrossRef] - Xu, W.; Mulligan, J. Performance evaluation of color correction approaches for automatic multi-view image and video stitching. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, 13–18 June 2010; pp. 263–270.
- Zhang, X.; Wandell, B.A. A spatial extension of CIELAB for digital color-image reproduction. J. Soc. Inf. Disp.
**1997**, 5, 61–63. [Google Scholar] [CrossRef] - Oliveira, M.; Sappa, A.D.; Santos, V. A Probabilistic Approach for Color Correction in Image Mosaicking Applications. IEEE Trans. Image Process.
**2015**, 24, 508–523. [Google Scholar] [CrossRef] [PubMed] - Brown, M.; Süsstrunk, S. Multispectral SIFT for Scene Category Recognition. In Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR), Colorado Springs, CO, USA, 21–23 June 2011; pp. 177–184.

**Figure 1.**(

**Left**) Pair of images (VS-IR) to be fused; (

**Right**) DWT decompositions (one level) of the input images.

**Figure 3.**Two dimensional wavelet decomposition scheme (l: low pass filter; h: high pass filter; dec: decimation).

**Figure 4.**Results sorted according to the metric used for the evaluation (note FS-CIELAB is a dissimilarity measure, meaning that the smaller the score the better the metric quality).

**Figure 7.**Four pairs of images from the subset used for validation: (

**Top**) Visible spectrum images; (

**Bottom**) NIR images.

**Figure 8.**Three pairs of cross-spectral images of the same scene but at different day-time (images from [8]): (

**Left**) VS; (

**Right**) LWIR.

Variable | Comments | Values |
---|---|---|

Wavelet family | Family of wavelet used for both DWT and I-DWT | Haar, Daubechies, Symlets, |

Coiflets, Biorthogonal, | ||

Reverse Biorthogonal | ||

Discrete Meyer Aprox. | ||

Level | Level of decomposition | 1, 2 and 3 |

Fusion strategy (approx.) | Strategy used to merge coefficients from both images | mean, max, min, rand |

Fusion strategy (details) | Strategy used to merge coefficients from both images | mean, max, min, rand |

Wavelet Name | Comments | Setups |
---|---|---|

Haar (haar) | Orthogonal Wavelet with linear phase. | haar |

Daubechies (dbN) | Daubechies’ external phase wavelets. | db1, db2, ..., db8. |

N refers to the number of vanishing moments. | ||

Symlets (symN) | Daubechies’ least asymmetric wavelets. | sym2, sym3, ..., sym8. |

N refers to the number of vanishing moments. | ||

Coiflets (coifN) | In this family, N is the number of vanishing moments for both the wavelet and scaling function. | coif1, coif2, ..., coif5. |

Biorthogonal (biorNr.Nd) | Biorthogonal wavelets with linear phase. Feature pair of scaling functions (with associated wavelet filters), one for decompositions and one for reconstruction, which can have different number of vanishing moments. Nr and Nd represent the number of vanishing moments respectively. | bior1.1, bior1.3, bior1.5, |

bior2.2, bior2.4, bior2.6, | ||

bior2.8, bior3.1, bior3.3, | ||

bior3.5, bior3.7, bior3.9, | ||

bior4.4, bior5.5, bior6.8 | ||

Reverse Biorthogonal (rbioNr.Nd) | Reverse of the Biorthogonal wavelet explained above. | rbio1.1, rbio1.3, rbio1.5, |

rbio2.2, rbio2.4, rbio2.6, | ||

rbio2.8, rbio3.1, rbio3.3, | ||

rbio3.5, rbio3.7, rbio3.9, | ||

rbio4.4, rbio5.5, rbio6.8 | ||

Discrete Meyer Approximation (dmey) | Approximation of Meyer wavelets leading to FIR filters that can be used in DWT. | dmey |

**Table 3.**Performance decrease (percentage) with respect to the best one according to the four evaluation metrics (see Figure 4).

3% Best FPSNR | 3% Best FMI | 3% Best FSS | 3% Best FS-CIELAB | |
---|---|---|---|---|

FPSNR | 0.26% | 1.17% | 16.79% | 16.97% |

FMI | 2.96% | 1.05% | 2.24% | 3.24% |

FSS | 6.39% | 6.46% | 0.04% | 0.17% |

FS-CIELAB | 2.37% | 1.52% | 0.008% | 0.006% |

**Table 4.**Best setups according to the evaluation metric for the pair of cross-spectral images presented in Figure 8.

Day-Time | Evaluation Metric | Wavelet Family | Level | Fusion Strategy (approx. coef.) | Fusion Strategy (details coef.) |
---|---|---|---|---|---|

Figure 8 (Top) | FPSNR | bior5.5 | 1 | min | max |

Figure 8 (Middle) | FPSNR | bior5.5 | 1 | mean | mean |

Figure 8 (Bottom) | FPSNR | bior5.5 | 1 | min | max |

Figure 8 (Top) | FMI | rbio2.8 | 1 | min | mean |

Figure 8 (Middle) | FMI | rbio2.8 | 1 | mean | mean |

Figure 8 (Bottom) | FMI | rbio2.8 | 1 | min | max |

© 2016 by the authors; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC-BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Sappa, A.D.; Carvajal, J.A.; Aguilera, C.A.; Oliveira, M.; Romero, D.; Vintimilla, B.X.
Wavelet-Based Visible and Infrared Image Fusion: A Comparative Study. *Sensors* **2016**, *16*, 861.
https://doi.org/10.3390/s16060861

**AMA Style**

Sappa AD, Carvajal JA, Aguilera CA, Oliveira M, Romero D, Vintimilla BX.
Wavelet-Based Visible and Infrared Image Fusion: A Comparative Study. *Sensors*. 2016; 16(6):861.
https://doi.org/10.3390/s16060861

**Chicago/Turabian Style**

Sappa, Angel D., Juan A. Carvajal, Cristhian A. Aguilera, Miguel Oliveira, Dennis Romero, and Boris X. Vintimilla.
2016. "Wavelet-Based Visible and Infrared Image Fusion: A Comparative Study" *Sensors* 16, no. 6: 861.
https://doi.org/10.3390/s16060861