# FuseVis: Interpreting Neural Networks for Image Fusion Using Per-Pixel Saliency Visualization

^{*}

## Abstract

**:**

## 1. Introduction

- We trained several state-of-the-art fusion-based unsupervised CNNs under the same learning objective. Considering the importance of understanding the fusion black boxes in life sensitive domain such as medical imaging, we focused on interpreting the trained neural networks specifically for MRI-PET medical image fusion.
- We performed fast computation of per-pixel jacobian-based saliency maps for the fused image with respect to input image pairs. To the best of our knowledge, it is a first-of-its-kind technique to visualize fusion networks by considering the backpropagation heuristics that helps it to be more transparent in a real-time setup.
- We constructed guidance images for each input modality by using gradients of the fused pixel with respect to the input pixel at the corresponding location in the input image. We also interpreted the gradient values in each of the guidance images with the grayscale intensities of the fused image by combining the images in the color channels of an RGB image.
- We computed scatter plots between the gradients of the guidance images which provides a visual overview of the correlation between the influence of each of the input modalities. For example, a positive correlation will show that the input modalities influence the fused image equally.
- We developed an interactive Graphical User Interface (GUI) named FuseVis, that combines all the visual interpretation tools in an efficient way. The FuseVis tool allows the computation of saliency maps in real time on the mouseover at the pixel pointed to by the mouse pointer. Our code is available at https://github.com/nish03/FuseVis.
- Finally, we performed clinical case studies on MRI-PET image pairs using our FuseVis tool and visually interpreted the fusion results obtained from several different neural networks. We showed the usefulness of FuseVis in identifying the capability of the evaluated neural networks to solve clinically relevant problems.

## 2. Related Work

#### 2.1. Classical Image-Fusion Approaches

#### 2.2. Mixture of Classical and Deep Learning-Based Fusion Approaches

#### 2.3. Unsupervised End-to-End Deep Learning-Based Fusion Approaches

#### 2.4. Visualization Techniques

## 3. Method

#### 3.1. Visual Analysis Goals

#### 3.2. Visualization Concepts

#### 3.2.1. Jacobian Images

#### 3.2.2. Guidance Images

#### 3.2.3. Guidance RGB Images

#### 3.2.4. Scatterplot

#### 3.2.5. Gamma Correction

#### 3.3. Overview of FuseVis Tool

## 4. Experimental Setup

#### 4.1. Fusion Networks

#### 4.2. Hardware-Software Setup

#### 4.3. Training Dataset

^{3}. The MRI-T2 images were N3m MPRAGE sequences while PET images were co-registered, averaged, and standardized with a uniform resolution for each of the subjects. We aligned the MRI-PET image pairs using the Affine transformation tool of the 3D Slicer registration library.

#### 4.4. Network Architectures

#### 4.5. Loss Function

^{−3}.

#### 4.6. Hyperparameter Tuning

## 5. Medical Case Studies

#### 5.1. Glioma and Its Pathological Features

#### 5.2. Clinical Test Examples

#### 5.3. Visualization Requirements

- A fusion approach should assist clinicians in visualizing the extent of hyper dark PET regions resembling necrotic core with no blood flow being superimposed on the bright anatomical boundary of the whole tumor mass. This information is important for clinicians to estimate the extent up to which a tumor resection is required. For example, in the first and the fifth column of Figure 4, the principle pixel in this very dark PET region was chosen for visual analysis.
- A fusion approach should preserve the very bright PET features which convey high blood perfusion and normal metabolism in healthy brain tissues as it helps clinicians in visualizing the regions with high brain activity due to external stimuli at a particular time. For example, in the third and seventh column of Figure 4, the principal pixel in the bright PET region was chosen for visual analysis.
- A fusion approach should be stable and less sensitive to changes in input features from a clinically less significant modality. For example, the change in grayscale MRI intensities within the necrotic core shall not highly influence the fused grayscale intensities as it might corrupt the clinically important dark PET features. For example, the MaskNet and DeepPedestrian networks are less sensitive to changes in the MRI features which can be visualized in the guidance MRI and guidance PET images of these networks shown in Figure 5.
- A fusion approach should be less sensitive to the changes in grayscale pixel intensities located in one sub-region of glioma (say enhancing tumor) when the principle pixel is in the other regions of glioma (say necrotic core). Therefore, a fusion method should have a negligible influence of the neighborhood pixels exterior to a local feature with the principle pixel interior to the local feature. For example, the fusion methods such as Weighted Averaging and MaskNet have no or very low gradients in the neighborhood pixels which are outside the very dark PET features resembling necrotic core as shown in the jacobian images in Figure 7.

## 6. Results and Discussion

#### 6.1. Fused Images

#### Summary

#### 6.2. Guidance Images

#### 6.2.1. Weighted Averaging

#### 6.2.2. FunFuseAn

#### 6.2.3. DeepFuse

#### 6.2.4. MaskNet and DeepPedestrian

#### 6.2.5. Summary

#### 6.3. Jacobian Images

#### 6.3.1. Weighted Averaging

#### 6.3.2. FunFuseAn

#### 6.3.3. MaskNet

#### 6.3.4. DeepFuse

#### 6.3.5. DeepPedestrian

#### 6.3.6. Summary

#### 6.4. Scatterplots

#### Summary

#### 6.5. Memory and Frame Rates

## 7. Conclusions

## Author Contributions

## Funding

## Acknowledgments

## Conflicts of Interest

## Appendix A

## References

- Wang, Z.; Bovik, A.; Sheikh, H.; Simoncelli, E. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Process.
**2004**, 13, 600–612. [Google Scholar] [CrossRef] [PubMed][Green Version] - James, A.; Dasarathy, B. Medical Image Fusion: A survey of the state of the art. Inf. Fusion
**2014**, 19, 4–19. [Google Scholar] [CrossRef][Green Version] - Li, S.; Kang, X.; Fang, L.; Hu, J.; Yin, H. Pixel-level image fusion: A survey of the state of the art. Inf. Fusion
**2017**, 33, 100–112. [Google Scholar] [CrossRef] - Du, J.; Li, W.; Lu, K.; Xiao, B. An overview of multi-modal medical image fusion. Neurocomputing
**2016**, 215, 3–20. [Google Scholar] [CrossRef] - Mertens, T.; Kautz, J.; Reeth, F. Exposure Fusion. In Proceedings of the 15th Pacific Conference on Computer Graphics and Applications (PG’07), Maui, HI, USA, 29 October–2 November 2007; pp. 382–390. [Google Scholar]
- Yang, S.; Wang, M.; Jiao, L.; Wu, R.; Wang, Z. Image fusion based on a new contourlet packet. Inf. Fusion
**2010**, 11, 78–84. [Google Scholar] [CrossRef] - Lewis, J.; O’Callaghan, R.; Nikolov, S.; Bull, D.; Canagarajah, N. Pixel- and region-based image fusion with complex wavelets. Inf. Fusion
**2007**, 8, 119–130. [Google Scholar] [CrossRef] - Du, J.; Li, W.; Xiao, B.; Nawaz, Q. Union Laplacian pyramid with multiple features for medical image fusion. Neurocomputing
**2016**, 194, 326–339. [Google Scholar] [CrossRef] - Guihong, Q.; Dali, Z.; Pingfan, Y. Medical image fusion by wavelet transform modulus maxima. Opt. Express
**2001**, 9, 184–190. [Google Scholar] [CrossRef][Green Version] - Li, H.; Manjunath, B.; Mitra, S. Multisensor Image Fusion Using the Wavelet Transform. In Proceedings of the 1st International Conference on Image Processing, Austin, TX, USA, 13–16 November 1994; Volume 57, pp. 235–245. [Google Scholar]
- Liu, Z.; Yin, H.; Chai, Y.; Yang, S.X. A novel approach for multimodal medical image fusion. Expert Syst. Appl.
**2014**, 41, 7424–7435. [Google Scholar] [CrossRef] - Yang, L.; Guo, B.; Ni, W. Multimodality medical image fusion based on multiscale geometric analysis of contourlet transform. Neurocomputing
**2008**, 72, 203–211. [Google Scholar] [CrossRef] - Wang, L.; Li, B.; Tian, L. Multimodal Medical Volumetric Data Fusion Using 3-D Discrete Shearlet Transform and Global-to-Local Rule. IEEE Trans. Biomed. Eng.
**2014**, 61, 197–206. [Google Scholar] [CrossRef] [PubMed] - Miao, Q.; Shi, C.; Xu, P.; Yang, M.; Shi, Y. A novel algorithm of image fusion using shearlets. Opt. Commun.
**2011**, 284, 1540–1547. [Google Scholar] [CrossRef] - Qu, X.; Yan, J.; Xiao, H.; Zhu, Z. Image Fusion Algorithm Based on Spatial Frequency-Motivated Pulse Coupled Neural Networks in Nonsubsampled Contourlet Transform Domain. Acta Autom. Sin.
**2008**, 34, 1508–1514. [Google Scholar] [CrossRef] - Bhatnagar, G.; Wu, Q.; Liu, Z. Directive Contrast Based Multimodal Medical Image Fusion in NSCT Domain. IEEE Trans. Multimed.
**2013**, 15, 1014–1024. [Google Scholar] [CrossRef] - Singh, S.; Gupta, D.; Anand, R.; Kumar, V. Nonsubsampled shearlet based CT and MR medical image fusion using biologically inspired spiking neural network. Biomed. Signal Process. Control.
**2015**, 18, 91–101. [Google Scholar] [CrossRef] - Yin, M.; Liu, W.; Zhao, X.; Yin, Y.; Guo, Y. A novel image fusion algorithm based on nonsubsampled shearlet transform. Optik
**2014**, 125, 2274–2282. [Google Scholar] [CrossRef] - Yin, M.; Liu, X.; Liu, Y.; Chen, X. Medical Image Fusion with Parameter-Adaptive Pulse Coupled Neural Network in Nonsubsampled Shearlet Transform Domain. IEEE Trans. Instrum. Meas.
**2018**, 1–16. [Google Scholar] [CrossRef] - Hu, J.; Li, S. The multiscale directional bilateral filter and its application to multisensor image fusion. Inf. Fusion
**2012**, 13, 196–206. [Google Scholar] [CrossRef] - Li, S.; Kang, X.; Hu, J. Image Fusion with Guided Filtering. IEEE Trans. Image Process.
**2013**, 22, 2864–2875. [Google Scholar] - Yang, B.; Li, S. Pixel level image fusion with simultaneous orthogonal matching pursuit. Inf. Fusion
**2012**, 13, 10–19. [Google Scholar] [CrossRef] - Li, S.; Yin, H. Multimodal image fusion with joint sparsity model. Opt. Eng.
**2011**, 50, 1–11. [Google Scholar] - Liu, Y.; Chen, X.; Ward, R.; Wang, Z. Image Fusion with Convolutional Sparse Representation. IEEE Signal Process. Lett.
**2016**, 23, 1882–1886. [Google Scholar] [CrossRef] - Fei, Y.; Wei, G.; Zongxi, S. Medical Image Fusion Based on Feature Extraction and Sparse Representation. Int. J. Biomed. Imaging
**2017**, 2017, 1–11. [Google Scholar] [CrossRef] [PubMed] - Zhang, Q.; Liu, Y.; Blum, R.; Han, J.; Tao, D. Sparse Representation Based Multi-sensor Image Fusion for Multi-focus and Multi-modality Images. Inf. Fusion
**2018**, 40, 57–75. [Google Scholar] [CrossRef] - Das, S.; Kundu, M.K. A neuro-fuzzy approach for medical image fusion. IEEE Trans. Biomed. Eng.
**2013**, 60, 3347–3353. [Google Scholar] [CrossRef] [PubMed] - Li, S.; Yang, B. Multifocus image fusion by combining curvelet and wavelet transform. Pattern Recognit. Lett.
**2008**, 29, 1295–1301. [Google Scholar] [CrossRef] - Liu, Y.; Liu, S. A general framework for image fusion based on multi-scale transform and sparse representation. Inf. Fusion
**2015**, 24, 147–164. [Google Scholar] [CrossRef] - Wang, J.; Peng, J.; Feng, X.; He, G.; Wu, J.; Yan, K. Image fusion with nonsubsampled contourlet transform and sparse representation. J. Electron. Imaging
**2013**, 22, 1–15. [Google Scholar] [CrossRef][Green Version] - Zhu, Z.; Yin, H.; Chai, Y.; Li, Y.; Qi, G. A novel multi-modality image fusion method based on image decomposition and sparse representation. Inf. Sci.
**2018**, 432, 516–529. [Google Scholar] [CrossRef] - He, C.; Liu, Q.; Li, H.; Wang, H. Multimodal medical image fusion based on IHS and PCA. Procedia Eng.
**2010**, 7, 280–285. [Google Scholar] [CrossRef][Green Version] - Piella, G. A general framework for multiresolution image fusion: From pixels to regions. Inf. Fusion
**2003**, 4, 259–280. [Google Scholar] [CrossRef][Green Version] - Wang, Q.; Shen, Y. Performances evaluation of image fusion techniques based on nonlinear correlation measurement. In Proceedings of the 21st IEEE Instrumentation and Measurement Technology Conference (IEEE Cat. No.04CH37510), Como, Italy, 18–20 May 2004; Volume 1, pp. 472–475. [Google Scholar]
- Haghighat, M.B.A.; Aghagolzadeh, A.; Seyedarabi, H. A Non-reference Image Fusion Metric Based on Mutual Information of Image Features. Comput. Electr. Eng
**2011**, 37, 744–756. [Google Scholar] [CrossRef] - Piella, G.; Heijmans, H. A new quality metric for image fusion. In Proceedings of the International Conference on Image Processing (Cat. No.03CH37429), Barcelona, Spain, 14–17 September 2003; Volume 3, p. iii-173. [Google Scholar]
- Han, Y.; Cai, Y.; Cao, Y.; Xu, X. A new image fusion performance metric based on visual information fidelity. Inf. Fusion
**2013**, 14, 127–135. [Google Scholar] [CrossRef] - Liu, Y.; Chen, X.; Wang, Z.; Wang, Z.J.; Ward, R.K.; Wang, X. Deep learning for pixel-level image fusion: Recent advances and future prospects. Inf. Fusion
**2018**, 42, 158–173. [Google Scholar] [CrossRef] - Liu, Y.; Chen, X.; Cheng, J.; Peng, H. A medical image fusion method based on convolutional neural networks. In Proceedings of the 20th International Conference on Information Fusion, Xi’an, China, 10–13 July 2017; pp. 1–7. [Google Scholar]
- Li, H.; Wu, X.; Kittler, J. Infrared and Visible Image Fusion using a Deep Learning Framework. In Proceedings of the 2018 24th International Conference on Pattern Recognition (ICPR), Beijing, China, 20–24 August 2018; pp. 2705–2710. [Google Scholar]
- Zhong, J.; Yang, B.; Li, Y.; Zhong, F.; Chen, Z. Image Fusion and Super-Resolution with Convolutional Neural Network. Pattern Recognit.
**2016**, 663, 78–88. [Google Scholar] - Du, C.; Gao, S. Image Segmentation-based Multi-focus Image Fusion through Multi-scale Convolutional Neural Network. IEEE Access
**2017**, 5, 15750–15761. [Google Scholar] [CrossRef] - Yan, X.; Gilani, S.Z.; Qin, H.; Mian, A. Unsupervised deep multi-focus image fusion. arXiv
**2018**, arXiv:1806.07272. [Google Scholar] - Yang, B.; Zhong, J.; Li, Y.; Chen, Z. Multi-focus Image Fusion and Super-resolution with Convolutional Neural Network. Int. J. Wavelets Multiresolut. Inf. Process.
**2017**, 15, 1750037. [Google Scholar] [CrossRef] - Liu, Y.; Chen, X.; Peng, H.; Wang, Z. Multi-focus image fusion with a deep convolutional neural network. Inf. Fusion
**2017**, 36, 191–207. [Google Scholar] [CrossRef] - Ma, B.; Ban, X.; Huang, H.; Zhu, Y. SESF-Fuse:An unsupervised deep model for multi-focus image fusion. arXiv
**2020**, arXiv:1908.01703. [Google Scholar] - Guo, X.; Nie, R.; Cao, J.; Zhou, D.; Mei, L.; He, K. FuseGAN: Learning to Fuse Multi-Focus Image via Conditional Generative Adversarial Network. IEEE Trans. Multimed.
**2019**, 21, 1982–1996. [Google Scholar] [CrossRef] - Huang, J.; Le, Z.; Ma, Y.; Mei, X.; Fan, F. ACGAN: A generative adversarial network with adaptive constraints for multi-focus image fusion. Neural Comput. Appl.
**2020**, 32, 15119–15129. [Google Scholar] [CrossRef] - Kumar, N.; Hoffmann, N.; Oelschlägel, M.; Koch, E.; Kirsch, M.; Gumhold, S. Structural Similarity Based Anatomical and Functional Brain Imaging Fusion. In Multimodal Brain Image Analysis and Mathematical Foundations of Computational Anatomy. MBIA 2019, MFCA 2019; Lecture Notes in Computer Science; Springer: Cham, Switzerland, 2019; Volume 11846. [Google Scholar]
- Prabhakar, K.R.; Srikar, V.S.; Babu, R.V. DeepFuse: A Deep Unsupervised Approach for Exposure Fusion with Extreme Exposure Image Pairs. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 4724–4732. [Google Scholar]
- Shopovska, I.; Jovanov, L.; Philips, W. Deep Visible and Thermal Image Fusion for Enhanced Pedestrian Visibility. Sensors
**2019**, 19, 3727. [Google Scholar] [CrossRef] [PubMed][Green Version] - Li, H.; Wu, X. DenseFuse: A Fusion Approach to Infrared and Visible Images. IEEE Trans. Image Process.
**2019**, 28, 2614–2623. [Google Scholar] [CrossRef][Green Version] - Hou, R.; Zhou, D.; Nie, R.; Liu, D.; Xiong, L.; Guo, Y.; Yu, C. VIF-Net: An Unsupervised Framework for Infrared and Visible Image Fusion. IEEE Trans. Comput. Imaging
**2020**, 6, 640–651. [Google Scholar] [CrossRef] - Ma, J.; Yu, W.; Liang, P.; Li, C.; Jiang, J. FusionGAN: A generative adversarial network for infrared and visible image fusion. Inf. Fusion
**2019**, 48, 11–26. [Google Scholar] [CrossRef] - Ma, J.; Liang, P.; Yu, W.; Chen, C.; Guo, X.; Wu, J.; Jiang, J. Infrared and visible image fusion via detail preserving adversarial learning. Inf. Fusion
**2020**, 54, 85–98. [Google Scholar] [CrossRef] - Xu, D.; Wang, Y.; Xu, S.; Zhu, K.; Zhang, N.; Zhang, X. Infrared and Visible Image Fusion with a Generative Adversarial Network and a Residual Network. Appl. Sci.
**2020**, 10, 554. [Google Scholar] [CrossRef][Green Version] - Ma, J.; Xu, H.; Jiang, J.; Mei, X.; Zhang, X. DDcGAN: A Dual-Discriminator Conditional Generative Adversarial Network for Multi-Resolution Image Fusion. IEEE Trans. Image Process.
**2020**, 29, 4980–4995. [Google Scholar] [CrossRef] - Xu, H.; Ma, J.; Le, Z.; Jiang, J.; Guo, X. FusionDN: A Unified Densely Connected Network for Image Fusion. In Proceedings of the Thirty-Fourth AAAI Conference on Artificial Intelligence, New York, NY, USA, 7–12 February 2020. [Google Scholar]
- Joo, D.; Kim, D.; Kim, J. Generating a fusion image: One’s identity and another’s shape. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–22 June 2018; pp. 1635–1643. [Google Scholar]
- Simonyan, K.; Vedaldi, A.; Zisserman, A. Deep inside convolutional networks: Visualising image classification models and saliency maps. arXiv
**2013**, arXiv:1312.6034. [Google Scholar] - Springenberg, J.T.; Dosovitskiy, A.; Brox, T.; Riedmiller, M. Striving for simplicity: The all convolutional net. ICLR (workshop track). arXiv
**2015**, arXiv:1412.6806. [Google Scholar] - Selvaraju, R.R.; Cogswell, M.; Das, A.; Vedantam, R.; Parikh, D.; Batra, D. Grad-CAM: Visual explanations from deep networks via gradient-based localization. In Proceedings of the 2017 IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017; pp. 618–626. [Google Scholar]
- Bach, S.; Binder, A.; Montavon, G.; Klauschen, F.; Muller, K.R.; Samek, W. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagatiom. PLoS ONE
**2015**, 10, e0130140. [Google Scholar] [CrossRef] [PubMed][Green Version] - Shrikumar, A.; Greenside, P.; Kundaje, A. Learning important features through propagating activation differences. arXiv
**2017**, arXiv:1704.02685. [Google Scholar] - Zeiler, M.D.; Fergus, R. Visualising and understanding convolutional networks. In European Conference on Computer Vision; Springer: Cham, Switzerland, 2014; pp. 818–833. [Google Scholar]
- Ribeiro, M.T.; Singh, S.; Guestrin, C. Why should I trust you?: Explaining the predictions of any classifier. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining KDD’16, San Francisco, CA, USA, 13–17 August 2016; pp. 1135–1144. [Google Scholar]
- Fong, R.C.; Vedaldi, A. Interpretable explanations of black boxes by meaningful perturbations. In Proceedings of the The IEEE International Conference on Computer Vision (ICCV), Venice, Italy, 22–29 October 2017. [Google Scholar]
- Sundararajan, M.; Taly, A.; Yan, Q. Axiomatic attribution for deep networks. In Proceedings of the 34th International Conference on Machine Learning ICML’17, Sydney, Australia, 6–11 August 2017; Volume 70, pp. 3319–3328. [Google Scholar]
- Kumar, N.; Hoffmann, N.; Kirsch, M.; Gumhold, S. Visualisation of Medical Image Fusion and Translation for Accurate Diagnosis of High Grade Gliomas. In Proceedings of the 2020 IEEE 17th International Symposium on Biomedical Imaging (ISBI), Iowa City, IA, USA, 3–7 April 2020; pp. 1–5. [Google Scholar]
- Goodfellow, I. Efficient Per-Example Gradient Computations. arXiv
**2015**, arXiv:1510.01799. [Google Scholar] - Xu, H.; Fan, F.; Zhang, H.; Le, Z.; Huang, J. A Deep Model for Multi-Focus Image Fusion Based on Gradients and Connected Regions. IEEE Access
**2020**, 8, 26316–26327. [Google Scholar] [CrossRef] - Clifford, R.J. The Alzheimer’s Disease neuroimaging initiative (ADNI). J. Magn. Reson. Imaging
**2008**, 27, 685–691. [Google Scholar] - Johnson, K.; Becker, J. The Whole Brain Atlas. Available online: http://www.med.harvard.edu/AANLIB/home.html (accessed on 9 December 2020).

**Figure 2.**The first and second row shows the effect of the gamma correction on the jacobian PET and guidance PET images of MaskNet network with ${\gamma}_{corr1}$ and ${\gamma}_{corr2}$ varying between 0.1 and 2.0.

**Figure 4.**The figure shows the fusion results for each of the fusion methods. The zoomed image within the clinical region of interests is always placed on the right of the unzoomed image.

**Figure 5.**The figure shows the guidance MRI and guidance PET images for each of the fusion methods in the clinical region of interests. The ${\gamma}_{corr2}$ was fixed at 0.5 for all the guidance images.

**Figure 7.**The figure shows the jacobian MRI and jacobian PET images for each of the fusion methods in the clinical region of interests. The ${\gamma}_{corr1}$ was fixed at 0.3 for all the jacobian images.

**Figure 8.**The figure shows the scatterplots between the gradients of the guidance MRI and guidance PET images for each of the fusion methods. The green scatter points are the gradients for the pixels in the zoomed region of interest.

**Table 1.**The table shows the partial loss values of the trained fusion networks after 200 epochs and the fine-tuned hyperparameter configurations.

Network | $\mathit{\lambda}$ | ${\mathit{\gamma}}_{\mathit{ssim}}$ | ${\mathit{\gamma}}_{{\mathit{\ell}}_{\mathbf{2}}}$ | ${\mathit{L}}_{\mathit{SSIM}}^{\mathit{MRI}}$ | ${\mathit{L}}_{\mathit{SSIM}}^{\mathit{PET}}$ | ${\mathit{L}}_{{\mathit{\ell}}_{\mathbf{2}}}^{\mathit{MRI}}$ | ${\mathit{L}}_{{\mathit{\ell}}_{\mathbf{2}}}^{\mathit{PET}}$ |
---|---|---|---|---|---|---|---|

FunFuseAn | 0.99 | 0.47 | 0.5 | 0.2524 | 0.2147 | 0.0148 | 0.0094 |

MaskNet | 0.99 | 0.5 | 0.494 | 0.2208 | 0.2236 | 0.0109 | 0.0115 |

DeepFuse | 0.99 | 0.497 | 0.5 | 0.2824 | 0.2830 | 0.0385 | 0.0338 |

DeepPedestrian | 0.99 | 0.52 | 0.5 | 0.2164 | 0.2219 | 0.0148 | 0.0175 |

**Table 2.**The table shows the timing and frame rate results for each of the fusion-based neural networks.

Setting | FunFuseAn | MaskNet | DeepFuse | DeepPedestrian |
---|---|---|---|---|

Jacobian computations | 0.003 s | 0.004 s | 0.003 s | 0.005 s |

FuseVis-Jacobian images | 20 fps | 15 fps | 20 fps | 10 fps |

Guidance images | 198 s | 265 s | 195 s | 360 s |

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Kumar, N.; Gumhold, S. FuseVis: Interpreting Neural Networks for Image Fusion Using Per-Pixel Saliency Visualization. *Computers* **2020**, *9*, 98.
https://doi.org/10.3390/computers9040098

**AMA Style**

Kumar N, Gumhold S. FuseVis: Interpreting Neural Networks for Image Fusion Using Per-Pixel Saliency Visualization. *Computers*. 2020; 9(4):98.
https://doi.org/10.3390/computers9040098

**Chicago/Turabian Style**

Kumar, Nishant, and Stefan Gumhold. 2020. "FuseVis: Interpreting Neural Networks for Image Fusion Using Per-Pixel Saliency Visualization" *Computers* 9, no. 4: 98.
https://doi.org/10.3390/computers9040098