# Interactive Visual Analysis of Mass Spectrometry Imaging Data Using Linear and Non-Linear Embeddings

^{1}

^{2}

^{*}

## Abstract

**:**

## 1. Introduction

- the detection of image regions based on both all relevant information contained in the spectra and the spatial distribution in the imaging space and
- the analysis of the detected image regions to compare different regions with respect to which compounds are decisive for forming the regions and contrasting them against other regions.

## 2. Background

## 3. Requirement Analysis

**Data description.**Using a mathematical notation, the given data can be described as a field with a 2D domain $D\subset {\mathbb{R}}^{2}$ and a range $R\subset {\mathbb{R}}^{n}$ that consists of spectra. The field is sampled equidistantly in both spatial dimensions. At each sample ${\mathbf{p}}_{ij}$ for $(i,j)\in \{1,\dots ,{r}_{x}\}\times \{1,\dots ,{r}_{y}\}$, where ${r}_{x}$ and ${r}_{y}$ describe the spatial resolutions, a spectrum $\mathbf{s}\left({\mathbf{p}}_{ij}\right)\in R$ is given by an n-dimensional vector of scalar values, which represent the peak intensities. As additional meta information, we hold the $m/z$-ratios for the n selected peaks. The dimensionality n of the range is commonly in the range of hundreds of dimensions.

**Analysis tasks.**A requirement analysis was performed with the data providers (co-authors of the paper), who are also the target users of the interactive visual analysis system.

- (T1)
- Which image regions form homogeneous areas with respect to the entire spectral information?
- (T2)
- Which peaks are most descriptive to distinguish image regions?
- (T3)
- What are the distributions of descriptive peaks within an image region?
- (T4)
- How do the spectra/descriptive peaks compare for different image regions?

## 4. Related Work

## 5. Methodology

#### 5.1. Interactive Cluster Generation

**Non-linear embedding.**For the visual encoding of the spectral space, the peaks can be interpreted as spectral dimensions, i.e., each MSI pixel ${\mathbf{p}}_{ij}$ can be represented as a point $\mathbf{s}\left({\mathbf{p}}_{ij}\right)$ in an n-dimensional spectral space, where n is the number of considered peaks. A typical pre-processing step before computing lower-dimensional embeddings is a normalization of the dimensions [38], which we also perform here.

**Coordinated interaction with image-space view.**Defining clusters in the 2D embedding only consider spectral information. To relate it back to the image we use coordinated views. For image-space visualization, we simply highlight selections made in the 2D embedding by color-coding the respective pixels at their locations in image-space. For example, the yellow cluster selected in Figure 4a relates to the yellow pixels in Figure 4c. Similarly, multiple cluster selections are shown in image-space by assigning to different clusters different colors, see Figure 3c,d.

**Integrated view.**Another observation we can make from Figure 4 is that the selected cluster actually forms multiple image regions. We could use the lasso tool in the coordinate image-space view to separate them. However, we also propose an integrated view that allows for splitting the cluster in the point-based view, which may allow for more efficient and intuitive interactions. The integrated view uses both spectral and image information. It uses a 3D layout, where two dimensions are used for the 2D embedding and the third dimension reflects proximity of pixels in the image-space.

#### 5.2. Cluster Analysis

**Linear embedding.**The goal was to find a linear embedding that best separates the clusters of a given labeled multi-dimensional data set. Automatic approaches exist such as variants of linear discriminant analysis (LDA), e.g., [50], but we are interested in providing a user-centric approach, where the user can select interactively, which clusters to separate, possibly giving the separation of one cluster more weight than the separation of others, possibly focusing only on two clusters, or possibly focusing on one cluster against the rest. Hence, we want to provide an interactive cluster separation approach.

**Statistical graphics.**The selected peaks can be further investigated using statistical plots. Since we have detected the peaks that best discriminate a cluster of interest, we are interested in investigating the intensity values of the detected peaks for the cluster (Task T3) and comparing them against other clusters or against the rest (Task T4). A widely used approach for such comparisons of groups is to use statistical plots such as boxplots. Boxplots summarize the distributions of the intensity values of a group by revealing main statistical values including the median, the interquartile range, and the min–max range in the form of whiskers. We propose to have one statistical plot with juxtaposed boxplots for each of the selected peaks in the given order, see Figure 9. For the boxplots, we report the original intensities, i.e., without normalization, as it provides additional information, whether values are generally high for an $m/z$-ratio, or not.

**Image-space visualization.**Having identified peaks that allow for distinguishing selected image regions, it is of interest to visualize the intensity distributions of these peaks in the image space. This allows for validating the peak selection and observing where in image space intensities differ. For the image-space visualization of the intensity distribution of a selected peak we use a typical color mapping of the intensities. Different color maps can be chosen such as the multi-hue luminance color map shown in Figure 10c. Using this color map, Figure 10a,b show examples for the selected peaks. We can observe that these peaks indeed exhibit regional differences. Moreover, we observe that the two peaks exhibit different intensity patterns.

## 6. Results

**Mouse cerebellum in negative ion mode.**The first MSI data set we considered is that of a mouse cerebellum in a negative ion mode with $112\times 138=15,456$ pixels and for each pixel a spectrum of 500 peaks after peak picking. We start our analysis by visualizing the spectral information in the non-linear embedding. We can immediately observe four clusters, which we select interactively as shown in Figure 11b. By visual inspection, one can observe that the corresponding image regions in Figure 11a already match the given ground truth in Figure 11d quite well. The main regions such as white matter (WM), granular layer (GL), and molecular layer (ML) were well segmented. As the histological images are not registered with the MSI data, we only perform a qualitative analysis here. We further analyze the data set by selecting each of the four clusters and generate non-linear embeddings for each of them individually. Figure 3 shows that the upper cluster actually splits into two clusters when only considering this sub-population for a re-configured non-linear embedding. If homogeneous regions in spectral space still form multiple connected regions in image-space, we can further split them using the space-filling curve approach, as shown in Figure 5. The resulting image regions can be cleaned by adjusting the clustering decisions for noisy pixels using image-space operations, as shown in Figure 4. The overall result of the interactive cluster generation is shown in Figure 11c. When compared to the ground truth in Figure 11d, this outcome can be considered of high quality. (Please note that the ground truth image is missing a part on the right when compared to the MS image.)

**Mouse cerebellum in positive ion mode.**The second MSI data set we considered is that of a mouse cerebellum in a positive ion mode with $114\times 135=13,110$ pixels and for each pixel a spectrum of 500 peaks after peak picking. We conducted an interactive visual analysis following the same workflow as above. Figure 12a shows the non-linear embedding and Figure 12b the interactive selection of clusters in the non-linear embedding. Figure 12c shows the respective image regions. Selecting the red cluster and computing a non-linear embedding for the sub-population allows us to split the cluster into two meaningful structures, see Figure 12e. Figure 12f shows the final result of the interactive cluster generation, which again matches very well the ground truth in Figure 12d. In particular, we can relate the red cluster to the region annotated as granular layer (GL) and the orange cluster to the region annotated as molecular layer (ML).

**Rat testis.**The third MSI data set we considered is that of a rat testis in positive ion mode with $196\times 196=38,220$ pixels and for each pixel a spectrum of 500 peaks after peak picking. We followed again the same analytical workflow. Here, the clusters were not so well separated in the non-linear embedding, but using transparency for point rendering we were able to select three somewhat separated clusters, see Figure 16b. The respective image-space visualization in Figure 16a nevertheless conveys structures that match the ground truth in Figure 16c very well. Cluster analysis via linear embeddings and SC (see Figure 17) delivers most discriminant peaks, as shown in Figure 18 for the red vs. the yellow cluster. The image-space visualizations of selected peaks in Figure 19 exhibits again significant structures but also the necessity for analyzing multiple peaks simultaneously.

## 7. Discussion and User Feedback

**Parameters and reproducibility.**The quality of the non-linear embedding is crucial for cluster detection. The t-SNE approach is generally applicable, but requires us to choose an appropriate perplexity parameter, which determines the size of the Gaussian kernels used in the normal distributions. We used perplexity of 50 when considering the full data sets and 25 when considering a sub-population. We investigated the influence of the perplexity value on the outcome and discovered that the results are quite robust against changes. Parameter tuning was not necessary. Moreover, the t-SNE approach requires a random initialization, which influences the outcome of the 2D embedding computation. However, for all MSI data sets considered, the 2D embeddings were generated very robustly and results were easily and reliably reproducible. Hence, the initial configuration did not influence the analysis outcome in a noticeable manner.

**Detection of discriminant peaks.**In the cluster analysis step, an alternative to our peak selection mechanism via the linear embeddings would have been to choose the dimensions with lowest computed p-values. However, we observed that some peaks happen to have low p-values for selected clusters, while the image-space visualization revealed no obvious structure. The reason for this is that the values for that peak were generally low for all structures such that the statistically significant increase was irrelevant. With our selection of discriminant peaks via the separation in linear embeddings, we did not encounter such cases. All peaks that were considered discriminant, indeed showed relevant structures. Of course, some of the selected peaks exposed similar structures.

**User feedback.**When performing interactive sessions with domain experts, they were very positive about the tool. They rated it as very useful and stated that their community would be interested in using this tool. We have not yet made our tool publicly available, but based on this positive feedback would like to do so soon. The experts were particularly positive about the cluster analysis/interpretation step. They are currently using the SCiLS tool [25] in their laboratory and report that the automatic clustering produces satisfactory results, but that an interactive cluster analysis as we propose is not provided. Cluster analysis in the SCiLS tool uses an ROC analysis, i.e., there is no interaction mechanisms for refining the clustering. The interactivity was appreciated as a unique selling point for our tool. They also pointed us to the paper by Abdelmoula et al. [26,27], which is using automatic clustering in a 2D t-SNE embedding. This work has gained attention in their community due to good clustering results, but its main drawback is that it does not reveal what drives the cluster formation. Using our linear embedding, we are able to provide such information in the form of the most discriminant peaks.

## 8. Conclusions

## Author Contributions

## Funding

## Conflicts of Interest

## References

- Yuste, R. Fluorescence microscopy today. Nat. Methods
**2005**, 2, 902. [Google Scholar] [CrossRef] [PubMed] - Buchberger, A.R.; DeLaney, K.; Johnson, J.; Li, L. Mass Spectrometry Imaging: A Review of Emerging Advancements and Future Insights. Anal. Chem.
**2018**, 90, 240–265. [Google Scholar] [CrossRef] [PubMed] - Cole, L.M. (Ed.) Imaging Mass Spectrometry; Springer: New York, NY, USA, 2017. [Google Scholar] [CrossRef][Green Version]
- Ifa, D.R.; Wu, C.; Ouyang, Z.; Cooks, R.G. Desorption electrospray ionization and other ambient ionization methods: Current progress and preview. Analyst
**2010**, 135, 669–681. [Google Scholar] [CrossRef] [PubMed] - Aichler, M.; Walch, A. MALDI Imaging mass spectrometry: Current frontiers and perspectives in pathology research and practice. Lab. Investig.
**2015**, 95, 422–431. [Google Scholar] [CrossRef][Green Version] - Kompauer, M.; Heiles, S.; Spengler, B. Atmospheric pressure MALDI mass spectrometry imaging of tissues and cells at 1.4-mm lateral resolution. Nat. Methods
**2016**, 14, 90–96. [Google Scholar] [CrossRef] - Bouschen, W.; Schulz, O.; Eikel, D.; Spengler, B. Matrix vapor deposition/recrystallization and dedicated spray preparation for high-resolution scanning microprobe matrix-assisted laser desorption/ionization imaging mass spectrometry (SMALDI-MS) of tissue and single cells. Rapid Commun. Mass Spectrom.
**2010**, 24, 355–364. [Google Scholar] [CrossRef] - Dreisewerd, K. The Desorption Process in MALDI. Chem. Rev.
**2003**, 103, 395–426. [Google Scholar] [CrossRef] - Soltwisch, J.; Kettling, H.; Vens-Cappell, S.; Wiegelmann, M.; Muthing, J.; Dreisewerd, K. Mass spectrometry imaging with laser-induced postionization. Science
**2015**, 348, 211–215. [Google Scholar] [CrossRef] - Ellis, S.R.; Bruinen, A.L.; Heeren, R.M.A. A critical evaluation of the current state-of-the-art in quantitative imaging mass spectrometry. Anal. Bioanal. Chem.
**2014**, 406, 1275–1289. [Google Scholar] [CrossRef] - Ràfols, P.; Vilalta, D.; Brezmes, J.; Cañellas, N.; del Castillo, E.; Yanes, O.; Ramírez, N.; Correig, X. Signal preprocessing, multivariate analysis and software tools for MA(LDI)-TOF mass spectrometry imaging for biological applications. Mass Spectrom. Rev.
**2016**, 37, 281–306. [Google Scholar] [CrossRef] - Kriegsmann, J.; Kriegsmann, M.; Casadonte, R. MALDI TOF imaging mass spectrometry in clinical pathology: A valuable tool for cancer diagnostics (Review). Int. J. Oncol.
**2014**, 46, 893–906. [Google Scholar] [CrossRef] [PubMed][Green Version] - Stoeckli, M.; Staab, D.; Schweitzer, A. Compound and metabolite distribution measured by MALDI mass spectrometric imaging in whole-body tissue sections. Int. J. Mass Spectrom.
**2007**, 260, 195–202. [Google Scholar] [CrossRef] - WATERS. The Science of What’s Possible. Available online: http://www.waters.com/waters/en_GB/SYNAPT-G2-Si-High-Definition-Mass-Spectrometry/nav.htm?cid=134740622&locale=en_GB (accessed on 30 November 2020).
- Klinkert, I.; Chughtai, K.; Ellis, S.R.; Heeren, R.M. Methods for full resolution data exploration and visualization for large 2D and 3D mass spectrometry imaging datasets. Int. J. Mass Spectrom.
**2014**, 362, 40–47. [Google Scholar] [CrossRef] - Avtonomov, D.M.; Raskind, A.; Nesvizhskii, A.I. BatMass: A Java Software Platform for LC–MS Data Visualization in Proteomics and Metabolomics. J. Proteome Res.
**2016**, 15, 2500–2509. [Google Scholar] [CrossRef] [PubMed][Green Version] - Paschke, C.; Leisner, A.; Hester, A.; Maass, K.; Guenther, S.; Bouschen, W.; Spengler, B. Mirion—A software package for automatic processing of mass spectrometric images. J. Am. Soc. Mass Spectrom.
**2013**, 24, 1296–1306. [Google Scholar] [CrossRef] [PubMed] - Martin, R.; Markus, S. BioMap. Available online: https://ms-imaging.org/wp/biomap/ (accessed on 30 November 2020).
- Bokhart, M.T.; Nazari, M.; Garrard, K.P.; Muddiman, D.C. MSiReader v1.0: Evolving Open-Source Mass Spectrometry Imaging Software for Targeted and Untargeted Analyses. J. Am. Soc. Mass Spectrom.
**2018**, 29, 8–16. [Google Scholar] [CrossRef] - Hayakawa, E.; Fujimura, Y.; Miura, D. MSIdV: A versatile tool to visualize biological indices from mass spectrometry imaging data. Bioinformatics
**2016**, 32, 3852–3854. [Google Scholar] [CrossRef][Green Version] - Wijetunge, C.D.; Saeed, I.; Boughton, B.A.; Spraggins, J.M.; Caprioli, R.M.; Bacic, A.; Roessner, U.; Halgamuge, S.K. EXIMS: An improved data analysis pipeline based on a new peak picking method for EXploring Imaging Mass Spectrometry data. Bioinformatics
**2015**, 31, 3198–3206. [Google Scholar] [CrossRef][Green Version] - Albert-Jan, Y.; Joris, B.; Marilou, D.; Marnix, K.; Michel, C.; Roeland, L.; Steven, B.; Taco, W. MS Spectre: Mass Spectrometry Analysis Software. Available online: http://ms-spectre.sourceforge.net/ (accessed on 30 November 2020).
- Bemis, K.D.; Harry, A.; Eberlin, L.S.; Ferreira, C.; van de Ven, S.M.; Mallick, P.; Stolowitz, M.; Vitek, O. Cardinal: An R package for statistical analysis of mass spectrometry-based imaging experiments. Bioinformatics
**2015**, 31, 2418–2420. [Google Scholar] [CrossRef][Green Version] - Goracci, L.; Tortorella, S.; Tiberi, P.; Pellegrino, R.M.; Di Veroli, A.; Valeri, A.; Cruciani, G. Lipostar, a Comprehensive Platform-Neutral Cheminformatics Tool for Lipidomics. Anal. Chem.
**2017**, 89, 6257–6264. [Google Scholar] [CrossRef] - Zweigniederlassung Bremen der Bruker Daltonik GmbH, University of Bremen. SCiLS. Available online: https://scils.de/ (accessed on 30 November 2020).
- Abdelmoula, W.M.; Balluff, B.; Englert, S.; Dijkstra, J.; Reinders, M.J.; Walch, A.; McDonnell, L.A.; Lelieveldt, B.P. Data-driven identification of prognostic tumor subpopulations using spatially mapped t-SNE of mass spectrometry imaging data. Proc. Natl. Acad. Sci. USA
**2016**, 113, 12244–12249. [Google Scholar] [CrossRef] [PubMed][Green Version] - Abdelmoula, W.M.; Pezzotti, N.; Höllt, T.; Dijkstra, J.; Vilanova, A.; McDonnell, L.A.; Lelieveldt, B. Interactive Visual Exploration of 3D Mass Spectrometry Imaging Data Using Hierarchical Stochastic Neighbor Embedding Reveals Spatiomolecular Structures at Full Data Resolution. J. Proteome Res.
**2018**, 17, 1054–1064. [Google Scholar] [CrossRef] [PubMed][Green Version] - Nunes, M.; Laruelo, A.; Ken, S.; Laprie, A.; Bühler, K.A. A Survey on Visualizing Magnetic Resonance Spectroscopy Data. In Eurographics Workshop on Visual Computing for Biology and Medicine; Viola, I., Bühler, K., Ropinski, T., Eds.; The Eurographics Association: Geneva, Switzerland, 2014. [Google Scholar] [CrossRef]
- Jawad, M.; Molchanov, V.; Linsen, L. Coordinated Image and Featurespace Visualization for Interactive Magnetic Resonance Spectroscopy Imaging Data Analysis. Int. Conf. Inf. Vis. Theory Appl.
**2019**, 10, 118–128. [Google Scholar] - Garrison, L.; Vašíček, J.; Grüner, R.; Smit, N.N.; Bruckner, S. SpectraMosaic: An Exploratory Tool for the Interactive Visual Analysis of Magnetic Resonance Spectroscopy Data. In Eurographics Workshop on Visual Computing for Biology and Medicine; Kozlíková, B., Linsen, L., Vázquez, P.P., Lawonn, K., Raidou, R.G., Eds.; The Eurographics Association: Geneva, Switzerland, 2019. [Google Scholar] [CrossRef]
- Murchie, S.L.; Seelos, F.P.; Hash, C.D.; Humm, D.C.; Malaret, E.; McGovern, J.A.; Choo, T.H.; Seelos, K.D.; Buczkowski, D.L.; Morgan, M.F.; et al. Compact Reconnaissance Imaging Spectrometer for Mars investigation and data set from the Mars Reconnaissance Orbiter’s primary science phase. J. Geophys. Res. Planets
**2009**, 114, E00D07. [Google Scholar] [CrossRef][Green Version] - Pelkey, S.M.; Mustard, J.F.; Murchie, S.; Clancy, R.T.; Wolff, M.; Smith, M.; Milliken, R.; Bibring, J.P.; Gendrin, A.; Poulet, F.; et al. CRISM multispectral summary products: Parameterizing mineral diversity on Mars from reflectance. J. Geophys. Res. Planets
**2007**, 112. [Google Scholar] [CrossRef] - Blaas, J.; Botha, C.P.; Post, F.H. Interactive Visualization of Multi-Field Medical Data Using Linked Physical and Feature-Space Views. In Proceedings of the 9th Joint Eurographics/IEEE VGTC Conference on Visualization, Norrkoping, Sweden; Eurographics Association: Goslar, Germany, 2007; pp. 123–130. [Google Scholar]
- Linsen, L.; Long, T.V.; Rosenthal, P. Linking Multidimensional Feature Space Cluster Visualization to Multifield Surface Extraction. IEEE Comput. Graph. Appl.
**2009**, 29, 85–89. [Google Scholar] [CrossRef] - He, X.; Tao, Y.; Wang, Q.; Lin, H. Multivariate Spatial Data Visualization: A Survey. J. Vis.
**2019**, 22, 897–912. [Google Scholar] [CrossRef][Green Version] - van der Maaten, L.; Hinton, G. Visualizing data using t-SNE. J. Mach. Learn. Res.
**2008**, 9, 2579–2605. [Google Scholar] - Pezzotti, N.; Lelieveldt, B.P.F.; van der Maaten, L.; Höllt, T.; Eisemann, E.; Vilanova, A. Approximated and User Steerable tSNE for Progressive Visual Analytics. IEEE Trans. Vis. Comput. Graph.
**2017**, 23, 1739–1752. [Google Scholar] [CrossRef][Green Version] - Rubio-Sánchez, M.; Sanchez, A. Axis Calibration for Improving Data Attribute Estimation in Star Coordinates Plots. Vis. Comput. Graph. IEEE Trans.
**2014**, 20, 2013–2022. [Google Scholar] [CrossRef] - Elmqvist, N.; Dragicevic, P.; Fekete, J. Rolling the Dice: Multidimensional Visual Exploration using Scatterplot Matrix Navigation. IEEE Trans. Vis. Comput. Graph.
**2008**, 14, 1539–1548. [Google Scholar] [CrossRef] [PubMed][Green Version] - Inselberg, A. The plane with parallel coordinates. Vis. Comput.
**1985**, 1, 69–91. [Google Scholar] [CrossRef] - Cox, T.F.; Cox, M.A.A. Multidimensional Scaling; Chapman and Hall: London, UK, 1994. [Google Scholar]
- van der Maaten, L. Accelerating t-SNE using tree-based algorithms. J. Mach. Learn. Res.
**2014**, 15, 3221–3245. [Google Scholar] - Peano, G. Sur une courbe, qui remplit toute une aire plane. Math. Ann.
**1890**, 36, 157–160. [Google Scholar] [CrossRef] - Sierpínski, W. Sur une nouvelle courbe continue qui remplit toute une aire plane. Bull. Acad. Sci. Crac.
**1912**, 462–478. [Google Scholar] - Hilbert, D. Über die stetige Abbildung einer Linie auf ein Flächenstück. In Dritter Band: Analysis· Grundlagen der Mathematik· Physik Verschiedenes; Springer: Berlin/Heidelberg, Germany, 1935; pp. 1–2. [Google Scholar]
- Pascucci, V.; Laney, D.E.; Frank, R.J.; Scorzelli, G.; Linsen, L.; Hamann, B.; Gygi, F. Real-time Monitoring of Large Scientific Simulations. In Proceedings of the 2003 ACM Symposium on Applied Computing, Melbourne, FL, USA; ACM: New York, NY, USA, 2003; pp. 194–198. [Google Scholar] [CrossRef][Green Version]
- Weissenböck, J.; Fröhler, B.; Gröller, E.; Kastner, J.; Heinzl, C. Dynamic Volume Lines: Visual Comparison of 3D Volumes through Space-filling Curves. IEEE Trans. Vis. Comput. Graph.
**2019**, 25, 1040–1049. [Google Scholar] [CrossRef] - Holzmüller, D. Efficient neighbor-finding on space-filling curves. arXiv
**2017**, arXiv:1710.06384. [Google Scholar] - Skubalska-Rafajłowicz, E. Applications of the space—filling curves with data driven measure—Preserving property. Nonlinear Anal. Theory Methods Appl.
**1997**, 30, 1305–1310. [Google Scholar] [CrossRef] - Ye, J. Least Squares Linear Discriminant Analysis. In Proceedings of the 24th International Conference on Machine Learning, Corvalis, OR, USA; ACM: New York, NY, USA, 2007; pp. 1087–1093. [Google Scholar] [CrossRef]
- Molchanov, V.; Linsen, L. Interactive Design of Multidimensional Data Projection Layout. In EuroVis-Short Papers; Elmqvist, N., Hlawitschka, M., Kennedy, J., Eds.; The Eurographics Association: Geneva, Switzerland, 2014. [Google Scholar] [CrossRef]
- Kandogan, E. Star coordinates: A multi-dimensional visualization technique with uniform treatment of dimensions. In Proceedings of the IEEE Information Visualization Symposium, Salt Lake City, UT, USA, 8–13 October 2000; Volume 650, p. 22. [Google Scholar]
- Teoh, S.T.; Ma, K.L. StarClass: Interactive Visual Classification using Star Coordinates. In Proceedings of the 2003 SIAM International Conference on Data Mining, San Francisco, CA, USA, 1–3 May 2003; pp. 178–185. [Google Scholar]
- Chen, K. Optimizing star-coordinate visualization models for effective interactive cluster exploration on big data. Intell. Data Anal.
**2014**, 18, 117–136. [Google Scholar] [CrossRef][Green Version] - Khalid, N.E.A.; Yusoff, M.; Kamaru-Zaman, E.A.; Kamsani, I.I. Multidimensional Data Medical Dataset Using Interactive Visualization Star Coordinate Technique. Procedia Comput. Sci.
**2014**, 42, 247–254. [Google Scholar] [CrossRef][Green Version] - Kiyadeh, A.P.H.; Zamiri, A.; Yazdi, H.S.; Ghaemi, H. Discernible visualization of high dimensional data using label information. Appl. Soft Comput.
**2015**, 27, 474–486. [Google Scholar] [CrossRef]

**Figure 1.**(

**a**) Mass spectrum from the data set of rat testis with the 500 most intensive peaks depicted as intensities over $m/z$-ratios for a selected pixel indicated in (

**b**). Grayscale values of average peak intensity over all 500 peaks for each pixel (

**b**). Color-coded spatial intensity distribution in image-space for selected peak with $m/z$-ratio 846.5753 (

**c**). Data was taken from [9].

**Figure 2.**Analytical workflow: Clusters in MS spectra that form image regions are interactively generated and their spectral properties are interactively analyzed. Both steps are performed using coordinated views of (non-linear or linear) embeddings of spectral information and image-space visualizations.

**Figure 3.**Sub-population selection in non-linear embedding (

**b**) for further analysis in a re-configured non-linear embedding (

**d**) allows refinement of clusters. Respective image regions are highlighted in the coordinated image-space views (

**a**,

**c**). Methodology applied to mouse cerebellum in negative ion mode data set [9].

**Figure 4.**Non-linear embedding of spectral information (

**a**) allows for interactive cluster detection, e.g., the one highlighted in yellow. Coordinated image-space views (

**c**) highlight the respective image region. Integrated view (

**b**) of non-linear embedding with space-filling curve allows for detection of image regions within a cluster. Coordinated interaction in spectral and image views allows for generation of suitable image regions (

**d**–

**f**). Methodology applied to mouse cerebellum in negative ion mode data set [9].

**Figure 5.**Non-linear embedding of sub-population with horizontal axis being replaced by space-filling curve (

**a**) allows for separation of continuous image regions even in rather nested cases. (

**b**) Methodology applied to mouse cerebellum in negative ion mode data set [9].

**Figure 6.**Linear embedding of labeled data, where color-coded labels correspond to interactively generated clusters in Figure 11b. Medians of labeled groups highlighted by black frames serve as control points for interactive cluster separation.

**Figure 7.**Star coordinates (SCs) of linear embedding in Figure 6 reveals which dimensions were given more weight to separate clusters in the form of long dimension axis vectors.

**Figure 9.**Juxtaposed boxplot visualization for discriminant peaks selected in Figure 8 allows for quantitative comparative analyses of peak intensity distributions between selected image regions. Computed p-values for each peak are reported after its $m/z$-ratio in the labels of the horizontal axis.

**Figure 11.**Mouse cerebellum in negative ion mode data set [9]: Interactive selection of clusters in non-linear embedding (

**b**) reveals image regions (

**a**) close to ground truth (

**d**). Further interactive cluster refinement delivered an enhanced result (

**c**).

**Figure 12.**Mouse cerebellum in positive ion mode data set [9]: Non-linear embedding of spectral information (

**a**) and interactive selection of clusters in the non-linear embedding (

**b**) reveals image regions (

**c**) close to ground truth (

**d**). Selecting the red cluster in (

**b**) for generating a non-linear embedding of the sub-population (

**e**) allows for a refined and improved result (

**f**).

**Figure 16.**Rat testis data set [9]: Interactive selection of clusters in non-linear embedding (

**b**) reveals image regions (

**a**) close to ground truth (

**c**).

**Figure 18.**Juxtaposed boxplot visualization allows for quantitative comparative analyses of peak intensity distributions of selected image regions (cf. Figure 17) for selected discriminant peaks.

**Figure 20.**PCA for linear embeddings does not allow for observing/separating clusters: (

**a**) Mouse cerebellum in negative ion mode, (

**b**) mouse cerebellum in positive ion mode, and (

**c**) rat testis data sets [9].

Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations. |

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Jawad, M.; Soltwisch, J.; Dreisewerd, K.; Linsen, L.
Interactive Visual Analysis of Mass Spectrometry Imaging Data Using Linear and Non-Linear Embeddings. *Information* **2020**, *11*, 575.
https://doi.org/10.3390/info11120575

**AMA Style**

Jawad M, Soltwisch J, Dreisewerd K, Linsen L.
Interactive Visual Analysis of Mass Spectrometry Imaging Data Using Linear and Non-Linear Embeddings. *Information*. 2020; 11(12):575.
https://doi.org/10.3390/info11120575

**Chicago/Turabian Style**

Jawad, Muhammad, Jens Soltwisch, Klaus Dreisewerd, and Lars Linsen.
2020. "Interactive Visual Analysis of Mass Spectrometry Imaging Data Using Linear and Non-Linear Embeddings" *Information* 11, no. 12: 575.
https://doi.org/10.3390/info11120575