Integrated Multi-Omics Maps of Lower-Grade Gliomas

Simple Summary Data from multiple omics domains were increasingly generated in large-scale tumour studies to enhance our understanding of molecular mechanisms of cancer. We present an integrated cartography of three omics layers combining the transcriptome, methylome, and genome (copy number variations) into a unique mapping scheme which enabled us to decipher functional links within and between the omics domains. Application to lower grade gliomas reveals distinct networks governed either by methylation or copy number variations, both affecting transcriptomics modes of cell activity. The integrated maps provide an intuitive view on tumour heterogeneity across the omics layers distinguishing, e.g., astrocytoma- and oligodendroglioma-like glioma types. In a wider sense, multi-omics cartography deciphers the effect of different omes on tumour phenotypes and their molecular hallmarks with individual resolution. Abstract Multi-omics high-throughput technologies produce data sets which are not restricted to only one but consist of multiple omics modalities, often as patient-matched tumour specimens. The integrative analysis of these omics modalities is essential to obtain a holistic view on the otherwise fragmented information hidden in this data. We present an intuitive method enabling the combined analysis of multi-omics data based on self-organizing maps machine learning. It “portrays” the expression, methylation and copy number variations (CNV) landscapes of each tumour using the same gene-centred coordinate system. It enables the visual evaluation and direct comparison of the different omics layers on a personalized basis. We applied this combined molecular portrayal to lower grade gliomas, a heterogeneous brain tumour entity. It classifies into a series of molecular subtypes defined by genetic key lesions, which associate with large-scale effects on DNA methylation and gene expression, and in final consequence, drive with cell fate decisions towards oligodendroglioma-, astrocytoma- and glioblastoma-like cancer cell lineages with different prognoses. Consensus modes of concerted changes of expression, methylation and CNV are governed by the degree of co-regulation within and between the omics layers. The method is not restricted to the triple-omics data used here. The similarity landscapes reflect partly independent effects of genetic lesions and DNA methylation with consequences for cancer hallmark characteristics such as proliferation, inflammation and blocked differentiation in a subtype specific fashion. It can be extended to integrate other omics features such as genetic mutation, protein expression data as well as extracting prognostic markers.


Introduction
Multi-omics high-throughput technologies are producing a steeply increasing number of data sets which are not restricted to only one but consist of multiple omics modalities

Gene Expression, Methylation, and Copy Number Data of Gliomas
We here studied 122 WHO grade II and III adult-type gliomas (lower grade gliomas, LGG) collected from patients which were previously analysed by microarray-based gene expression profiling [22] and array-based DNA methylation profiling (Illumina 450K arrays) [25,35]. The tumours were stratified into genetic groups according to the WHO 2021-classification scheme [36], namely astrocytomas (IDH-A) carrying mutation in the IDH1 or IDH2 gene (IDH-mut) and no co-deletion of Chr1p and Chr19q (Chr1/Chr19 intact), oligodendrogliomas (IDH-O) with IDH-mut and Chr1p/19q codeletions (codel) and IDH-wt astrocytoma/glioblastoma-like tumours. In addition, we collected IDH-A gliomas with single Chr19q deletions without Chr1p co-deletions into a separate class IDH-A'. A schematic workflow of the data analysis is provided in Figure 1. LGG subtyping is illustrated in Figure 2.

Preprocessing and Multi-Omics CombiSOM Portrayal
Gene-centric expression (Gex), methylation (Dme), and copy number data (CNV) were preprocessed as described previously [25] and then centralized by subtracting the respective log-mean averaged over all gliomas (see Appendix A, Equation (A1)). The obtained omics scores define profiles given as data vector for each gene with the sample-related values as elements (Appendix A, Equation (A2)). Subsequently, the score values from the different omics modalities were harmonized to make the different numerical scales comparable (Appendix A, Equation (A3)). After centralization and harmonization, we merged the Gex, Dme, and CNV scores of each gene into one combined multi-omics score where the different omics data were combined with different mutual weights, w e , w m , w c , for Gex, Dme, and CNV data, respectively. They were chosen from the interval (0,1) and meet the condition w e + w m + w c = 1 (Figure 1, Appendix A, Equation (A4)).

Figure 1.
Workflow of integral multi-omics portrayal using combiSOM analysis. Different omics data layers were combined using weight factors and then trained together into one combined selforganizing map (combiSOM). After training, the three layers were decomposed to provide one separate portrait for each of them. Genes are located identically in all three portraits. Comparison of the three profiles reveals virtually anticorrelation between Gex and Dme for the red group and correlation between CNV and Gex for the cyan and blue groups. Downstream analysis considers clinical data to generate prognostic maps, to extract the functional context, and the relatedness between the omes and tumours. Symbols: wi with i=e, m, c denotes the weighting factor for combining the expression, methylation and CNV domains and e, m and c their feature values (see text).

Figure 1.
Workflow of integral multi-omics portrayal using combiSOM analysis. Different omics data layers were combined using weight factors and then trained together into one combined selforganizing map (combiSOM). After training, the three layers were decomposed to provide one separate portrait for each of them. Genes are located identically in all three portraits. Comparison of the three profiles reveals virtually anticorrelation between Gex and Dme for the red group and correlation between CNV and Gex for the cyan and blue groups. Downstream analysis considers clinical data to generate prognostic maps, to extract the functional context, and the relatedness between the omes and tumours. Symbols: w i with i = e, m, c denotes the weighting factor for combining the expression, methylation and CNV domains and ∆e, ∆m and ∆c their feature values (see text).
The combined profile vectors were then clustered into so-called metagene profiles (Appendix A, Equation (A5)) by applying machine learning using Self-Organizing Maps (SOM) using our implementation in oposSOM [19] as a combiSOM version [34]. The method uses an iterative training algorithm to ensure that the metagene profiles optimally cover the omics data space. SOM training arranges the metagenes in a quadratic grid used for visualization of the data in metagene space. We applied a SOM of size 45 × 45 and default parametrization of SOM training. After training, the combined multi-omics scores are back transformed into their original single omics components for visualization and further downstream analysis. Accordingly, each tumour sample studied is characterized by its state in each of the omics domains and visualized by a separate image, namely a Gex, a Dme and a CNV portrait, respectively. Importantly, each metagene and the associated single genes are located at the same position in each of the portraits enabling their direct comparison, e.g., to compare their expression, methylation, and copy number levels. Because SOM training applies to the combined multimodal vectors the topology of the resulting map is governed by the weighting factors, which, in turn, define the degree of couplings between the different omics layers. We here applied equal weights as default setting (w e = w m = w c = 1/3), ensuring balanced couplings between the Gex, Dme and CNV domains in the resulting SOM. Alternatively, one can apply "dominant weights" (e.g., w e = 0.99, w m = w c = 0.0005, for expression dominance) resulting in a single-ome topology of the resulting SOM.
organizing map (combiSOM). After training, the three layers were decomposed to provide one separate portrait for each of them. Genes are located identically in all three portraits. Comparison of the three profiles reveals virtually anticorrelation between Gex and Dme for the red group and correlation between CNV and Gex for the cyan and blue groups. Downstream analysis considers clinical data to generate prognostic maps, to extract the functional context, and the relatedness between the omes and tumours. Symbols: wi with i=e, m, c denotes the weighting factor for combining the expression, methylation and CNV domains and e, m and c their feature values (see text). LGG were classified according to their IDH mutation status (IDH-mutated tumours were subsumed as GCIMP) and the co-deletion status of Chr.1p/19q, and single deletion of Chr.19q and sorted in each of the groups with increasing GCIMP-methylation score [37]. Selected features such as TERT promoter mutation, CNV of Chr.7+ (gains) and Chr.10-(loss), WHO grade (II or III), and prognosis (hazard ratio) differ between the genetic groups. Expression (E1-E8) and methylation (M1-M6) groups as defined in [35] enrich in different genetic groups. The red lines serve as guide for the eye to show trends of GCIMP and G-protein coupled receptor (GPCR) DNA methylation.

ScoV (Signed Square Root Covariance) Maps and Mean Portraits
Co-variances between the different omics domains can be estimated by calculating the signed square root covariance (ScoV) between their metagene profiles in a pairwise fashion, e.g., between the Gex and Dme, the Gex and CNV; and between the Dme and CNV profiles (Appendix A, Equations (A6) and (A7)). SOM portraits of the three omics domains and of the three combined ScoV were obtained for each sample. For the sample groups, we calculated mean portraits by averaging the respective metagene values over all individual sample portraits of the respective group.

Spot Module Selection and Functional Analysis
Due to the self-organizing properties of the SOM, neighboured metagenes tend to be coloured similarly because of their similar profiles. In consequence, the obtained mosaic images show a smooth texture with red and blue spot-like regions referring to clusters of increased or decreased omics scores in the respective tumour. These "spots" represent clusters of co-expressed, co-methylated, and co-aberrant genes in the Gex, Dme, and CNV domains, respectively. Genes in the spots were identified by applying a threshold (usually 90% of maximum) to the respective omics score [38].
Function mining was performed using a repository of about 6000 gene sets implemented in oposSOM, which refer to different functional context and which were taken from literature, gene ontology and other sources [19]. The gene set score (GSZ) estimates the normalized mean log expression of the member genes of a set in each of the tumour samples (Appendix A) [39]. For Dme and CNV data, the GSZ score is calculated analogously by substituting the expression value by the respective methylation and CNV values. For ternary diagrams, GSZ scores were transformed into percentages of the modality values (Appendix A). Diversity analysis in sample and gene state space as well as function mining by means of gene set analysis was performed using the standard options provided by oposSOM [19,40] (Appendix A).

Genetic Stratification of LGG
The LGG cases under study were classified into four groups based on genetic characteristics agreeing with WHO 2021 stratification of gliomas [36] (Figure 2): The IDH-wt group collects unmutated (wild type) IDH1 and/or IDH2 (IDH) gliomas.
LGG carrying IDH-mutations split into the IDH-O (oligodendroglioma) group with co-deletions at Chro-mosome1p and Chr.19q (Chr1/19codel), the IDH-A and IDH-A' (IDH-mut astrocytomas) groups without Chr1/19codel where IDH-A' in contrast to IDH-A carries a deletion at Chr.19q. Combined gains of Chr.7 (Chr7+) and losses of Chr.10 (Chr10-) constitute a characteristic of WHO grade IV glioblastomas (GBM). These tumours accumulate in IDH-wt LGG (12 out of 19 cases, 63%). In IDH-mut, LGG Chr7+ and Chr10-appear mostly uncombined in IDH-A and IDH-A' where the number of tumours with Chr7+ exceed that with Chr10in IDH-A (16 versus 6 out of 54 cases, 30% versus 11%). IDH-O LGG are characterized by a high amount of TERT promoter methylation. The tumours in each LGG group were ranked with increasing score of the CpG-Island Phenotype (GCIMP)-signature taken from [37] ( Figure 2). The mean methylation values averaged over the tumours of each group increase in the order IDH-wt< IDH-A/A'< IDH-O. The overall methylation follows that of the GCIMP signature, showing that GCIMP characterizes the hypermethylation of IDH-mut LGG. GCIMP LGG have better prognosis than IDH-wt in terms of the hazard ratio (HR), whereas the prognosis of IDH-O is slightly better than that of IDH-A/A'. Group averaged methylation of G-protein receptors (GPCR) shows similar relations between the groups as the GCIMP signature. The GCIMP and GPCR methylation of the tumours in each group however show different slopes (red curves) We previously classified LGG into eight expression subtypes E1-E8 and six methylation subtypes M1-M6 based on their expression and methylation characteristics where both omics domains were considered independently [25,35]. The E-and M-subtypes accumulate to different degrees in the genetic groups, namely E1 and M1 in IDH-wt, E6 and M5 in IDH-O, and E2-E4 and M2-M4 in IDH-A/A' (see the colour bars in Figure 2). LGG of the E7, E8, and M6 subtypes are subsumed as neuronal (NL) tumours constituting samples of reduced tumour cell content [25]. NL-type LGG distributes over all genetic groups and partly reflect molecular characteristics of a healthy brain [35]. The present study applies an integrative multi-omics approach to this glioma data in order to illustrate the performance of combined SOM portrayal of omics landscapes in an integrative analysis setting.

Transcriptome, Methylome, and Genome Similarity Patterns of LGG Are Different
Next, we processed integral combiSOM portrayal using gene expression (Gex), methylation (Dme), and CNV data as described in the Methods section. It generated one combi-SOM per tumour with one image for each of the three omics realms (Appendix B, Figure A1). Pairwise correlation heatmaps between these portraits characterize the co-variance landscapes of the three omics data (Figure 3a). The maroon-coloured squares along the diagonals mark clusters of tumours with correlated molecular portraits in each of the heatmaps. They mostly agree with the genetic groups, thus reflecting their associations with the Gex, Dme and CNV landscapes. Substructures within the correlation clusters of the genetic groups are indicative of a fine structure of tumour heterogeneity previously resolved in terms of eight expression (E1-E8) and six methylation (M1-M6) glioma subtypes [25,35]. For example, subtypes E7-E8 and M6 can be related to neuronal (NL)-tumours which distribute over all genetic groups but show distinct features differing from other IDH-wt, IDH-A and IDH-O LGG due to their low tumour cell content. Subtypes M2 and E3 induce other sub-patterns in the genetic IDH-A group because of their decreased methylation level forming a separate GCIMP-low methylator type [35]. The Dme heatmap reveals another GCIMP-O methylator type of IDH-O tumours differing from the GCIMP patterns of IDH-A gliomas [35]. The detailed portraits of the E-and M-subtypes in the combiSOM are discussed below.
Cancers 2022, 14, x FOR PEER REVIEW 8 of 32 Figure 3. Similarity analysis of LGG using Gex, Dme and CNV data using pairwise similarity provides heatmaps of the SOM portraits (part (a)) and similarity network presentation (part (b)) (each circle indicates one tumour). CNV shows the clearest separation between genetic groups mainly along two axes referring to the Chr. 1p/19q codeletion and Chr. 7 + status, followed by Dme distributing the tumours along one axis according to their methylation level. Gex produces more diverse, multidimensional patterns resembling a closed circular net with strong intermixing of IDH-A, IDH-A and IDH-wt tumours. For methods description, see Appendix B.

Integrated Portrayal of LGG Reveals Orthogonal Effects of Methylation and CNV
The similarity landscapes shown in the previous section describe the state space of the LGG tumours. The SOM portrait, conversely, characterizes the omics state space of the genes chosen as the "atomic" unit of genomics regulation. According to the genetic strata, we generated group averaged Gex, Dme, and CNV portraits (Figure 4a). Red and blue spot-like areas indicate high and low values of the respective omics score, meaning up and down regulated expression, promoter hyper-and hypo-methylation, and copy number gains and losses of the respective genes, respectively. Importantly, our combined portrayal locates each gene at the same position in each of the maps, which enables their direct comparison, e.g., regarding the mutual effects of expression, methylation, and CNV on a certain gene or a group of genes collected in a spot. Overall, the spots refer to combined feature profiles meeting the condition of small mutual Euclidian distance between the equally weighted sums of Gex, Dme and CNV features as described in the Methods section (see Appendix B, Equation (5)). As a result, a spot can collect genes strongly correlated in one of the omics domains but virtually weakly or not correlated in the two other ones, and/or co-correlated in two or all three omics data. For estimating and visualizing the effect of pairwise combinations of different omics features, we generated signed square root covariance (ScoV) maps which colour code negative covariances in blue and positive covariances in maroon (Figure 4b). The Gex versus Dme ScoV maps show mostly blue spots reflecting the repressive mutual effect between gene expression and promoter methylation. In contrast, the Gex versus CNV ScoV maps are dominated by red spots because aberrant copy numbers typically affect expression via a direct dose-response relationship [35]. . Similarity analysis of LGG using Gex, Dme and CNV data using pairwise similarity provides heatmaps of the SOM portraits (part (a)) and similarity network presentation (part (b)) (each circle indicates one tumour). CNV shows the clearest separation between genetic groups mainly along two axes referring to the Chr. 1p/19q codeletion and Chr. 7 + status, followed by Dme distributing the tumours along one axis according to their methylation level. Gex produces more diverse, multidimensional patterns resembling a closed circular net with strong intermixing of IDH-A, IDH-A and IDH-wt tumours. For methods description, see Appendix B.
Network views as implemented in oposSOM [19] visualize topological properties of the sample similarity landscapes: The Dme-network topology is virtually one-dimensional, pointing from low methylation in IDH-wt tumours towards highest methylation in GCIMP-O seen in IDH-O type tumours, with IDH-A and GCIMP-low LGG in between, forming a transition range between low and high methylated gliomas (Figure 3b). The CNV network overall resembles a T-like structure which distributes along two major axes, one spanned by the Chr.1/19 codel versus -intact status and the other which is governed, first of all, by Chr7+ and Chr10-CNVs. The Gex network reflects a more complex, donut-like internal structure with similarity connections between IDH-A/A' and IDH-wt LGG and separated clouds formed by IDH-O and NL tumour. The three different omics similarity nets clearly separate the IDH-O LGG from the other types, which reflect specific properties of Gex, Dme and CNV data. The other IDH-A/A' and IDH-wt groups partly mix to different degrees because of overlapping molecular features presumably related to their astrocytomalike characteristics. In summary, the tumour similarity landscapes of the three omics realms reflect the genetic groups but they also show different complexities ranging from more one-dimensional methylation via a two-dimensional CNV towards a more complex Gex topology.

Integrated Portrayal of LGG Reveals Orthogonal Effects of Methylation and CNV
The similarity landscapes shown in the previous section describe the state space of the LGG tumours. The SOM portrait, conversely, characterizes the omics state space of the genes chosen as the "atomic" unit of genomics regulation. According to the genetic strata, we generated group averaged Gex, Dme, and CNV portraits (Figure 4a). Red and blue spot-like areas indicate high and low values of the respective omics score, meaning up and down regulated expression, promoter hyper-and hypo-methylation, and copy number gains and losses of the respective genes, respectively. Importantly, our combined portrayal locates each gene at the same position in each of the maps, which enables their direct comparison, e.g., regarding the mutual effects of expression, methylation, and CNV on a certain gene or a group of genes collected in a spot. Overall, the spots refer to combined feature profiles meeting the condition of small mutual Euclidian distance between the equally weighted sums of Gex, Dme and CNV features as described in the Methods section (see Appendix B, Equation (A5)). As a result, a spot can collect genes strongly correlated in one of the omics domains but virtually weakly or not correlated in the two other ones, and/or co-correlated in two or all three omics data. For estimating and visualizing the effect of pairwise combinations of different omics features, we generated signed square root covariance (ScoV) maps which colour code negative covariances in blue and positive covariances in maroon ( Figure 4b). The Gex versus Dme ScoV maps show mostly blue spots reflecting the repressive mutual effect between gene expression and promoter methylation. In contrast, the Gex versus CNV ScoV maps are dominated by red spots because aberrant copy numbers typically affect expression via a direct dose-response relationship [35].
So-called supporting maps provide additional information about the gene-state space. The variance maps reveal that genes showing large variations of expression, methylation and copy numbers localize in different, only partly overlapping regions of the SOM which reflects partial independence of the different omics domains (Figure 4c). Dme mainly varies along the diagonal axis between the lower left and the right upper corner showing a relatively smooth spot pattern while CNV varies mainly along the other diagonal between the left upper and the right lower corner of the SOM showing a relatively rugged pattern of a larger number of small spots. Variance of Gex more overlaps with that of Dme but partly with that of CNV in distinct spots of high variance due to interactions between the respective omics modalities. The spot map selects distinct areas of high variance, thus defining modules of co-regulated genes in one, two or in all three omics domains using a percentile threshold as described previously [38] (Figure 4d). Overall, eleven major spots were detected which were labelled with capital letters A-K. The population map counts the genes per pixel of the SOM. It reveals that the areas of high Dme variance are occupied by genes continuously distributed across the metagenes while the regions of high CNV variance were occupied more by discontinuously spread "islands" of genes separated by empty (white) metagenes (pixels) in-between ( Figure 4e). This difference in gene distribution reflects the continuous feature space of Dme which contrasts the more discontinuous CNV feature space referring to a trinary loss-intact-gain metric.
The spot profiles comprise Gex, Dme and CNV score values across the LGG-samples studied (Figure 4f). They were dominated by alterations of Gex and Dme (green marks), CNV (yellow), or Gex only (red) which associates with the localization of the respective spots in different regions of the SOM (compare with Figure 4d). They can be assigned either to functional gene sets using knowledge mining (mostly Gex dominated spots), to methylation modes (Dme-dominated spots), or copy number aberrations at specific chromosomes (CNV dominated spots). Correlation plots of selected spots show negative correlations between Gex and Dme (spots B and G) and positive correlations between Gex and CNV (spots E and J) (Figure 4g). Ternary diagrams visualize the signal composition in terms of percentages of Gex, Dme and CNV in units GSZ using the spot genes as signature. Accumulation of tumours (dots in the diagrams) along the left Gex axis are indicative of strictly methylation driven gene expression (first row in Figure 4h) while tumours accumulating near the lower CNV axis are governed by a CNV Gex dose-response relation which, however, resembles more an on-off binary switch lacking fine tuning along the CNV axis (second row in Figure 4h). In summary, the combiSOM clusters genes into consensus modes of concerted changes of expression, methylation, and/or CNV and visualizes them in mutually linked landscapes. Their topologies are governed by the degree of co-regulation and the particular interactions between the omics realms studied. Hereby Dme and CNV mainly change along two perpendicular axes reflecting their partial independence while Gex is affected by both of them.

Cartography of Features, Functions, and of Their Prognostic Impact
The metagene covariance maps in Figure 5a visualize the mean deviation between the gene and the metagene profiles in each of the pixels [38]. The analysis generates patterns which merge properties of the variance map with that of the population map (compare with Figure 4d,e). Gene set analysis reveals that the highly resolved spot patterns in the different omics maps associate with functional contexts related to gene expression of healthy brain, to proneural, inflammation, and EMT (epithelial-mesenchymal transition) transcriptional signatures in the Gex map, to targets of the polycomb repressive complex 2 (PRC2) related to neural development and to GCIMP, GCIMP-O, GPCR, and RTKII (receptor tyrosine kinase type II methylation subtype) methylation modes in the Dme map and to individual chromosomes in the CNV map where key chromosomal aberrations in gliomas were marked as red chromosome numbers. For example, Chr.19-losses distribute over two neighbouring spots in the right lower corner of the map, one related to Chr.1/19codel in IDH-O and the other one to single Chr.19 deletions in IDH-A'. Glioma key genes were localized across the map (Figure 5a, map below) showing, e.g., an association of IDH1 and TP53 with proliferative activity and of CIC located at Chr.19 with chromosomal losses of this chromosome in IDH-O and IDH-A'. Methylation landscapes show lowest heterogeneity presumably because DNA methylation forms a coarser layer of genomics regulation preserving footprints of the respective cell of origin [42]. The CNV landscape is highly diverse where, however, only a few aberrations, such as Chr.1/19 codeletions or Chr.7+ and Chr.10-, seem to be key effectors of LGG heterogeneity considered in the genetic WHO classes. Hence, combined SOM portrayal links the molecular landscapes of expression, methylation, and genetic features with their prognostic impact, which, in turn, associates with key genes, functional and structural signatures of LGG pathogenesis. The prognostic map colour codes the overall survival (OS) hazard ratio (HR, compared with the mean OS averaged over the whole data set) between maximum (red, furious prognosis) and minimum (blue, good prognosis) HR. HR values are calculated metagene-wise by selecting tumours showing omics scores exceeding one standard deviation in positive direction (i.e., with high values of the score, see [35] for details). (c) Diversity analysis includes spot frequency distributions and profiles of the mean omics score and variance per tumour. The row below maps the key genes driving the different genetic subtypes (see Discussion section).

Profiling and Mapping Functional Signatures
For further knowledge mining in the combined data, we selected gene sets from different categories such as gene ontology biological process (GO BP), signature genes from previous omics studies (mostly Gex), as well as genes located at a certain chromosome. The prognostic map colour codes the overall survival (OS) hazard ratio (HR, compared with the mean OS averaged over the whole data set) between maximum (red, furious prognosis) and minimum (blue, good prognosis) HR. HR values are calculated metagene-wise by selecting tumours showing omics scores exceeding one standard deviation in positive direction (i.e., with high values of the score, see [35] for details). (c) Diversity analysis includes spot frequency distributions and profiles of the mean omics score and variance per tumour. The row below maps the key genes driving the different genetic subtypes (see Discussion section).
The prognostic maps relate Gex, Dme and CNV metagene values across the SOM to the hazard ratio (HR) of patients showing high values of the respective omics score in the respective metagene (Figure 5b, for a detailed description of the method see [35]). For example, the red region in the Gex prognostic map refers to high expression of genes amplified by Chr7 + CNV gains, genes not "suffering" from deletions at Chr1-and genes upregulated in an inflammatory context. These red high-HR areas refer to IDH-wt and partly to IDH-A and GCIMP-low LGG. The blue low-HR areas in this map indicate underexpressed genes in the respective groups. The Dme-prognostic map just shows partly colour-inverted HR-patterns compared with the Gex map where red substitutes blue and vice versa. It can be simply explained by the mostly anticorrelated relation between methylation and expression meaning that high methylation in IDH-wt, e.g., in the right upper corner of the SOM, associates with decreased expression levels. The CNV prognostic map shows similarities with the Gex map due to the correlated dose-response relation between both omics scores. Especially spot H enriching genes of Chr.7 + associates with the most furious prognosis in the CNV and Gex maps where the latter indicates bad prognosis for genes upregulated in the inflammatory context presumably in the tumour microenvironment (spot B) [35]. Notably, the most furious prognosis is predicted by hyper-methylation of spot G (HR > 7) observed specifically in IDH-wt LGG (see the spot profile in Figure 4f) and assigned to the GBM RTKI/II hypermethylation signatures [24,41].
Diversity analysis provided information about the heterogeneity of the molecular patterns in the different omics domains (Figure 5c). The spot frequency distributions count the fraction of tumours with a certain number of spots in their portraits, which, in turn, relates to the molecular heterogeneity of the respective subtypes. Most spots were found in the Gex portraits (up to 3), while Dme portraits show usually only one spot except that of IDH-O LGG with up to three spots. IDH-A shows the most diverse CNV patterns while IDH-O is the less diverse one, mostly related to Chr.1/19 codeletions. The mean metagene expression and methylation profiles (calculated per tumour) confirm their anti-correlated relation.
In summary, covariance summary maps provide highly resolved feature landscapes related to cellular functions, methylation modes, chromosomal aberrations, and key genes. Prognostic information with metagene resolution can be extracted after translating expression, methylation, and CNV metrics into HR-scale with possible impact for marker selection from regions of maximum or minimum HR values. Gene expression landscapes are overall the most diverse ones reflecting largest heterogeneity of cellular processes. Methylation landscapes show lowest heterogeneity presumably because DNA methylation forms a coarser layer of genomics regulation preserving footprints of the respective cell of origin [42]. The CNV landscape is highly diverse where, however, only a few aberrations, such as Chr.1/19 codeletions or Chr.7+ and Chr.10-, seem to be key effectors of LGG heterogeneity considered in the genetic WHO classes. Hence, combined SOM portrayal links the molecular landscapes of expression, methylation, and genetic features with their prognostic impact, which, in turn, associates with key genes, functional and structural signatures of LGG pathogenesis.

Profiling and Mapping Functional Signatures
For further knowledge mining in the combined data, we selected gene sets from different categories such as gene ontology biological process (GO BP), signature genes from previous omics studies (mostly Gex), as well as genes located at a certain chromosome. We plotted their ranked profiles, locations of the genes in the SOM as signature maps as well as ternary diagrams in Gex-Dme-CNV space (Figures 6a and A2 for a larger overview). For example, genes from Chr. 1 show low expression in IDH-O tumours due to the loss of copy numbers of Chr. 1, which is associated with slightly decreased methylation levels. These Chr. 1 genes accumulate in and near spot J referring to samples with codeletion together with Chr. 19, but also in a second area which refers to single deletions at Chr. 1 (see red circles in Figure 6a, first raw). Genes from Chr. 19 accumulate in other areas of the map, first of all, in spot I due to frequent codeletion together with Chr.1 in IDH-O as well as in a second spot for LGG with single deletions. CNV dominated alterations of LGG spread roughly along the CNV axis of the ternary diagrams (Figure 6b). GCIMP genes and PRC2-targets accumulate along the Gex axis, which is indicative for Dme-driven expression changes. The distributions of gene signatures related to inflammatory response and of GPCR encoding genes are similar and reflect a strong effect of methylation with subtle differences between IDH-O and IDH-A tumours (see for details [25,35]). Contrarily, cycling genes accumulate around spot F in the map. They form a cloud distributing along the bisectrix between the Gex and CNV axes in the ternary diagram, which indicates small effects of Dme and CNV on Gex, which, in turn, is more governed by transcription factor (TF) network, e.g., via Myc-targets distributing similarly as cycling genes (Figures 6 and A2b). Importantly, most functional signatures show rather continuous distributions despite the systematic differences between the subtypes. In summary, profiling and mapping of gene signatures provide combined information about the mutual effect of gene expression, methylation, and copy number variation on functional modes enabling to evaluate their impact on tumour heterogeneity. codeletion together with Chr. 19, but also in a second area which refers to single deletions at Chr. 1 (see red circles in Figure 6a, first raw). Genes from Chr. 19 accumulate in other areas of the map, first of all, in spot I due to frequent codeletion together with Chr.1 in IDH-O as well as in a second spot for LGG with single deletions. CNV dominated alterations of LGG spread roughly along the CNV axis of the ternary diagrams ( Figure 6b). GCIMP genes and PRC2-targets accumulate along the Gex axis, which is indicative for Dme-driven expression changes. The distributions of gene signatures related to inflammatory response and of GPCR encoding genes are similar and reflect a strong effect of methylation with subtle differences between IDH-O and IDH-A tumours (see for details [25,35]). Contrarily, cycling genes accumulate around spot F in the map. They form a cloud distributing along the bisectrix between the Gex and CNV axes in the ternary diagram, which indicates small effects of Dme and CNV on Gex, which, in turn, is more governed by transcription factor (TF) network, e.g., via Myc-targets distributing similarly as cycling genes (Figures 6 and Figure A2b). Importantly, most functional signatures show rather continuous distributions despite the systematic differences between the subtypes. In summary, profiling and mapping of gene signatures provide combined information about the mutual effect of gene expression, methylation, and copy number variation on functional modes enabling to evaluate their impact on tumour heterogeneity. Figure 6. Sets of signature genes were analysed in terms of ranked profiles, maps, and ternary diagrams using the gene set Z score (GSZ) of expression, methylation, and CNV values. (a) LGG were ranked with increasing GSZ from left to right. Tumours of different genetic groups accumulate at low or high GSZ values as indicated (LGG-bars are coloured according to their genetic group; see Figure 2 for assignment). The signature maps show the distribution of signature genes in the map. Their accumulation in selected spot areas is indicated by red circles (see Figure 4d). (b) Ternary diagrams show mutual dependencies where Gex of the respective set is driven via dose response by CNV (see chromosomal sets), via repression by Dme (see GCIMP set), by combinations of both or none of both (see the legend in the right part). A fine structure spans in direction (blue arrow) perpendicular to the major axis of variation (red arrow). Gene sets were taken from [43,44]. For a larger collection of gene signatures see Figure A2.

Integrative Portrayal of the LGG Subtype Diversity-Beyond the Genetic Classes
The pairwise correlation maps in Figure 3 reveal a fine structure of glioma heterogeneity not resolved by the genetic groups. Our previous LGG classification resolved this heterogeneity into eight expression (E1-E8) and six methylation (M1-M6) subtypes [25]. The E-and M-subtypes mutually overlap to 70-90% of the tumours which reflects inter-omics regulatory modes between gene expression and DNA methylation modalities, however, the absence of a strong one-to-one relationship [25,35] (see the colour bars above the heatmaps in Figure 2, which assign the LGG to the three different classification schemes). Here we compared the genetic groups with these E-and M groups by means of their combined portraits in the three omics realms in order to identify overlapping and disjunct modes of genomics regulation.
The flow diagram illustrates the balance of tumours between the genetic groups applied here and the E-and M groups of the previous, independent classifications (Figure 7a). The Gex, Dme and CNV portraits of the IDH-wt genetic group closely resemble those of the E1-expression and M1-methylation subtypes, while the portraits of the IDH-O genetic group are very similar to those of the E6 and M5 subtypes because of the large degree of overlap (see thick flows in Figure 7a). The IDH-A tumours decompose into several substrata (E2-E5 on expression side and M2-M4 on methylation side) where the majority accumulates in the core astrocytoma subtypes E4 and M4 [35]. The NL-like subtypes (E7, E8 and M6) distribute virtually over all genetic groups, which confirms their "technical" origin due to low tumour cell content [25]. Their portraits consequently show partly healthy brain properties not explicitly resolved in the portraits of the genetic groups.

Reweighting the Modalities-Single Omics Dominated Maps
Our integrative SOM approach combined the Gex, Dme and CNV modalities applying equal weight to each of them (we = wm = wc = 1/3; see Figure 1 for illustration). This balanced combiSOM generated a landscape which is affected by all three omics data domains. The alteration of the weight factors under the condition we + wm + wc = 1 would increase or to decrease the relative effect of each of the different modalities. In a previous two-modality (Gex and Dme) combiSOM approach on WHO grade IV glioblastoma data, we tuned the weighting factor between dominant weighting of Gex (we = 0.99) and Dme (wm = 0.99) [34]. We found that the respective major weight component determines whether gene expression or methylation data show largest variance in the resulting virtually "univariate" SOM. For the LGG data, we here calculated three such weight- For the largely overlapping groups, the respective omics portraits virtually agree, namely between IDH-wt, E1, and M1 and between IDH-O, E6, and M5. Conversely, IDH-A/A' split into a series of subtypes, differing in the expression activity of cell cycle as well as inflammatory genes (increase from E4/M4 towards E2,3/M3,2) paralleled by decaying methylation in the respective spots (Figure 7a,b) [35]. Hence, re-stratification of the LGG into the E-and M groups resolves heterogeneity with higher granularity showing partly clear differences, e.g., between the IDH-A/A' subgroups and between the NL-tumours and the rest ( Figure A3).
Profiles of the most pronounced spots illustrate the variations of Gex, Dme and CNV features across the LGG groups and subtypes, e.g., antagonistic overexpression and hypermethylation of IDH-wt which allocates the affected genes in opposite corners (lower left versus upper right) of the map (Figure 7b). This anti-correlation becomes even better resolved in the respective ScoV map (Figure 7c) revealing, e.g., hypermethylation modes in IDH-mut (GCIMP) and IDH-wt (proneural, PN) tumours by blue spots (see [24]). The Gex versus CNV ScoV-profiles indicate direct correlations (red spots) between gene expression and CNV in IDH-wt (E1, M1) in the left upper corner mainly due to Chr10-losses as well as in IDH-O (E6, M5) in the right lower corner due to Chr1/19 codeletions. Comparison of the Gex versus Dme ScoV-profiles, in turn, indicate negative correlations (blue spots) which refer to GCIMP and anti-GCIMP methylation modes (Figure 7c). In summary, resolution of the genetic groups into expression and methylation subtypes further refines the combined omics landscapes, especially of IDH-mut astrocytomas, IDH-A/A', in terms of functional modes such as alterations of the cell cycle activity and inflammatory response governed by combined changes of methylation and/or CNV modalities. Combined portrayal resolves these coupled modes with high granularity and visualizes them in an intuitive fashion.

Reweighting the Modalities-Single Omics Dominated Maps
Our integrative SOM approach combined the Gex, Dme and CNV modalities applying equal weight to each of them (w e = w m = w c = 1/3; see Figure 1 for illustration). This balanced combiSOM generated a landscape which is affected by all three omics data domains. The alteration of the weight factors under the condition w e + w m + w c = 1 would increase or to decrease the relative effect of each of the different modalities. In a previous two-modality (Gex and Dme) combiSOM approach on WHO grade IV glioblastoma data, we tuned the weighting factor between dominant weighting of Gex (w e = 0.99) and Dme (w m = 0.99) [34]. We found that the respective major weight component determines whether gene expression or methylation data show largest variance in the resulting virtually "univariate" SOM. For the LGG data, we here calculated three such weight-dominated combiSOM by setting w e = 0.99 (and w m = w c = 0.005) for the Gex dominant SOM and analogously w m = 0.99 or w c = 0.99 for Dme-and CNV dominance, respectively. As the main result, we found that group-averaged portraits and variance maps of the Gex and Dmedominated SOM show mostly similar topologies governed by the high variant modules of gene expression and DNA methylation, respectively ( Figure A4). In contrast, the topology of the CNV dominated SOM is governed by copy number aberrations, e.g., at Chr. 1, 19, 7 and 10 and the respective dose-response relationships between CNV and Gex ( Figure A4). In general, tuning the weight factors thus enables tuning the SOM topologies between high Gex variant, high Dme-variant and high CN-variant spot patterns where overlap regions are indicative for couplings between the omics modalities such as repressive interactions between Dme and Gex and dose-response relationships between CNV and Gex. Each of these single-omics dominant SOMs can be used to study these situations in detail. However, each of them provides a separate SOM-topology with different distributions of genes, which requires a new and separate orientation and interpretation of the respective map. Conversely, the equally weighted SOM combines the omics modalities in a balanced fashion, which makes this setting a preferential option for balanced multi-modal omics landscapes. Finally, setting weights in a virtually bi-variant fashion (e.g., w e = w m = 0.5; w c = 0) would generate SOM landscapes governed by the co-variance between two of the omics domains, thus providing a third option for studying relationships between the omes.

Multi-Omics Cartography of LGG
DNA methylation of CpGs in gene promoters, CNV, and gene expression are mutually dependent effects that affect activity of cellular programs. We here presented a method enabling the combined analysis of multi-omics data based on SOM machine learning. It "portrays" the expression, methylation, and CNV landscapes using the same "gene-centred" coordinate system which enables their combined visual evaluation and direct comparison on a personalized and class-related basis for each tumour and subtype, respectively. We applied this combined SOM portrayal to LGG, a relatively well characterized tumour entity which classifies into a series of molecular subtypes defined by genetic key aberrations such as the mutation of the IDH-gene and/or co-losses on chromosomes 1 and 19. These changes together cause large-scale effects on DNA methylation and gene expression, and in final consequence, associate with cell fate decisions towards oligodendroglioma-(IDH-O) or astrocytoma-(IDH-A), as well as GBM(IDH-wt)-like tumour cell lineages with different prognostic impact [45].
Our multi-omics cartography visualizes three layers of molecular landscapes which are linked by gene-centred Dme (promoter methylation), Gex (gene expression), and CNV (copy number variation) features ( Figure 8). Our method segments the different omics landscapes into modules of co-methylated, co-expressed, and co-aberrant genes which are visualized as peaks in the three dimensional and as spots in the two-dimensional maps, respectively. They reflect the underlying network of regulatory modes of cell activity within each of the omics layers and between them. For example, GCIMP-methylation due to the IDH mutation generates such peaks in the left lower and right upper part of the map which hypermethylate in tumours carrying either the mutated or the intact IDH gene, respectively. Conversely, key CNV associating with IDH-wt and IDH-O tumours more distribute along the other diagonal from the left upper towards the right lower part of the map. This virtually orthogonal distribution of DNA methylation and genetic CNV effects reflects their partial independence. In contrast, variant transcriptome modules spread throughout the whole map because gene expression is modulated by methylation as well as by CNVs preferentially via repressive and direct interactions, respectively. Certain modules such as the G-protein coupled receptor (GPCR) that peak strongly protrude only in the methylation map, thus illustrating that variations in one of the omes does not necessarily transform into variations in another omics layer (see, e.g., [35] for a detailed discussion). The different modes associate with different functional contexts ( Figure 4) and prognoses, which can be visualized in terms of functional and hazard ratio maps, respectively (Figure 5a,b) and which reflect aspects of pathogenesis of LGG.

LGG Pathogenesis Is Governed by Genetic and Epigenetic Factors along Subtype Specific Paths
Particularly, mutations in the IDH1/2 gene(s) are the driving event behind IDH-mut LGG [35,45]. The resulting misfunction of mutated IDH1/2 proteins in the TCA cycle induces widespread alterations of DNA methylation via an onco-metabolic mechanism. The aberrant oncometabolite 2-hydroxyglutarate (2-HG) inhibits the activity of a series of enzymes erasing or writing methylation marks at the DNA and at histone side chains, which deregulates the epigenetic machinery of the cells, making them more plastic and forming a common progenitor of both IDH-A and IDH-O subtypes. IDH-O gliomas arise after the Chr. 1p/19q co-deletion as genetic hallmark. It deactivates CIC (at Chr. 19q), a transcriptional repressor, which in consequence promotes proliferation, blocks differentiation and, in combination with activating mutations in the promoter of the TERT-gene, supporting survival of IDH-O tumour cells via a telomerase-driven telomere maintenance mechanism (TEL-TMM). IDH-A, in contrast, lacks the Chr. 1p/19q co-deletion, but instead, its pathogenesis is driven by a triple hit mechanism. First, deactivating mutations of the TP53 gene (hit 1), a stereotypical tumour suppressor deactivated in more than 50% of all cancers, accumulate excessive mutations and genomics damage in the progressing cells. Second, deactivating mutations of ATRX (hit 2), encoding a chromatin remodelling enzyme of the SWI/SNF family, alter the chromatin structure in the telomeric regions and promotes cell survival by an alternative telomere maintenance mechanism (ALT-TMM). Third, hit 3 downregulates transcription of SOX2, a transcription factor essential for cell differentiation, via a promoter-enhancer dissociation mechanism induced by aberrant local DNA-hypermethylation, which disrupts CTNF-induced chromatin looping. In consequence, hit 3 blocks differentiation of the neuronal progenitors. Pathogenesis of IDH-wt gliomas is affected by blocked differentiation driven by key CNV at Chr. 7 and 10 and aberrant function of PRC2 and of its components such as EZH2. necessarily transform into variations in another omics layer (see, e.g., [35] for a detailed discussion). The different modes associate with different functional contexts ( Figure 4) and prognoses, which can be visualized in terms of functional and hazard ratio maps, respectively (Figure 5a,b) and which reflect aspects of pathogenesis of LGG.

LGG Pathogenesis Is Governed by Genetic and Epigenetic Factors along Subtype Specific Paths
Particularly, mutations in the IDH1/2 gene(s) are the driving event behind IDH-mut LGG [35,45]. The resulting misfunction of mutated IDH1/2 proteins in the TCA cycle induces widespread alterations of DNA methylation via an onco-metabolic mechanism. The aberrant oncometabolite 2-hydroxyglutarate (2-HG) inhibits the activity of a series of enzymes erasing or writing methylation marks at the DNA and at histone side chains, which deregulates the epigenetic machinery of the cells, making them more plastic and forming a common progenitor of both IDH-A and IDH-O subtypes. IDH-O gliomas arise after the Chr. 1p/19q co-deletion as genetic hallmark. It deactivates CIC (at Chr. 19q), a transcriptional repressor, which in consequence promotes proliferation, blocks differentiation and, in combination with activating mutations in the promoter of the TERT-gene, supporting survival of IDH-O tumour cells via a telomerase-driven telomere maintenance mechanism (TEL-TMM). IDH-A, in contrast, lacks the Chr. 1p/19q co-deletion, but instead, its pathogenesis is driven by a triple hit mechanism. First, deactivating mutations of the TP53 gene (hit 1), a stereotypical tumour suppressor deactivated in more than 50% of all cancers, accumulate excessive mutations and genomics damage in the progressing cells. Second, Hence, overall pathogenesis of LGG is governed by a complex interplay of genetic and epigenetic mechanisms giving rise to different modes of genomics regulation which associate with key functions such as proliferation, inflammation, and cell differentiation in a subtype specific fashion. Our combiSOM identifies mutual relations between these modes in and between the omics layers. For example, one finds that IDH-wt gliomas selectively share activated proliferation with IDH-O while activated inflammatory features are shared with IDH-A. Increasing inflammatory characteristics of IDH-A associates with decaying GCIMP and overall DNA methylation while, conversely, GPCR methylation increases. Our combiSOM maps directly visualize the mutual association between the Gex and Dme layers, which in independent single omics analytics, are not obvious and required additional analytic efforts for their identification. The prognostic maps link molecular features in all three omics layers with potential impact for molecular marker selection.

What Modality Is the Best?
Our analysis provides intuitive answers to important questions such as "What is the best?", which is frequently asked about multi-omics settings.
What omics modality is the best? CNV and DNA methylation are "structural" modalities affecting genes with or without explicit functional consequences, e.g., by copy number losses of whole chromosome arms, e.g., of Chr. 10p, 1p or 19q in LGG [22] or by large-scale hypermethylation affecting, e.g., the olfactory subgenome including GPCR as a functional category [25]. These "structural" lesions only partly associate with the transcriptome leading to a lack of one-to-one correspondence as demonstrated previously [34,35]. In consequence, co-methylated or co-aberrant genes selected from the Dme or CNV domains, respectively, usually don't strongly enrich functional gene signatures due to their "structural" origin which dilutes the functional signature genes often in a "sea" of nonfunctional ones. However, they can provide footprints of cancerogenesis such as cell of origin characteristics maintained in the Dme patterns [42,46]. Contrarily, co-expressed genes selected from the Gex domain often directly reflect regulatory networks of distinct functional pathways. In consequence, the transcriptome domain seems advantageous for extracting functional information based on the "guilt by association" principle [47] and eventually can be used as filter to extract genes with functional relevance from genetic and methylation patterns.
What modality provides the best markers? Correlation of characteristic genetic markers with clinical outcome defined three major prognostic groups [22] considered in the LGG classification proposed by WHO [48] and here as IDH-A (and IDH-A'), IDH-O, and IDH-wt genetic groups. It stratified patients into prognostically distinct classes better than histological classes. The addition of gene expression markers to this genomics classifier did not further improve prognosis [22]. This result simply shows that due to interactions between the omics modalities their individual prognostic power is partly redundant and their combination doesn't markedly improve prognosis. However, the HR maps (Figure 5b) reveal that the (red and blue) areas of prognostic impact only partly overlap between the modalities meaning that suited single marker genes from different omics domains can diverge because of their different responsiveness. For example, the IDH gene, although probably the most distinctive genetic marker, shows only weak differential expression and methylation between the IDH-mut and IDH-wt groups, while conversely, CIC located at Chr. 19 is a suited expression marker to discriminate between IDH-A and IDH-O. Multi-omics cartography in terms of prognostic maps provides a tool to extract such gene signatures of maximum/minimum HR from the different omics domains. Novel genetic and functional markers and signatures for brain tumours are under discussion [49]. Note that other tumour types such as colon cancer [50] or B-cell lymphomas [51] show less clear overall omics landscapes with respect to clinical outcomes which suggests that amendments can be reached presumably by combining different omics modalities.
What subtyping is the best? We independently used expression and methylation data, to refine the genetic groups regarding neuronal-type (NL) LGG of low tumour content distributed over all genetic groups and regarding three to four IDH-A subtypes differing in immunogenic properties partly resembling that observed in pilocytic or mesenchymal astrocytomas [35] with possible impact for treatment resistance and prognosis [25,52]. Multiomics cartography well illustrates subtle differences between the molecular landscapes of the different subtypes (Figure 7a). Expression, methylation, and genetic subtypes largely overlap but show differences in 10-20% of the number of tumours in each of the classes [35]. Consenting subtypes from the different omes would simplify the picture but conversely, would remove important domain-specific details such as methylation of the GPCR genes with possible impact on immunogenicity of the tumour microenvironment. We therefore advocate that omic-specific subtyping should complement consensus classification schemes in order to more specifically considers functional details.
What integration method is the best? Our combiSOM method enables integrating the different domains with variable weights. Equal-weighting combines the omics layers in a balanced fashion, thus revealing mutually integrating features in the gene state space reflecting the inter-omics network of genomics regulation. Tuning the weights towards single-ome dominated landscapes conversely reveals co-variant genes regarding the dominant layer and particularly, non-covariant ones, e.g., if GPCR expression on the average is weakly affected by methylation. We recently used these dominant-omics settings to decipher the cooperation between epigenetics and transcription factor networks for cell fate decisions [53].
Hence, the overall answer on all these questions is a Solomonic "it depends" one, where the multi-omics view provides more flexibility and a wider, holistic view in data space for selecting markers and subtypes. More importantly, multi-omics data are mandatory for the comprehensive understanding of the whole repertoire of genomics regulation underlying cancer genesis and development. A similar "Solomonic" conclusion was drawn in a systematic benchmark study comparing multi-omics integration methods on cancer data [54]. The effect of different omics data types varies and can improve the outcome in terms of both clustering and clinical metrics, but it can have even a negative effect if too many omics layers are integrated, thus refuting the intuition that incorporating more types of omics data always helps produce better results. As possible reasons for this counterintuitive observation, the authors see redundancy in the information content of different types of omics data, especially between Gex and Dme data, as well as suboptimal bioinformatics processing in the integration step. These reasons, why one integrative method behaves this way and another in another way, are not clear, mainly because of the "black box" character of the kernel algorithms making it difficult to follow the integral processing step which combines the different domain data. Our integral cartography, conversely, applies a simple weighted combination of the omics domain data together with their intuitive visualization which supports the interpretation of the relatedness between the omics layers and to adjust their combination for the particular question under study.

Limitations and Future Applications
Our publication illustrates the potential of multi-omics SOM cartography. We selected a hitherto well-characterized LGG data set as a worked example for illustration [22,25,35]. Our study is limited by the relatively small sample size of 122 tumours which doesn't resolve molecular details of part of subtypes. We have previously shown that a larger set of more than 400 LGG available in the TCGA data portal (The Cancer Genome Atlas) well fits into the classification scheme proposed based on our smaller data [25], which supports the reliability of the landscapes presented here. Other limitations are the restriction to methylation of the gene promoter region and to CNV as a specific feature of the genetic domain. We see future applications, for example, by considering explicitly the mutation domain, methylation in the gene body and/or upstream enhancer regions as well as chromatin accessibility as measured by the ATAC (Assay for Transposase-Accessible Chromatin using sequencing) technique often in combination with gene expression data in single-cell settings. Another application of combiSOM addressed layers of different histone modifications obtained by means of ChIP-seq measurements [32].

Conclusions
Multi-omics SOM cartography allows for disentangling the diversity of regulatory modes of cell functions in terms of easy-to-interpret gene-centric data landscapes. They visualize aberrant changes related to complex diseases such as cancer. The method is not restricted to the triple-omics data used here. It can be extended to integrate other omics features such as genetic mutation or protein expression data. "Phenotype" association with survival of the patients and other clinical characteristics potentially extends the visualization options of the method. Due to the growing use of multi-omics data, we expect that these options will become important for future progress in cancer bioinformatics.  Funding: This study was supported by the Science Committee of RA in the frames of the projects 21AG-1F021 and 21SC-BRFFR-1F020 (to A.A.).

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable here.
Data Availability Statement: Gene expression and methylation data are available in the gene expression omnibus (GEO) database under accession number GSE61374 (LGG expression [22]) and GSE129477 (LGG methylation [25]). The R-object is available in the Leipzig Health Atlas under ID 8CUCNFVX6N-6 (see Appendix B).

Acknowledgments:
We acknowledge the German Glioma Network (GGN, Coordinator: M. Weller, Zuerich) for kindly providing molecular and clinical data of all glioma cases reported in this publication. This work is partly supported by the Federal Ministry of Education and Research (BMBF) project grant Leipzig Health Atlas.

Conflicts of Interest:
The authors declare no conflict of interest.

Centralization and Harmonization of the Omics Scores
Gene-centric expression, methylation and copy number data (E ns , M ns and C ns , respectively) were preprocessed as described previously [24] and then centralized by subtracting the respective log-mean averaged over all gliomas as follows.
where n = 1 . . . N is the gene index and s = 1 . . . S is the sample (tumour) index. For methylation measures, we used "M"-scale (ratio of methylated to unmethylated CpG) instead of beta-scale (fraction of methylated CpG) where methylation was integrated over the promoter region of each gene (from 2000 bp upstream to 200 bp downstream). The centralized expression, methylation and copy number values thus define "omics scores" referring to the differential levels of each gene n in the data set studied relative to the respective mean averaged over all tumours. If not stated otherwise, we used the terms over-and underexpression throughout the paper for ∆e ns > 0 and ∆e ns < 0, respectively; the terms hyper-and hypo-methylation for ∆m ns > 0 and ∆m ns < 0, respectively, and CNV gains and losses for ∆c ns > 0 and ∆c ns < 0. We define "profiles" of the omics scores given as data vector for each gene with the sample-related values as elements: ∆e n• = (∆e n1 , . . . , ∆e nS ), ∆m n• = (∆m n1 , . . . , ∆m nS ) and ∆c n• = (∆c n1 , . . . , ∆c nS ) (A4) and "states", given as data vector for each sample with the gene-related values as elements, ∆e •s = (∆e 1s , . . . , ∆e Ns ), ∆m •s = (∆m 1s , . . . , ∆m Ns ) and ∆c •s = (∆c 1s , . . . , ∆c Ns ) (A5) For combined analysis, data were transformed into a unique, "harmonized" scale by normalizing them with respect to the mean absolute value averaged over all data

CombiSOM Training
After centralization and harmonization, we merged expression and methylation profiles into combined profiles where the data were combined with different mutual weights chosen from the data intervals w e , w m , w e = (0,1) meeting the condition w e + w m + w e = 1 (see Figure 1 for a schematic overview of the data processing pipeline). The combined profile vectors are then clustered into so-called metagenes by applying machine learning using Self Organizing Maps (SOM) as implemented in oposSOM [19] in the combiSOM version [34]. The metagenes, in analogy to the input profiles, split into the three metagene vectors of gene expression (Gex), DNA methylation (Dme) and CNV, with k = 1, . . . K where K is the total number of metagenes. The method uses an iterative training algorithm to ensure that the metagene profiles optimally cover the data space. SOM training is performed in two dimensions by arranging the metagenes in a quadratic grid of size x,y = 1 . . . √ K used for visualization of the data landscapes in metagene space. We used a SOM of size 45 × 45 and default parametrization of SOM training as implemented in [19].
After SOM training, meta-and single Gex, Dme and CNV data are back transformed into their original scales for visualization and further downstream analysis. Accordingly, each sample studied is characterized by its state of metagene expression, methylation and CNV, respectively. The Gex, Dme and CNV states of each tumour sample were visualized by colour coding the metagene values in the quadratic mosaic grid between red (high scores) to blue (low scores) in maximum-minimum metrics for each sample. This way, one obtains three images per sample which "portrait" its expression, methylation and CNV landscapes separately ( Figure 1). Importantly, each metagene is located at the same position in all three maps. It is associated with the same cluster of single genes because of the joint training of Gex, Dme and CNV data. Therefore, all three maps can be directly compared with another, e.g., to identify regions of specific combinations of expression and methylation data.

Signed Square Root Covariance (ScoV) Maps
Associations between the different omics domains can be estimated by calculating the signed square root covariance (ScoV) between their metagene profiles in a pairwise fashion, e.g., between the Gex and Dme profiles

The Gene Set Z score (GSZ) and Ternary GSZ-Diagrams
The gene set score (GSZ) estimates the normalized mean log-expression of the member genes of a set in each of the tumour samples. It is calculated as GSZ s (set) = ∆e ns n=set SD n=set,all s , where normalization is provided as Z score by using the standard deviation (SD) of the expression of the set genes in all samples [20,39]. For Dme and CNV data, the GSZ score is calculated analogously by substituting the expression value by the respective methylation and CNV values. For ternary diagrams, GSZ scores were transformed into percentages of the domain values %GSZ s (set) = GSZ s (set) with GSZ s (set) = GSZ s (set) − min(GSZ(set)).
∑ GSZ s (set) denotes the sum over the Gex, Dme and CNV GSZ'values of the respective tumour samples.

Tumour Similarity Analysis, Supporting and Prognostic Maps
Similarity analysis compares the SOM portraits of the tumour samples by means of Pearson's correlation coefficient of their Gex, Dme and CNV values using meta-genes instead of single genes, which has the advantage of improving the representativeness and resolution of the results [20]. The correlation matrix was visualized using pairwise correlation maps ( Figure 3a) and correlation net (Figure 3b) representations. The correlation net constructs an unweighted graph by connecting the nodes (samples) whose pairwise correlation coefficient exceeds a given threshold (0.5 here, see [20]).

Data and Program Availability
The R-workspace of the combiSOM application is available at the Leipzig Health Atlas (LHA) repository under the link https://www.health-atlas.de/data_files/583 (LHA-ID: 8CUCNFVX6N-6, provided 6 May 2022). The quick-start guide is available under https://www.health-atlas.de/presentations/4 (see Supplement File S2).    which is partly observed for G-protein coupled receptors (GPCR), thus suggesting their role in this process. Targets of the polycomb repressive complex (PRC2) and so-called low transcription factor activity genes (low TF) are mostly regulated by epigenetic mechanisms in the context of differentiation and development of healthy tissues and cancer [43,56]. Genes from these sets split into two major populations found near opposite corners of the map, namely the lower-left one (Gex_UP in LGG) and the upper right corner (Gex_DN in LGG). This split indicates that the respective genes are affected by an LGG group specific hyper-methylation patterns which activates them either in healthy brain or in brain cancer. Conversely, so-called high-TF genes associate with regulatory modes governed by TF-networks. They accumulate in regions of increased proliferative activity observed for MYC-targets. (c) Glioma-related signatures were taken from [22,24,37,41]. They show mostly pronounced enrichment in spot areas and clearly associate with group-wise up-and/or down-regulation in the different omics domains. They were adequately extracted from glioma expression data sets. Therefore, these signatures give rise to the spread of the LGG along the Gex axis of the ternary diagram, however with segregation between subtypes, e.g., between IDH-A and IDH-O. Genes with functions in inflammatory response are downregulated in IDH-O due to their hypermethylation, a pattern, which is partly observed for G-protein coupled receptors (GPCR), thus suggesting their role in this process. Targets of the polycomb repressive complex (PRC2) and so-called low transcription factor activity genes (low TF) are mostly regulated by epigenetic mechanisms in the context of differentiation and development of healthy tissues and cancer [43,56]. Genes from these sets split into two major populations found near opposite corners of the map, namely the lower-left one (Gex_UP in LGG) and the upper right corner (Gex_DN in LGG). This split indicates that the respective genes are affected by an LGG group specific hyper-methylation patterns which activates them either in healthy brain or in brain cancer. Conversely, so-called high-TF genes associate with regulatory modes governed by TF-networks. They accumulate in regions of increased proliferative activity observed for MYC-targets. (c) Glioma-related signatures were taken from [22,24,37,41]. They show mostly pronounced enrichment in spot areas and clearly associate with group-wise up-and/or down-regulation in the different omics domains. They were adequately extracted from glioma expression data sets. Therefore, these signatures give rise to the spread of the LGG along the Gex axis of the ternary diagram, however with segregation between subtypes, e.g., between IDH-A and IDH-O.  Particularly, the Gex governed SOM (w e = 0.99, first row of maps) reveals a few areas collecting genes of highly variant expression which overlap with variant methylation areas due to the interactions between Dme and Gex. The spot related to cell cycle genes appears virtually only in the Gex map, thus indicating a weak association with Dme (and CNV). Hence, the expression of these genes is only partly affected by differential methylation. Analogously, the most variant areas in the CNV map (caused by Chr. 1/19 codel) only partly match with the expression spots, which indicates that the respective modes are affected not only by CNV.

Appendix B. Additional Figures and Tables
The Dme-governed SOM (w m = 0.99, second row of maps) is driven by differential methylation in four major areas which transform into areas of differential expression due to mostly repressive interactions. Genes related to cell cycling accumulate in the middle of the map. Their expression is driven by transcription factor networks such as Myc driven targets and virtually not by methylation and not by CNV.
The CNV governed SOM (w C = 0.99, third row of maps) shows a pattern of copy number aberrations which largely directly transforms into the Gex variance map. Compare the "dominantly weighted" maps with the "equally weighted" maps in Figure 4a,c. The variance maps identify spot-regions of co-expressed (Gex map), co-methylated (Dme) and co-aberrant (CNV) genes which partly mutually overlap between the different modalities. (c) Distribution of genes in each of the SOM is different. It is governed by the dominantly weighted omics domain giving rise to clusters of highly variant gene expression (Gex), methylation (Dme) or copy numbers. Selected gene signatures consequently distribute differently in the different SOM. GCIMP genes were collected by combined hypermethylation and underexpression [37]. They consequently accumulate in certain areas in the Gex and Dmedominated maps but diffusely spread over different chromosomal locations in the CNV dominated map. Inflammatory genes [44] show the strongest local enrichment in the Gex dominated SOM but are more distributed in the two other SOM. Genes encoding G-protein coupled receptors (GPCR) accumulate in the middle of the Gex dominated SOM, an area of low variant expression, while in the Dme-dominated SOM these genes associate with higher variance, meaning that to a higher degree with changing methylation than with expression. Low expression transcription factor genes associate more with epigenetic modes of genomics regulation, while high expression TF associate more with transcription factor networks [56].
Particularly, the Gex governed SOM (we = 0.99, first row of maps) reveals a few areas collecting genes of highly variant expression which overlap with variant methylation areas due to the interactions between Dme and Gex. The spot related to cell cycle genes appears virtually only in the Gex map, thus indicating a weak association with Dme (and CNV). Hence, the expression of these genes is only partly affected by differential methylation. Analogously, the most variant areas in the CNV map (caused by Chr. 1/19 codel) only partly match with the expression spots, which indicates that the respective modes are affected not only by CNV.
The Dme-governed SOM (wm = 0.99, second row of maps) is driven by differential methylation in four major areas which transform into areas of differential expression due to mostly repressive interactions. Genes related to cell cycling accumulate in the middle The variance maps identify spotregions of co-expressed (Gex map), co-methylated (Dme) and co-aberrant (CNV) genes which partly mutually overlap between the different modalities. (c) Distribution of genes in each of the SOM is different. It is governed by the dominantly weighted omics domain giving rise to clusters of highly variant gene expression (Gex), methylation (Dme) or copy numbers. Selected gene signatures consequently distribute differently in the different SOM. GCIMP genes were collected by combined hypermethylation and underexpression [37]. They consequently accumulate in certain areas in the Gex and Dme-dominated maps but diffusely spread over different chromosomal locations in the CNV dominated map. Inflammatory genes [44] show the strongest local enrichment in the Gex dominated SOM but are more distributed in the two other SOM. Genes encoding G-protein coupled receptors (GPCR) accumulate in the middle of the Gex dominated SOM, an area of low variant expression, while in the Dme-dominated SOM these genes associate with higher variance, meaning that to a higher degree with changing methylation than with expression. Low expression transcription factor genes associate more with epigenetic modes of genomics regulation, while high expression TF associate more with transcription factor networks [56].  Table S1 for the full gene list, gene names and further details. (b) Gene sets were implemented in oposSOM ( [19,39] and references cited therein). Enrichment was calculated using Fisher's exact test [57]. Only gene sets with enrichment p < 10 −6 were considered.