An Exploratory Application of Multilayer Networks and Pathway Analysis in Pharmacogenomics

Over the years, network analysis has become a promising strategy for analysing complex system, i.e., systems composed of a large number of interacting elements. In particular, multilayer networks have emerged as a powerful framework for modelling and analysing complex systems with multiple types of interactions. Network analysis can be applied to pharmacogenomics to gain insights into the interactions between genes, drugs, and diseases. By integrating network analysis techniques with pharmacogenomic data, the goal consists of uncovering complex relationships and identifying key genes to use in pathway enrichment analysis to figure out biological pathways involved in drug response and adverse reactions. In this study, we modelled omics, disease, and drug data together through multilayer network representation. Then, we mined the multilayer network with a community detection algorithm to obtain the top communities. After that, we used the identified list of genes from the communities to perform pathway enrichment analysis (PEA) to figure out the biological function affected by the selected genes. The results show that the genes forming the top community have multiple roles through different pathways.


Introduction
Network analysis is a branch of network science that deals with the study of complex networks.To investigate complex relationships, network analysis adopts theories and methods typical of several research areas [1].Networks and network analysis methods are a keystone in computational biology and bioinformatics and are increasingly being used to study biological and clinical data in an integrated way.In detail, network analysis consists of a collection of techniques with a shared methodological perspective, which allows to depiction of relations among entities and to analysis of the structures that emerge from the recurrence of these relations.The basic assumption is that better explanations of different phenomena are yielded by the analysis of the relations among entities.A classical network analysis method is represented by community detection [2].Community detection is one of the most popular research areas in various complex systems, such as biology, sociology, medicine, and transportation systems [3,4].The reason for this is that the community structures, defined as groups of nodes that are more densely connected than the rest of the network, represent significant characteristics for understanding the functionalities and organizations of complex systems modelled as a network [2].It is expected that the communities play significant roles in the structure-function relationship.For example, in biological networks such as protein-protein interaction (PPI) networks, the communities represent proteins involved in a similar function; in neuroscience, the communities detected in brain networks mean regions of interest (ROI) that are active during tasks.In social networks, communities can be groups of friends or colleagues.In the World Wide Web, communities represent web pages sharing the same topic [5].Thus, the discovery of communities in these systems has become an interesting approach to figuring out how network structure relates to system behaviours.In recent years, network analysis has become an essential tool in pharmacogenomics [6,7].By providing a powerful framework to model data, network analysis has allowed researchers to analyse and interpret complex interactions between genes, proteins, and drugs in the pharmacogenomics field.This enables uncovering underlying biological mechanisms, identifying potential drug targets and biomarkers, facilitating drug repurposing efforts, and enabling personalized medicine approaches.In particular, network analysis can identify potential drug targets by constructing biological networks that integrate various data sources, such as protein-protein interactions, gene expression data, and pathway information.Furthermore, by analysing the network topology and identifying key nodes or modules, it is possible to pinpoint genes or proteins that play crucial roles in disease pathways or drug responses [8][9][10].This information can guide the development of targeted therapies.Also, network analysis can aid in the discovery of genetic biomarkers that predict drug response or adverse reactions.By integrating genomic and clinical data, researchers can construct networks that capture the relationships between genetic variations, clinical phenotypes, and drug response [11].Network-based approaches can identify modules or subnetworks that are highly associated with specific drug responses, enabling the discovery of potential biomarkers for personalized medicine.Also, network analysis can uncover the interconnected pathways and biological processes affected by genetic variations or drug treatments.By mapping genetic variants onto biological networks, pathways that are significantly enriched for these variants can be identified [12].This knowledge helps in understanding the molecular mechanisms underlying drug response and identifying potential targets for intervention.
By analysing the interactions between drugs, genes, and diseases in a network context, potential off-target effects or repurposed drugs for different indications can be revealed [13].Finally, network analysis can contribute to personalized medicine approaches by integrating patient-specific genetic and clinical data into networks.
Recently, the need to investigate more complicated frameworks than the classical networks has led to the introduction of a multilayer approach as an extension of graph theory.The reason for this is that many real networks cannot be exhaustively explained with a classical network approach, but need more complex structures [14,15].The introduction of multilayer networks provides a more comprehensive and realistic representation of complex systems where multiple types of relationships coexist.It allows to analyse and understand the dynamics and behaviour of interconnected entities in a more nuanced manner [16].
Multilayer network analysis enables the study of various properties and phenomena that are not easily captured by traditional network analysis approaches.It allows for the examination of interdependencies, correlations, and patterns that emerge across different layers.This can provide insights into how different layers influence each other, the resilience of the system, the spread of information or diseases, and the identification of key nodes or communities in the network.
Starting from these considerations, in this work, we aim to present an application of network analysis in pharmacogenomics to demonstrate how network analysis methods are able to extract hidden relationships and to discover novel knowledge, i.e., identifying key genes in biological pathways involved in drug response and adverse reactions.For this aim, we built a biological multilayer network comprising genes, drugs, diseases, and their associations extracted from a public database.Then, we analysed the multilayer network by applying a community detection algorithm, enabling the identification of essential genes from gene-disease-drug communities.After that, we used the identified list of genes from the communities to perform pathway enrichment analysis (PEA) to figure out the biological function affected by the selected genes.In particular, the identified genes are detached from their biological context, making it impossible to know in which biological mechanisms and functions they are involved.To understand which biological mechanisms are affected by these communities of essential genes, it is mandatory to link each gene to the opportune biological reference context by means of a pathway enrichment analysis (PEA).PEA links genes and groups of genes to the influenced biological pathways responsible for disease development, adverse drug reactions, as well as the different overall survival rates of patients treated with the same drugs.Thus, the new knowledge allows the development of new treatments that are more effective than drug repositioning strategies, in addition to realizing more adequate drugs for reducing or, even better, eliminating the onset of possible adverse drug reactions.

Background on Multilayer Networks
Multilayer networks have emerged as a powerful framework for modelling and analysing complex systems with multiple types of interactions.Unlike classical networks, which only consider one type of relationship between nodes, multilayer networks present the interdependencies between the entities of a system and the interacting layers [17]; see Figure 1 for a complete example.The first example (a) reports three different networks, i.e., a gene-gene interaction network, disease-disease interaction network, and drug-drug interaction network.In the second example, each of those networks represent a distinct layer in the multilayer network.The nodes of the multilayer network are the genes, the diseases, and the drugs, all discriminated by belonging to the respective layer.The intra-edges represent the gene-gene, drug-drug, and disease-disease associations, while the inter-edges are the gene-disease, gene-drug, and disease-drug associations.
Formally, a multilayer network can be traced back to a set of nodes, edges, and layers that take into account the physical and functional relationships between them [16,18].
In particular, each layer in a multilayer network represents a specific aspect or type of relationship between nodes.For example, in a social network, different layers can represent friendships, professional connections, or family relationships.Each layer can have its own set of nodes and edges, and there can be connections between nodes across different layers.
These networks provide a realistic representation of complex real-world systems, finding their way into practical applications in various domains, such as social network analysis, transport organization, biological systems, and technological networks [15,19].Formally, a multilayer network graph may be described as a tuple G ml = V L , E intra L , E inter L xL , where G ml is multilayer graph, V L , E intra L is a set of nodes belonging to each layer, E inter L xL is a set of edges belonging to each layer, and L = {0, 1, ...l} is a set of layers.For each layer k, we have a graph V k , E intra k (intralayer edges), and for each pair of layers, k, h, we have a set of edges (interlayer edges) Einter v xk connecting nodes of the layers v and k [20].
Examples of multilayer networks come from many different fields, from social network analysis to biological networks.For instance, Figure 1 represents an example of a biological multilayer network representing the interplay among diseases, genes, and drugs.Multilayer networks provide a powerful tool for analysing complex genetic and clinical data in pharmacogenomics, enabling better predictions of drug response, identification of drug targets, and acceleration of drug discovery and development processes.Multilayer networks can be used for general data analysis in pharmacogenomics.They can integrate and analyse large-scale genomic, transcriptomic, and proteomic data to uncover hidden relationships between genes, proteins, and drug responses.This can lead to a better understanding of the underlying mechanisms of drug response and aid in the development of personalized treatment strategies.
Furthermore, multilayer networks can be used to predict how individuals will respond to specific drugs based on their genetic information.By training the network on a dataset of patients' genetic profiles and corresponding drug responses, it can learn complex patterns and make predictions for new patients.This can help identify individuals who are likely to experience adverse drug reactions or those who are more likely to respond positively to a particular medication.Multilayer networks can help identify genetic markers associated with the risk of adverse drug reactions (ADRs).By integrating genetic and clinical data, the network can identify patterns that link specific genetic variations to ADRs.This information can be used to develop personalized medicine approaches, where patients at higher risk of ADRs can be identified and alternative treatment options can be explored.Multilayer networks can aid in the identification of potential drug targets by analysing genetic data and can assist in drug discovery and repurposing efforts by analysing genetic data and identifying potential drug candidates.By integrating information from various sources such as gene expression, protein-protein interactions, and biological pathways, the network can identify key genes or proteins that play a crucial role in disease development or drug response.

Material and Methods
In order to apply multilayer network formalism and pathway enrichment analysis, with the goal to improve knowledge in the pharmacogenomics field, we design a methodology that comprises four steps:

•
The building of a biological multilayer network comprising genes, drugs, diseases, and their associations extracted from the BioSNAP database; • The analysis of the multilayer network by applying a community detection algorithm; • The identification of essential genes from gene-disease-drug communities; • Performing pathway enrichment analysis (PEA) to figure out the biological function affected by the selected genes.

Case Study
We considered the following datasets from the Stanford Biomedical Network Dataset Collection (BioSNAP) [21]: Drug-Drug Interaction (DrDrI) network of interactions between drugs, approved by the U.S. Food and Drug Administration (FDA): 1514 nodes and 48,514 edges.

2.
Disease-Disease (DD) network of interactions between 6878 inherited nodes and 6877 inherited edges.

3.
Gene-Gene (GG) network of interactions between in 25,825 inherited nodes and 208,836,746 inherited edges.The nodes are given by NCBI Entrez Gene IDs.

4.
Disease-Drug Association (DDrA) network, a set of curated relationships between diseases and drugs: 5535 disease nodes, 1662 drug nodes, and 466,656 edges.The diseases are given by DOIDs, i.e., Disease Ontology terms. 5.
Gene-Disease (GDA) Association network, a set of relationships between genes and disease: 7294 gene nodes, 519 disease nodes, and 21,357 edges.6.
Gene-Drug Interaction (GDrI) network, a set of relationships between genes and drugs: 3648 gene nodes, 284 drug nodes, and 18,690 edges.
We build a multilayer network with three layers obtained from the DDI, DDr, and GG databases.Then, we add interlayer edges by considering the DDrA, GDA, and GDrI databases.Finally, the resulting multilayer network, that, for convenience, we called the GDD multilayer network, consisted of 52,640 nodes and 208,892,137 interactions, of which 506,703 interedges exist.At first, we performed a topological analysis on the GDD multilayer network.The network analysis was performed using the multinet R package [22] (for complete details on multilayer network analysis, see [22]).Table 1 summarizes topological measures computed using GDD on the multilayer network for each layer.Table 2 summarizes topological measures computed using GDD on the multilayer network for layer comparison.The first part of the value indicates the type of comparison function (Jaccard, Coverage, Simple Matching, Russell Rao, Kulczynski, Hamann), and the second part indicates the configurations to which the comparison function is applied.Table 3 summarizes the distribution dissimilarity computed using GDD on the multilayer network (notice that these are dissimilarity functions: 0 means the highest similarity) Table 4 summarizes the statistical degree correlations computed using GDD on the multilayer network.

Network Measure Value
Pearson degree 0.153 rho degree 0.220

Community Detection on GDD Multilayer Network
Once built, we analyse the GDD multilayer network by applying one of most useful exploratory technique for network analysis, i.e., community detection.Community detection is considered a first step in understanding network analysis and community structures, defined as groups of nodes that are more densely connected than the rest of the network, and represent significant characteristics for understanding the functionalities and organizations of complex systems modelled as networks.Thus, community extraction provides the identification of densely connected nodes within multilayer networks that play significant roles in the structure-function relationship.For this study, we selected Infomap [23] because, according to the literature, it outperforms other community detection methods for multilayer networks [24].
Then, we applied Infomap on the GDD multilayer network, obtaining 153 communities.Infomap extracted three typologies: (i) communities containing genes, diseases, and drugs; (ii) communities containing diseases and drugs; and (iii) communities containing genes.For our aims, we focus on the first typology of communities containing genes, diseases, and drugs.Then, we selected the top 10 communities, i.e., the communities comprising interlayer relations, for example, gene-drug and gene-disease relations.In Table 5, we reported the list of genes belonging to the top 10 communities.

Table 5. Top ten communities.
Community Genes

Pathway Enrichment Analysis
Pathway enrichment analysis (PEA) helps researchers comprehend the biological meaning of gene lists obtained from high-throughput experiments, such as RNA sequencing, genome-wide association studies, or proteomics.These experiments identify genes, including proteins and metabolites, that differ between the conditions of interest.However, this gene list alone is insufficient to understand the biological differences between these conditions.Therefore, PEA assists researchers in interpreting large gene lists and developing hypotheses about the underlying biology [25].
To identify the biological mechanisms and/or functions affected by the identified communities of genes, we used the Reactome pathway database [26].In particular, we describe the enrichment performed using the communities with identifier 10 using the software tool BiopaxParser (BiP-v.1)[27].
Table 6 reports the enriched pathways using the list of proteins belonging to community 10.Next, we used BiP to know which pathways are influenced for each input gene.Table 7 presents the relation between genes and affected pathways.Inside community 10, a total of 36 genes are a member of layer 5, e.g., the disease-gene layer, and layer 6, e.g., the drug-gene layer, revealing the multiple roles of a gene through different pathways.Analysing Table 7, it is worth noting that the activity of the metabolism of proteins pathway, a well-known pathway related to adverse or normal drug responses, as well as to disease progression or decline, is regulated by the interactions of more multilayer genes, namely P43088, P15170, P18509, P05546, Q9Y277, Q02817, Q13285, O75976, and P15328, reinforcing the benefits of the multilayer formalism to represent complex networks.

Results and Discussion
Pharmacogenomics is a complex field where the drug response of the living organism is due to the interactions of several different biological entities like genes, enzymes, and small and large molecules that cooperate in a synchronized fashion to accomplish the task.Multilayer network representation allows for more comprehensive and realistic modelling of these heterogeneous interactions than traditional ones [28].In addition, multilayer networks enable the identification of multilayer communities that are a bunch of genes more densely connected among them and the correlations through the different layers: information that can be used to perform PEA to comprehend the affected underlying biological mechanisms.
Performing PEA using the detected gene communities from layers 10 enriches several biological pathways, as reported in Table 6.Analysing the content of Table 6, it is worth noting that the enriched pathways present multiple intertwinements among them, some of which are more explicit than others.We conducted a literature search to explore the possible connections between the results of the protein enrichment analysis.According to Bhardwaj et al. [29], leishmania alters various signalling pathways to survive, which is in line with the other enriched pathways 6, 7, 8, 9. Additionally, Kaiser [30] found that cyclic nucleotides such as cAMP and cGMP are crucial for parasitic proliferation and regulate functions such as auditory and olfactory senses [31].Also, Rho GTPases play a role in hostpathogen interaction by controlling innate and adaptive immune responses.Pathway 1 in Table 6 is another signalling pathway that leishmania affects, as described in [31].Moreover, Schlessinger et al. [32] explain the vital role of the mediator of Rho GTPases in the WNT signalling pathway.Finally, Kikuchi et al. [33] describe the regulation of WNT signalling pathways through post/translation modifications, while Li et al. [34] provide details on the role of RUNX1 in promoting tumour metastasis by activating WNT.
Table 7 clearly displays the association between genes and pathways, emphasizing that a single gene can be involved in multiple pathways.The enrichment was calculated by implicitly incorporating topological and structural network properties, resulting in improved enrichment outcomes, as opposed to using more general genes, as described in [27].
In Figure 3, community 1 is represented as a network.The green nodes with red labels in the network correspond to the genes listed in Table 7.These genes play a crucial role in the network's connectivity and are known as hub genes in the literature.For instance, if we remove O95182, P15328, and P09012, the network loses its complete connectivity.To learn more about the role of the three hub proteins, we searched the Reactome database.We used the Reactome web pathway browser and found out that the three hub proteins are a part of the metabolism pathway.Specifically, all three proteins have an impact on the citric acid (TCA) cycle and respiratory electron transport pathway.Moreover, proteins P09012 and O95182 regulate respiratory electron transport, ATP synthesis via chemiosmotic coupling, and heat production by uncoupling proteins pathway.This highlights the importance of multilayer modelling and enables the selection of more relevant genes from the network for performing PEA.In addition, Table 7 includes some genes that are not part of community 1, featured in Figure 4.
If we rely solely on a traditional network representation, we may overlook crucial information, such as the fact that gene P18509 does not belong to community 1.However, through a multilayer representation, we can observe that gene P18509 interacts with P10109, as illustrated in Figure 3.
The reason for this is that both genes affect the same category of pathway related to the metabolism.In conclusion, the use of multilayer networks to represent interactions among heterogeneous data is a novel approach, especially in the field of omics.A multilayer approach can help researchers capture more information and obtain a more accurate understanding of gene interactions.In the literature, Shang et al. in [28], propose a multilayer network representation learning method for predicting drug-target interactions.This method integrates information from different networks, reduces onise, and learns the feature vectors of drugs and targets, overcoming the challenges of integrating multiple data types and managing network noise.Using a multilayer network to infer new relationships among genes, diseases, and drugs is at its early stage and is a continuously developing field.This limits the possibility of validating the proposed method by comparing it with existing methodologies.Using the proposed method, we discovered potential new relationships between leishmania and different signalling pathways: results possible only through multilayer representation.This could help researchers to identify drugs targeting specific biological functions affected by the enriched pathways.Investigating leishmania is particularly important in the context of travel medicine.Berman reviewed several aspects of diagnosis and treatment for leishmania in [35].With our method, we could determine which drugs could contrast the damage caused by leishmania infection.7.

Conclusions
In this work, we explore the application of network analysis in the pharmacogenomics field.In particular, we used multilayer network representation to model the interaction among genes, drugs, diseases, and their associations.Then, we analysed the network by applying a community detection algorithm to discover the top communities.Finally, we used the identified list of genes from the communities to perform pathway enrichment analysis (PEA) to figure out the biological function affected by the selected genes.The results demonstrate that the genes forming the communities extracted from the multilayer network regulate the activity of the protein metabolism pathway related to adverse or normal drug response, as well as the progression or decline of disease, demonstrating the advantages of multilayer formalism to represent pharmacogenomic domains.

Figure 1 .
Figure 1.Examples of classical and multilayer networks.The figure shows two toy examples of a classical biological network (a) and a multilayer network (b).The first example (a) reports three different networks, i.e., a gene-gene interaction network, disease-disease interaction network, and drug-drug interaction network.In the second example, each of those networks represent a distinct layer in the multilayer network.The nodes of the multilayer network are the genes, the diseases, and the drugs, all discriminated by belonging to the respective layer.The intra-edges represent the gene-gene, drug-drug, and disease-disease associations, while the inter-edges are the gene-disease, gene-drug, and disease-drug associations.

Figure 2
Figure 2 summarize all steps.

Figure 3 .
Figure 3. Interactions among genes in community 2. The figure displays a network diagram depicting the interactions among the genes that belong to community 2.

Figure 4 .
Figure 4. Interaction among genes in community 1.The figure displays a network diagram depicting the interactions among the genes that belong to community 1.In the network, green nodes with red labels indicate the genes listed in Table7.

Table 1 .
Topological measures computed using GDD on the multilayer network for each layer.

Table 2 .
Topological measures computed using GDD on the multilayer network for layer comparison.

Table 3 .
Distribution dissimilarity computed using GDD on the multilayer network.

Table 4 .
Statistical degree correlation computed using GDD on the multilayer network.

Table 6 .
The first 10 enriched pathways, in order of statistical relevance, obtained using the gene list of community 10 as input data.

Table 7 .
The 20 multilayer genes and their affected biological pathways.