In Silico Pleiotropy Analysis in KEGG Signaling Networks Using a Boolean Network Model

Pleiotropy, which refers to the ability of different mutations on the same gene to cause different pathological effects in human genetic diseases, is important in understanding system-level biological diseases. Although some biological experiments have been proposed, still little is known about pleiotropy on gene–gene dynamics, since most previous studies have been based on correlation analysis. Therefore, a new perspective is needed to investigate pleiotropy in terms of gene–gene dynamical characteristics. To quantify pleiotropy in terms of network dynamics, we propose a measure called in silico Pleiotropic Scores (sPS), which represents how much a gene is affected against a pair of different types of mutations on a Boolean network model. We found that our model can identify more candidate pleiotropic genes that are not known to be pleiotropic than the experimental database. In addition, we found that many types of functionally important genes tend to have higher sPS values than other genes; in other words, they are more pleiotropic. We investigated the relations of sPS with the structural properties in the signaling network and found that there are highly positive relations to degree, feedback loops, and centrality measures. This implies that the structural characteristics are principles to identify new pleiotropic genes. Finally, we found some biological evidence showing that sPS analysis is relevant to the real pleiotropic data and can be considered a novel candidate for pleiotropic gene research. Taken together, our results can be used to understand the dynamics pleiotropic characteristics in complex biological systems in terms of gene–phenotype relations.


Introduction
Pleiotropy is the phenomenon in which one gene can result in multiple phenotypes or traits [1][2][3]. In human genetic diseases, it means that different mutations within the same gene cause different pathological effects [4,5]. This becomes an important contributor in identifying a novel function of individual genes with respect to gene-gene interactions [6,7] in system-level biological diseases [8,9]. In this regard, many methods have been modeled to understand the pleiotropy. For example, an experimental study [10] performed several laboratory cultures with the nath-10 polymorphism and explained its pleiotropic role in the evolution of a cryptic genetic variation in C. elegans. In another study, a statistical analysis [11] using canonical correlation analysis identified a novel candidate pleiotropic associations between genetic variants and phenotypes. In addition, a few computational models [12,13] deployed a pairwise combination of genome-wide association data from the complex disease pleiotropy analysis and modular gene expression analysis. Another study using a metabolic model [14] conducted constraint-based simulations for E. coli and S. cerevisiae and found that pleiotropy is an emergent property of metabolic network.
Finally, protein-protein interaction network analysis was also widely used, and some studies [15][16][17] confirmed pleiotropic effects in biological molecular function, which lead to complex diseases. Despite the interesting observations in previous studies, most of the previous approaches focused on the pleiotropy analysis induced by undirected molecular correlation networks. Therefore, a new approach is needed to investigate the pleiotropy induced by a directed signaling network because it can explain the pleiotropy caused by a gene-gene dynamical relationship.
To quantify the pleiotropic degree of a gene in terms of network dynamics, we proposed in silico pleiotropic score (sPS), which is a measure to represent how differently a gene affects the dynamics of other genes against different mutations, such as knockout [18,19] and over-expression [20] mutations. In this study, we employed the Boolean network model [21,22] to simulate the network dynamics. A Boolean network model implicitly assumes that all biological components are described by binary values, and their interactions represented by Boolean regulatory functions [23], and it is well-known to capture the silent dynamical properties of biological networks [23,24]. For example, it has been used to analyze oncogene rules in Non-small cell lung cancer [25], to model the C. albicans yeast for hyphal transition [26], to a matrix cell density sensing to contact inhibition, proliferation, migration, and apoptosis [27], or to illustrate the regulatory effects in cervical cancer [28]. A previous study showed that the dynamics influence of a gene to another genes has some interesting structural characteristics in the signaling network [29]. This study can be extended because the pleiotropy is understood as the difference of the dynamics influence against different mutation types. In this study, through intensive investigations with a signaling network, we observed that most of the dynamically affected genes were related to the experimentally proved pleiotropic genes [30]. Moreover, we found that sPS is negatively correlated with the previous standardized method pleiotropy [2]. Further, we investigated the relationships of sPS with structural properties and found that they have highly positive correlations with degree/in-degree/out-degree, feedback loops, and centrality measures such as closeness, betweenness, stress, and eigenvector, in the signaling network. This implies that the more central of genes, the more pleiotropic. Finally, we found some biological evidence confirming that sPS analysis is relevant to the experimental pleiotropic database, and it can be used for novel candidate pleiotropic gene characteristics. Through a network visualization, we observed that most novel candidate pleiotropic genes are closely located to the known pleiotropic genes. Taken together, these results help to understand the importance of dynamics pleiotropic in complex biological systems in terms of gene-phenotype relations.

Datasets
To examine the in silico dynamics-based pleiotropy, we employed a dataset of a cellular signaling network with 1659 genes and 7964 interactions [31], which was constructed from Kyoto Encyclopedia of Genes and Genomes network (KEGG) database [32]. We then retrieved the list of the pleiotropic genes from human phenotype ontology (HPO) database [30,33], where (a) a gene v i is associated with n phenotypes, and (b) a phenotype p is associated with n genes. In our analysis, this definition was used to sort KEGG genes that are associated with any phenotypes in (a) and (b), and to compute their degree of pleiotropy by the number of phenotypes of those genes. Further, we classified all KEGG genes with respect to the functional importance genes by annotating them with cancer genes, drug-target genes, essential genes, tumor-suppressor genes, oncogenes, and disease genes based on TCGA CBioportal [34,35], DrugBank [35,36], DEG [37], TSGene [38,39], ONGene [40], and DisGenet [41,42] databases, respectively (See Tables S1-S3 in Supplementary Materials). We note that this study is not limited to the pleiotropic genes identified from the public databases, because there are other ways to explore the pleiotropy, for example, lab experiments [43] using high-throughput morphometric analysis of hundreds of thousands of single cells in the budding yeast Saccharomyces cerevisiae or experimental design in Drosophila simulans with genome sequencing [44] to study pleiotropy.

Boolean Network Model
To examine pleiotropic dynamics of genes induced by different types of mutations in a large-scale network, we applied a synchronous Boolean network model [23], which is one of the simplest computational methods to elucidate the network dynamics [22] and has been used to examine complicated behaviors of biological networks [45][46][47]. A Boolean network is represented by a directed graph G(V, A), where V = {v 1 , v 2 , . . . , v N } is a set of nodes, and A ⊆ V × V is a set of directed links. Each v i ∈ V has a value of 1 (on) or 0 (off), which means the possible states of the corresponding elements. A directed link v i , v j represents a positive (activating) or a negative (inhibiting) relationship from v i to v j (v i and v j are called source and target nodes of the link, respectively). Let v(t) denote the state of node v at time step t. When a state of v i at time t + 1 is determined by the values Then, all nodes are synchronously updated, and here, we implemented a nested canalyzing functions (NCFs) model [48,49] to describe an update rule f i as follows: where I m and O m (m = 1, 2, . . . , k i ) represent the canalyzing and canalyzed Boolean values, respectively, and O de f is generally set to 1 − O k i . In addition, we specified all I m and O m values independently and uniformly at random between 0 and 1. We note that many biological networks were successfully represented by NCFs [50][51][52], and NCFs also properly fit biological experiments' data [49] including pleiotropy analysis [11]. Those support that NCFs can describe the network dynamics considerably similarly to those real biological networks.
In a Boolean network, a network state at time t can be denoted by a list of state values of all nodes, v(t) = [v 1 (t), v 2 (t), . . . , v N (t)] ∈ {0, 1} N . Next, every network state transits to another network state through a set of Boolean update functions F = { f 1 , f 2 , . . . , f N } and eventually converges to either a fixed point or a limit-cycle attractor starting from its initial state. This attractor represents the diverse biological network behaviors such as homeostasis or oscillation. The definition of the attractor is defined as follows.
In this study, the examination of attractors is needed to find the affected genes. The affected genes were obtained based on our previous work about gene-gene dynamics influence networks [29]. To implement this, we specified a set of initial states, S, and computed a state trajectory starting at every v(0) ∈ S until an attractor is found. We note that the network dynamics can depend on the initial network states.

Computation of In-Silico Pleiotropic Scores (sPS)
Given a gene subject to different types of mutations, we propose sPS of the gene to represent how much the other genes are differentially affected in terms of the dynamics that can be used to deepen the study of existing measures of pleiotropy analysis. Specifically, we considered two mutations, knockout [19,53] and overexpression [54] mutations. The knockout mutation represents the effect of suppressing the expression of a gene or the pharmaceutical inhibition of the secondary messenger production or kinase/phosphate activity [18,55]. On the other hand, the overexpression mutation represents the effect of gene expression change [20,56]. In the Boolean network model, these mutations describe scenarios where the state of mutated gene is frozen to 0 (off) state and 1 (on) state, respectively, during a mutation duration time T. In this study, T is a parameter to denote the mutation duration time, and thus a mutation is effective only for t ≤ T. This mutation duration time is considered important since it can affect the mutation process of molecular interaction networks [57,58]. Taken together, these mutations can be implemented by changing F into F for t ≤ T as follows: where α is a set to 0 and 1 in the case of the knockout and the overexpression mutations, respectively. Note that the update rule of v i is restored to f i after the time step T.
To compute sPS of a gene, we employed the notion of the gene-gene dynamics influence used in the previous work [29]. Given a Boolean network G(V, A) with a set of nodes V = {v 1 , v 2 , . . . , v N } specified by a set of corresponding update rules F = { f 1 , f 2 , . . . , f N }, we generate a set of random initial states S. We first define the dynamics influence of gene v i on gene v j , which represents how much the states sequence of v j is changed by a mutation subject to v i as follows (see Figure S1 in the Supplementary Information for an illustrative example):

1.
For each initial state v(0) ∈ S, we obtain two attractors G, F, v(0) and G, F , v(0) in the wild-type and the v i -mutant networks, respectively. For convenience, let We compute a distance between G, F, v(0) j and G, F , v(0) j defined as follows: where c and e are the least common multiple and the greatest common divisor, respectively, of p and p , and I(condition) is an indicator function where outputs 1 if condition is true, and 0 otherwise. As a result, d v(0), v i , v j represents the minimum ratio of a bitwise difference between the states sequence of v j in the wild-type and the v i -mutant attractors over the least common period (c) of the two attractors.

3.
Lastly, we compute the dynamics influence of v i on v j denoted by µ v i , v j by averaging out d v(0), v i , v j over a set of initial states in S as follows: Then, let v i an arbitrary source gene in V. Based on the dynamics influence, we can denote a set of affected genes as v j ∈ V µ v i , v j > 0 . Let V k and V o the sets of affected genes with respect to the knockout and the overexpression mutations, respectively. We then define sPS of a gene v i as follows: It represents the proportion of the genes that are included in the symmetric difference of V k and V O among their union. Figure 1 shows an illustrative example of computing sPS. Let v 1 a node subject to the knockout or the overexpression mutation. Through a Biomolecules 2022, 12, 1139 5 of 16 computation of the dynamics influence from v 1 to v 2 , v 3 and v 4 , we obtained the sets of affected genes regarding the knockout and the overexpression mutations, Biomolecules 2022, 12, x FOR PEER REVIEW 5 of 17 It represents the proportion of the genes that are included in the symmetric difference of and among their union. Figure 1 shows an illustrative example of computing . Let a node subject to the knockout or the overexpression mutation. Through a computation of the dynamics influence from to , and , we obtained the sets of affected genes regarding the knockout and the overexpression mutations, = , , and = , , respectively. Thus, ( ) = 1/3 which implies that the node is pleiotropic because ( ) > 0. where the state value of is frozen to 0 and 1, respectively, for ≤ . The sets of genes whose dynamics are influenced by the mutations are identified as and , respectively. Accordingly, of is computed as the ratio of the symmetric difference of and over the union of them.

A Standardized Measure of Degree of Pleiotropy
A previous research used a standardized pleiotropic measure [2] to compute phenotypic effects in the baker's yeast S. cerevisiae. Given a gene, they examined the average and the standard deviation of the number of transformed traits from wild-type cells, denoted by and , respectively, and the number of transformed traits from a cell deficient of the gene, denoted by . In addition, they defined a standardized measure by the ztransformed pleiotropic score ( ) as follows: We note that this notion of pleiotropy is different from that in . The former focused on the standardized cardinality of the set of affected genes. To examine the notion of in our work, we set to the number of affected genes (i.e., | | or | |). In addition, we specified and as the average and the standard deviation of the number of associated phenotypes or traits of HPO database for every KEGG gene. In this way, we calculated of the KEGG associated with HPO database and compared with our of them. The mutations change F to F where the state value of v 1 is frozen to 0 and 1, respectively, for t ≤ T. The sets of genes whose dynamics are influenced by the mutations are identified as V k and V o , respectively. Accordingly, sPS of v 1 is computed as the ratio of the symmetric difference of V k and V o over the union of them.

A Standardized Measure of Degree of Pleiotropy
A previous research used a standardized pleiotropic measure [2] to compute phenotypic effects in the baker's yeast S. cerevisiae. Given a gene, they examined the average and the standard deviation of the number of transformed traits from wild-type cells, denoted by m wt and SD, respectively, and the number of transformed traits from a cell deficient of the gene, denoted by m d . In addition, they defined a standardized measure by the z-transformed pleiotropic score (zPS) as follows: We note that this notion of pleiotropy is different from that in sPS. The former focused on the standardized cardinality of the set of affected genes. To examine the notion of zPS in our work, we set m d to the number of affected genes (i.e., |V k | or |V O |). In addition, we specified m wt and SD as the average and the standard deviation of the number of associated phenotypes or traits of HPO database for every KEGG gene. In this way, we calculated zPS of the KEGG associated with HPO database and compared with our sPS of them.

Structural Characteristics of Pleiotropic Genes
It is known that the structural characteristics of genes in biological network related to their dynamics stability [59,60]. Here, we considered the following structural properties to investigate the relations to sPS.

•
A feedback loop (FBL) means a sequence chain of nodes where any node is not repeated except the starting and the end nodes [59,61]. In a given network G(V, E), an FBL is a closed simple cycle in which all nodes except the starting and ending nodes are not revisited; in other words, a path P of a length L(≥ 1) is represented by a sequence of ordered nodes u 1 → u 2 → . . . → u L+1 with no repeated nodes except u 1 and u L+1 . Hence, the P is called a feedback loop if u 1 = u L+1 . It was known that FBLs play important roles for controlling dynamics behavior of signaling networks [61][62][63][64]. • Centrality properties including the closeness defined as the reciprocal of the average distance from a node to every other node [4], the betweenness defined as the ability of a gene to control communication between genes through the shortest paths [65], the stress based on enumeration of the shortest paths [66], and the eigenvector represented by the principal eigenvector of the adjacency matrix of the network, where each node affects all of its neighbors [52]. • Degree of nodes represents the number of edge upon a gene link to another gene [4,64].
In addition, in-degree and out-degree mean the degree of the incoming and the outgoing links only, respectively.

Random Network Generation
To verify that the results of sPS in the real molecular interaction networks are relevant with randomly structured networks, we generated random networks using the Barabási Albert (BA) model [67], which is a kind of network growth model with a preferential attachment scheme, or called a probabilistic mechanism, where a new node is free to connect to any node in the network, whether it is a hub or just has a single link.

Parallel Computation
For efficient in silico simulations, we implemented the computational program using PANET [46], which is an analysis tool of the network dynamics using the OpenCL library (The recent version is available at http://panet-csc.sourceforge.net/, accessed on 26 November 2019). This allows us to compute large number of attractors in parallel by assigning each initial random state in Equations (3)-(6) to a processing unit of CPUs and/or GPUs.

Statistical Analysis
In this work, we conducted all statistical analysis using the Pearson correlation coefficient and the Student t-test using MedCalc Statistical Software version 13.0.6 (MedCalc Software bvba, Ostend, Belgium; http://www.medcalc.org; 2014) [68].

Results
In this work, we simulated sPS of all genes in the KEGG signaling networks using a Boolean network model (see Section 2). A total of 1000 initial states were randomly generated to calculate sPS, and the mutation duration time T was varied from 14 to 20 considering the network size of KEGG (|V| = 1659).

Comparison of sPS with the Observational Pleiotropy
To show that our approach is relevant to the real phenotype data, we plot a contingency table between the degree of pleiotropy between our in silico model and HPO (Table 1). We first listed every KEGG gene that is associated with any phenotypes in the HPO database (see Supplementary Table S2). Next, we specified the degree of pleiotropy with respect to the HPO database as the number of phenotypes or traits of the gene in HPO ('HPOassociated'). Further, we specified the degree of pleiotropy with respect to our in silico model (the mutation duration time T was set to 20) as the number of KEGG affected genes by the knockout and the over-expression mutations in sPS ('sPS-associated'). Table 1. The list of genes compared between the degree of pleiotropy by HPO and sPS. The top 10 genes with the highest degree of pleiotropy are chosen by the number of phenotypes in HPO database ('HPO-associated') and the number of KEGG genes affected by the knockout and the over-expression mutations in sPS ('sPS -associated').

No.
Gene Name HPO-Associated sPS-Associated We then selected top 10 HPO-associated genes that are also related to sPS. As shown in the table, there are some genes that show a high degree of HPO-associated pleiotropy but a low degree of sPS-associated pleiotropy. This shows that our method 'sPS-associated' supported the HPO-associated genes and can be used to analyze whose pleiotropic phenomenon is known in terms of network dynamics.
Next, we investigated the relationship between sPS and zPS by varing the mutation duration time (see Figure 2). As shown in Figure 2, there was a negative correlation between them irrespective of the mutation duration time T (All p-values < 0.0001 using t-test). This implies that our measure is different from the previous measure. This is because zPS focused on the influential extent of the mutation, whereas sPS considers the degree of influential difference caused by a pair of differences. In this regard, sPS can convey a novel viewpoint of pleiotropy from the previous approach.

Relation of sPS and the Functional Importance Genes
Some pleiotropic genes are relevant to many functionally important genes such as cancer genes, drug targets, essential genes, tumor suppressors, oncogenes, and disease genes. For example, it is known that cancer is one of the lead causes of death in human population [69,70], which comes from the accumulation of sequential mutations resulting from cell abnormalities or genetic instability [71,72]. Accordingly, it is not surprising that pleiotropic analysis has become very common in explore different cancer phenotypes [73]. Another examples is the investigation of drug-target genes through network-based analysis [74,75], which shows a significantly different connectivity, more feedback loops, and more evolutionary than non-drug target genes [76]. Thus, drug targets were considered potential cancer therapeutics [77], and recently, they have been recognized as new therapeutic targets for pleiotropic genes [78]. Furthermore, it has been identified that the deletion of any essential genes can lead to death or infertility [79] and tend to be associated with human disease genes [5,80]. Inspired by those results, we investigated the relationship of sPS with functionally important genes. Firstly, we specified every KEGG gene into 'cancer', 'drug-target (DT)', 'essential', 'tumor-suppressors (TSG)', 'oncogene (OCG)', and 'disease gene (DG)' (see Supplementary Table S3 in detail). Secondly, we examined their sPS values and classified them into two subset groups, 'non-Zero sPS' and 'Zero-sPS'. We compared the proportion of the functionally important genes between the two groups. Figure 3a-f shows the result of cancer genes, drug targets, essential genes, tumor suppressors, oncogenes, and disease genes, respectively. As shown in the figure, the ratio in the non-zero sPS group is significantly larger than that in the zero-sPS group for all types of functionally important genes and all mutation duration time T. In other words, the functionally important genes tend to be more pleiotropic than the other genes based on our in silico model. It is interesting that this result is relevant with some previous studies. For example, the gene TERT was found to be a pleiotropic cancer gene associated with 12 different cancer types [81], while PTPN2 was confirmed as a pleiotropic gene associated with several autoimmune diseases [82]. This result indicates the promising usefulness of sPS to predict the unknown pleiotropic role of functionally important genes.

Relation of and the Functional Importance Genes
Some pleiotropic genes are relevant to many functionally important genes such as cancer genes, drug targets, essential genes, tumor suppressors, oncogenes, and disease genes. For example, it is known that cancer is one of the lead causes of death in human population [69,70], which comes from the accumulation of sequential mutations resulting from cell abnormalities or genetic instability [71,72]. Accordingly, it is not surprising that pleiotropic analysis has become very common in explore different cancer phenotypes [73]. Another examples is the investigation of drug-target genes through network-based analysis [74,75], which shows a significantly different connectivity, more feedback loops, and more evolutionary than non-drug target genes [76]. Thus, drug targets were considered potential cancer therapeutics [77], and recently, they have been recognized as new therapeutic targets for pleiotropic genes [78]. Furthermore, it has been identified that the deletion of any essential genes can lead to death or infertility [79] and tend to be associated with human disease genes [5,80]. Inspired by those results, we investigated the relationship of with functionally important genes. Firstly, we specified every KEGG gene into 'cancer', 'drug-target (DT)', 'essential', 'tumor-suppressors (TSG)', 'oncogene (OCG)', and 'disease gene (DG)' (see Supplementary Table S3 in detail). Secondly, we examined their values and classified them into two subset groups, 'non-Zero ' and 'Zero-'. We compared the proportion of the functionally important genes between the two groups. Figure 3a-f shows the result of cancer genes, drug targets, essential genes, tumor suppressors, oncogenes, and disease genes, respectively. As shown in the figure, the ratio in the non-zero group is significantly larger than that in the zero-group for all types of functionally important genes and all mutation duration time . In other words, the functionally important genes tend to be more pleiotropic than the other genes based on our in silico model. It is interesting that this result is relevant with some previous studies. For example, the gene TERT was found to be a pleiotropic cancer gene associated with 12 different cancer types [81], while PTPN2 was confirmed as a pleiotropic gene associated

Relation of sPS and the Structural Characteristics
Some previous studies have shown that the structural characteristics of a gene in signaling networks are related to its dynamical behavior [59,60]. In this regard, we examined the relation of sPS with the structural characteristics, specifically, the degree, the involvement of feedback loops, and some centrality measures (Figure 4). For every KEGG genes, we first compared the correlation coefficients between degree/in-degree/out-degree of genes with sPS values.
As shown in Figure 4a, all degrees showed significantly positive correlations, irrespective of the mutation duration time T. It means that the sPS values tend to be larger as the number of degree/in-degree/out-degree gets larger. In addition, we observed that the correlation coefficient of sPS with in-degree was relatively lower than those with degree and out-degree. In fact, the difference of the average sPS between two gene groups classified by in-degree values was not as large as that when two gene groups were classified by degree or out-degree values (see Figure S2 in Supplementary Information). It is interesting that degree shows more apparent relation than either specific sub degree type. Next, we examined the relation of sPS with the involvement of feedback loops. Thus, we classified each gene into 'FBL' and 'No FBL' groups if the gene was involved with any feedback loops or not, respectively, and then compared the average sPS values of the groups. As shown in Figure 4b, the average sPS value of 'FBL' group is significantly larger than that of 'No FBL' group. This implies that a gene tends to be more pleiotropic when it is involved with feedback loops. Moreover, we computed the correlation coefficient between the number of feedback loops and sPS values and found significant positive relations (see Figure S3 in Supplementary Information; All p-values < 0.0001 using t-test). This implies that feedback loops can play an important role in pleiotropy, as indicated in a previous study [83]. This result is intriguing because many previous studies have shown the relation of feedback loops with various dynamical behavior of biological networks [62,84]. For example, the FBL plays role in amplifying (positive feedback loop) or inhibiting (negative feedback loop) of the intracellular signals [62,84], related to disease comorbidity [85], or protein-protein interaction [64]. Thus, this result can add the importance of feedback loop structure in terms of the network dynamics. Finally, we examined the relations of centrality measures such as closeness, betweenness, stress, and eigenvector with sPS values and found that all of them have positive correlations (all p-values < 0.0001 using t-test), irrespective of the mutation duration time (Figure 4c). In other words, it is likely that the more central gene in signaling networks shows a higher sPS value. In particular, the correlation coefficient of closeness, which indicates how closely a gene is located to other genes in a network, was the largest. On the other hand, the correlation coefficient of stress was the lowest. It is interesting that our centrality result is consistent with some previous results showing that pleiotropic genes were more central in protein interaction networks [4,17]. In addition, we examined the correlation coefficient of sPS values with degree/in-degree/out-degree of nodes, feedback loops, and centrality measures in the BA random networks and could observe consistent results (see Supplementary Figure S4). This implies that our results are principles not only in real networks but also in artificially structured networks. with several autoimmune diseases [82]. This result indicates the promising usefulness of to predict the unknown pleiotropic role of functionally important genes. (a-f) Results of cancer genes, drug targets, essential genes, tumor suppressors, oncogenes, and disease genes, respectively. In each subfigure, all genes were classified into 'non-zero ' and 'zero-' groups. Yaxis is the ratio of the functionally important genes among the total genes in the group. The mutation duration time was varied from 14 to 20.

Relation of and the Structural Characteristics
Some previous studies have shown that the structural characteristics of a gene in signaling networks are related to its dynamical behavior [59,60]. In this regard, we examined the relation of with the structural characteristics, specifically, the degree, the involvement of feedback loops, and some centrality measures (Figure 4). For every KEGG genes, we first compared the correlation coefficients between degree/in-degree/out-degree of genes with values. (a-f) Results of cancer genes, drug targets, essential genes, tumor suppressors, oncogenes, and disease genes, respectively. In each subfigure, all genes were classified into 'non-zero sPS' and 'zero-sPS' groups. Y-axis is the ratio of the functionally important genes among the total genes in the group. The mutation duration time T was varied from 14 to 20.

Relation of and the Structural Characteristics
Some previous studies have shown that the structural characteristics of a gene in signaling networks are related to its dynamical behavior [59,60]. In this regard, we examined the relation of with the structural characteristics, specifically, the degree, the involvement of feedback loops, and some centrality measures (Figure 4). For every KEGG genes, we first compared the correlation coefficients between degree/in-degree/out-degree of genes with values.

Biological Evidence of Pleiotropic Genes Based on sPS
To reveal novel candidate pleiotropic dynamics by the sPS measure, we profiled the genes with high sPS values from the KEGG network. A total of 29 genes are shown in Table 2. Among them, ten genes were known to be the real pleiotropic genes, for example, the PIK3CA (PIK3) gene, which is found to be most pleiotropic among targets of drugs abuse in pharmacological experiments [86], and the ABL1 gene, which is expressed in all tissues of mice pleiotropic phenotype T-cell signaling [87]. In addition, it was reported that EGFR gene is a marker of pleiotropic effects in underlying kidney function and cardiovascular disease [88]. It was clear that those genes were involved with many cancer types, associated with drug targets, essentials, and disease genes. On the other hand, we found 19 novel candidate genes in Table 2 that are not known in the experimental pleiotropic database. This suggests that sPS can predict the unknown pleiotropic genes. As shown in the table, those genes are very interesting because most of them are associated with various functionally important genes such as cancer, drug target, oncogenes, tumor suppressors, essentials, or disease genes. Specifically, all such genes were associated with disease genes. This implies that disease genes tend to be pleiotropic. Further, we map the listed genes in Table 1 into KEGG sub-network ( Figure 5; see Figure S5 in the Supplementary Materials for original network) for visualization. We note that only the listed genes and their neighbors were included in the sub-network. In the figure, the known pleiotropic genes and the novel candidate pleiotropic genes were marked by a blue and a yellow circle, respectively. Interestingly, the novel candidate pleiotropic genes were located closely to the known pleiotropic genes (most of the yellow circle was located from one of blue circle with length 1 in the network). This implies that the novel candidate pleiotropic genes tend to closely interact with the known pleiotropic genes in the signaling network. This was relevant to a study that found that larger effects of pleiotropy can also be caused by correlated effects among traits [89] or the regulatory networks that are so highly interconnected influence neighbor genes to have effects on the core disease genes [90]. In addition, most novel candidate pleiotropic genes were involved in feedback loops in the network. Moreover, their in-degree tends to be smaller than their out-degree. Hence, we can conclude that novel candidate pleiotropic genes are not only associated with certain functional influence genes but also have certain structural characteristics. Taken together, our sPS measure can be more efficiently used to reveal novel candidate pleiotropic genes when it is combined with structural characteristics index. 2528 HPO = 1 means the gene was confirmed in the real observational pleiotropic database, otherwise 0. DT = 1, ES = 1, TSG = 1, OCG = 1, DG = 1 means the gene involves with drug-target, essential, tumor-suppressors, oncogenes, or disease genes, respectively, otherwise 0. NuCancer abbreviates the number of associated cancer types. Deg/In-Deg/Out-Deg denote the values of node degree/indegree/out-degree. NuFBL abbreviates the number of feedback loops involving gene.

Discussion
In this study, we defined pleiotropic scores ( ), which represents how much the dynamics of a gene is affected against a couple of different mutations, herein the knockout and the overexpression mutations, using a Boolean network model. We investigated our approach using the KEGG signaling network. It was interesting to observe that the affected genes with high values were related to the real pleiotropic data and can ex-

Discussion
In this study, we defined pleiotropic scores (sPS), which represents how much the dynamics of a gene is affected against a couple of different mutations, herein the knockout and the overexpression mutations, using a Boolean network model. We investigated our approach using the KEGG signaling network. It was interesting to observe that the affected genes with high sPS values were related to the real pleiotropic data and can explain the dynamical behavior of pleiotropic genes. More interestingly, the various functionally important genes were related to pleiotropic genes. Further, we also found that novel candidate pleiotropic genes tend to closely interact with the known pleiotropic genes in the signaling network. These results will enhance the understanding of dynamical effects on pleiotropic genes, especially in large-scale biological systems. Despite the usefulness of our approach, there are some limitations caused by the Boolean network model used in this study. The first concern is the use of the nested canalyzing function to represent a update of a gene status. However, some previous studies have proven its usefulness in gene regulatory interactions. For example, 133 out of 139 rules compiled from a dataset about a transcriptional regulatory network [50], or 39 out of 42 rules inferred from a dataset about signaling pathways [91] can be classified into the nested canalyzing functions. Another concern is the synchronous update scheme, which is less realistic than the asynchronous update scheme. In fact, it is likely that the genes in the real signaling networks are regulated in an asynchronous update rule. However, it is required to properly specify some unknown parameters to implement the asynchronous scheme. For example, the asynchronous update assumes that only one node can change state at any given moment, and each node has the same probability of being updated [26]. This implies that the asynchronous update is valid only when a correct strategy to choose an update sequence is known. In this regard, a future study will include an approach to infer the update rule from real biological data instead of generating random update rules. In addition, it will extend to a more generalized analysis considering various mutation types.

Conclusions
Pleiotropy refers to the ability of different mutations within the same gene to cause different pathological effects, and many computational methods have been suggested to unravel the dynamics of the pleiotropy. However, little is known in identifying more complicated dynamical relations of gene pleiotropy, since most of them focused on undirected molecular interaction networks. Therefore, a new perspective is needed to investigate the dynamical characteristics induced by gene-gene pleiotropic. In this work, we proposed a measure to compute gene-gene in silico pleiotropic scores (sPS) representing how much the gene is affected against the different type of mutations on dynamics using a Boolean network model. We considered knockout and overexpression mutations to compute sPS values. Through intensive investigations, we found that some functional importance genes such as cancer, drug-target genes, tumor suppressors, oncogenes, essential genes, and disease genes tend to have non-zero sPS values than other genes. Next, we investigated the relationships of sPS and structural properties and found that there are positive correlations with the number of nodes' degree/in-degree/out-degree, feedback loops, and centrality measures such as closeness, betweenness, stress, and eigenvector. More interestingly, we were able to find some biological evidence confirming that sPS is relevant to real pleiotropic data and can be used to find novel candidate pleiotropic gene characteristics. Overall, our results suggest the usefulness of sPS in understanding the dynamics pleiotropic in complex biological systems.