The Eminence of Co-Expressed Ties in Schizophrenia Network Communities

: Exploring gene networks is crucial for identifying signiﬁcant biological interactions occurring in a disease condition. These interactions can be acknowledged by modeling the tie structure of networks. Such tie orientations are often detected within embedded community structures. However, most of the prevailing community detection modules are intended to capture information from nodes and its attributes, usually ignoring the ties. In this study, a modularity maximization algorithm is proposed based on nonlinear representation of local tangent space alignment (LTSA). Initially, the tangent coordinates are computed locally to identify k -nearest neighbors across the genes. These local neighbors are further optimized by generating a nonlinear network embedding function for detecting gene communities based on eigenvector decomposition. Experimental results suggest that this algorithm detects gene modules with a better modularity index of 0.9256, compared to other traditional community detection algorithms. Furthermore, co-expressed genes across these communities are identiﬁed by discovering the characteristic tie structures. These detected ties are known to have substantial biological inﬂuence in the progression of schizophrenia, thereby signifying the inﬂuence of tie patterns in biological networks. This technique can be extended logically on other diseases networks for detecting substantial gene “hotspots”.


Introduction
Schizophrenia is a multifaceted disorder characterized as a dysfunctional psychiatric illness.This condition occurs across 1.5% of world population prominently leading to cognitive impairment and thought delusions [1].Having a manifold of neurochemical symptoms makes it further demanding to devise advanced treatments for eradicating this disorder.Furthermore, studies have highlighted the correlations between aberrant brain interactions and occurrence of the first episode of schizophrenia [2].While imaging and spectroscopic techniques reveal structural abnormalities associated with the disorder, their impact on the brain function is yet unknown, to a certain extent [3,4].Thereby, several studies in past have failed to identify the fundamental phenomenon responsible for a dysfunctional brain [5].In this context, comparative analysis of numerous psychiatric conditions including schizophrenia, depressive, bipolar and treatment resistant schizophrenia (TRS) revealed that this subtype of schizophrenia, TRS, is associated with severe cognitive and psychopathological impairments requiring specialized treatment measures [6].Henceforth, treatment of the illness requires treatments specific to different variants of schizophrenia.Currently, antipsychotic drugs are widely used for treatment of schizophrenia.However, safety and efficacy of these medications remain questionable [7,8].Henceforth, alternative mechanisms are been uncovered currently for discovering the pathological, etiological and physiological impacts of this illness.In this direction, computational methods are seen as better alternatives.These techniques have paved the way for recognizing the functioning of brain at diverse orientations [9][10][11].
Centered on this idea, modeling schizophrenia as a computational network has received widespread attention in recent years due to its prognostic proficiencies as compared to other conventional techniques like magnetic spectroscopy and imaging techniques [12][13][14].Computational approaches are capable of associating genomic information across neural circuits to identify functional phenotypes expressed in the disorder.Such network-based approaches further analyze topological features of a disease, oriented as a modular unit.The so-derived biological modules characterize patterns of interactions across several psychopathological, cognitive and psychological factors responsible for schizophrenia [15].Numerous computational techniques have been adopted in previous studies for identifying the functional network modules from biological entities [16,17].Some of the popular ones include greedy algorithms, network propagation techniques and co-clustering methodologies [18].Apart from these techniques, biological modules are also discovered using community detection.Community detection techniques are preferred in the case of biological networks due to their commendable influence in distinguishing the functional components within networks [19].Communities obtained from such networks are usually oriented with dense interconnections across internal nodes when compared with other nodes.These communities are further essential to identifying the dynamics and topological features of the entire network.The assorted connections identified across communities will eventually help in exploring the interrelationships across nodes and their influences on other nodes.For instance, analysis of communities in biological networks identifies the connotations across multiple genetic factors responsible for epidemics of a disease.
Furthermore, the tie arrangement spanned across a network can be analyzed taking into account the community structure.These ties are oriented as strong or weak based on the strength of interactions across the nodes [20].Identifying such ties reveals the integrity of networks across its neighboring nodes [21].
Taking into account the benefits of community structures, this study is intended to ascertain the impact of the tie structure in schizophrenia gene network.Especially, the work identifies some relevant research questions in this direction: Question 1: What is the influence of community structure in gene networks?Question 2: How does the tie structure influence the orientation of gene modules in schizophrenia?Question 3: Specifically, what tie category influences the functioning of schizophrenia gene network?

Related Work
This section highlights significant contributions over the years pertaining to the application of network approaches towards knowledge discovery from biological information.

Network Approach for Disease Modeling
Modelling diseases as a network has helped in understanding the dynamic interactions across biological entities.Some of the popular biological networks include protein-protein interaction networks, gene regulatory networks, parasite and pathogen networks to name a few [22].These networks are oriented as nodes and edges representing multitude biological entities and their interactions [23].Comparable to other diseases, schizophrenia is modelled as a network in various studies.Some of the significant studies are enlisted in Table 1.The unitary mechanism of the disease is identified in cognitive, negative and positive domains 6 [28] Several drugs including dopaminergic, cholinergic, glutamatergic, GABA (Gamma-Aminobutyric Acid), kappa opioid, cannabinoid and serotonergic are evaluated to understand their interaction patterns in schizophrenia The stimulants impacting progression of schizophrenia are identified from the drug models 7 [12] Multiple alterations in brain disorders are identified using a network model The network model detected the positive symptoms of diseases using integrated approach from social, biological and psychological factors.

[29]
Predictive model is developed based on functional network patterns to detect schizophrenia Sparse multivariate regression model applied on whole-brain functionality resulted in 74% accuracy for predicting schizophrenia 9 [30] Magnetic resonance imaging data is utilized for mapping differences in brain structure Overlapping regions of 2% is observed in cerebral, frontal and temporal regions.

[31]
Differentially expressed schizophrenia transcripts are identified using dysregulated genes Two markers RGS1 and CCL4 are identified with 97% accuracy from 27% of patient subset

Prominence of Community Detection in Biological Networks
Community detection is a mechanism for visualization of connections across different modules in a network [19].Community detection is widely used in biological networks for detection of functional components.In this context, many algorithms have been devised for community detection.Some of the prominent ones are discussed in Table 2.

Tie Structure Analysis
Tie structure detection is performed to detect macro and micro-level interactions in networks.A tie is a structure that captures relevant information from a network.The definition of these ties varies across each network.Coming to biological networks, a tie represents significant biological details in form of genes, proteins, enzymes, drugs etc.Such ties are oriented as two types namely, strong and weak ties.A strong tie captures associations across two closely held biological entities while weak ties are spread across the entire network to discover prominent connotations.These weak ties are also called bridges as they maintain the global connectivity of the entire network.Both ties are significant in detecting topological and functional features of a network.Hence, capturing information from tie structures reveals deeper insights about the dynamics of the underlying network.Some of the prominent tie detection studies are shown in Table 3.Based on these studies it is concluded that tie structure identification is substantial for analyzing network orientations.Currently, no study has focused on the direction of tie structure analysis in schizophrenia gene network.Hence, this study focuses on recognizing relevant gene connections across the disease network.

Methods
This section highlights the methodology adopted for elucidating relevant gene entities from schizophrenia network.

Collecting Gene Data
Schizophrenia gene data is gathered from multiple biological repositories including DisGeNET [44], SZDB [45] and SZGR2.0 [46].Such an integrated dataset includes all the essential genes expressed in pathology of schizophrenia.This dataset is further validated by linking the genes with the schizophrenia pathway information mined from the Kyoto Encyclopedia of Genes and Genomes (KEGG) database [47].

Identifying Functional Modules and Creating the Gene Network
From thousands of genes collected in the previous step, functional modules are identified based on the biological processes expressed in schizophrenia.Biological processes are identified by Gene Ontology (GO) search [48] and literature analysis [49][50][51][52][53][54].Centered on these essential processes in schizophrenia, the genes are spanned across different modules.These modules are further employed for constructing the schizophrenia gene network.This network is undirected, as there are no paths between each gene modules.

Categorizing the Gene Components
Gene modules within the schizophrenia network are classified based on the biological processes they belong to.It is observed that a gene can be part of more than one biological process.Gene modules are pigeonholed based on joint modelling of genes and their biological processes using topic modelling strategies.Topics are pre-defined as labeled attributes using biological processes.Supervised topic modeling is suitable for such tasks as a response variable for each term exists in the Data 2019, 4, 149 6 of 23 data.Out of numerous algorithms available for supervised modelling, supervised Latent Dirichlet Allocation (sLDA) algorithm [55] is appropriate for the dataset, resulting in enriched topic specific gene associations.These associations are derived by estimating the maximum likelihood of gene modules within the network.

Modularity Based Community Detection
Following module classification, community structure is to be identified from the gene dataset.Out of several standard metrics available for computing the quality and strength of communities, modularity index is chosen.Modularity is known to perform better for biological networks when there are multiple interactions across genetic networks [56,57].Modularity compares the edge density of the clusters in a given graph G along with the edge clusters in any random graph G'.The larger the differences among these edge densities, better the communities are clustered.Mathematically, the modularity function is represented as Q, defined by: Here, a ij represents the edges between vertex i and vertex j, k i k j /2m denotes the edges among vertex i and vertex j, when they are randomly placed.The elements within modularity function are indicated using the matrix, M. It is defined as M = [m ij ], here m ij is denoted as: When G is undirected.By substituting the value of m ij , the modularity function Q becomes, Here, s = (s 1 , . . ., s n ) T is the vector indicating the community membership in quadratic form.Initially, gene communities are detected using this modularity metric.However, it resulted in poorly identified clusters with no relevant interactions.Hence, this metric is to be optimized for deriving better partitioned gene communities.

Optimizing Modularity Using Nonlinear Embedding
Of several optimizations available for modularity metric, non-linear embedding is preferred for preserving the inherent patterns, when compared to linear methods [58].Nonlinear embedding maps the manifold in low dimension to high dimensional embedding by representing the data instances with their nearest neighbors.Such dimensional adaptations in tangent space are known to preserve the data points with minimal reconstruction errors.
Local tangent space alignment (LTSA) algorithm is a type of nonlinear embedding deduced in tangent space by examining the overlapping substructures within the local coordinates [59].Based on these local embeddings, global coordinates are aligned across the network.Such an embedding is suitable for community detection as it retains the intrinsic network alignment by unraveling the inter communities and interweaving the intra community nodes [60].Unweighted variant of LTSA function is considered for deriving the communities, as the gene network is undirected in this case.

Implementing LTSACom for Community Detection
Employing the LTSA function for gene community detection generates an implementation LTSACom.This algorithm attempts to represent the local structure of gene communities in tangent space to explore possible gene interactions [61].
The manifold is to be embedded in y-dimensional space such that x < y.Initially, the modularity matrix, M is to be sampled from an m-dimensional orientation to s dimension with local embedding.The fixed data point in M is denoted by a, while the tangent space attached to "a" is denoted by T 1 A. The algorithm includes three basic steps: Step 1: Extracting Local Coordinates The first step is performed to identify local coordinates in modularity function based on their orientation in the manifold.Any function representing a manifold can be expressed using the Taylor series of first order expansion.The expansion of manifold is denoted as: Here, m denotes the s-dimensional data point such that m € R s .The tangent vector is employed at this point to provide the representation of 'a' in tangent space with minimal error.This is followed by computing the local orthogonal basis of modularity matrix.This orthogonality is significant for deriving mutually orthogonal vectors Q M , based on the Jacobian function J f .
The next objective is to identify the local transformation metric L for global space coordinate M such that it minimizes the local mapping function.It is defined as: Step 2: Alignment-Based Feature Extraction For the gene data instances x i , computing optimal approximation in s-dimensional subspace is achieved by Principal Component Analysis (PCA) at local orientation.PCA reduces dimensions of m orthonormal attributes in M into s dimensional space (such that m < s).It is followed by computing the covariance matrix, C.This matrix C includes the covariance metrics of the matrix M for all the data attributes.Furthermore, performing eigenvalue decomposition for the matrix C results in a list of eigenvectors and their respective eigenvalues.These eigenvectors indicate the components present in the diminished subspace s sorted based on their eigenvalues.Based on this decomposition, r vectors having maximal variance is derived for the k neighbors in low dimension.The metric defined for selecting local coordinates is denoted as: Here, M includes "n" orthonormal attributes, with x i defining the average of all the X i values.Ӫ is given by Q T i x ij − x i .Q in the Ӫ function denotes the matrices of r singular vectors corresponding to the largest n singular instances derived from PCA.
Outcomes from PCA are reflected in the alignment matrix constructed from the local coordinates.It is important to derive this alignment matrix by maintaining the reconstruction error to be as minimal as possible.The alignment matrix so constructed includes all the coordinates in low dimension.
Step 3: Global Alignment of Coordinates Based on the eigenvector computed in the previous step, the matrixes with the smallest n + 1 eigenvectors are selected.This selection includes spanning the coordinates from the second eigenvector matrix till the smallest n + 1 eigenvector.This assortment identifies the global coordinates of matrix M, Data 2019, 4, 149 8 of 23 such that the overall reconstruction error is minimal.The characteristic equation for error analysis is given by: Here θ + i denotes the inverse of θ i .Grounded on these three principles, LTSACom algorithm is designed for analyzing the schizophrenia gene communities.

Validation of Gene Communities
It is important to validate the underlying community structure obtained in the previous step to affirm its structural orientation.In this context, a random number of edges are removed from each community to examine the structural variations.The network obtained after removing random edges is subjected to community detection using LTSACom algorithm.The modularity function is recomputed to compare the observations before and after removing the edges.The performance of LTSACom is further evaluated using the mixing parameter (µ) which is an influential metric for evaluation of community structure.It is defined as follows: Here, the external and total degrees are estimated for every node in the network denoted as d ext n and d tot n respectively [62].The external degree for a node is denoted as the sum of links connecting itself to every other vertex in other communities.Several studies have reported that network communities are well depicted when the range is between 0 and 1, while a value of 0.5 depicts well connected structures.Based on the computed values of mixing parameter, performance of the algorithm in detecting gene communities is adjudicated.

Discovering the Tie Structure from Communities
Corresponding genes across the network communities are identified based on their associations.Furthermore, these connotations divulge the inherent tie structure (i.e., strong and weak ties) within these communities.Strong ties are formed within gene communities, while weak ties are formed across different gene communities.Weak ties act as coexpressive ties in case of biological networks due to multiple interactions between any two vertices.The genes spread across such ties are further examined to validate their biological relevance in the progression of the disorder.

Multiple Correspondence Analysis
The relationship between gene communities and their embedded genes is discovered using multiple correspondence analysis (MCA).MCA is a multivariate technique which is the generalized extension of principal component analysis (PCA) for detecting the associations among multiple categorical variables [63].This technique identifies the correlations among gene modules across the network based on their orientation.

Results
This section highlights the observations for schizophrenia gene dataset based on community detection and tie analysis.The dataset utilized for this study is shown as Supplementary file.

Description of the Gene Dataset
Annotated schizophrenia gene dataset is categorized based on the substantial biological processes identified from GO and literature analysis.Six prominent processes are identified for schizophrenia disorder, namely inflammation, immune response, genetic factors, neurotransmitters, metabolism and stress inducers.Based on these categories, the genes are pigeonholed into six different modules.The genes spread across each category are as presented in Table 4. Stress Inducers 36 Orientation of these genes at different phases is reflected in Figure 1.
Data 2019, 4, 149 9 of 25 Stress Inducers 36 Orientation of these genes at different phases is reflected in Figure 1.

Supervised LDA for Topic Modeling
Six modules identified in the previous step are subjected to topic modeling process for accessing the nature of association among genes.Supervised topic modeling is to be implemented owing to the labeled gene dataset.The supervised Latent Dirichlet Allocation (sLDA) algorithm is tested on this dataset using its functionality available in R programming language [64].Gibbs sampling is performed initially using the sLDA function by taking the gene data as input.The latent parameters alpha, eta and variance are set to 1.0, 0.5 and 0.3 respectively after trial and error analysis for defining their values.The computed model results in topic matrix for each gene category predicting their connotations.This step is iterated ten times, to cross validate the outcome from the model resulting in a tenfold cross validation.This model reveals stronger connections between modules 1, 2 and 3 which are inflammatory, immune and genetic factors respectively.These associations are calculated based on the connectivity score ranged between 0 and 1.The stronger the association, the higher is this score.Relations captured within these modules are used to identify mutual genes across these schizophrenia gene categories.The pictorial representation in Figure 2 highlights gene modules along with associations between three significant gene modules for a smaller data instance.

Supervised LDA for Topic Modeling
Six modules identified in the previous step are subjected to topic modeling process for accessing the nature of association among genes.Supervised topic modeling is to be implemented owing to the labeled gene dataset.The supervised Latent Dirichlet Allocation (sLDA) algorithm is tested on this dataset using its functionality available in R programming language [64].Gibbs sampling is performed initially using the sLDA function by taking the gene data as input.The latent parameters alpha, eta and variance are set to 1.0, 0.5 and 0.3 respectively after trial and error analysis for defining their values.The computed model results in topic matrix for each gene category predicting their connotations.This step is iterated ten times, to cross validate the outcome from the model resulting in a tenfold cross validation.This model reveals stronger connections between modules 1, 2 and 3 which are inflammatory, immune and genetic factors respectively.These associations are calculated based on the connectivity score ranged between 0 and 1.The stronger the association, the higher is this score.Relations captured within these modules are used to identify mutual genes across these schizophrenia gene categories.The pictorial representation in Figure 2 highlights gene modules along with associations between three significant gene modules for a smaller data instance.

Modularity-Based Community Detection
Underlying the association between genes and their categories, communities are to be detected.Initially, traditional modularity metric is applied on these genes to detect communities.The algorithm however, resulted in two communities with a low modularity index of 0.239.To further discover enhanced communities, maximization of the modularity function Q is performed using non-linear embedding of LTSA algorithm.This algorithm, LTSACom comprises of three steps for community detection: Step 1: The modularity matrix, M is given as input for the algorithm.It is followed by extracting local information from gene modules based on their nearest k neighbors.For each of the six modules, the nearest neighbors are computed.
Step 2: The alignment matrix A is constructed based on the largest eigenvector computed using PCA in low dimensional subspace.The local coordinates obtained from these vector matrices are further summated.
Step 3: Computing all the minimum (d+1) eigenvectors for the modularity metric is done by observing the inherent tie structure in the network.These eigenvector matrices conforming to the 2 nd value up to the smallest (d+1) eigenvalues are selected to detect the global coordinates of M.
The algorithm of LTSACom is represented as Algorithm 1.

Modularity-Based Community Detection
Underlying the association between genes and their categories, communities are to be detected.Initially, traditional modularity metric is applied on these genes to detect communities.The algorithm however, resulted in two communities with a low modularity index of 0.239.To further discover enhanced communities, maximization of the modularity function Q is performed using non-linear embedding of LTSA algorithm.This algorithm, LTSACom comprises of three steps for community detection: Step 1: The modularity matrix, M is given as input for the algorithm.It is followed by extracting local information from gene modules based on their nearest k neighbors.For each of the six modules, the nearest neighbors are computed.
Step 2: The alignment matrix A is constructed based on the largest eigenvector computed using PCA in low dimensional subspace.The local coordinates obtained from these vector matrices are further summated.
Step 3: Computing all the minimum (d + 1) eigenvectors for the modularity metric is done by observing the inherent tie structure in the network.These eigenvector matrices conforming to the 2nd value up to the smallest (d + 1) eigenvalues are selected to detect the global coordinates of M.
The algorithm of LTSACom is represented as Algorithm 1.

Algorithm 1: LTSACom for community detection
Input Input the modularity matrix M derived from schizophrenia gene dataset for detection of gene communities Step 1 Compute the nearest neighbors using the local information among genes in tangent space Step 2 Construct the unweighted alignment matrix A based on the embedded vectors in the matrix M Step 3 Global optimization of A based on local tangents using eigenvector decomposition Output Compute the modularity index for the dataset to identify gene communities Performing these steps in nonlinear embedding generates network structure for the schizophrenia gene dataset, resulting in six diverse communities.The modularity index for these communities is found to be 0.9256, which is considerably superior compared to the initial value of 0.239.

Validating the Community Structure
The gene communities detected by LTSACom need to be authenticated to ensure that the modules are not formed by a random chance.For this purpose, mixing parameter is computed by altering the link structure in the network.It is observed that as the value of µ elevates, the modularity function decreases gradually.When µ is found to be 0.56, the algorithm detects relevant gene communities with six diverse clusters representing the six classes of genes.This value reflects the stronger interconnections across the gene communities.
For each of these communities, some of the centrality metrics are also computed and represented pictorially as distributions in Figure 3.As observed in the figure, betweenness centrality and closeness centrality highlight the importance of certain nodes within a network.These nodes act as influential connections across the network based on the centrality index.Furthermore, distribution plots for eccentricity and modularity are calculated using eccentricity distribution and size distribution plots respectively.Eccentricity distribution is used to identify the distances across any two genes of interest in a network.Furthermore, distribution of communities based on their modularity index is displayed in the size distribution plot.All these distributions are derived from the network visualization software, Gephi 0.9.2 [65].

Input
Input the modularity matrix M derived from schizophrenia gene dataset for detection of gene communities Step 1 Compute the nearest neighbors using the local information among genes in tangent space Step 2 Construct the unweighted alignment matrix A based on the embedded vectors in the matrix M Step 3 Global optimization of A based on local tangents using eigenvector decomposition Output Compute the modularity index for the dataset to identify gene communities Performing these steps in nonlinear embedding generates network structure for the schizophrenia gene dataset, resulting in six diverse communities.The modularity index for these communities is found to be 0.9256, which is considerably superior compared to the initial value of 0.239.

Validating the Community Structure
The gene communities detected by LTSACom need to be authenticated to ensure that the modules are not formed by a random chance.For this purpose, mixing parameter is computed by altering the link structure in the network.It is observed that as the value of µ elevates, the modularity function decreases gradually.When µ is found to be 0.56, the algorithm detects relevant gene communities with six diverse clusters representing the six classes of genes.This value reflects the stronger interconnections across the gene communities.
For each of these communities, some of the centrality metrics are also computed and represented pictorially as distributions in Figure 3.As observed in the figure, betweenness centrality and closeness centrality highlight the importance of certain nodes within a network.These nodes act as influential connections across the network based on the centrality index.Furthermore, distribution plots for eccentricity and modularity are calculated using eccentricity distribution and size distribution plots respectively.Eccentricity distribution is used to identify the distances across any two genes of interest in a network.Furthermore, distribution of communities based on their modularity index is displayed in the size distribution plot.All these distributions are derived from the network visualization software, Gephi 0.9.2 [65].

Performance Analysis of LTSACom
The performance of LTSACom algorithm is further examined based on comparative analysis with other state of art modularity maximization algorithms.Some of the algorithms used for this purpose include spectral algorithm (SP) [66], Fast-Newman (FN) algorithm [67], Finding and Extracting a Community (FEC) algorithm [68], Fast Unfolding Algorithm (FUA) [69], Multi-layer Ant Based Algorithm (MABA) [70] and InfoMap algorithm [71].
These algorithms are tested on schizophrenia gene dataset for 200 iterations and the modularity index is calculated at each trial.Computations revealed that LTSACom algorithm maintains a better modularity index with increasing trials as compared to other algorithms.Observations from this computation are seen in Table 5, clearly indicating better performance of LTSACom in detecting gene communities for schizophrenia data.The communities are also depicted pictorially highlighting their gene assemblies in Figures 4 and 5 respectively.As observed from these figures, the stronger the association between the communities, the more resilient are their interactions.

Identifying Tie Structure from Gene Communities
Interconnected tie structure is scrutinized from the network communities to ascertain persuasive genes in schizophrenia.For the six gene communities, several ties are detected which co-occur across different modules.These spanned ties across the communities tend to have fragile connections within the network, forming weak ties.Furthermore, the relevance of these ties in

Identifying Tie Structure from Gene Communities
Interconnected tie structure is scrutinized from the network communities to ascertain persuasive genes in schizophrenia.For the six gene communities, several ties are detected which co-occur across Data 2019, 4, 149 15 of 23 different modules.These spanned ties across the communities tend to have fragile connections within the network, forming weak ties.Furthermore, the relevance of these ties in schizophrenia is evaluated using literature analysis.Based on this analysis, substantial gene ties are identified and highlighted in Table 6.The tie structure spanning the entire network for different communities is shown in Figure 6 at different stages.This figure highlights the influence of intrinsic genes and ties based on the network size.Initially, the network comprises of genes and their categories oriented among each other.These genes represent the core connections which persist to exist in the network irrespective of its size.By increasing the network size, further genes get added to the network, highlighting the communities and tie structure.

Multiple Correspondence Analysis
Relationship spanning across categorical network modules and their gene instances is detected using the MCA technique.This analysis exposes the concealed comprehensions within the schizophrenia network.The FactoMineR package available in R programming language is utilized for performing MCA [72].Primarily, variations occurring across data instances are calculated using the eigenvalues.Based on these variations, individual modules are identified and visualized along with their gene associations.The quality of the associations is detected using cos2 metric.The metric is calculated for the genes to adjudicate their connotations.Higher the value of cos2 better is the associations across the entities.The genes corresponding to different gene modules are shown in Figure 7 as an MCA plot.This figure highlights the interactions based on the cos2 index.

Multiple Correspondence Analysis
Relationship spanning across categorical network modules and their gene instances is detected using the MCA technique.This analysis exposes the concealed comprehensions within the schizophrenia Data 2019, 4, 149 18 of 23 network.The FactoMineR package available in R programming language is utilized for performing MCA [72].Primarily, variations occurring across data instances are calculated using the eigenvalues.Based on these variations, individual modules are identified and visualized along with their gene associations.The quality of the associations is detected using cos2 metric.The metric is calculated for the genes to adjudicate their connotations.Higher the value of cos2 better is the associations across the entities.The genes corresponding to different gene modules are shown in Figure 7 as an MCA plot.This figure highlights the interactions based on the cos2 index.

Conclusion
This study explores the importance of tie structure in gene networks, inspired from the findings of Granovetter [21].The study initially frames a few exploratory questions pertaining to the gene interactions in schizophrenia.Some of the relevant findings obtained from the study are discussed in this section.The first question is concerning the influence of community structure in schizophrenia gene network.Based on the analysis, it is observed that inherent modular structure in gene networks can be discovered proficiently by discovering communities.These modular structures are further considered for ascertaining hidden associations in the network.Hence, communities are found to be precise entities for quantifying biological properties from the gene network.The second question is based on the outcome of the first question.Since communities are considered to be crucial in networks, the next question ascertains the impact of tie structure within these communities.From observations, it is found that ties highlight the inherent associations among different genes.These associations further reveal the strength of network connectivity.The patterns of these ties within gene communities could also highlight the influence of a gene across the entire network.Hence, the study claims that integral ties are substantially imperative to ascertain the functioning of a gene network.The third question is framed to discover the relevance of strong and weak ties in the gene network.Investigating the gene network revealed that the entire network is composed of coexpressed ties scattered across different biological modules.These coexpressed ties are spanned across two or more communities representing weaker connections.However, these frail interactions maintain global connectivity of the gene network as compared to strong ties.Henceforth the study affirms that weak ties influence the functioning of schizophrenia gene network both locally and

Conclusions
This study explores the importance of tie structure in gene networks, inspired from the findings of Granovetter [21].The study initially frames a few exploratory questions pertaining to the gene interactions in schizophrenia.Some of the relevant findings obtained from the study are discussed in this section.The first question is concerning the influence of community structure in schizophrenia gene network.Based on the analysis, it is observed that inherent modular structure in gene networks can be discovered proficiently by discovering communities.These modular structures are further considered for ascertaining hidden associations in the network.Hence, communities are found to be precise entities for quantifying biological properties from the gene network.The second question is based on the outcome of the first question.Since communities are considered to be crucial in networks, the next question ascertains the impact of tie structure within these communities.From observations, it is found that ties highlight the inherent associations among different genes.These associations further reveal the strength of network connectivity.The patterns of these ties within gene communities could also highlight the influence of a gene across the entire network.Hence, the study claims that integral ties are substantially imperative to ascertain the functioning of a gene network.The third question is framed to discover the relevance of strong and weak ties in the gene network.Investigating the gene network revealed that the entire network is composed of coexpressed ties scattered across different biological modules.These coexpressed ties are spanned across two or more communities representing weaker connections.However, these frail interactions maintain global connectivity of the gene network as compared to strong ties.Henceforth the study affirms that weak ties influence the functioning of schizophrenia gene network both locally and globally.
In brief, the study finds two crucial outcomes: (i) A novel implementation of modularity maximization algorithm, LTSACom based on LTSA function for detecting gene communities; and (ii) Detecting the influence of weak ties in schizophrenia network.The modularity-based algorithm so designed helps in revealing diverse gene communities across the network with an increased modularity value of 0.9256.Furthermore, these communities disclose coexpressive interactive ties across the network throwing light on the relevance of embedded ties.These ties influence the global connectivity of network with feeble interactions across prominent genes in the network.Identifying such genes helps in ascertaining the "dominant hotspots" that influence the progression of a disease.Additionally, the study also performs some fundamental analyses including topic modeling, centrality distributions, community validation and MCA testing for identifying the dynamics of the inherent gene network.
This study is novel in some directions.The research is innovative in discovering the impact of weak ties in schizophrenia gene networks.Previous studies have focused on network-based approaches for recognizing gene expressions from schizophrenia network [73,74].However, these studies have not acknowledged the relevance of tie structures within schizophrenia gene networks.Furthermore, computational gene modeling employed in this study can be a promising technique for ascertaining the micro and macro level interactions across the network.These patterns of interactions can be scrutinized to uncover the expression of a particular gene of interest.Such pattern-based gene expressions can promote drug designing towards a susceptible gene target rather than a much more complex protein.However, this study is an initial attempt in this direction, as surplus investigation is required to reconstruct the schizophrenia gene network on large scale for measuring gene expressions for targeted therapy.The dataset adopted in this study focuses on genes alone while proteins and drugs are other significant entities to be considered for modelling the disease.Identifying interactions across protein networks and drug molecules will discover hidden functional implications of the disorder.Henceforth, such a dataset needs be constructed to aid in modelling the disorder at modular level.Furthermore, gene communities are discovered by optimizing the modularity metric which is often subjected to resolution limits resulting in local and global deviations, that needs to be nullified [75].In this context it is necessary to evaluate the performance of optimized modularity on local and global scale prior to tie structure analysis.Despite these limitations, the current study identifies significant associations among multiple mechanisms that contribute for progression of the illness.These outcomes have significant implications in designing targeted therapies against schizophrenia.Such targeted therapeutics can be adopted in conjunction with other medications to combat the disorder at genomic level.Furthermore, contributions from current research help in detecting complex interactions among genetic, inflammatory, immune and environmental factors based on modularity metric derived from LTSACom algorithm.Unraveling such complex associations with the help of tie interactions helps in designing personalized medications on individual basis compared to universal treatment procedures for schizophrenia.Henceforth, the current approach looks promising for detecting functional entities within the gene network.Furthermore, this technique could be expanded in future to expose common functional modules across group of psychiatric disorders including schizophrenia, bipolar disorder, paraphrenia and other psychotic symptoms.
x-dimensional manifold B with an underlying modularity matrix

Figure 1 .
Figure 1.The orientation of schizophrenia genes at different phases: (a) the initial gene network comprising of all schizophrenia genes as a cluster; (b) the gene network highlighting the nodes after calculating their degrees; (c) the genes oriented based on their biological modules; (d) genes revealing the tie structure across the modules.The figure is generated from Gephi tool.

Figure 1 .
Figure 1.The orientation of schizophrenia genes at different phases: (a) the initial gene network comprising of all schizophrenia genes as a cluster; (b) the gene network highlighting the nodes after calculating their degrees; (c) the genes oriented based on their biological modules; (d) genes revealing the tie structure across the modules.The figure is generated from Gephi tool.

Figure 2 .
Figure 2. The topics spanned gene modules revealing a strong association between topic 1 (i.e., inflammation), topic 2 (i.e., immune response) and topic 3 (i.e., genetic factors).The colored pattern indicates the type of gene mechanisms on X-axis (lr) with the estimate of their occurrence on Y-axis (density).The black colored patterns indicate gene mechanisms as topics.Topic 1 defines module 1 (inflammation), topic 2 is of module 2 (Immune response) and topic 3 is module 3 (genetic factors).The figure is generated from R programming language.

Figure 2 .
Figure 2. The topics spanned gene modules revealing a strong association between topic 1 (i.e., inflammation), topic 2 (i.e., immune response) and topic 3 (i.e., genetic factors).The colored pattern indicates the type of gene mechanisms on X-axis (lr) with the estimate of their occurrence on Y-axis (density).The black colored patterns indicate gene mechanisms as topics.Topic 1 defines module 1 (inflammation), topic 2 is of module 2 (Immune response) and topic 3 is module 3 (genetic factors).The figure is generated from R programming language.

Figure 3 .
Figure 3. Centrality, eccentricity and size distributions for the gene communities.

Figure 4 .
Figure 4.The gene communities highlighting significant biological processes in schizophrenia: (a) backbone community structure; (b) embedded connections in the community structure.

Figure 4 .
Figure 4.The gene communities highlighting significant biological processes in schizophrenia: (a) backbone community structure; (b) embedded connections in the community structure.

Figure 4 .
Figure 4.The gene communities highlighting significant biological processes in schizophrenia: (a) backbone community structure; (b) embedded connections in the community structure.

Figure 5 .
Figure 5.The interactions between different gene communities: (a) interactions between inflammatory and immune response gene modules; (b) the interactions between inflammatory, immune response and genetic factor modules; (c) the interactions between genetic, neurotransmitter and metabolic gene modules.

Figure 6 .
Figure 6.The tie structure of schizophrenia network, (a) the core genes expressed across different gene modules; (b) few more genes obtained after increasing the network size; (c) the core genes forming a community structure displaying the inherent ties; (d) the community structure oriented across the core genes for the entire network.

Figure 6 .
Figure 6.The tie structure of schizophrenia network, (a) the core genes expressed across different gene modules; (b) few more genes obtained after increasing the network size; (c) the core genes forming a community structure displaying the inherent ties; (d) the community structure oriented across the core genes for the entire network.

Figure 7 .
Figure 7.The reflection of genes scattered across different modules based on cos2 metric.

Table 1 .
Prominent studies modeling schizophrenia as a network.

Table 2 .
Studies highlighting community detection in biological networks.

Table 3 .
Substantial tie detection studies.

Table 4 .
The distribution of schizophrenia genes across the gene modules.

Table 4 .
The distribution of schizophrenia genes across the gene modules.

Table 5 .
Performance analysis of LTSACom for modularity maximization.

Table 6 .
Significant ties in schizophrenia gene network.