Application of Thermodynamics and Protein–Protein Interaction Network Topology for Discovery of Potential New Treatments for Temporal Lobe Epilepsy

Featured Application: This work describes the use of new methodology based on Gibbs homology analysis for the identiﬁcation of potential protein targets as well as their inhibitors for the development of therapeutic options for various diseases. In the past, similar approaches have been proposed and partially validated for various types of cancer. Here, we apply the method that combines thermodynamic measures with protein–protein interaction network topology to temporal lobe epilepsy. Our results identify a number of potential therapeutic targets. Abstract: In this paper, we propose a bioinformatics-based method, which introduces thermodynamic measures and topological characteristics aimed to identify potential drug targets for pharmaco-resistant epileptic patients. We apply the Gibbs homology analysis to the protein–protein interaction network characteristic of temporal lobe epilepsy. With the identiﬁcation of key proteins involved in the disease, particularly a number of ribosomal proteins, an assessment of their inhibitors is the next logical step. The results of our work offer a direction for future development of prospective therapeutic solutions for epilepsy patients, especially those who are not responding to the current standard of care.


Introduction
It has been estimated that about 50 million people worldwide suffer from epilepsy [1]. In 2015, about 3.4 million people had active epilepsy in the U.S. alone [2]. Epilepsy is one of the most common and most disabling neurological disorders, characterized by recurrent unprovoked excessive brain activity [3]. The current understanding of the neurophysiological mechanisms of epilepsy is largely based on extensive investigations of neuronal cells. However, glial cells have also been demonstrated to play a fundamental role in triggering seizures. Accurate diagnosis of epilepsy is challenging and therapeutic strategies span a range from single, to multiple drug courses of administration, to respective neurosurgical procedures and dietary therapy. In spite of the fact that several dozen antiepileptic drugs (AEDs) are available, there are still approximately 20-30% of patients who do not respond satisfactorily to these AEDs [3]. To develop new, effective treatments, studies focus on the development of the central nervous system (CNS) and neuronal activities in vivo to understand the causes and mechanisms of the disease initiation and progression [4]. However, to find effective treatments for patients who are refractory to current AEDs, it is necessary to study each case more precisely and individually. Hippocampal biopsy tissue of pharmaco-resistant Temporal Lobe Epileptic (TLE) patients is an extraordinarily useful substrate to study molecular mechanisms related to structural and cellular abnormalities in epilepsy. Several genes and signaling cascade alterations have already been reported in the literature based on the TLE hippocampus analysis [5]. The availability of gene expression profiling provides a detailed insight into the disease for each individual patient. Methods, such as microarray and RNA sequencing, developed in the recent decade are used to obtain genome-wide mRNA expression data. One of the advantages of using such methods is that this type of investigation is biomarker driven. Especially with whole genome-wide data, the comprehensive information on the status of each functioning unit, its interacting complex and interaction pathways is suitable for analysis, which can reveal potentially important molecular mechanisms and novel therapeutic targets. By investigating genomewide expression data, we can study the integrated results of all possible causative changes for each patient. Thus, the results of such an analysis are considered to be precise and biomarker-driven [6].
Epilepsy is a complex neuro-pathology that arises due to different etiologies, having various localizations, often occurring in conjunction with other diseases. Irrespective of what triggers the paroxysm, the electrical discharge during seizure is the common clinical manifestation for all forms of epilepsy, suggesting an underlying common molecular mechanism [7]. Although revealing the precise origin of epilepsy is still part of the ongoing investigations, it was shown that the cause may vary from de novo genetic mutations [8] to traumatic brain injury [9,10]. While gene mutations may naturally lead to altered downstream pathway behavior, brain injury was also shown to cause chronically altered gene expression signatures of genes that were linked to epilepsy [9]. Recent studies revealed some genetic causes of epilepsy, such as gene SCN1A mutations, which affect sodium channels [8] and tuberous sclerosis complex (TSC) mutations, which lead to the dysregulation of the mechanistic target of rapamycin (mTOR) pathway [11], both of which lead to epileptic conditions. Mechanisms of epileptogenesis consist of genetic and epigenetic alterations occurring in both neuronal and astroglia cells [12]. Abnormal activity of astrocytes during epileptic events has been extensively reported to play a major role due to their K+ buffering role in the extracellular milieu [13].
Clinical diagnosis of epilepsy starts with an identification of the type of seizure and proceeds with EEG and neuroimaging studies. Accurate diagnosis is pivotal for the adoption of an appropriate therapy. Anti-epileptic drugs (AEDs) represent first-line treatment for epilepsy and despite the availability of more than 20 such drugs, approximately 30% of patients do not respond to this type of therapy [12]. Among all epilepsy types, temporal lobe epilepsy (TLE) is the most common drug-resistant form of epilepsy in adults. Current AEDs mainly act by directing transmembrane ion channel function or by promoting γ-aminobutyric acid (GABA)-mediated inhibition to decrease the electrical activity of the brain [3,14]. For instance, phenytoin and lacosamide inhibit sodium channel activation; retigabine opens potassium channels; ethosuximide and lamotrigine block calcium channels; tiagabine inhibits GABA reuptake and phenobarbital and benzodiazepines enhance GABA receptors [3]. The AEDs generally focus on stabilizing and elevating the threshold of the CNS against hyperexcitability. On the other hand, the role of GABA neurotransmitters and GABA receptors in inhibiting activity in the central nervous system [15], and the role of glutamate neurotransmitters in aberrant hyperexcitability [16] have been shown to represent possible mechanisms which can be explored in order to develop new treatments. With extensive research focused on each of the above approaches, great progress has been made in deciphering epilepsy's pathophysiology at a molecular level. The better the understanding of the complex molecular mechanisms involved in the disease initiation and progression, the more unrealistic it appears that a drug affecting a single neurotransmitter receptor or an ion channel will be found to effectively treat epilepsy. Instead, combinations of pharmacological agents designed for an individual patient with a known expression profile should lead to better personalized clinical outcomes. In order to optimize such drug combinations, sophisticated quantitative analyses of protein-protein interaction networks should be implemented involving the key biomarker proteins of interest and the results of these analysis validated experimentally. In this paper we develop such a computational modelling effort for epilepsy that is based on thermodynamic measures of protein-protein network characterization, having previously provided a general approach [17] and also applied it specifically to other diseases such as various types and stages of cancer [18][19][20][21][22][23][24][25][26].

Materials and Methods
The theoretical underpinnings for our understanding of the thermodynamics and bioenergetics of brain development started by investigating the molecular biology of human diseases from a systems and network biology perspective. These studies were developed over a several-year period [17][18][19][20][21][22][23][24][25][26]. Here, we only provide a brief summary of these approaches. The transcriptome and other -omic (e.g., proteomic, genomic, metabolomic) measures can represent the collective energetic state of a cell. By the use of the word "energetic", we mean it from a thermodynamics perspective where one uses thermodynamic functions of state such as entropy, Gibbs free energy, enthalpy, and internal energy. In particular, for open thermodynamic systems such as the human body Gibbs free energy is a suitable thermodynamic function of state, which can be computed from the so-called chemical potential for the statistical system such as a network of proteins expressed by a living cell. There, a chemical potential can be found for all interacting molecules in a cell, in particular, a chemical potential of all the proteins that interact with each other. This can be imagined to represent a rugged landscape, not dissimilar to Waddington's epigenetic landscape [27,28]. We provide mathematical expressions for the Gibbs free energy of a cellular protein-protein interaction network below.
To perform these calculations, we need input data and a method of calculating the Gibbs free energy. The method we propose uses mRNA transcriptome data or RNAseq data as a surrogate for actual measurements of protein concentration values. This assumption is largely valid since Kim et al. [29] and Wihelm et al. [30] have shown an 83% correlation between mass spectrometry proteomic information and transcriptomic information for multiple tissue types. Further, Guo et al. [31] found a Spearman correlation of 0.8 in comparing RNAseq and mRNA transcriptome from TCGA human cancer data [32]. Therefore, we have decided to use this highly correlated proxy for protein expression data in our calculations.
Given a set of transcriptome data as representative of protein concentration values, we overlay that on the graph of the human protein-protein interaction network from BioGrid [33]. This means we assign to each protein representing a node of the network, the scaled (between 0 and 1), transcriptome value (or RNAseq value). The edges in this network correspond to protein-protein interactions and they define a unique topology for a given protein-protein interaction network. As shown in our previous work [18][19][20][21][22][23][24][25], each disease studied so far is characterized by a unique network topology. From the data extracted for a given protein-protein interaction network, we compute the Gibbs free energy of each protein-protein interaction using the relation: where c i is the "concentration" of the protein i, normalized, or rescaled, to be between 0 and 1. The sum in the denominator is taken over all protein neighbors (i.e., those that interact with it) of i, and including i. Therefore, the denominator can be considered related to a degree-entropy, although its functional form is much simplified since it does not include logarithmic terms. Carrying out this mathematical operation essentially transforms the "concentration" value assigned to each protein to a corresponding contribution to the Gibbs free energy. Thus, we replace the scalar value of transcriptome to a scalar function the Gibbs free energy. Thus, the equation represents the relationship between concentrations of proteins and the corresponding Gibbs energy.
The Gibbs free energy is a negative number, thus associated with each protein on the network that is a negative potential energy well. When plotted in 3D space where the vertical axis corresponds to the Gibbs free energy and the points in the horizontal plane represent protein coordinates, this results in a rugged energy landscape shown schematically in Figure 1. If we use what is called a topological filtration on this landscape, we essentially move a filtration plane up from the deepest energy well. As the filtration plane is moved up, larger-and-larger energetic subnetworks are captured. For convenience, we stop the filtration at energy threshold 32 meaning 32 nodes in the energetic subnetwork are retained. We call these subnetworks Gibbs-homology networks. This is not a magic number. The threshold of 32 was selected for convenience in showing networks visually. Incidentally, if we attempted to build a "network" by ranking the gene expression values, we would find disconnected nodes and not a connected network.
the "concentration" value assigned to each protein to a corresponding contribution to th Gibbs free energy. Thus, we replace the scalar value of transcriptome to a scalar functio the Gibbs free energy. Thus, the equation represents the relationship between concentr tions of proteins and the corresponding Gibbs energy.
The Gibbs free energy is a negative number, thus associated with each protein on th network that is a negative potential energy well. When plotted in 3D space where th vertical axis corresponds to the Gibbs free energy and the points in the horizontal plan represent protein coordinates, this results in a rugged energy landscape shown schema cally in Figure 1. If we use what is called a topological filtration on this landscape, w essentially move a filtration plane up from the deepest energy well. As the filtration plan is moved up, larger-and-larger energetic subnetworks are captured. For convenience, w stop the filtration at energy threshold 32 meaning 32 nodes in the energetic subnetwo are retained. We call these subnetworks Gibbs-homology networks. This is not a mag number. The threshold of 32 was selected for convenience in showing networks visuall Incidentally, if we attempted to build a "network" by ranking the gene expression value we would find disconnected nodes and not a connected network. We now compute the Betti centrality, a topological measure, on the 32-node energet networks as described in detail by Benzekry et al. [20]. The concept is easily explained follows. In networks, there are holes, or rings, of various sizes. In these energetic pat ways, protein-protein interaction networks, the proteins form interaction rings. densely connected, but not fully connected, networks the rings, or holes, may consist triangles and larger rings of interaction. To find the Betti centrality we ask ourselves th following question. Which protein when removed from the network will change the ove all total number of rings the most? The total number of rings is called the Betti numbe Given a network G consisting of edges, e, and vertices, v, the Betti centrality is given by ( Hence, the difference between the total Betti number B(G) and the Betti number the network after removing node i, gives the Betti centrality for node labeled i. We com pute this for all nodes in the threshold-32 energetic network. Often there will be two o more proteins in the network that have equivalent Betti centrality. We discuss this equi alence and the Betti centrality with respect to the brain region data below in this man script. We now compute the Betti centrality, a topological measure, on the 32-node energetic networks as described in detail by Benzekry et al. [20]. The concept is easily explained as follows. In networks, there are holes, or rings, of various sizes. In these energetic pathways, protein-protein interaction networks, the proteins form interaction rings. In densely connected, but not fully connected, networks the rings, or holes, may consist of triangles and larger rings of interaction. To find the Betti centrality we ask ourselves the following question. Which protein when removed from the network will change the overall total number of rings the most? The total number of rings is called the Betti number. Given a network G consisting of edges, e, and vertices, v, the Betti centrality is given by Hence, the difference between the total Betti number B(G) and the Betti number of the network after removing node i, gives the Betti centrality for node labeled i. We compute this for all nodes in the threshold-32 energetic network. Often there will be two or more proteins in the network that have equivalent Betti centrality. We discuss this equivalence and the Betti centrality with respect to the brain region data below in this manuscript.

Data Sources
Patient information for TLE is available in ref. [34] using Dataset GSE63808. Note that the data from GSE63808 only include epilepsy patients and did not include healthy controls. Obviously, it would be unethical to collect temporal lobe tissue form healthy people.
Computing the Betti centrality for energy threshold 32, we find eleven proteins as the most energetically significant overall. These are ranked in the Pareto chart shown in Figure 2. (Note that Pareto ranking is a common statistical method for displaying differences in data. A Pareto diagram is a simple bar chart that ranks related measures in decreasing order of occurrence, Pareto ranking is based on the principle of non-dominated sorting also called Pareto dominance, which can clearly be seen in the present case). Since there can be one or two (sometimes three) Betti centrality nodes with equivalent energies, the number of centrality nodes in the Pareto chart adds up to greater than 131 (which is the total number of patients whose data have been accessed in this study).

Data Sources
Patient information for TLE is available in ref. [34] using Dataset GSE63808. Note that the data from GSE63808 only include epilepsy patients and did not include healthy controls. Obviously, it would be unethical to collect temporal lobe tissue form healthy people.
Computing the Betti centrality for energy threshold 32, we find eleven proteins as the most energetically significant overall. These are ranked in the Pareto chart shown in Figure 2. (Note that Pareto ranking is a common statistical method for displaying differences in data. A Pareto diagram is a simple bar chart that ranks related measures in decreasing order of occurrence, Pareto ranking is based on the principle of non-dominated sorting also called Pareto dominance, which can clearly be seen in the present case). Since there can be one or two (sometimes three) Betti centrality nodes with equivalent energies, the number of centrality nodes in the Pareto chart adds up to greater than 131 (which is the total number of patients whose data have been accessed in this study).

Results
By using the Pareto ranking, the most important node in the network is found to be CD81. (See Figure S1 for Pareto ranking of control patients.) A Gibbs Homology network at threshold 32 in which CD81 has the highest Betti centrality is shown in Figure 3 (highlighted in yellow) and the nearest-neighbor nodes with smaller, though important Gibbs energies, are shown highlighted in green.
Importantly, CD81 is a transmembrane protein that has tissue specific expression in various tissues including: tonsil, cerebral cortex, lymph node, smooth muscle, and reproductive tissues [35]. Shown in mice, CD81 regulates neuron-induced astrocytic differenti-

Results
By using the Pareto ranking, the most important node in the network is found to be CD81. (See Figure S1 for Pareto ranking of control patients.) A Gibbs Homology network at threshold 32 in which CD81 has the highest Betti centrality is shown in Figure 3 (highlighted in yellow) and the nearest-neighbor nodes with smaller, though important Gibbs energies, are shown highlighted in green. elevated when cellular stress is present, including in leukemia, several types of cancer, T cell under certain stimulation, and individuals suffering from chronic obstructive pulmonary disease (COPD) [44]. In rats, it was found that HSP90 protein levels in piriform cortex decreased after status epilepticus. The degradation was related to neuronal vulnerability to status epilepticus insult [45]. The third most frequent protein, excluding ribosomal proteins, according to the Pareto ranking of Betti centrality nodes is EEF1A1 (eukaryotic translation elongation factor 1, alpha 1). It plays an important role in the cellular translation process. Although EEF1A1 was not associated with neurological diseases, its isoform EEF1A2 was identified to be related. The two proteins are 92% similar in their amino acid sequences but exhibit nonoverlapping expression patterns [46]. The difference in functions that the two isoforms perform might be due to post-translational modifications [47]. In wild-type mice, expression of A2 takes over from A1 starting at 21 days after birth; at the same time, a deletion or biallelic mutation of EEF1A2 gene was found responsible for the early-onset neurological abnormalities and early death in mice [48,49]. In human patients, EEF1A2 missense mutation was associated to early-onset epilepsy, severe intellectual disabilities and specific subtle facial dysmorphic features [50][51][52].
TUBA1A (Tubulin Alpha 1a) is one of the components that make up the cytoskeleton. In particular, together with beta tubulin, it forms a stable tubulin heterodimer that is a building block of microtubules. Microtubules play key roles in mitosis of dividing cells where they form mitotic spindles. In non-dividing cells such as neurons, microtubules form parallel bundles in axons and dendrites providing pathways for axoplasmic Importantly, CD81 is a transmembrane protein that has tissue specific expression in various tissues including: tonsil, cerebral cortex, lymph node, smooth muscle, and reproductive tissues [35]. Shown in mice, CD81 regulates neuron-induced astrocytic differentiation [36]. Three alleles of the mice CD81 in seven genetic backgrounds were associated to abnormal brain development, hematopoietic system, and/or immune cells development and behavior [37]. In CD81-null mice, astrocyte and microglia cell numbers were upregulated, which lead to a 30% increase in brain size [38]. However, in human patients, deficiency of CD81 proteins was only associated to defected B cells [39]. As shown in mice and rats, expression of CD81 gene was up-regulated after seizure in three hippocampal cell layers: DGCL, CA1pyr, and CA3pyr [40,41]. In addition, the gene coding for the CD81 protein was recognized as a tumor-suppressor gene [42].
The second most frequent protein in the Pareto ranking of Betti centrality nodes is HSP90AA1 (heat shock protein 90 alpha family class A member 1). It is a stress induced isoform of the molecular chaperone Hsp90. In human, the gene has ubiquitous expression in brain and testis and 25 other tissues [43]. The expression of HSP90AA1 is known to be elevated when cellular stress is present, including in leukemia, several types of cancer, T cell under certain stimulation, and individuals suffering from chronic obstructive pulmonary disease (COPD) [44]. In rats, it was found that HSP90 protein levels in piriform cortex decreased after status epilepticus. The degradation was related to neuronal vulnerability to status epilepticus insult [45].
The third most frequent protein, excluding ribosomal proteins, according to the Pareto ranking of Betti centrality nodes is EEF1A1 (eukaryotic translation elongation factor 1, alpha 1). It plays an important role in the cellular translation process. Although EEF1A1 was not associated with neurological diseases, its isoform EEF1A2 was identified to be related. The two proteins are 92% similar in their amino acid sequences but exhibit nonoverlapping expression patterns [46]. The difference in functions that the two isoforms perform might be due to post-translational modifications [47]. In wild-type mice, expression of A2 takes over from A1 starting at 21 days after birth; at the same time, a deletion or biallelic mutation of EEF1A2 gene was found responsible for the early-onset neurological Appl. Sci. 2021, 11, 8059 7 of 11 abnormalities and early death in mice [48,49]. In human patients, EEF1A2 missense mutation was associated to early-onset epilepsy, severe intellectual disabilities and specific subtle facial dysmorphic features [50][51][52].
TUBA1A (Tubulin Alpha 1a) is one of the components that make up the cytoskeleton. In particular, together with beta tubulin, it forms a stable tubulin heterodimer that is a building block of microtubules. Microtubules play key roles in mitosis of dividing cells where they form mitotic spindles. In non-dividing cells such as neurons, microtubules form parallel bundles in axons and dendrites providing pathways for axoplasmic transport. Tubulin's gene is overexpressed in the brain's spinal cord and other tissues [53]. In human patients, mutations of this gene are expressed over a wide spectrum of phenotypes including lissencephaly, microcephaly, and early-onset epileptic seizures caused by defective neuronal migration [54,55]. Beta tubulin isotypes, TUBB2A, TUBB3, and TUBB4B have all been found over-expressed in post-traumatic brain injury patients and it has been therefore suggested that these tubulins along with CD44 may be appropriate targets for treatment [9], especially since there are numerous tubulin-binding pharmacological agents available [56]. However, virtually all of the approved and investigational tubulin-binding agents interact with beta and not alpha tubulin. We should point out that the Lipponen et al. study [9] was on rats and the data are available from GSE80174.

Ribosomal Proteins
The following analysis of ribosomal proteins is by no means exhaustive. First, we note that in the entire population of 130 patients, we found 34 ribosomal proteins in the Gibbs-homology networks at energy threshold 35. Of those 34 proteins, 10 were found in the literature to be related in some way to epilepsy, seizures, synaptic transmission, voltage-gated channels, and/or cytoskeleton. These 10 are RPS1, RPS4, RPS10, RPS11, RPS15, RPS27, RPL10A, RPL18, RPL32 [57]. Marrone et al. [58] discussed RPL32 in regard to seizures and aberrant cellular homeostasis. RPS6 is involved in co-expression of cyclin D1 in hemimegalencephaly [59] and is a key player in the mTOR signaling pathway and a contributor to epilepsy [60]. RPS6 is also implicated in X-chromosome brain diseases, as is RPS4, RPS1, RPL10 [61]. Notably, RPL6 is involved in a molecular pathway has been linked to temporal lobe epilepsy in childhood [62].

Discussion
Several studies based on the analysis of microarrays of epileptic tissue (human and animal models) reported both transcriptional and epigenetic alterations. In particular, the altered molecular pathways result in a variety of modifications in voltage-gated and receptor gated ion channels that lead to a perturbation of dendritic excitability [12]. Neuroproteomic studies on epilepsy revealed a significant contribution of proteins involved in energy metabolism, oxidative stress, inflammation, and excitatory imbalance [63]. Here, we report the results of our systems-biology based investigation, which analyzed the Gibbs Homology network for protein-protein interaction epilepsy data at threshold 32. The results of our investigations show that proteins with the highest Betti centrality are mainly transmembrane proteins, heat shock proteins, as well as neuronal elongation and cytoskeleton component proteins (alpha and beta tubulin isotypes). Importantly, by using the Pareto ranking, we also found significant ribosomal protein overexpression involving a number of these proteins. In the Supplementary Material, an additional analysis is provided showing similar results for a control group of 55 subjects.
Pires et al. [64] examining the proteome of brain samples from epilepsy found an overexpression of ribosomal proteins indicating an increased translational machinery. Increased ribosomal activity from microglia has been linked to neurological inflammatory conditions [65]. Gliosis as inflammatory responses, is frequently found in epilepsy patients and animal models. Moreover, the main histopathological features of TLE are hippocampal neuronal cell loss and gliosis with extensive synaptic rearrangement ('mossy fiber' innervation). The question if inflammation is the effect or the cause of seizure has been potentially answered in a mice model. The genetic deletion of Beta1-integrin leads to the development of astrogliosis that lead to spontaneous seizures [66]. Changes in glial cell phenotype have also been shown in specimens from patients with pharmaco-resistant temporal lobe epilepsy. Interestingly, astroglia molecular abnormalities have been revealed with the altered expression, localization, and function of the K+ and water channels [67]. These findings suggest that astroglia cells play a pivotal role in epilepsy and should be considered as promising targets for new therapeutic strategies.
This study identified a number of proteins that appear to play major roles in epilepsy and can, therefore, become attractive targets for pharmacological inhibition. The main proteins of interest are ribosomal proteins RPS1, RPS4, RPS10, RPS11, RPS15, RPS27, and RPL10A, RPL18, and RPL32. Unfortunately, only two of these protein structures have been solved and can be found in the Protein Data Bank (https://www.rcsb.org/ accessed on 30 August 2021), namely, RPS15 (PDB id: 1G1X) and RPL18 (PDB id: 1ILY). To the best of our knowledge, there are no known specific and selective inhibitors of the above-listed protein.
However, there exist broad inhibitors of ribosomal activity, such as antibiotic molecules hygromyacin B, tetracycline, and pactamycin [68]. Future pharmacological development of specific inhibitors of the identified ribosomal proteins of interest could lead to important advances in this field.