Distinguishing and Biochemical Phenotype Analysis of Epilepsy Patients Using a Novel Serum Profiling Platform

Diagnosis of non-symptomatic epilepsy includes a history of two or more seizures and brain imaging to rule out structural changes like trauma, tumor, infection. Such analysis can be problematic. It is important to develop capabilities to help identify non-symptomatic epilepsy in order to better monitor and understand the condition. This understanding could lead to improved diagnostics and therapeutics. Serum mass peak profiling was performed using electrospray ionization mass spectrometry (ESI-MS). A comparison of sera mass peaks between epilepsy and control groups was performed via leave one [serum sample] out cross-validation (LOOCV). MS/MS peptide analysis was performed on serum mass peaks to compare epilepsy patient and control groups. LOOCV identified significant differences between the epilepsy patient group and control group (p = 10−22). This value became non-significant (p = 0.10) when the samples were randomly allocated between the groups and reanalyzed by LOOCV. LOOCV was thus able to distinguish a non-symptomatic epilepsy patient group from a control group based on physiological differences and underlying phenotype. MS/MS was able to identify potential peptide/protein changes involved in this epilepsy versus control comparison, with 70% of the top 100 proteins indicating overall neurologic function. Specifically, peptide/protein sera changes suggested neuro-inflammatory, seizure, ion-channel, synapse, and autoimmune pathways changing between epilepsy patients and controls.


Introduction
Epilepsies are brain disorders characterized by recurrent seizures coupled with reduced thresholds for such seizures [1,2]. Approximately 3 percent of the world's population will experience seizures

Study Participants
This study was approved by the Human Subjects Institutional Review Boards of CMC Vellore and of the University of Oklahoma Health Sciences Center, Oklahoma, USA. All subjects gave their informed consent for inclusion before they participated in the study. The study was conducted in accordance with the Declaration of Helsinki, and the protocol was approved by the University of Oklahoma Health Sciences Center Institutional Review Board (IRB, Project identification code #16125). Study participants sought care at the Christian Medical College (CMC) and Hospital, Vellore, India between January 2013 and October 2014 for seizure or headache issues. Written informed consents were obtained from study participants prior to blood specimen retrieval and any treatments. Seizure patients had at least two seizures in the last ten years and at least one seizure in the previous 7 months, indicating an epilepsy disorder [5] Seizure patients (N = 29) were found to have no evidence of parasite or tumor or other structural brain lesions on MRI (N = 21) or CT (N = 8) imaging, and were seronegative for antigens and antibodies to the larval stages of T. solium and T. saginata. These seizure patients (N = 29) were diagnosed as having non-symptomatic epilepsy of unknown etiology by the CMC Clinical Staff.
The mean age of the epilepsy patients was 27 with 65% being males ( Table 1). The mean age of the control subjects was 34 and 59% were female (Table 1). Their brain imaging was normal with no history of seizures, brain tumor, head trauma, HIV (human immunodeficiency virus), HBV (hepatitis B virus), HCV (hepatitis C virus). This group was designated as presenting with idiopathic headaches ( Table 1). All MRI and CT images were read by one of the authors (V.R.). Patients had not taken anti-inflammatory drugs at least 7 days prior to enrollment, and were not visibly ill at the time of blood collection. Sera were obtained from peripheral blood at the CMC Hospital, Vellore, according to standard procedures [20]. Aliquots (250 µL) were frozen at −80 • C, and not reused after initial freezing and thawing. Other socio-demographic and clinical characteristics are summarized in Table 1. Partial then generalized 4 (14%) NA a unknown etiology. b These questions were only asked to male patients. c Associated with possible neurocysticercosis (a parasitic infection of T. solium and T. saginata prevalent to the area and having seizure as a clinical symptom) infection. All subjects were without edema, EITB negative, and AgELISA negative [21,22].

Direct Electrospray Mass Spectrometry (ESI-MS) of Sera from Patients With Epilepsy and Control Subjects
An LCQ ADVANTAGE ion-trap electrospray MS instrument (ThermoFisher), was used for obtaining serum MS spectra and for tandem MS/MS peptide/protein identifications. The spectral data were analysed as described previously [8,14]. Briefly, three high-resolution mass spectra were obtained from each serum sample over a mass divided by charge (m/z) range of 400 to 2000. Spectral data were recorded and extracted using the manufacturer's software (Qual Browser: version 1.4SR1), and normalized to a sum value of 100 intensity units in non-overlapping segments of 10 m/z. Mass peak areas were recorded/transformed into centroid m/z peak area values (valley to valley) using Mariner Data Explorer 4.0.0.1 software (Applied BioSystems). All serum samples were processed in an identical fashion. The same serum samples (epilepsy or control) were also analysed on a lower resolution compact desk-top single quadrupole ESI-MS instrument (Expression CMS, Advion, Inc., Ithaca, NY, USA) as described previously [8].
For tandem MS/MS peptide/protein identifications, ions encompassing the range of 900-1008 m/z were analysed in 10 epilepsy patient and 10 control individual sera samples that were age and sex matched as best as possible. This m/z range was chosen based on empirically determined optimal machine performance for MS/MS analysis of unfractionated serum samples. A 35% ionization energy was utilized on each m/z unit parent ion and fragmentation was observed for 5 min. The MS/MS signal data analysis utilized ThermoFisher Proteome Discoverer 1.0 SP1 with a human and T. solium non-redundant databases as downloaded from the National Center for Biotechnology Information (NCBI). On average, serum samples contained 1.95 (range: 0-5) parent ions with significant differences, as determined by standard deviation of MS spectral data, between the pre and post MS/MS scans. MS/MS search settings were applied [enzyme name = no-enzyme/ no digest]: precursor mass tolerance = 1.8 Da, fragment mass tolerance = 0.8 Da, "b" and "y" ions were scored, and any dynamic modifications for oxidation (C and M amino acids), phosphorylation (S, T, and Y), methylation (C), were noted with a maximum of 4 modifications per peptide. Protein identifications required a minimum of two unique peptides and a cross-correlation range (Xcorr) ≥ 1.7, in line with previous studies [13,14]. The identified sequences were also searched using the National Center for Biotechnology Information NCBI online search database Basic Local Alignment Search Tool (BLAST). A "hit" in the protein database search is scored for each MS/MS scan when the Xcorr, identifying a peptide sequence, is higher than or equal to 1.7. Each sample was scanned multiple times for a total of 5 min duration at each m/z. Each identification of a peptide/protein sequence was termed a "hit". The number of patient sera samples out of ten indicating the presence of the same peptide/protein is reported. Identified protein names and the number of Identified MS/MS sequence spectra identified "hits" were imported each as log(base 2) ratios of epilepsy/control for Ingenuity Pathway Analysis (IPA, QIAGEN) [23]. Proteins have been inspected for protein function using the Medline and PubMed online databases.

Statistical Analysis
Leave one [serum sample] out cross-validation (LOOCV) involving a novel peak classification value (PCV) procedure was used to help distinguish serum samples of epilepsy patients from control individuals ( Figure 1) [8,14]. Triplicate averaged mass peak areas at individual m/z mass peaks from diluted sera were compared for significance between the epilepsy patients and controls using one-tailed Student's t-tests and used in the LOOCV analysis ( Figure 1A). One sera sample and its mass peaks from either the control or epilepsy groups is alternatively "left out" to build a series of unique N-1 LOOCV "left in" significant (p < 0.05) mass peak area "difference" datasets. The mass peaks of each "left out" sample were then compared to all the "left in" mass peaks in their unique N-1 LOOCV dataset. This comparison involves the use of a peak classification value (PCV) used to classify each "left out" sample at each significant "left in" peak of each N-1 LOOCV dataset. Whether a "left out" peak area falls above or below the PCV determines if it should be classified into the epilepsy or control group. For example, peak 919 in Figure 1B is classified as an "epilepsy" peak in the "left in" database because the higher mean area is assigned to this group (dash). If the 919 peak from the "left out" sample has a peak area above the PCV then it is classified as an "epilepsy" peak. If it falls below or equal to this PCV, then the "left out" peak is classified as a "control" peak. Peak classifications are performed for all "left out" peaks in all "left out" serum samples against their respective N-1 "left in" LOOCV mass peak databases resulting in a summed % total LOOCV peak sera score (patient score). This procedure results in sera samples displaying a combination of "epilepsy-peaks" and "control-peaks". This % of total mass peaks classified as epilepsy for the left-in dataset is assigned each "left out" sample and plotted on the y-axis versus the individual serum samples on the x-axis in Figure 1C. To check for over-fitting [24][25][26] of the large datasets, sera samples were randomized to the control or epilepsy group ( Figure 1D) using the RND (randomization) function in Excel and manually balanced to retain gender and age ratios of the initial groups [8,14]. Upon randomization, LOOCV was performed again exactly as described above. Cohen's-d values were calculated from the %LOOCV means and standard deviations to approximate the discriminatory effect size between two groups [27]. Cohen's-d serves as a measure of statistical power (ability to detect type II errors-false negatives), and is estimated for given sample sizes as described [27,28].

Test Metrics
A group test metric "cut off" was calculated from the mean % LOOCV classified peaks from each group minus (epilepsy), or plus (control), an equivalent number of standard deviations (SD) exhibited, e.g., in Figure 1C and as previously described [8,14]. LOOCV cut offs were used to determine False Positives (FP), True Positive (TP), False Negative (FN), and True Negative (TN) values for classifying controls and epilepsy patients into the proper group (Figures 1-3). Sensitivity is the percentage of epilepsy patients classified as epilepsy because their % of total LOOCV was above the epilepsy LOOCV threshold cut off. The specificity is the percentage of control patients classified as control because their % of total LOOCV was at or below the epilepsy LOOCV cut off.

Distinguishing Sera of a Patient Group with Epilepsy from a Control Individual Group Using LOOCV/PCV ESI-MS
Distinguishing a patient group with epilepsy from a control group with this ESI-MS serum profiling platform is illustrated in Figure 1C. Patient scores (y axis) for the % of the total LOOCV classified serum mass peak dataset were used to compare epilepsy patients (dashes) and control individuals (open triangles, x axis). A clear demarcation/separation is observed between the epilepsy patients and control patients in the % of epilepsy LOOCV classified patient serum mass peaks (y axis, 106-118 peaks utilized). The cut off of 41.68% epilepsy LOOCV classified mass peaks yielded a strong separation of the epilepsy group from the control group as evidenced by the very low p-value Brain Sci. 2020, 10, 504 6 of 18 (4.6 × 10 −22 ). The "Group Mean Test Metric Cut Off" represents the combined cut off values of both epilepsy and control groups and is set at equidistant differences of 2.77 SD's from each respective group mean (-2.77 SD-standard deviations higher scoring group) or (+2.77 SD lower scoring group). These narrow standard deviations indicate considerable homogeneity in both of these groups despite their somewhat uneven nature in terms of sample numbers between the groups and a gender bias toward males within the groups (Table 1). An individual patient serum score > "Cut Off" value suggests higher identity to the epilepsy group. Whereas, a score ≤ "Cut Off" value suggests higher identity to the control group. The cut off in Panel C indicates there are no false positives or false negatives in this separation. The epilepsy vs. control comparison was performed previously with sera samples from patients recovered from neurocysticercosis (RNCC) epilepsy [8]. Sera mass peak groupings were different in the present Figure 1C, and this led to improved group p-value separation from 10 −18 [8] to the present 10 −22 . When sera samples from these study subjects are randomized between the two groups and the same LOOCV/PCV process is repeated, nearly all samples are LOOCV classified as members of both the control and epilepsy groups with no separation, as exhibited in Figure 1D. This random grouping of patient serum samples demonstrates a lack of separation between patient groups. Specific group cut-off lines are shown and now represent different values, due to random patient grouping, even though the calculation methodology did not change. The lack of difference between the randomized groups is evidenced by a much larger p-value obtained (p = 0.10). These results are consistent with an expectation of minimal over-fitting and the potential presence of a physiological basis being responsible for the discrimination observed between these groups.

Distinguishing Blinded Epilepsy Sera Samples, and LOOCV with a Low-Cost Desk Top Mass Spectrometer
A blinded sample experiment was performed by randomly removing six epilepsy samples and four control samples from the dataset used in Figure 1C, and re-run the LOOCV using the remaining 23 epilepsy and 13 control samples as a "training set" in Figure 2A. The training set distinguished samples from the two study groups with a p-value of 3.02 × 10 −16 ; randomization resulted in a p-value of 0.054, suggesting minimal over-fitting in the training set. The 10 "blinded" samples were then classified using the epilepsy LOOCV test metric cut off determined by the training set (43%) and displayed in Figure 2B. A total of 9 out of 10 samples were correctly classified; one blind sample from the control group was above the cut off value of the epilepsy training set so would have mistakenly been classified as false-positive "epilepsy". The analysis of additional samples in a future study could conceivably improve such blind sample discrimination. Figure 3A exhibits where the "left out" sera mass peaks from 13 male United States Veteran patients (Xs) who suffered a mild (loss of consciousness 1-30 min) TBI (male age mean = 39.9) group-classify using the LOOCV/PCV procedure employed in the Figure 1C separation of epilepsy patients from controls [14]. All 13 sera (Xs) segregate with the epilepsy samples (dashes) versus the control samples (triangles), which is consistent with some physiological similarity between the two brain disorders when compared to control individuals with minor headaches. Importantly, this LOOCV/PCV procedure is able to distinguish the sera of these TBI patients (dark squares) from the epilepsy patients (dashes) from these TBI-patients with a low p value (10 −18 panel B). The RND-discrimination p-value is more significant for the epilepsy vs. TBI discrimination (0.0012, panel C) than for the epilepsy vs. control (0.10, Figure 1C). Despite this significant p-value for the randomization of the TBI and epilepsy patient samples, little if any group separation is observed (panel C). outlining the serum sample handling and mass spectrometry processing of the binary patient/subject group analysis. (B) Serum mass peak Scoring of the LOOCV/PCV (leave (one serum sample) out cross-validation/peak classification value) procedure used to classify mass peaks either as "epilepsy" or as "control" for a "left out" sample, (a limited sample range 900-1008 m/z is displayed) of significant group discriminatory mass peaks. The PCV example for the peak at 919 is exhibited and used to classify "left out" peaks as either "epilepsy" (peak area above this PCV) or control (peak area at or below this PCV). (C) Serum discrimination of epilepsy patients (dashes) from controls (triangles) by a % of LOOCV classified mass peaks. A cut off value is indicated (− or + SDs from the seizure or control groups respectively) to determine test metric values (e.g., true positives). (D) A lack of serum sample discrimination is demonstrated which results when the two different sample groups are mixed together randomly followed by the same LOOCV mass peak analysis.

Distinguishing Blinded Epilepsy Sera Samples, and LOOCV with a Low-Cost Desk Top Mass Spectrometer.
A blinded sample experiment was performed by randomly removing six epilepsy samples and four control samples from the dataset used in Figure 1C, and re-run the LOOCV using the remaining 23 epilepsy and 13 control samples as a "training set" in Figure 2A. The training set distinguished samples from the two study groups with a p-value of 3.02 × 10 −16 ; randomization resulted in a p-value of 0.054, suggesting minimal over-fitting in the training set. The 10 "blinded" samples were then classified using the epilepsy LOOCV test metric cut off determined by the training set (43%) and displayed in Figure 2B. A total of 9 out of 10 samples were correctly classified; one blind sample from the control group was above the cut off value of the epilepsy training set so would have mistakenly been classified as false-positive "epilepsy". The analysis of additional samples in a future study could conceivably improve such blind sample discrimination. outlining the serum sample handling and mass spectrometry processing of the binary patient/subject group analysis. (B) Serum mass peak Scoring of the LOOCV/PCV (leave (one serum sample) out cross-validation/peak classification value) procedure used to classify mass peaks either as "epilepsy" or as "control" for a "left out" sample, (a limited sample range 900-1008 m/z is displayed) of significant group discriminatory mass peaks. The PCV example for the peak at 919 is exhibited and used to classify "left out" peaks as either "epilepsy" (peak area above this PCV) or control (peak area at or below this PCV). (C) Serum discrimination of epilepsy patients (dashes) from controls (triangles) by a % of LOOCV classified mass peaks. A cut off value is indicated (− or + SDs from the seizure or control groups respectively) to determine test metric values (e.g., true positives). (D) A lack of serum sample discrimination is demonstrated which results when the two different sample groups are mixed together randomly followed by the same LOOCV mass peak analysis.  Figure 2C. The randomization p-value is higher with the CMS instrument vs. the LCQ instrument (0.37 vs. 0.10). These results do suggest a less accurate instrument with reduced m/z range can still detect enough mass spectral signal differences between these two groups. This ability strengthens the conclusions that the biomolecules observable in the serum and differing among the study groups could possibly help in the diagnosis and monitoring epilepsy. Figure 3A exhibits where the "left out" sera mass peaks from 13 male United States Veteran patients (Xs) who suffered a mild (loss of consciousness 1-30 min) TBI (male age mean = 39.9) group-classify using the LOOCV/PCV procedure employed in the Figure 1C separation of epilepsy patients from controls [14]. All 13 sera (Xs) segregate with the epilepsy samples (dashes) versus the control samples (triangles), which is consistent with some physiological similarity between the two  Figure 2C. The randomization p-value is higher with the CMS instrument vs. the LCQ instrument (0.37 vs. 0.10). These results do suggest a less accurate instrument with reduced m/z range can still detect enough mass spectral signal differences between these two groups. This ability strengthens the conclusions that the biomolecules observable in the serum and differing among the study groups could possibly help in the diagnosis and monitoring epilepsy. LOOCV/PCV procedure is able to distinguish the sera of these TBI patients (dark squares) from the epilepsy patients (dashes) from these TBI-patients with a low p value (10 −18 panel B). The RND-discrimination p-value is more significant for the epilepsy vs. TBI discrimination (0.0012, panel C) than for the epilepsy vs. control (0.10, Figure 1C). Despite this significant p-value for the randomization of the TBI and epilepsy patient samples, little if any group separation is observed (panel C).  Table 2 summarizes the test metrics for the LOOCV data in Figures 1-3. Patient groups tested are listed in the far-left column. % LOOCV classified mass peak mean values with standard deviations (SD) for the two comparative groups are given in panel I. The performance of ESI-MS to classify subjects into their true study group was high, with a sensitivity and a specificity of 100% and 97%, when the full LOOCV dataset was used for the epilepsy vs. control comparison (panel II). Cohen's-d effect size values are provided in panel II. These values are calculated from the % LOOCV means and SDs to obtain a sense of the effect size which is a measure of the size of the observed differences between the two groups under comparison [27]. Cohen's-d is an indirect measure of statistical power (ability to detect type II errors-false negatives). The large Cohen's-d values here have an estimated power of >0.90 and bolster the reliability of the sample sizes utilized here [27,28]. Test metrics using the lower resolution, lower m/z range, single quadrupole desk-top ESI mass spectrometer are exhibited in panels I and II. All metrics are diminished when compared to the LCQ instrument.  Table 2 summarizes the test metrics for the LOOCV data in Figures 1-3. Patient groups tested are listed in the far-left column. % LOOCV classified mass peak mean values with standard deviations (SD) for the two comparative groups are given in panel I. The performance of ESI-MS to classify subjects into their true study group was high, with a sensitivity and a specificity of 100% and 97%, when the full LOOCV dataset was used for the epilepsy vs. control comparison (panel II). Cohen's-d effect size values are provided in panel II. These values are calculated from the % LOOCV means and SDs to obtain a sense of the effect size which is a measure of the size of the observed differences between the two groups under comparison [27]. Cohen's-d is an indirect measure of statistical power (ability to detect type II errors-false negatives). The large Cohen's-d values here have an estimated power of >0.90 and bolster the reliability of the sample sizes utilized here [27,28]. Test metrics using the lower resolution, lower m/z range, single quadrupole desk-top ESI mass spectrometer are exhibited in panels I and II. All metrics are diminished when compared to the LCQ instrument.

Phenotype Assessment of Epilepsy Patients versus Control Individuals Using TANDEM MS/MS of Serum Peptide/Proteins, and Bioinformatics Cell Pathway/Disease Mechanism Analysis
MS/MS analysis of 10 epilepsy patient sera and 10 control sera, in a range between 900 and 1008 m/z, was employed to examine peptide/protein differences between these two sera groups. This 900-1008 range was empirically shown previously to provide ample ionization for serum MS/MS peptide identification [14]. To focus on peptides/proteins with larger differences between the two groups, a subset of the top 100 peptides/proteins showing at least a two-fold difference in number of positive sera between epilepsy and control groups was chosen and is exhibited in Tables 3 and 4. A MS/MS "hit" (single peptide identification) ratio between the two groups of at least a rounded off value of 1.5 was also employed. A total of 182 peptides/proteins meet these criteria and are exhibited as Supplemental Table S1. A PubMed/Medline search of these 100 differentially expressed peptides/proteins in Tables 3  and 4 showed 70% were related to neurological function, 45% to immune/inflammation, 32% to seizures/epilepsy, 24% to ion-channels, and 10% to blood-brain barrier (BBB). Seizure/epilepsy related proteins (shaded cells in Tables 3 and 4 Tables 3  and 4 are CACNA2D2 (Calcium Voltage-Gated Channel Auxiliary Subunit Alpha2 delta 2, [41]) and ASTN2 (Astrotactin 2, [42]). Both these proteins have ion-channel functions which have important roles in epileptogenesis [9]. It is noted that 16 out of 32 seizure/epilepsy related peptides/proteins in Table 3 (50%) have ion-channel relatedness.    The proteins/peptides expressed differently between these epilepsy and control groups were analyzed by Ingenuity Pathway Analysis (IPA) to potentially identify affected networks of cellular/biochemical/disease pathways/systems (Figure 4, 60 out of 100 proteins present in Tables 3  and 4). IPA is used in this context to predict cellular pathways that are changing based on altered gene expression parameters [23], in this case, serome peptidome changes which are valid gene expression markers for disease states [43]. Pathways affected and major proteins present in such pathways include neuroinflammation (HDC, NOTCH, VWF), synapses and synaptic transmission (BSN, EPHB2), cognitive impairment (SLIT2, RYE2, NF1), behavior (PCM1, EPHB2), seizure/epilepsy (KCNQ2, CACNA2D2), blood-brain barrier (HDC and VWF), brain damage (EPHB2, VWF, CACNA2D2), brain cell death (ADNP and CACNA2D2), traumatic brain injury (TBI, RECK and MEGF8), and ion-channels (KCNQ2, CACNA2D2), and autoimmunity (VWF, RECK, SCARF1). IPA of the top 182 proteins in Table S1 with autoimmune emphasis is exhibited in Figure S1 (74 proteins present).
A line connection between two proteins or between a protein and a "hub" in Figure 4 (hub defined as a biological function or disease), for example, between ABL2 and EPHB2 or between EPHB2 and the hub synaptic transmission, represents published biological findings compiled in IPA software. Of note, some of the proteins listed in Table 3, Table 4 and Figure 4 may not have a direct connection to epilepsy, but serve as connections to other biomolecular pathways of possible interest like locomotion and amyloidosis. Finding proteins/peptides known to be related to seizure/epilepsy phenotypes in an epilepsy patient comparison with control individuals in Table 3, Table 4 and Figure 4 provides initial support for the ability of this mass profiling platform to assist as a potential monitoring platform and push forward to decipher the epilepsy pathology and phenotype. Future studies will examine larger numbers of serum samples in these contexts, and also test for peptide/protein presence in sera using immunoassays. Such analyses are complex since mostly peptides and not intact proteins are being identified by MS/MS. Matching available antibodies and their epitopes to peptides is a complex process. In some cases, this will likely involve de novo acquisitions of peptide-specific antibodies.
The authors recognize the imbalance of male and female participants but were unable to demonstrate any of the presented results specifically associated with patient gender.  Table 3 and Table 4. Ingenuity Pathway Analysis (IPA), Qiagen, Inc.) of the 100 peptides/proteins exhibited in Table 3 and Table 4 having a 2× difference in positive sera number between epilepsy and controls, and a 1.5× difference in MS/MS "hit" (single peptide identification) ratio between the two groups. The protein function legend is at the bottom of the figure.
A line connection between two proteins or between a protein and a "hub" in Figure 4 (hub defined as a biological function or disease), for example, between ABL2 and EPHB2 or between EPHB2 and the hub synaptic transmission, represents published biological findings compiled in IPA software. Of note, some of the proteins listed in Table 3, Table 4 and Figure 4 may not have a direct connection to epilepsy, but serve as connections to other biomolecular pathways of possible interest like locomotion and amyloidosis. Finding proteins/peptides known to be related to seizure/epilepsy phenotypes in an epilepsy patient comparison with control individuals in Table 3, Table 4 and Figure 4 provides initial support for the ability of this mass profiling platform to assist as a potential monitoring platform and push forward to decipher the epilepsy pathology and phenotype. Future studies will examine larger numbers of serum samples in these contexts, and also test for peptide/protein presence in sera using immunoassays. Such analyses are complex since mostly peptides and not intact proteins are being identified by MS/MS. Matching available antibodies and their epitopes to peptides is a complex process. In some cases, this will likely involve de novo acquisitions of peptide-specific antibodies.
The authors recognize the imbalance of male and female participants but were unable to demonstrate any of the presented results specifically associated with patient gender.  Tables 3 and 4 having a 2× difference in positive sera number between epilepsy and controls, and a 1.5× difference in MS/MS "hit" (single peptide identification) ratio between the two groups. The protein function legend is at the bottom of the figure.

Discussion
Biomarkers are excellent tools for monitoring and understanding diseases such as epileptogenesis and seizure as well as aiding in treatment [10]. Although biomarker progress on epilepsies has been slow, recent advances in large input/throughput approaches (genomic, transcriptomic, proteomic, metabolomic) show promise [10]. It is important in this "omic" context to examine cellular/pathophysiological networks possibly having roles as these diseases are complex. Developing such biomarker and cell network approaches using readily available bodily sources such as peripheral blood would be helpful. The present studies purpose was to examine a novel methodology to see whether it could help identify and monitor patients with epilepsy, in this case of unknown etiology. The MS method uses unfractionated serum to help distinguish and monitor epilepsy patients from control individuals. The hypothesis of this approach is that epilepsy induces organs and tissues to release/shed specific biomolecules into the peripheral blood involved in the disease state as well as in specific systemic responses to that disease state. Examination of biomolecules in peripheral blood, e.g., peptides/proteins that change with epilepsy, has the potential to provide diagnostic, phenotypic, mechanistic, and therapeutic insights into this disorder. The serome peptidome is a valid gene expression entity for study and was shown to correlate with specific disease states [14,43]. This study utilized serum samples gathered from patients presenting with a history of seizure (seizure group) and patients presenting with headache (control group) and sought care at the Christian Medical College (CMC) and Hospital, Vellore, India between January 2013 and October 2014 for either seizure or headache issues. Alternative explanations, as provided by standard of care, for seizure were excluded as described in the Materials and Methods.
MS profiling of sera identified mass peaks changing significantly upon comparison of patients with epilepsy versus controls individuals who sought treatment for idiopathic headaches but with no brain lesions upon examination. Randomization of serum samples between these two groups followed by LOOCV mass peak analysis resulted in loss of group-specific discrimination ability, suggesting a physiological basis for the epilepsy vs. control discrimination. These positive results are likely due to the large number of different identifiers (106 to 118 mass peaks used in the Figure 1C group discrimination) as the larger the number of such identifiers, the greater the disease discriminatory capability of a platform. Results using ESI-MS were substantiated using an instrument with a different mass analyser of lower resolution; see Figure 2. These results support the hypothesis that epilepsy induces biomolecular alterations that are reflected in the peripheral blood and can play a role in identifying specific clinical groups. More specifically, these results indicate that this ESI-MS approach described has potential for monitoring and understanding epilepsy as well as aiding in therapeutic development, thus warranting further study. Of note, all of the LOOCV analyses presented in this study exhibit quite large "effect sizes" (proportional to mean differences and standard deviations of the group-specific % LOOCV classified mass peaks) in the exhibited binary comparisons. Such "effect sizes" are proportional to a Cohen's d value, which is proportional to statistical power. The large Cohen's d values in this study ( Table 2) yield an estimated power of > 0.90 for these LOOCV binary comparisons and bolster the reliability of the sample sizes utilized here [27,28].
A noteworthy addition to this study is the inclusion for comparative purposes of a group of traumatic brain injury (TBI) patients without diagnosed epilepsy that were described in a previous study on TBI [14]. Others have indicated most epilepsy biomarker studies only include controls and not comparisons with other non-epileptic brain injuries and that this is a weakness in these studies [7]. Using LOOCV, this group of TBI patients (N = 13) segregated with the epilepsy patients when compared to controls, Figure 3A. This suggests the described above apparent physiological changes are responsible for these LOOCV serum sample separations, and the TBI condition is more related to the epilepsy condition than to the idiopathic headache condition. Importantly, a separate LOOCV analysis was able to directly discriminate the TBI patients from the epilepsy patients ( Figure 3B), indicating and suggesting that different disease conditions are present.
Besides these disease group discriminations, the MS methodology employed here is able to assist in understanding biochemical mechanisms as well as identifying potential novel biomarkers and therapeutic targets. Many of the serum mass peaks analyzed in this study, for example in Figure 1C, are between approximately 500 to 1200 m/z, and likely include host tissue/organ exoprotease activities and other cell/tissue signaling activities resulting from the lower mass peptide "serome", biomolecules [17,18]. To aid in identifying physiological differences in this complex biomolecular milieu, MS/MS structure determinations were performed. At the ionization energies employed here, intact larger proteins are not likely to be fragmented, only existing peptides and polypeptides. The identification of differentially present peptides and biochemical pathways could be helpful in understanding underlying disease mechanisms and developing novel diagnostic biomarkers and therapeutics [43].
For these purposes, a range analysis (900-1008 m/z) was conducted, revealing a prominent epilepsy phenotype with 32 of the 100 different proteins (32%) with known associations to seizure/epilepsy; see Tables 3 and 4. A prominent epilepsy-associated protein appeared in this analysis, potassium ion-channel protein KCNQ2, whose gene plays important roles, when mutated, in the development of infancy epilepsies like benign familial neonatal convulsions (BFNC, [9]). Since epilepsy likely encompasses unknown genetic changes, the finding of KCNQ2 effects in epilepsy, indicates overlap between known gene changes in infantile epilepsy and unknown gene changes in epilepsy. Another prominent epilepsy related ion-channel protein found in this analysis is CACNA2D2, a calcium channel protein involved in small-molecule ligand interactions as well as neuronal cell death pathways [44].
These two proteins comprise two major hubs in the IPA cellular/biochemical pathway analysis overlapping a variety of epilepsies, ion-channels and synaptic transmission, and neurodegeneration and cognitive issues; see Figure 4. Other prominent hubs in the IPA using the 100 peptides/proteins listed in Tables 3 and 4 include the blood-brain barrier (BBB), neuro-inflammation, and autoimmunity. The autoimmunity connection is of interest because it could help explain the reoccurrence and cyclical nature of seizures in epilepsy. Recent work indicates an autoimmune association with epilepsy [19]. IPA with autoimmune emphasis using the expanded list of proteins in Table S1 is exhibited in Figure  S1 Encephalitis and encephalomyelitis pathways appear suggestive of viral infection origins for some epilepsy.

Conclusions
Our results demonstrate, for the patient groups analyzed, the this ESI-MS ability to platform and distinguish/monitor sera of patients having epilepsy, based on their respective serum biomolecule mass peak profiles. In addition, a set of TBI patient serum samples were segregated with epilepsy patient samples when compared with controls. Bioinformatics/systems biology analysis of MS/MS deduced peptide/protein changes associated with epilepsy, suggested cell pathways/biochemical systems affected/altered include neuro-inflammation, seizure/epilepsy, synaptic transmission, cognitive impairment, behavior, TBI, brain damage, blood-brain barrier, autoimmunity, and ion-channels.