Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis

Sakiz, Elif; Amanzadeh Jajin, Elnaz; Cubeddu, Liza; Gamsjaeger, Roland; Avsar, Timucin

doi:10.3390/ijms26136171

Open AccessReview

Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis

by

Elif Sakiz

^1,2,†,

Elnaz Amanzadeh Jajin

^3,†

,

Liza Cubeddu

¹,

Roland Gamsjaeger

¹

and

Timucin Avsar

^2,4,*

¹

School of Science, Western Sydney University, Sydney, NSW 2751, Australia

²

Neuro-Oncology Laboratory, School of Medicine, Bahcesehir University, Istanbul 34734, Türkiye

³

Functional Neurosurgery Research Centre, Shohada Tajrish Comprehensive Neurosurgical Centre of Excellence, Shahid Beheshti University of Medical Sciences, Tehran P.O. Box 1988873554, Iran

⁴

Department of Medical Biology, Bahçeşehir University School of Medicine, Istanbul 34734, Türkiye

^*

Author to whom correspondence should be addressed.

^†

These authors contributed equally to this work.

Int. J. Mol. Sci. 2025, 26(13), 6171; https://doi.org/10.3390/ijms26136171

Submission received: 23 March 2025 / Revised: 17 April 2025 / Accepted: 28 May 2025 / Published: 26 June 2025

(This article belongs to the Special Issue Molecular Insights into Multiple Sclerosis)

Download

Browse Figures

Versions Notes

Abstract

To advance our understanding of multiple sclerosis (MS), accurate identification of protein expression profiles as biomarkers for MS in cerebrospinal fluid (CSF) is critical. However, proteomic studies investigating MS have yielded inconsistent findings due to variability in sample sizes, diagnostic criteria, and data processing methods. We aimed to tackle these challenges by performing a thorough meta-analysis of proteomics datasets sourced from multiple independent studies. We conducted a thorough database search to gather all relevant studies using appropriate keywords. We screened articles using defined inclusion and exclusion criteria, and finally, six studies were included. We retrieved and combined data from five CSF datasets for discovery and two additional datasets for validation in 368 MS patients and controls. After data preprocessing, we calculated Z-scores for all datasets and for the integrated dataset. We used logistic regression models using training and validation datasets. We identified 11 differentially expressed proteins in the integrated dataset, revealing significant alterations in key pathways involved in immune response, neuroinflammation, and synaptic function. Notably, IGKC exhibited strong diagnostic potential, with an AUROC of 0.81. These findings highlight the value of re-analysing publicly available proteomics data to develop robust biomarker panels for MS diagnosis.

Keywords:

cerebrospinal fluid; CSF; meta-analyses; multiple sclerosis; neurodegeneration; proteomics

1. Introduction

Multiple sclerosis (MS) is a chronic and multifaceted neurodegenerative disease characterised by a wide range of neurological symptoms, including motor dysfunction, cognitive decline, and sensory disturbances [1]. Immune cell infiltration in the central nervous system (CNS) is the hallmark of MS, leading to local inflammation and demyelination of neurons [2,3]. MS is a progressive disease with appearance of symptoms at late stages of the disease. Therefore, identification of novel biomarkers for early diagnosis of MS will help clinicians to apply appropriate interventions, leading to improved quality of life for patients. Furthermore, early diagnosis of MS via protein biomarkers leads to decreased financial burdens on health organisations.

Despite frequent studies on the underlying mechanisms of MS and the identification of prognostic markers, many questions remain unanswered [4]. The variability in proteomic findings across studies complicates the identification of consistent biomarkers [5,6]. It impedes the development of reliable diagnostic tools for MS. This complexity underscores the critical need for reliable biomarkers to aid in diagnosis, prognosis, and therapeutic monitoring. In recent years, researchers have focused on the characterisation of protein and RNA biomarkers of MS in cerebrospinal fluid (CSF) [7].

Due to its close interaction with the CNS, CSF is a valuable source for identifying biomarkers in MS [8,9,10]. The influx of leukocytes from serum to CSF is highly filtered in healthy subjects [11]. Accordingly, CSF specimens are used to examine protein content and the number of cells. In this regard, McDonald’s criteria were developed and revised in 2017 to detect dissemination via CSF-specific IgG oligoclonal bands [12].

Proteomic studies have identified proteins potentially linked to MS in CSF [13,14,15,16]. However, only some of these findings have been consistently validated across independent cohorts. Reproducibility remains a crucial challenge, driven by small sample sizes, inconsistent diagnostic criteria, and variability in sample processing, data acquisition, and analysis. These factors contribute to the difficulty in establishing robust biomarkers for MS. Liquid chromatography–mass spectrometry (LC-MS)-based approaches are mainly used in this context to measure metabolites, including amino acids and proteins in body fluids [17].

Meta-analyses are powerful tools that overcome limitations in statistical power and reproducibility, and are used across biomedical research fields. Yet, their application in proteomics needs to be utilised more, especially where established biomarkers are lacking. Existing proteomics meta-analyses often rely on published results without accounting for the heterogeneity introduced by different databases and analytical strategies. We hypothesised that performing an unbiased meta-analysis of proteomics datasets could reveal a reliable set of MS-associated proteins with higher reproducibility and clinical relevance. To this end, we identified MS-focused CSF proteomic LC-MS datasets and re-analysed them using the analytical approach introduced by van Zalm and colleagues [18]. Using this approach, we identified biomarkers that were differentially expressed in CSF samples of MS subjects compared to the control subjects. To this end, we analysed datasets separately and then as one integrated sample. Enrichment analysis revealed related signalling pathways and processes. Then, we developed a model to evaluate the association between identified proteins and MS prevalence. To effectively mitigate the risk of overfitting in our trained model, we strategically employed a logistic regression model that not only validated our innovative two-pronged approach but also leveraged two recently published additional cohorts, which underscored the potential impact of our research.

2. Methods

2.1. Literature Review

A comprehensive search was conducted in the PubMed database to find MS-related CSF datasets. PubMed was prioritised since it is a comprehensive database and covers all studies including MS proteomics, which are supplemented by PRIDE (PRoteomics IDEntifications Database) for raw data. For this purpose, a combination of keywords was used: “MS” and “multiple sclerosis” as the first group, “proteomics”, “proteome”, and “protein” as the second group, and “CSF” or “cerebrospinal fluid” as the third groups of keywords in different combinations. Due to the limited available datasets, brain and spinal cord analyses were not included in this study.

The inclusion criteria included proteomic analyses involving the brain, the spinal cord, CSF, and other relevant tissues. This applied to MS patients and control subjects, with studies focusing on various MS subtypes versus controls. Additionally, the proteomic profiling utilised liquid LC-MS/MS in data-dependent acquisition (DDA) mode and required high-resolution and high-accuracy instrumentation. Exclusion criteria included review studies, studies with no raw data from LC-MS (proteomics), studies written in languages other than English, studies with non-human samples, studies with no MS-related samples, studies with no description of sample preparation and mass spec techniques, and for proteomics, the use of non-DDA methods (e.g., SRM, MRM, PRM, Western Blot, 2D gel electrophoresis).

The included articles assessed data availability in the databases using PRIDE and ProteomeXchange (https://proteomecentral.proteomexchange.org, accessed on 5 August 2024). Data were retrieved from the available datasets and used for further analysis. Additionally, we searched for all proteomics datasets available for MS patients in the same databases. Ultimately, we included eligible datasets with available proteome datasets (n = 6) and 1 dataset with no available published articles from the PRIDE database.

The present systematic review and meta-analysis included all the datasets with corresponding papers and PubMed identifiers. The PubMed database was accessed online through the National Centre for Biotechnology Information (NCBI) website (https://pubmed.ncbi.nlm.nih.gov/, accessed on 3 August 2024). One unpublished dataset in the PRIDE database was added to the datasets in this study.

2.2. Proteomics Analysis

All raw LC-MS data were retrieved and analysed using FragPipe 22.0 via Homo sapiens protein sequences from the UniProtKB/Swiss-prot database, which included 40,936 proteins in total and was downloaded on 10 July 2024. The analysis settings for amino acids included a maximum of two missed tryptic cleavages and a peptide length between 7 and 50 amino acids long. In addition, the cysteine residues were set up as fixed. In contrast, acetylation of the N-terminal of proteins and methionine oxidation were set up as variable modifications, and the maximum number of modifications was three. Modification on the N-terminal of the peptides and lysine residues in tandem mass tag (TMT) studies was used as a fixed modification. A 1% false discovery rate (FDR) was applied for both Percolator and ProteinProphet [19]. For IonQuant [20], the match between runs was set up for all the positions where at least one ion for peptide quantification was needed.

2.3. Data Analysis

Protein identification data were imported into RStudio (v.4.4.2) for preprocessing, statistical analysis, and visualisation. The source code for this study is publicly available at the following GitHub Repository: https://github.com/ElnazAmanzadeh/MS-Proteome-meta-analysis, accessed on 5 August 2024.

All datasets were normalised in R (v.4.4.2), while two distinct methods were applied based on the presence or absence of reference nodes in datasets. Determining the normalisation factor (NF) was crucial to calculating standardised intensities for all samples. For this purpose, the sum of intensities was calculated for each sample in all datasets. In addition, the median of all summed intensities was calculated for each dataset separately. Then, the normalisation factors were calculated for each dataset sample via division of the median of summed intensities by the summed intensity of each dataset sample. In all datasets with reference samples, the median intensities were calculated and then used to calculate the NF. On the other hand, in studies with several reference samples, the NF was determined via calculation of the median intensity of the reference samples. In the following, this factor was recruited to normalise the protein intensities of each sample.

In the present study, we used the method by Zalm and colleagues [18]. This method does not rely on principal component analysis (PCA) to identify and remove outliers but rather uses a standardised outlier identification approach to optimise the results. However, a PCA plot was used to visualise the results before and after identifying and removing outliers. Next, normalised datasets were used to create a theoretical sample based on each dataset’s median intensities achieved for individual proteins. Then, the correlation between each sample and the theoretical median sample was calculated using the Pearson correlation test. Additionally, the correlation coefficients were used to calculate the standard deviations. Simultaneously, outliers were defined as values beyond three standard deviations, which were subsequently removed.

In this present study, we encountered various methodologies for sample preparation from the CSF, experiment set-ups, and profiling of the proteins. Therefore, CSF samples were adjusted primarily to remove batch effects. We utilised data from various datasets to calculate Z-scores to achieve this goal, enabling effective comparisons and deeper insights. To calculate the Z-score, a log2 transformation was applied to the intensities of all samples. Then, the mean and standard deviation were calculated for a protein in different samples using the intensities of control samples. Next, the Z-score of each protein in all samples was calculated, and this was repeated for all proteins across all samples. Ultimately, all datasets were merged into a large dataset for further analysis.

MS and control samples were identified using the information in metadata including demographic data and clinical criteria.

To accomplish this, we used RStudio to perform statistical analyses. Comparison of MS and control samples for identification of proteins with statistically significant expression was performed using Fisher’s exact test. Those with less than 70% expression in samples were removed to filter out low-frequency proteins. Next, a non-parametric Mann–Whitney U test was used for statistical analysis because parametric tests did not apply to these data. In addition, the Benjamini–Hochberg procedure was used to correct the obtained p-values. Importantly, the results of the quantitative expression analysis for the proteins in each dataset were obtained to be compared with the meta-analysis results. For this, we used Student’s t-test and Benjamini–Hochberg correction.

During the meta-analysis, functional enrichment analysis was performed to find the signalling pathways, including the identified significantly expressed proteins. To accomplish this, the Clusterprofiler package for R was used to visualise significant signalling pathways using MSidDB Gene Sets Hallmark datasets via misgdbr-package. Additionally, Metascape pathway analysis [21] and EnrichR [22,23] were employed for comprehensive Gene Ontology (GO) analysis, focusing on biological processes and molecular functions. These combined approaches provided an in-depth understanding of the biological significance and functional roles of the identified proteins.

EnrichR generated ranked lists of enriched terms for each selected gene set library, using both adjusted p-values (e.g., Benjamini–Hochberg correction) and integrated combined scores (which incorporate p-values and z-scores) to prioritise biologically and clinically relevant findings. This statistical rigour ensured that top-ranked terms reflected the most reliable and meaningful associations with the observed proteomic alterations in Parkinson’s Disease.

2.4. Model Development and Validation

Datasets from Timirci (2019) [24] and Comabella (2021) [25] were used as validation datasets for CSF samples. Tabular quantification datasets were used for primary model development using data for proteins identified as biomarkers. Then, a logistic regression model was applied to the integrated biomarker panel. Logistic regression was selected due to its interpretability and examinability. The training models were developed using four datasets for CSF samples. Then, to test the validation of the model, two independent models were created using the datasets mentioned before. The identified markers and developed models were tested on validation datasets to ensure no overfitting. The pROC R package was used to test the models. The results were evaluated using area under the curve plots in a receiver operating characteristic (ROC) curve analysis. Then, the area under the receiver operating characteristic curve (AUROC) was calculated for each protein to evaluate the efficacy of biomarker candidates identified through the meta-analysis. Finally, using the brute-force method, a collection of proteins from CSF samples was tested in 3-protein combinations in two validation datasets.

3. Results

3.1. Study Selection

The search process was based on Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. The search process led to the primary finding of 493 papers. Screening of titles led to the exclusion of 142 duplicate papers. Article-type screening resulted in the exclusion of 48 review articles. Abstracts and full-text articles were screened to find the documents that had no raw data available, used non-human samples, or used cell lines. This stage of screening led to the exclusion of 280 articles. Screening of studies with no available raw data, non-English-language studies, and studies using non-human samples excluded 4, 14, and 5 papers, respectively.

Meanwhile, from eight datasets with available raw data, four included data from non-MS diseases and MS patients treated with therapeutic compounds and methods. Finally, four eligible datasets were included in the present study. In addition, searching for MS datasets in databases yielded 51 datasets. However, 41 datasets were excluded since they included cell line studies, extracellular vesicle (EV) samples, treated MS samples, and blood and plasma samples. Further, the search yielded two studies with brain and one dataset with oligodendrocyte proteomics data, which were excluded since the meta-analysis needed four or more studies to complete the meta-analysis and validation processes. In total, seven studies that used CSF samples for proteomic analysis were included in this systematic review and meta-analysis. The study selection process is represented in Figure 1.

3.2. Data Extraction

Table 1 represents the characteristics of the included studies and related proteomics datasets that used CSF and brain samples from MS patients.

3.3. Data Preprocessing

LC-MS data for each study were retrieved from the PRIDE dataset using relevant dataset IDs in the articles or from support data published with the articles. For the studies with available raw data, these data were searched against the UniProt Human protein canonical sequence database (downloaded on 20 July 2024: 40,936 entries) using MSFragger/Fragpipe. This approach minimised variability due to the protein sequence differences, searching algorithm, quantification, and samples. In addition, using a canonical protein database led to removing issues, including spurious peptide spectral matches and artificial isoforms. Data normalisation was performed using the median intensities of reference samples. The median of summed intensities of samples was used as the NF for samples with no reference node. Outliers in samples were identified using a method developed by Zalm and colleagues (2023) [18]. In this method, protein expression levels should have correlations between different samples; if this is not observed, it represents a problem in the experimental stages. Therefore, the intensities of proteins in each sample are correlated with the median intensity of all samples per dataset. Outlier proteins were identified using a threshold of three standard deviations of mean correlation for each dataset. Six samples were removed since they were identified as outliers in CSF samples (Figure S1A). In brain samples, no sample was identified as an outlier (Figure S1B). As Zalm and colleagues [18] have mentioned, the model they developed led to fast and accurate data adjustment, helping to improve downstream analyses. In addition, Z-scores were calculated for each dataset individually and then for the integrated dataset, and then they were used for the normalisation of protein intensities. This approach helped remove batch effects caused by differences in experimental steps, sample preparation, and measurement instruments. PCA plots of datasets before and after normalisation via Z-score for both CSF samples are represented in Figure 2.

3.4. Discovery of Biomarker Candidates

After removing outliers in the CSF group, the remaining samples included 127,083 proteins. Then, Fisher’s exact test was performed to find the proteins with expression in only MS or control cases. However, this test did not yield to a significant number of proteins using a p-value < 0.05 as a substantial cut-off value (Figure S2A,B). Therefore, we decided to keep all the proteins in the further analysis steps. In the following steps, the non-parametric Mann–Whitney U test and then Benjamini–Hochberg multiple testing were used for correction. The results showed 73 proteins in CSF samples (Figure 3A) with a statistically significant increase in expression levels which were used for identification of functional interactions between potential biomarkers and enrichment analysis using the Molecular Signatures Database (MSigDB) Hallmark dataset with the highest statistical confidence level for both CSF and brain samples (Figure 3B). This analysis emphasises the importance of the effect of protein networks and metabolism in MS. Surprisingly, some of the proteins found to be significantly expressed in the meta-analysis did not show significant increases or decreases in expression levels in individual sample analysis. This finding indicates that meta-analysis helps to discover novel biomarkers not previously identified.

The network of enriched biological terms is shown in Figure 4, emphasising key functional clusters and their interconnections, which provide crucial insights into the molecular mechanisms of the disease. Functionally related terms are grouped into distinct clusters, such as “Platelet Degranulation”, “Complement Activation”, and “Vitamin D Receptor Pathway”, underscoring the interplay between immune regulation, coagulation, and metabolic processes. Terms like “Platelet Degranulation” and “Post-Translational Protein Phosphorylation” stand out for their strong statistical enrichment and extensive gene representation, highlighting their critical roles in processes such as cellular signalling, protein modification, and immune activation. The use of colour to represent statistical significance further emphasises the pathways most relevant to the disease, with darker nodes indicating higher significance. This network not only illustrates the complexity of the biological interactions but also identifies priority pathways that could serve as potential therapeutic targets or provide deeper insights into the disease’s progression and pathology.

Figure 5 highlights the most significantly enriched GO terms, providing insight into the biological processes (Figure 5A) and molecular functions (Figure 5B) associated with the differentially expressed proteins identified in this meta-analysis. In the biological processes category, pathways such as “Retina Homeostasis” (GO:0001895) and “Negative Regulation of Blood Coagulation” (GO:0030195) were highly enriched, indicating their potential relevance to maintaining tissue integrity and regulating pathological responses. Other enriched terms, including “High-Density Lipoprotein Particle Remodelling” (GO:0034375) and “Reverse Cholesterol Transport” (GO:0043691), suggest critical roles in lipid metabolism and transport, processes known to be implicated in neurodegenerative and systemic diseases [30,31].

Similarly, the molecular functions analysis revealed significant enrichment for activities such as “Cholesterol Transfer Activity” (GO:0120020) and “Amyloid-Beta Binding” (GO:0001540). These findings suggest involvement in cholesterol transport and amyloid protein interactions, both of which are crucial in cellular signalling and potentially linked to disease pathophysiology. Notably, “Lipoprotein Particle Receptor Binding” (GO:0070325) and “Phosphatidylcholine-Sterol O-acyltransferase Activator Activity” (GO:0060228) further highlight the relevance of lipid metabolism and receptor-mediated interactions in the context of MS. These enriched terms collectively provide evidence of the intricate involvement of lipid regulation, protein interactions, and homeostasis in the biological mechanisms underlying the studied condition.

The GO enrichment analysis of differentially expressed proteins (DEPs) identified significant associations with key biological processes (Figure 6), with immune system processes (GO:0002376) showing the highest enrichment (p < 10−17), emphasising the pivotal role of immune dysregulation in the pathology of the disease. This highlights the involvement of DEPs in immune-related pathways, consistent with the inflammatory mechanisms commonly observed in the condition. The second most significant process, biological regulation (GO:0065007), reflects the role of DEPs in critical regulatory networks, including the modulation of cellular responses and protein activity, suggesting that disruptions in these pathways may drive disease progression. The third, localisation (GO:0051179), underscores the involvement of DEPs in intracellular and intercellular transport processes essential for cellular function, with potential implications for molecular trafficking disruptions in the disease. Collectively, these findings provide critical insights into the immune, regulatory, and transport-related roles of the identified DEPs, offering a foundation for understanding the disease’s molecular underpinnings and identifying therapeutic targets.

3.5. Development of a Model Using Biomarker Candidates

In further analysis, putative biomarkers of CSF samples were used to develop machine-learning models using logistic regression. Considering that in-house samples for the validation analysis were not available, we used two datasets for the two-step validation of our models. To this end, we used five CSF datasets for primary model development and two datasets, including Timirci (2019) [24] and Comabella (2021) [25], for model validation. All the steps were performed for model validation datasets, including pre-processing, Z-score calculation, normalisation, and exploratory analysis. Meanwhile, the most effective proteins shared between test datasets and validation datasets on model accuracy were also found for each dataset. APOE, CD14, CTNPD1, CTNT1, DKK3, IHGA1, IGHG3, IGKC, NPTXR, PTGDS, and VGF were found for CSF samples, among which the highest AUROC was achieved for IGKC as 0.81 (Figure 7A–C).

4. Discussion

In this meta-analysis of CSF proteomes from MS patients, we identified several proteins that may play important roles in MS pathogenesis. Proteins such as APOE, CD14, CNDP1, CTNT1, DKK3, IGHA1, IGHG3, IGKC, NPTXR, PTGDS, and VGF provide valuable insights into the complex and interrelated mechanisms of immune regulation, neurodegeneration, and cellular signalling in MS.

APOE, traditionally recognised for its involvement in lipid metabolism, is implicated as a crucial modulator of inflammation in the CNS [32]. Beyond its lipid-related functions, APOE may influence T-cell proliferation, macrophage activity, and antigen presentation by CD1 molecules to natural killer T (NKT) cells [33]. Its potential role in dampening neuroinflammation positions APOE as a candidate modulator of MS-related immune responses, potentially shaping the immune landscape during disease progression [34].

CD14, a co-receptor for lipopolysaccharides, is another key protein identified in our analysis. Its upregulation in microglia and macrophages during active disease phases contributes to the production of proinflammatory cytokines, such as TNF-α and IL-1β, which drive demyelination and neurodegeneration [35,36]. Elevated CD14 levels in progressive MS forms highlight its association with disease severity [37,38], suggesting its potential as a marker of neuroinflammation and therapeutic targeting.

While no direct interaction between APOE and CD14 is confirmed in MS, both proteins modulate innate immune responses and may contribute to a shared inflammatory axis worth further investigation.

Immunoglobulins such as IGHA1, IGHG3, and IGKC were also identified as DEPs in the CSF of MS patients. These proteins are central to adaptive immune responses and have long been linked to MS pathology. The presence of oligoclonal bands (OCBs) in the CSF, indicative of intrathecal immunoglobulin production by B-cell clones, further supports the role of B-cell dysregulation in MS. Elevated levels of IGHG3 and IGKC, components of IgG molecules, suggest ongoing humoral immune responses in the CNS, aiding in the distinction of MS from other neurological disorders [39,40]. The identification of IGHA1, an immunoglobulin A heavy chain, is particularly intriguing, as it may reflect broader immune activity, despite its direct role in MS pathology remaining unclear.

Similarly, reductions in CNDP1 (carnosine dipeptidase 1) support its proposed protective role in MS pathology. As an enzyme that breaks down the neuroprotective dipeptide carnosine, CNDP1 helps buffer oxidative stress—a factor elevated in MS [41,42]. This protein’s function in reducing oxidative damage may also promote oligodendrocyte survival, linking it to remyelination and its relevance in relapsing–remitting MS (RRMS), where oxidative stress exacerbates disease relapses.

DKK3 (Dickkopf-3) was identified for its involvement in modulating Wnt/β-catenin signalling pathways and immune responses [43]. Emerging evidence suggests that DKK3 influences local T-cell polarisation and cytokine production, central to MS pathogenesis [44,45]. Its potential role in modulating immune tolerance within the CNS could open new avenues for therapeutic modulation of autoimmunity in MS.

NPTXR (neuronal pentraxin receptor) is another protein linked to neurodegenerative processes [46]. Its decreased levels in the CSF of MS patients, particularly in those with progressive disease, highlight its role in maintaining synaptic integrity [28,41,46]. Synaptic dysfunction, a hallmark of progressive MS, contributes to cognitive and motor decline, and NPTXR’s association with synaptic health suggests it could serve as a marker for neurodegeneration in MS.

PTGDS (prostaglandin D2 synthase) is highly expressed in oligodendrocytes and astrocytes, implicating it in remyelination and inflammatory modulation [47,48]. PTGDS exhibits both neuroprotective and proinflammatory activity, depending on disease stage [48,49,50], underscoring its complex involvement in MS pathophysiology.

VGF, a neuropeptide involved in synaptic plasticity and neuroprotection, was also significantly upregulated in MS patients. Its potential role in balancing excitatory and inhibitory signals in neurons makes it a key contributor to the neurodegenerative processes seen in MS, particularly cognitive decline [51,52]. VGF’s regulation by inflammatory signals further supports its involvement in neuroinflammation and neurodegeneration in MS.

Collectively, these proteins and their interactions underscore the potential for developing personalised therapeutic strategies for MS. Targeting the APOE-CD14 axis or modulating immunoglobulin production could provide tailored treatments for patients exhibiting particular immune profiles.

Beyond these individual proteins, integrative pathway enrichment analyses offer a broader, systems-level perspective. Clustering of terms such as “Platelet Degranulation,” “Complement Activation,” and “Vitamin D Receptor Pathway” highlights the complex interplay between immune responses, coagulation, and metabolic regulation in MS, and highlights the value of meta-analyses in the context of heterogeneous disease. The prominence of “Post-Translational Protein Phosphorylation” further points to intricate protein modifications orchestrating disease-associated signalling.

The enrichment of immune system processes among DEPs solidifies the centrality of immune dysregulation in MS. Additionally, dysregulations in biological regulation, intracellular transport, and regulatory networks may impede cellular homeostasis and molecular trafficking, contributing to disease progression. Processes like “Retina Homeostasis” and “Negative Regulation of Blood Coagulation” highlight the potential for both localised and systemic dysregulation, while molecular functions involving “Cholesterol Transfer Activity” and “Amyloid-Beta Binding” underscore the significance of lipid metabolism and protein aggregation in MS pathology.

In this study, we identified several pathways and biological processes that align with findings from our previous results [53]. Specifically, the complement and coagulation cascades emerged as a shared pathway, consistent with their central role in both immune response and neuroinflammation. Similarly, processes related to cholesterol metabolism, such as high-density lipoprotein particle remodelling, cholesterol efflux, and reverse cholesterol transport, align with previously reported pathways like the vitamin digestion and absorption pathway. Furthermore, amyloid-beta binding identified in this study complements the prion disease pathway observed in earlier findings, highlighting common mechanisms involving protein misfolding and aggregation. Finally, immune system-related processes, including complement activation and the negative regulation of biological processes, mirror pathways such as the NOD-like receptor signalling pathway reported in our prior work [53].

In conclusion, this meta-analysis highlights several CSF proteins and pathways of importance in MS pathogenesis, particularly in CSF. The identified proteins—such as APOE, CD14, CNDP1, DKK3, NPTXR, PTGDS, and VGF—underscore the multifaceted nature of MS, encompassing immune dysregulation, neuroinflammation, and neurodegeneration. Furthermore, the elevated levels of immunoglobulins, including IGHA1, IGHG3, and IGKC, reinforce the importance of humoral immune responses in MS. Understanding the diverse roles these proteins play in MS not only deepens our knowledge of the disease mechanisms but also offers new potential targets for therapeutic intervention. Future research focused on the functional roles of these proteins in disease progression will be essential for advancing MS treatments.

5. Conclusions

This study presents a comprehensive meta-analysis of proteomics data from MS patients’ CSF, resulting in the identification of several essential proteins, including APOE, CD14, and PTGDS for neuroinflammation, CNDP1, NPTXR, and VGF for neurodegeneration, CTNT1 and DKK3 for repair mechanisms, and IGHA1, IGHG3, and IGKC as hallmarks of MS. By leveraging independent datasets, we provide more reliable biomarkers highlighting immune dysregulation, neuroinflammation, and neurodegeneration as central mechanisms in MS pathogenesis.

Our findings underscore the utility of combining proteomics data from diverse sources to improve the robustness of biomarker discovery. Future studies should focus on validating these proteins across different MS subtypes and other neurological conditions to establish their clinical relevance further. In addition, further prognostic and diagnostic models should be developed for early diagnosis of MS appearance and progression.

In moving beyond existing research, our findings demonstrate the power of integrating proteomics data from diverse sources to enhance the robustness of biomarker discovery. These proteins have the potential to be incorporated into diagnostic panels, allowing for earlier and more accurate detection of MS. Moreover, by identifying proteins linked to distinct pathological mechanisms, our study lays the groundwork for stratifying MS patients based on their molecular profiles, opening up possibilities for personalised treatment approaches, while improving understanding of pathophysiology. Future studies should focus on validating these proteins across different MS subtypes and other neurological conditions to further validate clinical applicability.

6. Limitations

While this meta-analysis provides valuable insights into MS pathogenesis, it has several limitations. First, the datasets included were heterogeneous, with variability in sample preparation, data processing, proteomic platforms and patient characteristics, particularly regarding MS subtypes, disease stages and control group definitions. Such inconsistencies introduce technical and biological variability that may affect protein quantification and reproducibility. Although stringent normalisation and outlier removal procedures were applied, residual batch effects and methodological differences may still influence the results.

Secondly, the limited availability of detailed metadata, such as patient demographics, treatment history, and control group diagnoses, restricted the ability to conduct detailed subgroup analyses and may have masked protein expression patterns specific to disease stages or clinical phenotypes. Thirdly, differences in control groups across studies (e.g., healthy individuals vs. patients with other neurological conditions) may confound the interpretation of differentially expressed proteins. Finally, while this study focuses on CSF, it does not account for potential proteomic changes in other compartments, such as blood. Further validation in larger, more diverse cohorts and across different biofluids will be necessary to confirm the clinical relevance of these findings.

Supplementary Materials

The following supporting information can be downloaded at: https://www.mdpi.com/article/10.3390/ijms26136171/s1.

Author Contributions

Conceptualisation, E.S.; methodology, E.S.; software, E.A.J.; validation, E.S. and E.A.J.; formal analysis, E.S. and E.A.J.; investigation, E.S. and E.A.J.; resources, E.S. and E.A.J.; data curation, E.A.J.; writing—original draft preparation, E.S.; writing—review and editing, E.A.J., T.A., L.C. and R.G.; visualisation, E.S. and E.A.J.; supervision, T.A.; project administration, E.S.; funding acquisition, E.S. All authors have read and agreed to the published version of the manuscript.

Funding

This research was supported by a Research Training Program (RTP) Scholarship provided by Western Sydney University.

Conflicts of Interest

The authors declare no conflict of interest.

References

Trapp, B.D.; Nave, K.-A. Multiple Sclerosis: An Immune or Neurodegenerative Disorder? Annu. Rev. Neurosci. 2008, 31, 247–269. [Google Scholar] [CrossRef] [PubMed]
Magyari, M.; Sorensen, P.S. The changing course of multiple sclerosis: Rising incidence, change in geographic distribution, disease course, and prognosis. Curr. Opin. Neurol. 2019, 32, 320–326. [Google Scholar] [CrossRef] [PubMed]
Reich, D.S.; Lucchinetti, C.F.; Calabresi, P.A. Multiple Sclerosis. N. Engl. J. Med. 2018, 378, 169–180. [Google Scholar] [CrossRef] [PubMed]
van Langelaar, J.; Rijvers, L.; Smolders, J.; van Luijn, M.M. B and T Cells Driving Multiple Sclerosis: Identity, Mechanisms and Potential Triggers. Front. Immunol. 2020, 11, 760. [Google Scholar] [CrossRef]
Sandi, D.; Kokas, Z.; Biernacki, T.; Bencsik, K.; Klivényi, P.; Vécsei, L. Proteomics in Multiple Sclerosis: The Perspective of the Clinician. Int. J. Mol. Sci. 2022, 23, 5162. [Google Scholar] [CrossRef]
Åkesson, J.; Hojjati, S.; Hellberg, S.; Raffetseder, J.; Khademi, M.; Rynkowski, R.; Kockum, I.; Altafini, C.; Lubovac-Pilav, Z.; Mellergård, J.; et al. Proteomics reveal biomarkers for diagnosis, disease activity and long-term disability outcomes in multiple sclerosis. Nat. Commun. 2023, 14, 6903. [Google Scholar] [CrossRef]
Elkjaer, M.L.; Nawrocki, A.; Kacprowski, T.; Lassen, P.; Simonsen, A.H.; Marignier, R.; Sejbaek, T.; Nielsen, H.H.; Wermuth, L.; Rashid, A.Y.; et al. CSF proteome in multiple sclerosis subtypes related to brain lesion transcriptomes. Sci. Rep. 2021, 11, 4132. [Google Scholar] [CrossRef]
Teunissen, C.E.; Verheul, C.; Willemse, E.A.J. Chapter 1—The use of cerebrospinal fluid in biomarker studies. In Handbook of Clinical Neurology; Deisenhammer, F., Teunissen, C.E., Tumani, H., Eds.; Elsevier: Amsterdam, The Netherlands, 2018; pp. 3–20. [Google Scholar]
Cross, A.H.; Gelfand, J.M.; Thebault, S.; Bennett, J.L.; von Büdingen, H.C.; Cameron, B.; Carruthers, R.; Edwards, K.; Fallis, R.; Gerstein, R.; et al. Emerging Cerebrospinal Fluid Biomarkers of Disease Activity and Progression in Multiple Sclerosis. JAMA Neurol. 2024, 81, 373–383. [Google Scholar] [CrossRef]
Huang, J.; Khademi, M.; Fugger, L.; Lindhe, Ö.; Novakova, L.; Axelsson, M.; Malmeström, C.; Constantinescu, C.; Lycke, J.; Piehl, F.; et al. Inflammation-related plasma and CSF biomarkers for multiple sclerosis. Proc. Natl. Acad. Sci. USA 2020, 117, 12952–12960. [Google Scholar] [CrossRef]
Ransohoff, R.M.; Engelhardt, B. The anatomical and cellular basis of immune surveillance in the central nervous system. Nat. Rev. Immunol. 2012, 12, 623–635. [Google Scholar] [CrossRef]
Thompson, A.J.; Banwell, B.L.; Barkhof, F.; Carroll, W.M.; Coetzee, T.; Comi, G.; Correale, J.; Fazekas, F.; Filippi, M.; Freedman, M.S.; et al. Diagnosis of multiple sclerosis: 2017 revisions of the McDonald criteria. Lancet Neurol. 2018, 17, 162–173. [Google Scholar] [CrossRef] [PubMed]
Avsar, T.; Korkmaz, D.; Tütüncü, M.; Demirci, N.O.; Saip, S.; Kamasak, M.; Siva, A.; Turanli, E.T. Protein biomarkers for multiple sclerosis: Semi-quantitative analysis of cerebrospinal fluid candidate protein biomarkers in different forms of multiple sclerosis. Mult. Scler. 2012, 18, 1081–1091. [Google Scholar] [CrossRef] [PubMed]
Mosleth, E.F.; Vedeler, C.A.; Liland, K.H.; McLeod, A.; Bringeland, G.H.; Kroondijk, L.; Berven, F.S.; Lysenko, A.; Rawlings, C.J.; Eid, K.E.-H.; et al. Cerebrospinal fluid proteome shows disrupted neuronal development in multiple sclerosis. Sci. Rep. 2021, 11, 4087. [Google Scholar] [CrossRef]
Liu, H.; Wang, Z.; Li, H.; Li, M.; Han, B.; Qi, Y.; Wang, H.; Gao, J. Label-free Quantitative Proteomic Analysis of Cerebrospinal Fluid and Serum in Patients With Relapse-Remitting Multiple Sclerosis. Front. Genet. 2022, 13, 892491. [Google Scholar] [CrossRef]
Held, F.; Makarov, C.; Gasperi, C.; Flaskamp, M.; Grummel, V.; Berthele, A.; Hemmer, B. Proteomics Reveals Age as Major Modifier of Inflammatory CSF Signatures in Multiple Sclerosis. Neurol. Neuroimmunol. Neuroinflamm. 2025, 12, e200322. [Google Scholar] [CrossRef]
Kasakin, M.F.; Rogachev, A.D.; Predtechenskaya, E.V.; Zaigraev, V.J.; Koval, V.V.; Pokrovsky, A.G. Targeted metabolomics approach for identification of relapsing–remitting multiple sclerosis markers and evaluation of diagnostic models. MedChemComm 2019, 10, 1803–1809. [Google Scholar] [CrossRef]
van Zalm, P.W.; Ahmed, S.; Fatou, B.; Schreiber, R.; Barnaby, O.; Boxer, A.; Zetterberg, H.; Steen, J.A.; Steen, H. Meta-analysis of published cerebrospinal fluid proteomics data identifies and validates metabolic enzyme panel as Alzheimer’s disease biomarkers. Cell Rep. Med. 2023, 4, 101005. [Google Scholar] [CrossRef]
Nesvizhskii, A.I.; Keller, A.; Kolker, E.; Aebersold, R. A statistical model for identifying proteins by tandem mass spectrometry. Anal. Chem. 2003, 75, 4646–4658. [Google Scholar] [CrossRef]
Yu, F.; Haynes, S.E.; Teo, G.C.; Avtonomov, D.M.; Polasky, D.A.; Nesvizhskii, A.I. Fast Quantitative Analysis of timsTOF PASEF Data with MSFragger and IonQuant. Mol. Cell. Proteom. 2020, 19, 1575–1585. [Google Scholar] [CrossRef]
Zhou, Y.; Zhou, B.; Pache, L.; Chang, M.; Khodabakhshi, A.H.; Tanaseichuk, O.; Benner, C.; Chanda, S.K. Metascape provides a biologist-oriented resource for the analysis of systems-level datasets. Nat. Commun. 2019, 10, 1523. [Google Scholar] [CrossRef]
Kuleshov, M.V.; Jones, M.R.; Rouillard, A.D.; Fernandez, N.F.; Duan, Q.; Wang, Z.; Koplev, S.; Jenkins, S.L.; Jagodnik, K.M.; Lachmann, A.; et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 2016, 44, W90–W97. [Google Scholar] [CrossRef] [PubMed]
Chen, E.Y.; Tan, C.M.; Kou, Y.; Duan, Q.; Wang, Z.; Meirelles, G.V.; Clark, N.R.; Ma’ayan, A. Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinform. 2013, 14, 128. [Google Scholar] [CrossRef] [PubMed]
Timirci-Kahraman, O.; Karaaslan, Z.; Tuzun, E.; Kurtuncu, M.; Baykal, A.T.; Gunduz, T.; Tuzuner, M.B.; Akgun, E.; Gurel, B.; Eraksoy, M.; et al. Identification of candidate biomarkers in converting and non-converting clinically isolated syndrome by proteomics analysis of cerebrospinal fluid. Acta Neurol. Belg. 2019, 119, 101–111. [Google Scholar] [CrossRef]
Comabella, M.; Sastre-Garriga, J.; Borras, E.; Villar, L.M.; Saiz, A.; Martínez-Yélamos, S.; García-Merino, J.A.; Pinteac, R.; Fissolo, N.; Sánchez López, A.J.; et al. CSF Chitinase 3-Like 2 Is Associated With Long-term Disability Progression in Patients with Progressive Multiple Sclerosis. Neurol. Neuroimmunol. Neuroinflamm. 2021, 8, e1082. [Google Scholar] [CrossRef]
Stoop, M.P.; Runia, T.F.; Stingl, C.; van der Vuurst de Vries, R.M.; Luider, T.M.; Hintzen, R.Q. Decreased Neuro-Axonal Proteins in CSF at First Attack of Suspected Multiple Sclerosis. Proteom.–Clin. Appl. 2017, 11, 1700005. [Google Scholar] [CrossRef]
Opsahl, J.A.; Vaudel, M.; Guldbrandsen, A.; Aasebø, E.; Van Pesch, V.; Franciotta, D.; Myhr, K.-M.; Barsnes, H.; Berle, M.; Torkildsen, Ø.; et al. Label-free analysis of human cerebrospinal fluid addressing various normalization strategies and revealing protein groups affected by multiple sclerosis. Proteomics 2016, 16, 1154–1165. [Google Scholar] [CrossRef]
Kroksveen, A.C.; Guldbrandsen, A.; Vedeler, C.; Myhr, K.M.; Opsahl, J.A.; Berven, F.S. Cerebrospinal fluid proteome comparison between multiple sclerosis patients and controls. Acta Neurol. Scand. 2012, 126, 90–96. [Google Scholar] [CrossRef]
Kroksveen, A.C.; Guldbrandsen, A.; Vaudel, M.; Lereim, R.R.; Barsnes, H.; Myhr, K.-M.; Torkildsen, Ø.; Berven, F.S. In-Depth Cerebrospinal Fluid Quantitative Proteome and Deglycoproteome Analysis: Presenting a Comprehensive Picture of Pathways and Processes Affected by Multiple Sclerosis. J. Proteome Res. 2017, 16, 179–194. [Google Scholar] [CrossRef]
Estes, R.E.; Lin, B.; Khera, A.; Davis, M.Y. Lipid Metabolism Influence on Neurodegenerative Disease Progression: Is the Vehicle as Important as the Cargo? Front. Mol. Neurosci. 2021, 14, 788695. [Google Scholar] [CrossRef]
Jones, L.; Holmans, P.A.; Hamshere, M.L.; Harold, D.; Moskvina, V.; Ivanov, D.; Pocklington, A.; Abraham, R.; Hollingworth, P.; Sims, R.; et al. Genetic evidence implicates the immune system and cholesterol metabolism in the aetiology of Alzheimer’s disease. PLoS ONE 2010, 5, e13950. [Google Scholar] [CrossRef]
Guo, L.; LaDu, M.J.; Van Eldik, L.J. A dual role for apolipoprotein E in neuroinflammation. J. Mol. Neurosci. 2004, 23, 205–212. [Google Scholar] [CrossRef] [PubMed]
Liu, C.-C.; Kanekiyo, T.; Xu, H.; Bu, G. Apolipoprotein E and Alzheimer disease: Risk, mechanisms and therapy. Nat. Rev. Neurol. 2013, 9, 106–118. [Google Scholar] [CrossRef] [PubMed]
Zhang, H.-L.; Wu, J.; Zhu, J. The Immune-Modulatory Role of Apolipoprotein E with Emphasis on Multiple Sclerosis and Experimental Autoimmune Encephalomyelitis. J. Immunol. Res. 2010, 2010, 186813. [Google Scholar] [CrossRef] [PubMed]
Fransson, J.; Bachelin, C.; Ichou, F.; Guillot-Noël, L.; Ponnaiah, M.; Gloaguen, A.; Maillart, E.; Stankoff, B.; Tenenhaus, A.; Fontaine, B.; et al. Multiple Sclerosis Patient Macrophages Impaired Metabolism Leads to an Altered Response to Activation Stimuli. Neurol. Neuroimmunol. Neuroinflammation 2024, 11, e200312. [Google Scholar] [CrossRef]
Fransson, J.; Bachelin, C.; Deknuydt, F.; Ichou, F.; Guillot-Noël, L.; Ponnaiah, M.; Gloaguen, A.; Maillart, E.; Stankoff, B.; Tenenhaus, A.; et al. Dysregulated functional and metabolic response in multiple sclerosis patient macrophages correlate with a more inflammatory state, reminiscent of trained immunity. bioRxiv 2021. [Google Scholar] [CrossRef]
Walter, S.; Doering, A.; Letiembre, M.; Liu, Y.; Hao, W.; Diem, R.; Bernreuther, C.; Glatzel, M.; Engelhardt, B.; Fassbender, K. The LPS Receptor, CD14 in Experimental Autoimmune Encephalomyelitis and Multiple Sclerosis. Cell. Physiol. Biochem. 2006, 17, 167–172. [Google Scholar] [CrossRef]
Lutterotti, A.; Kuenz, B.; Gredler, V.; Khalil, M.; Ehling, R.; Gneiss, C.; Egg, R.; Deisenhammer, F.; Berger, T.; Reindl, M. Increased serum levels of soluble CD14 indicate stable multiple sclerosis. J. Neuroimmunol. 2006, 181, 145–149. [Google Scholar] [CrossRef]
Bogers, L.; Engelenburg, H.J.; Janssen, M.; Unger, P.-P.A.; Melief, M.-J.; Wierenga-Wolf, A.F.; Hsiao, C.-C.; Mason, M.R.; Hamann, J.; van Langelaar, J.; et al. Selective emergence of antibody-secreting cells in the multiple sclerosis brain. EBioMedicine 2023, 89, 104465. [Google Scholar] [CrossRef]
Torkildsen, Ø.; Stansberg, C.; Angelskår, S.M.; Kooi, E.; Geurts, J.J.; Van Der Valk, P.; Myhr, K.; Steen, V.M.; Bø, L. Upregulation of Immunoglobulin-related Genes in Cortical Sections from Multiple Sclerosis Patients. Brain Pathol. 2010, 20, 720–729. [Google Scholar] [CrossRef]
Wurtz, L.I.; Knyazhanskaya, E.; Sohaei, D.; Prassas, I.; Pittock, S.; Willrich, M.A.V.; Saadeh, R.; Gupta, R.; Atkinson, H.J.; Grill, D.; et al. Identification of brain-enriched proteins in CSF as biomarkers of relapsing remitting multiple sclerosis. Clin. Proteom. 2024, 21, 42. [Google Scholar] [CrossRef]
Borràs, E.; Cantó, E.; Choi, M.; Villar, L.M.; Álvarez-Cermeño, J.C.; Chiva, C.; Montalban, X.; Vitek, O.; Comabella, M.; Sabidó, E. Protein-Based Classifier to Predict Conversion from Clinically Isolated Syndrome to Multiple Sclerosis. Mol. Cell. Proteom. 2016, 15, 318–328. [Google Scholar] [CrossRef] [PubMed]
Mourtada, J.; Thibaudeau, C.; Wasylyk, B.; Jung, A.C. The Multifaceted Role of Human Dickkopf-3 (DKK-3) in Development, Immune Modulation and Cancer. Cells 2024, 13, 75. [Google Scholar] [CrossRef] [PubMed]
Federico, G.; Meister, M.; Mathow, D.; Heine, G.H.; Moldenhauer, G.; Popovic, Z.V.; Nordström, V.; Kopp-Schneider, A.; Hielscher, T.; Nelson, P.J.; et al. Tubular Dickkopf-3 promotes the development of renal atrophy and fibrosis. JCI Insight 2016, 1, e84916. [Google Scholar] [CrossRef] [PubMed]
Meister, M.; Papatriantafyllou, M.; Nordstrã, M.V.; Kumar, V.; Ludwig, J.; Lui, K.O.; Boyd, A.S.; Popovic, Z.V.; Fleming, T.H.; Moldenhauer, G.; et al. Dickkopf-3, a tissue-derived modulator of local T-cell responses. Front. Immunol. 2015, 6, 78. [Google Scholar] [CrossRef]
Gómez de San José, N.; Massa, F.; Halbgebauer, S.; Oeckl, P.; Steinacker, P.; Otto, M. Neuronal pentraxins as biomarkers of synaptic activity: From physiological functions to pathological changes in neurodegeneration. J. Neural Transm. 2022, 129, 207–230. [Google Scholar] [CrossRef]
Mohri, I.; Taniike, M.; Taniguchi, H.; Kanekiyo, T.; Aritake, K.; Inui, T.; Fukumoto, N.; Eguchi, N.; Kushi, A.; Sasai, H.; et al. Prostaglandin D2-mediated microglia/astrocyte interaction enhances astrogliosis and demyelination in twitcher. J. Neurosci. 2006, 26, 4383–4393. [Google Scholar] [CrossRef]
Drake, S.S.; Zaman, A.; Simas, T.; Fournier, A.E. Comparing RNA-sequencing datasets from astrocytes, oligodendrocytes, and microglia in multiple sclerosis identifies novel dysregulated genes relevant to inflammation and myelination. WIREs Mech. Dis. 2023, 15, e1594. [Google Scholar] [CrossRef]
Harrington, M.G.; Fonteh, A.N.; Biringer, R.G.; Xfc, H.; Hühmer, A.F.R.; Cowan, R.P. Prostaglandin D synthase isoforms from cerebrospinal fluid vary with brain pathology. Dis. Markers 2006, 22, 73–81. [Google Scholar] [CrossRef]
Kihara, Y. Systematic Understanding of Bioactive Lipids in Neuro-Immune Interactions: Lessons from an Animal Model of Multiple Sclerosis. In The Role of Bioactive Lipids in Cancer Inflammation Related Diseases; Honn, K.V., Zeldin, D.C., Eds.; Springer International Publishing: Cham, Switzerland, 2019; pp. 133–148. [Google Scholar]
Woo, M.S.; Bal, L.C.; Winschel, I.; Manca, E.; Walkenhorst, M.; Sevgili, B.; Sonner, J.K.; Di Liberto, G.; Mayer, C.; Binkle-Ladisch, L.; et al. The NR4A2/VGF pathway fuels inflammation-induced neurodegeneration via promoting neuronal glycolysis. J. Clin. Investig. 2024, 134, e177692. [Google Scholar] [CrossRef]
DeLuca, J.; Chiaravalloti, N.D.; Sandroff, B.M. Treatment and management of cognitive dysfunction in patients with multiple sclerosis. Nat. Rev. Neurol. 2020, 16, 319–332. [Google Scholar] [CrossRef]
Avsar, T.; Durası, İ.M.; Uygunoğlu, U.; Tütüncü, M.; Demirci, N.O.; Saip, S.; Sezerman, O.U.; Siva, A.; Turanlı, E.T. CSF Proteomics Identifies Specific and Shared Pathways for Multiple Sclerosis Clinical Subtypes. PLoS ONE 2015, 10, e0122045. [Google Scholar] [CrossRef]

Figure 1. PRISMA flow diagram of search and study selection process.

Figure 2. Z−score transformation for adjustment of data and removal of batch effects. (A) Pearson correlation was calculated between each sample, and theoretical samples more than three standard deviations removed from the samples were considered as outliers and removed from downstream analysis. This process was repeated for each sample, resulting in a PCA plot. (B) Z−score transformation was applied to overcome variations between datasets, resulting in a homogeneous dataset. The datasets used in this analysis included Elkjaer et al., 2021 [7]; Krokseveen et al., 2012 [28]; Krokseveen et al., 2016 [29]; Opsahl et al., 2016 [27]; and Stoop et al., 2017 [26].

Figure 3. Statistical analysis and enrichment analysis results: (A) The results of the Mann–Whitney U test and Benjamini–Hochberg comparison correction, resulting in the identification of 73 biomarkers for CSF samples. Grey: proteins expression levels that are not significantly changes, red: proteins which significantly upregulated, blue: proteins which significantly upregulated, green: proteins with upregulation which don’t meet the fold change cutoffs. (B) Biomarker candidates in CSF samples were used to perform enrichment analysis using the MSigDB Hallmark database for CSF samples.

Figure 4. Network of enriched terms: This network visualisation shows enriched biological terms, with nodes coloured by cluster ID to indicate functional groupings. Terms within the same cluster are positioned closer together, reflecting higher functional similarity. Node size represents the number of genes associated with each term.

Figure 5. Enriched GO terms for biological processes and molecular functions (2023). The bar charts display the most significantly enriched (A) biological processes and (B) molecular functions identified through GO analysis. Longer bars indicate higher levels of enrichment. GO terms and their identifiers are provided for each process and function, highlighting pathways and molecular activities relevant to this study. Biological processes are shown in (A), while molecular functions are shown in (B).

Figure 6. The bar chart shows the top 20 enriched GO terms (1 per cluster) ranked by −log10(p-value), with higher bars indicating greater significance. The count represents the number of input genes associated with each term, while the percentage (%) reflects the proportion of genes linked to the term. Multi-test adjusted p-values (−log10(q)) ensure statistical robustness.

Figure 7. Biomarker validation cohorts. (A) Venn diagram showing 11 overlapping proteins (Log2FoldChange) between the test and pruning cohorts, including APOE (3.614), CD14 (−2.050), CNDP1 (−1.548), CTN1 (1.379), DKK3 (1.183), IGHA1 (1.937), IGHG3 (−3.451), IGKC (−2.326), NPTXR (−2.852), PTGDS (2.162), and VGF (2.747). (B,D) Post-analysis of two datasets, including Comabella et al. [25] and Timirci et al. [24], was used to evaluate the biomarker efficacy of these 11 biomarker candidates. Each of the 11 proteins revealed significant differences between MS and controls in both datasets. (C,E) Further validation analysis was performed using a logistic regression model for 11 biomarker candidates, which was trained and tested using both models on the two validation cohorts (Comabella et al. [25] and Timirci et al. [24]). AUROCs for individual proteins as well as the two models, are shown in the legend.

Table 1. Summary of the included papers and datasets.

Sample ID	Author, Year	Number of Control Samples	Number of MS Samples	Sample Type
5	Timirci, 2019 [24]	19	23	CSF
6	Stoop, 2017 [26]	45	47	CSF
7	Opsahl, 2016 [27]	14	50	CSF
8	Krokseeven, 2012 [28]	17	17	CSF
PXD004572	Kroksveen, 2017 [29]	21	21	CSF
PXD017643	Elkjaer, 2021 [7]	33	103	CSF
PXD022958	Comabella, 2021 [25]	44	28	CSF
PXD004540	Kroksveen, 2016 [29]	21	216	CSF

The studies with No ID represent datasets retrieved using published articles and support data unavailable via proteomics datasets.

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Sakiz, E.; Amanzadeh Jajin, E.; Cubeddu, L.; Gamsjaeger, R.; Avsar, T. Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis. Int. J. Mol. Sci. 2025, 26, 6171. https://doi.org/10.3390/ijms26136171

AMA Style

Sakiz E, Amanzadeh Jajin E, Cubeddu L, Gamsjaeger R, Avsar T. Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis. International Journal of Molecular Sciences. 2025; 26(13):6171. https://doi.org/10.3390/ijms26136171

Chicago/Turabian Style

Sakiz, Elif, Elnaz Amanzadeh Jajin, Liza Cubeddu, Roland Gamsjaeger, and Timucin Avsar. 2025. "Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis" International Journal of Molecular Sciences 26, no. 13: 6171. https://doi.org/10.3390/ijms26136171

APA Style

Sakiz, E., Amanzadeh Jajin, E., Cubeddu, L., Gamsjaeger, R., & Avsar, T. (2025). Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis. International Journal of Molecular Sciences, 26(13), 6171. https://doi.org/10.3390/ijms26136171

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Comprehensive Meta-Analysis of Differentially Expressed Proteins in Cerebrospinal Fluid Associated with Multiple Sclerosis

Abstract

1. Introduction

2. Methods

2.1. Literature Review

2.2. Proteomics Analysis

2.3. Data Analysis

2.4. Model Development and Validation

3. Results

3.1. Study Selection

3.2. Data Extraction

3.3. Data Preprocessing

3.4. Discovery of Biomarker Candidates

3.5. Development of a Model Using Biomarker Candidates

4. Discussion

5. Conclusions

6. Limitations

Supplementary Materials

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI