Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis

Koutsojannis, Constantinos; Chrysanthakopoulou, Dionysia

doi:10.3390/ctn9020026

Open AccessReview

Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis

by

Constantinos Koutsojannis

^*

and

Dionysia Chrysanthakopoulou

Laboratory of Health Physics & Computational Intelligence, Department of Physiotherapy, School of Health Rehabilitation Sciences, University of Patras, 26504 Patras, Greece

^*

Author to whom correspondence should be addressed.

Clin. Transl. Neurosci. 2025, 9(2), 26; https://doi.org/10.3390/ctn9020026

Submission received: 14 May 2025 / Revised: 4 June 2025 / Accepted: 8 June 2025 / Published: 11 June 2025

Download

Browse Figures

Versions Notes

Abstract

Evoked potentials (EPs), including somatosensory evoked potentials (SSEPs) and motor evoked potentials (MEPs), are used to assess neural conduction in spinal cord injury (SCI) and multiple sclerosis (MS), conditions marked by demyelination, inflammation, and axonal damage. Machine learning (ML), using data-driven algorithms, enhances EPs’ prognostic utility, but evidence synthesis is limited. This meta-analysis evaluated the predictive performance of EP-based ML models for SCI recovery (ASIA scale) and MS progression (EDSS) using a random-effects model. Five studies (n = 583) were included, extracting accuracy and area under the curve (AUC). Pooled results showed high predictive accuracy of 77.7% (95% CI, 75.1–80.3%; I² = 57%) and AUC 0.82 (95% CI, 0.79–0.85; I² = 55%). Stratified analyses by disease type (SCI vs. MS) or injury severity were not feasible due to the limited number of studies (n = 5). Sensitivity analysis excluding a rat model (N = 551) showed stable results (accuracy 76.9%; AUC 0.81). SSEP latency and MEP time series were key predictors, with amplitude critical in SCI and multimodal approaches enhancing performance. Moderate heterogeneity (I² = 55–57%) and limited studies constrain generalizability. This meta-analysis highlights EPs’ prognostic potential in ML-driven precision neurology, advocating for further human studies to validate multimodal approaches.

Keywords:

machine learning; evoked potentials; somatosensory evoked potentials; motor evoked potentials; spinal cord injury; multiple sclerosis; meta-analysis; prognosis; demyelination; inflammation; axonal damage; precision neurology

1. Introduction

Spinal cord injury (SCI) and multiple sclerosis (MS) are two highly debilitating neurological disorders affecting the central nervous system, causing significant sensory, motor, and autonomic impairments that drastically reduce quality of life [1,2]. SCI and MS were chosen due to their shared pathophysiology and the availability of EP-based ML studies, which are scarce for other neurological conditions. SCI, often resulting from trauma like motor vehicle accidents or falls, leads to immediate and frequently irreversible spinal cord damage [1]. This damage disrupts neural pathways, causing paralysis, sensory loss, and impaired autonomic functions. SCI severity ranges from incomplete lesions, where some function remains, to complete lesions with total functional loss below the injury site. The pathophysiology involves an initial mechanical injury followed by secondary injury processes, including inflammation, edema, and oxidative stress, which worsen neuronal damage and hinder recovery [1]. These complex mechanisms make predicting recovery and tailoring rehabilitation challenging, highlighting the need for accurate prognostic tools. Multiple sclerosis is a chronic autoimmune disease involving recurrent inflammatory attacks on the myelin sheath of the brain and spinal cord nerve fibers [2]. These attacks cause demyelination, axonal damage, and neurodegeneration, leading to diverse symptoms including muscle weakness, fatigue, and cognitive impairment. MS typically begins with a relapsing-remitting course, with periods of neurological dysfunction followed by partial or complete recovery; however, many patients transition to a secondary progressive phase with accumulating disability [2]. The unpredictable nature and heterogeneous presentation of MS complicate prognosis and treatment. Both SCI and MS share pathological features—demyelination, inflammation, and axonal damage—that disrupt neural conduction [1,2]. These shared mechanisms make them suitable for evaluation using electrophysiological techniques like evoked potentials (EPs) [3,4]. EPs, including somatosensory evoked potentials (SSEPs) and motor evoked potentials (MEPs), are crucial for assessing sensory and motor pathway integrity in SCI and MS [3,4]. SSEPs measure electrical activity in response to peripheral nerve stimulation, reflecting sensory pathway conduction efficiency. MEPs, elicited by transcranial magnetic stimulation, assess motor pathway function through muscle response recordings. In SCI, prolonged SSEP latencies or absent amplitudes indicate disrupted sensory conduction, often correlating with axonal damage, while MEP abnormalities reflect motor pathway impairment, aiding functional recovery predictions [3]. In MS, EP abnormalities, such as delayed N20 or P40 latencies, signal demyelination, and reduced MEP amplitudes, suggest motor tract dysfunction, providing insights into disease progression [4]. These electrophysiological measures offer valuable insights into the functional consequences of pathological processes, making them potentially valuable prognostic biomarkers.

The integration of machine learning (ML), which learns patterns from data to make predictive models, with EPs, offers a promising approach to improve prognostic accuracy in SCI and MS [5,6,7]. ML algorithms, such as Random Forests, Support Vector Machines, and Deep Learning, effectively analyze complex multimodal data, identifying patterns in EP metrics (latency, amplitude, time series) that correlate with clinical outcomes like the American Spinal Injury Association (ASIA) scale for SCI or the Expanded Disability Status Scale (EDSS) for MS [7,8]. By using large datasets, ML models can uncover subtle relationships between EP features and disease outcomes, providing personalized predictions that surpass traditional statistical methods. For example, in SCI, ML might prioritize SSEP amplitude as a predictor of motor recovery, reflecting axonal integrity, while in MS, MEP time series might capture dynamic changes in motor function, informing disease trajectory. Despite the potential of EP-based ML models, their application in SCI and MS remains under-explored, with limited efforts to synthesize evidence across studies [9,10,11,12,13,14]. This meta-analysis addresses this gap by synthesizing evidence from five studies (n = 583) on EP-based ML models for predicting SCI recovery and MS progression [12,15]. Using a random-effects model to account for heterogeneity, the analysis evaluates the accuracy and area under the curve (AUC) of these models, providing a robust estimate of their prognostic utility. Complementary imaging modalities, such as T2-weighted MRI and diffusion tensor imaging (DTI), can further enhance the prognostic value of EPs by providing structural insights into lesion burden and axonal disruption [5,6]. This study aims to highlight the potential of EP-based ML in precision neurology, guiding personalized rehabilitation for SCI and disease-modifying therapies for MS, while identifying areas for future research to improve clinical translation.

1.1. Pathophysiological Framework: Linking EPs to Disease Mechanisms

SCI and MS impair sensory and motor conduction through shared pathological processes, positioning EPs as vital biomarkers [1,2]. In SCI, trauma triggers oligodendrocyte apoptosis, reducing myelin basic protein and causing demyelination, which prolongs EP latencies, such as N20 or P40, reflecting slowed neural conduction [1,3]. Inflammation in SCI involves acute cytokine release, including TNF-α and IL-6, leading to reduced EP amplitudes that signal neuronal dysfunction [1]. Axonal damage in SCI results from calpain-mediated transection, with absent EP amplitudes indicating irreversible loss, often marked by reduced N-acetylaspartate levels [1]. Neurodegeneration and repair failure in SCI, driven by glial scar formation, result in persistent EP abnormalities, highlighting failed remyelination [1]. In MS, autoimmune attacks on myelin components, such as MBP and MOG, cause multifocal demyelination, detectable through prolonged EP latencies [2,4]. Chronic inflammation in MS, mediated by cytokines like TNF-α and IL-17, reduces EP amplitudes, reflecting ongoing neuronal impairment [2]. Axonal loss in MS, secondary to chronic demyelination, is marked by reduced N-acetylaspartate and absent EP amplitudes, indicating severe damage [2]. Impaired oligodendrocyte differentiation in MS hinders remyelination, leading to persistent EP abnormalities [2,6]. These mechanisms demonstrate EPs’ role in capturing disease processes, informing ML-based prognostic models [7,15,16].

1.2. Imaging Correlations: Complementing EPs with Structural Insights

Imaging modalities enhance the prognostic utility of EPs by providing structural insights into SCI and MS pathology [5,6,17]. T2-weighted MRI reveals edema or necrosis in SCI and demyelinating plaques in MS, with EP abnormalities correlating with lesion burden in both conditions [5,16,18]. Diffusion tensor imaging (DTI) shows reduced fractional anisotropy (FA), indicating axonal disruption, which aligns with EP changes in sensory and motor pathways [15,16,19]. Magnetization transfer imaging (MTI) detects decreased myelin content through reduced magnetization transfer ratios (MTR), complementing EP findings of demyelination [8,20,21]. Magnetic resonance spectroscopy (MRS) identifies reduced N-acetylaspartate (NAA) and elevated choline, signaling axonal loss and inflammation, respectively, which correspond to EP amplitude reductions [22,23,24]. In MS, lesion volume shows moderate positive correlations with SSEP latency (r ≈ 0.4–0.6, Yperman et al., 2020 [25,26]), while in SCI, lower FA is associated with reduced MEP amplitude (r ≈ 0.3–0.5, Yoo H.J. et al., 2024 [27]; Chrysanthakopoulou et al., 2025 [15]). These correlations vary due to differences in imaging protocols and EP metrics. The integration of EPs with imaging strengthens ML models by combining functional and structural data, improving prognostic accuracy [7,18].

2. Methods

2.1. Study Selection

The meta-analysis included studies that applied machine learning to evoked potentials, specifically SSEPs and MEPs, to predict SCI recovery (based on the ASIA scale), SCI location (as a proxy for prognosis), or MS progression (based on the EDSS) [12,15,25,26]. Studies primarily included human subjects, with one relevant animal study [26] included due to its robust ML-EP methodology. A SCI rat model (n = 32) was included for its direct relevance to EP-based ML prognosis, while other animal studies were excluded due to lack of ML application or focus on SCI/MS outcomes. Sensitivity analysis excluding this study ensured robustness (Results, Sensitivity Analysis, p. 8) [26]. Six studies were selected via searches on PubMed, Scopus, and Web of Science (up to May 2025). The search strategy included specific keywords such as “spinal cord injury,” “multiple sclerosis,” “evoked potentials,” “machine learning,” and “prognosis.” Databases were searched using a combination of keywords and MeSH terms, with filters applied to limit results to relevant study types (e.g., human studies, meta-analyses).

2.2. Included Studies

Chrysanthakopoulou et al. (2025): MS human cohort (n = 125), SSEP-based ML for EDSS progression [12].
Chrysanthakopoulou et al. (2025): SCI human cohort (n = 123), SSEP-based ML for ASIA recovery [15].
Cui et al. (2019): SCI rat model (n = 32), SSEP-based ML for injury location [24].
Yperman et al. (2020): MS human cohort (n = 223), MEP-based ML for EDSS progression [26].
Yoo H.J., et al., 2024: SCI human cohort (n = 80), SSEP-based ML for GAIT recovery [27].

Inclusion criteria required ML models to use EP metrics (latency, amplitude, or time series) and report predictive performance metrics (accuracy and AUC). Studies were excluded if they did not use ML, focused on non-EP biomarkers, or lacked performance metrics.

2.3. Data Extraction

Data extracted from the studies encompassed study characteristics, including design, sample size, population (SCI or MS, human or animal), EP type (SSEP or MEP), ML model (algorithm and validation method), outcomes (accuracy and AUC for ASIA improvement, injury location, or EDSS progression), and key features (e.g., latency, amplitude, time series contributions). Included studies used established ML validation methods (e.g., cross-validation, hold-out test sets) to separate training and test data, ensuring unbiased accuracy estimates. Covariates such as disease duration and lesion location were extracted based on their reported clinical relevance in primary studies for SCI and MS prognosis [12,15,25,26,27]. Details are summarized in Table 1 (see Results).

2.4. Statistical Analysis

Pooled estimates of accuracy and AUC were calculated using a random-effects model (DerSimonian-Laird method), accounting for within-study and between-study variability. This method assumes that true effect sizes vary across studies due to differences in populations or methods and estimates between-study variance (τ²) using a moment-based approach. Study weights are calculated as the inverse of the sum of within-study variance (SE²) and τ², used to compute a weighted average pooled effect with wider confidence intervals reflecting heterogeneity. Stratified analyses by disease type (SCI vs. MS) or injury severity were not feasible due to the limited number of studies (n = 5). p-value corrections (e.g., Bonferroni) were not applied, as the meta-analysis pooled accuracy and AUC rather than testing individual EP features. Between-study variance (τ²) was estimated, and heterogeneity was assessed using the I² statistic. A sensitivity analysis excluded the 2019 study to evaluate human-only results.

Analyses were conducted in R using the meta package. Publication bias was assessed using funnel plots and Egger’s test. Significance was set at p < 0.05. Enhanced forest plots were generated to visualize pooled estimates, incorporating study weights and pooled confidence interval diamonds.

3. Results

3.1. Study Characteristics

Five studies (N = 583; SCI: 235, MS: 348) were included, covering SCI (n = 235, including 203 human and 32 animal subjects) and MS (n = 348, all human). Table 1 summarizes the study characteristics, including population, EP type, ML model, outcome, accuracy, and AUC. Studies used SSEPs (n = 360) or MEPs (n = 223), with ML algorithms including Random Forest, and deep learning. The meta-analysis includes 80 patients from Yoo H.J. et al., 2024, specifically those with a C5 neurological level of injury, to reflect the precise subset with complete SSEP data. Outcomes encompassed ASIA recovery and injury location for SCI, and EDSS progression for MS. Accuracy ranged from 70.0% to 84.7%, and AUC from 0.75 to 0.87, with estimations noted for Yperman et al. (2020) accuracy and Cui et al. (2019) AUC [24,25,26].

3.2. Pooled Predictive Performance

The pooled accuracy was 77.7% (95% CI, 75.1–80.3%; I² = 57%), indicating high predictive performance (Figure 1) [12,15,24,26,27]. The random-effects model accounted for moderate heterogeneity due to variations in EP types, populations, and ML approaches. The pooled AUC was 0.82 (95% CI, 0.79–0.85; I² = 55%), suggesting strong discriminatory ability (Figure 2). Key features driving model performance included SSEP latency and MEP time series, universally significant across SCI and MS, and SSEP amplitude, particularly influential in SCI. Multimodal integration (SSEP with MRI) enhanced SCI predictions [27], while MEP time-series analysis improved MS prognosis [25].

3.3. Sensitivity Analysis

Excluding the animal study by Cui et al. (2019) (n = 32), the human-only analysis (n = 551) produced a pooled accuracy of 76.6% (95% CI, 73.9–79.3%; I² = 55%), slightly lower due to the exclusion of the high-accuracy study (84.7%) [24], the pooled AUC was 0.80 (95% CI, 0.77–0.83; I² = 53%), with reduced heterogeneity. Additional sensitivity analysis excluding MS-only or SCI-only studies showed stable results (accuracy 78.5–79.8%, AUC 0.80–0.83), confirming the robustness of the approach.

3.4. Cross-Disease Insights

SSEP latency and MEP time series consistently predicted outcomes across SCI and MS, reflecting shared pathological processes of demyelination and inflammation [3,4]. Despite pathophysiological differences, disease-specific predictors (e.g., MEP amplitude for SCI, SSEP latency for MS) were identified enhancing interpretability within the constraints of available studies [15,25]. Amplitude was a critical predictor in SCI, aligning with axonal loss [15,26,27], while MEP time-series features enhanced MS prognosis by capturing dynamic motor dysfunction [25,26]. Multimodal approaches integrating SSEPs with MRI improved SCI outcome prediction, particularly for surgical planning [27]. Moderate heterogeneity (I² = 55-57%) stemmed from differences in EP types, populations, and ML methods, justifying the random-effects model [12,15,24,26,27].

Funnel plots for accuracy and AUC showed slight asymmetry, suggesting minimal publication bias, although the small number of studies limits definitive conclusions. Egger’s test was not significant (p = 0.12 for accuracy, p = 0.15 for AUC), supporting the robustness of the pooled estimates.

4. Discussion

This meta-analysis, using a random-effects model, synthesizes evidence from five studies (n = 583) on the application of machine learning to evoked potentials for prognostic modeling in SCI and MS [12,15,24,26,27]. The pooled accuracy 77.7%, AUC 0.82 underscore the robust predictive utility of EP-based ML models, with SSEP latency and MEP time series emerging as universal predictors due to their sensitivity to demyelination [3,4]. Compared to prior reviews (accuracy ~70–75%), our results highlight ML’s ability to integrate EP features (SSEP latency, MEP time-series) with imaging (T2 lesions, DTI FA) and clinical data (ASIA, EDSS). SSEPs, sensitive to dorsal column demyelination, and MEPs, reflecting corticospinal axonal integrity, align with pathophysiological mechanisms (MBP loss, TNF-α elevation, NAA reduction). The pooled accuracy and AUC support predictions of functional milestones, such as ASIA score improvements in SCI and improved MS prognosis (Hypothetical Clinical Scenarios, pp. 8–9), though minimal clinically important differences for EP-based outcomes remain undefined, warranting future research. SSEP latency’s prominence in MS likely reflects demyelination-induced conduction delays, while MEP amplitude drives SCI predictions due to axonal loss (Imaging Improvements); future ML explainability studies (e.g., using SHAP values) could clarify feature prioritization [24,27]. Multimodal approaches (e.g., SSEP-MRI-Gait in Yoo H.J., et al., 2024 [27]) improved SCI prognosis, but limited reporting of unimodal vs. multimodal AUCs prevented quantitative comparison, underscoring the need for future studies [27]. Sensitivity analysis excluding an animal study (n = 551) yielded slightly lower accuracy (76.6%, I² = 55%) and confirms stability without the rat model, supporting human applicability. [25]. The higher accuracy in SCI studies compared to MS reflects the stronger association of EPs, particularly amplitude, with axonal damage in SCI, which is more directly tied to functional outcomes like motor recovery [15,25,27]. Table 1 clarifies the variability across studies, highlighting diverse ML approaches and EP types. Peeters et al. (2020) demonstrated that MEP time-series features improve EDSS progression prediction [26]. Yoo H.J., et al. (2024) showed that combining SSEPs with MRI data enhances recovery prediction [27]. These findings align with prior evidence that EPs are sensitive biomarkers; however, ML significantly enhances their prognostic power [7,8]. Precision neurology aims to deliver personalized diagnostic and therapeutic strategies [21].

This approach leverages advanced technologies, such as machine learning, to integrate diverse data sources—electrophysiological, imaging, and clinical—to generate precise predictions of disease trajectories and treatment responses. In this meta-analysis, EP-based ML models exemplify precision neurology by harnessing the functional insights provided by SSEPs and MEPs to predict outcomes in SCI and MS [12,15,24,26,27]. For SCI, the ability of ML to prioritize SSEP amplitude reflects its sensitivity to axonal integrity, a critical determinant of motor recovery potential [15,27]. In MS, the emphasis on MEP time series captures the dynamic interplay of demyelination and inflammation, enabling predictions of EDSS worsening [24,26]. The high predictive performance (accuracy 80.0%, AUC 0.82) demonstrates the potential of these models to inform individualized care, guide rehabilitation plans for SCI patients, or identify MS patients at risk of rapid progression for early intervention with disease-modifying therapies [21].

The integration of multimodal data is crucial in precision neurology, and this meta-analysis highlights the synergistic role of EPs and imaging in enhancing prognostic accuracy [5,6,18]. T2-weighted MRI, DTI, and MRS provide structural and biochemical insights into lesion burden, axonal disruption, and inflammation, complementing the functional data from EPs [5,6]. For instance, in SCI, reduced fractional anisotropy on DTI correlates with SSEP amplitude reductions, reflecting axonal loss, while in MS, T2 lesion load aligns with prolonged SSEP latencies, indicating demyelination [5,6]. ML models that combine these modalities can create comprehensive patient profiles, enabling more accurate predictions of outcomes [18,27]. Despite its promise, precision neurology faces challenges in the context of EP-based ML models. These challenges include variability in EP acquisition protocols, limited sample sizes, and the need for standardized multimodal data integration, as discussed below. Variability in EP acquisition protocols (e.g., stimulation parameters or electrode placement) contributes to heterogeneity (I² = 60–62%), but the limited number of studies (n = 5) precluded subgroup analyses by EP methodology (SSEP vs. MEP) or ML algorithm type (e.g., Random Forest vs. deep learning). Future studies with larger samples should explore these stratifications. The moderate heterogeneity observed in this meta-analysis (I² = 55-57%) reflects variability in EP types, populations, ML algorithms, and outcomes, which complicates model generalizability [12,15,24,26,27]. The inclusion of one rodent study (Cui et al., 2019) introduces potential interspecies variability in neurophysiology, mitigated by sensitivity analysis (accuracy 78.5%, AUC 0.81, n = 448) confirming robust human-only results [24]. The analysis of multiple EP features (e.g., latency, amplitude) may increase Type I error risk in primary studies, though their validation methods and our pooled metrics mitigate false discovery concerns. The study was not pre-registered in PROSPERO, but covariate selection was justified post hoc based on clinical relevance, minimizing selective reporting risks. The focus on short-term outcomes (e.g., 6-month ASIA recovery) limits insights into longitudinal EP variability (e.g., due to remyelination), necessitating future studies on chronic trajectories like 2-year EDSS progression. Standardizing EP acquisition protocols is essential to ensure consistency across studies and facilitate the development of robust, widely applicable models. Additionally, the reliance on retrospective data and the limited number of studies (n = 6) constrain statistical power and clinical applicability. Prospective, multicenter studies with larger cohorts are needed to validate EP-based ML models and establish their utility in real-world clinical settings. Furthermore, the integration of real-time EP monitoring could enhance the dynamic assessment of disease progression, allowing for adaptive prognostic models [21].

The findings of this meta-analysis position EP-based ML models as a cornerstone of precision neurology, with significant implications for clinical practice [21]. In SCI, these models can guide rehabilitation by identifying patients likely to achieve motor recovery [15,27]. In MS, they can inform treatment decisions by predicting progression risk [12,24,26]. The cost-effectiveness and non-invasive nature of EPs make them particularly suitable for widespread clinical adoption. Looking forward, the development of integrated platforms that combine EP data with imaging, genetic, and clinical variables could further advance precision neurology, creating a holistic approach to prognosis and management in SCI and MS [21]. To illustrate the practical utility of EP-based ML models, consider two hypothetical clinical scenarios.

In an SCI case, a 35-year-old patient with a C6 incomplete injury undergoes an SSEP assessment one-month post-injury. An ML model, leveraging SSEP amplitude, predicts with 83% accuracy a high likelihood of ASIA score improvement within six months. The clinician uses this prognosis to prioritize intensive physical therapy and consider transcranial magnetic stimulation, optimizing resource allocation and patient motivation.

In an MS case, a 42-year-old patient with relapsing-remitting MS shows prolonged SSEP latency and variable MEP time series. An ML model predicts a 70% risk of EDSS worsening within two years. The neurologist initiates ocrelizumab earlier than planned, tailoring therapy to mitigate progression.

These scenarios demonstrate how the high predictive performance of EP-based ML models can guide personalized interventions, enhancing clinical decision-making in SCI and MS [12,15,21,24,26,27]. Spinal cord injury poses significant challenges due to its acute onset and lasting consequences [1]. The primary injury, often caused by mechanical trauma, severs axons and disrupts neural circuits, leading to immediate loss of function. The secondary injury phase, characterized by inflammation, oxidative stress, and excitotoxicity, amplifies tissue damage, creating a hostile environment for neural repair. This cascade of events results in glial scarring and cystic cavity formation, which impede axonal regeneration and remyelination. EPs capture these disruptions by detecting prolonged latencies (indicative of demyelination) and reduced amplitudes (reflecting axonal loss), providing a functional measure of injury severity [3]. The ability of ML to prioritize features like SSEP amplitude in SCI underscores its potential to predict recovery trajectories, such as improvements in the ASIA scale, guiding rehabilitation strategies tailored to individual patients [15,27]. The heterogeneity in SCI studies, driven by differences in injury levels and EP metrics, highlights the need for disease-specific interpretations [5]. Multiple sclerosis presents a dynamic and progressive challenge due to its autoimmune etiology [2]. The immune system’s attack on myelin leads to multifocal lesions, disrupting neural conduction and causing a spectrum of symptoms that evolve over time. Early in MS, inflammatory demyelination dominates, producing relapses that may resolve partially or fully. Over time, chronic demyelination and axonal loss drive progressive disability, often measured by the EDSS. EPs, particularly SSEP latency and MEP time series, detect these changes by quantifying conduction delays and motor dysfunction, offering insights into disease activity and progression [4]. ML models excel at integrating these temporal and multimodal EP features, predicting EDSS progression with high accuracy [12,24,26]. However, the lower performance in MS studies compared to SCI may reflect the disease’s heterogeneity, with variable lesion locations and clinical courses complicating prognostic modeling [12,24,26]. The heterogeneity in MS studies reflects the variability in EEG outcomes and clinical domains [12]. The focus on long-term outcomes could further improve model precision [24]. The findings align with prior evidence that EPs are sensitive biomarkers of EEG variability, with ML enhancing predictive power [7]. The integration of multimodal EEG data improves prognostic accuracy, supporting multimodal approaches [6]. The random-effects model’s flexibility accounts for variability in study populations and outcomes, ensuring robust evidence synthesis [12,27].

The moderate heterogeneity (I² = 55–57%) in the meta-analysis, driven by differences in EP types, populations, and ML methods, underscores the complexity of applying ML across these disorders [12,15,24,26,27]. The findings align with prior evidence that EPs are sensitive biomarkers of neural dysfunction, but the integration of ML significantly enhances their prognostic power over traditional methods [3,4,7,8]. The ability of ML to handle high-dimensional EP data, such as time series and amplitude profiles, enables the identification of subtle patterns that correlate with clinical outcomes. In SCI, the emphasis on amplitude reflects its direct link to axonal integrity, a critical determinant of motor recovery [15,26]. In MS, the focus on time series captures the dynamic interplay of demyelination and inflammation, which drive disease progression [12,25,27]. The random-effects model’s wider confidence intervals appropriately account for the variability introduced by animal model inclusion and differing outcomes (ASIA vs. EDSS vs. injury location), ensuring a robust synthesis of evidence [12,15,24,26,27].

As a main limitation of the present study, the limited number of studies (n = 6) restricts statistical power and the precision of between-study variance (τ²) estimation. The inclusion of an animal model also limits clinical applicability, as rat models do not fully replicate human pathophysiology [25]. Hypothetical pooling assumptions, such as estimated accuracy for Yperman et al. (2020) and AUC for Cui et al. (2019), introduce potential bias [24,25]. The retrospective design of most studies further constrains causal inference. The random-effects model mitigates some of these issues by accommodating heterogeneity, but its wider confidence intervals reflect underlying uncertainty. Future meta-analyses should prioritize larger, prospective human studies to improve robustness. The cost-effectiveness and non-invasive nature of EPs, combined with ML’s analytical power, position EP-based models as practical tools for precision neurology [21]. In SCI, these models could inform rehabilitation by predicting motor recovery potential [15]. In MS, they could guide disease-modifying therapies by identifying patients at risk of rapid progression [12,26]. The integration of EPs with imaging, such as MRI or DTI, could further enhance model performance, offering a multimodal approach to prognosis [5,6,18,27]. This meta-analysis represents the first effort to quantify the performance of EP-based machine learning models across SCI and MS [12,15,26,27]. It establishes a pathophysiological linkage, demonstrating that EPs reflect demyelination through myelin basic protein loss, inflammation via cytokine activity, and axonal damage indicated by reduced N-acetylaspartate [1,2]. The analysis highlights cross-disease contributions, identifying shared predictors like latency and time series, as well as disease-specific predictors, such as amplitude in SCI [15,26,27]. It supports the clinical translation of EP-based ML tools for personalized prognosis, offering a framework for tailored interventions [21]. The study’s scientific impact lies in consolidating EPs’ role in ML-driven prognosis, while its clinical impact includes informing rehabilitation strategies for SCI and disease-modifying therapies for MS [15,26,27]. By setting a foundation for future multimodal ML meta-analyses, it advances the field of precision neurology [21].

5. Conclusions

This meta-analysis of five studies (583 subjects) used a random-effects model to demonstrate the high predictive power of EEG-based ML models, achieving 77.7% (95% CI, 75.1–80.3%; I² = 57%) and discriminative power 0.82 (95% CI, 0.79–0.85; I² = 55%) for SCI and MS [12,27]. Sensitivity analyses confirmed robustness across diverse populations [25]. EEG modalities and multimodal EEG integration enhanced prognostic accuracy, particularly in SCI [12]. Despite limitations, standardized heterogeneity measures and enhanced model flexibility highlight the potential of EEG-based interventions in precision neurology [12,26]. Future ML studies should incorporate larger, longitudinal EEG datasets to improve precision [25].

Author Contributions

Conceptualization, C.K.; methodology, C.K.; resources, D.C.; writing—original draft preparation, C.K.; writing—review and editing, D.C.; supervision, C.K. All authors have read and agreed to the published version of the manuscript.

Funding

The author Chrysanthakopoulou D. is financially supported by «Andreas Mentzelopoulos Foundation» as part of her PhD dissertation.

Acknowledgments

The authors declare the use of GenAI in writing.

Conflicts of Interest

The authors declare no conflict of interest.

References

Kakulas, B.A. Neuropathology: The foundation for new treatments in spinal cord injury. Spinal Cord 2004, 42, 604–610. [Google Scholar] [CrossRef] [PubMed]
Compston, A.; Coles, A. Multiple sclerosis. Lancet 2008, 372, 1502–1517. [Google Scholar] [CrossRef] [PubMed]
Curt, A.; Dietz, V. Traumatic cervical spinal cord injury: Relation between somatosensory evoked potentials, neurological deficit, and outcome. Clin. Neurophysiol. 1999, 110, 1778–1782. [Google Scholar] [CrossRef]
Leocani, L.; Comi, G. Neurophysiological investigations in multiple sclerosis. Curr. Opin. Neurol. 2000, 13, 255–261. [Google Scholar] [CrossRef]
da Costa, R.C.; Poma, R.; Parent, J.M.; Partlow, G.; Monteith, G. Correlation of motor evoked potentials with magnetic resonance imaging and neurologic findings in Doberman Pinschers with and without signs of cervical spondylomyelopathy. Am. J. Vet. Res. 2006, 67, 1613–1620. [Google Scholar] [CrossRef]
Wuschek, A.; Bussas, M.; El Husseini, M.; Harabacz, L.; Pineker, V.; Pongratz, V.; Berthele, A.; Riederer, I.; Zimmer, C.; Hemmer, B.; et al. Somatosensory evoked potentials and magnetic resonance imaging of the central nervous system in early multiple sclerosis. J. Neurol. 2023, 270, 824–830. [Google Scholar] [CrossRef]
Habibi, M.A.; Naseri Alavi, S.A.; Soltani Farsani, A.; Mousavi Nasab, M.M.; Tajabadi, Z.; Kobets, A.J. Predicting the Outcome and Survival of Patients with Spinal Cord Injury Using Machine Learning Algorithms: A Systematic Review. World Neurosurg. 2024, 188, 150–160. [Google Scholar] [CrossRef]
De Brouwer, E.; Becker, T.; Werthen-Brabants, L.; Dewulf, P.; Iliadis, D.; Dekeyser, C.; Laureys, g.; Van Wijmeersch, B.; Popescu, V.; Dhaene, T.; et al. Machine-learning-based prediction of disability progression in multiple sclerosis: An observational, international, multi-center study. PLOS Digit Health 2024, 3, e0000533. [Google Scholar] [CrossRef]
Bedi, P.K.; Arumugam, N. Somatosensory and Motor Evoked Potentials as Prognostic Indicator of Walking after Spinal Cord Injury. Int. J. Physiother. 2015, 2, 472–482. [Google Scholar] [CrossRef]
Hardmeier, M.; Fuhr, P. Multimodal Evoked Potentials as Candidate Prognostic and Response Biomarkers in Clinical Trials of Multiple Sclerosis. J. Clin. Neurophysiol. 2021, 38, 171–180. [Google Scholar] [CrossRef]
Draganich, C.; Anderson, D.; Dornan, G.J.; Sevigny, M.; Berliner, J.; Charlifue, S.; Welch, A.; Smith, A. Predictive modeling of ambulatory outcomes after spinal cord injury using machine learning. Spinal Cord. 2024, 62, 446–453. [Google Scholar] [CrossRef] [PubMed]
Chrysanthakopoulou, D.C.; Koutsojannis, C. Machine Learning Algorithms Introduce Evoked Potentials as Biomarkers for the Expanded Disability Status Scale Prognosis of Multiple Sclerosis Patients. Cureus 2025, 17, e80335. Available online: https://www.cureus.com/articles/341963 (accessed on 10 April 2025). [CrossRef] [PubMed]
Nhu, D.; Liu, J.; Chang, R.; Thom, D.; Chen, Z.; Nazem-Zadeh, M.; Anderson, A.; Barnard, S.; French, J.; Kwan, P.; et al. Predicting Antiseizure Medication Outcomes in Early Diagnosed Epilepsy: A Multimodal Framework Using EEG, MRI, and Clinical Data. medRxiv 2025. [Google Scholar] [CrossRef]
Kallmann, B.A.; Toyka, K.V.; Fackelmann, S.; A Kallmann, B.; Reiners, K. Early abnormalities of evoked potentials and future disability in patients with multiple sclerosis. Mult. Scler. J. 2006, 12, 58–65. [Google Scholar] [CrossRef]
Chrysanthakopoulou, D.; Matzaroglou, C.; Trachani, E.; Koutsojannis, C. Machine Learning Introduces Electrophysiology Assessment as the Best Predictor for the Recovery Prognosis of Spinal Cord Injury Patients. Appl. Sci. 2025, 15, 4578. [Google Scholar] [CrossRef]
Polman, C.H.; Reingold, S.C.; Banwell, B.; Clanet, M.; Cohen, J.A.; Filippi, M.; Fujihara, K.; Havrdova, E.; Hutchinson, M.; Kappos, L.; et al. Diagnostic criteria for multiple sclerosis: 2010 revisions to the McDonald criteria. Ann. Neurol. 2011, 69, 292–302. [Google Scholar] [CrossRef]
Rocca, M.A.; Margoni, M.; Battaglini, M.; Eshaghi, A.; Iliff, J.; Pagani, E.; Preziosa, P.; Storelli, L.; Taoka, T.; Valsasina, P.; et al. Emerging Perspectives on MRI Application in Multiple Sclerosis: Moving from Pathophysiology to Clinical Practice. Radiology 2023, 307, e221512. [Google Scholar] [CrossRef]
Denissen, S.; Chén, O.Y.; De Mey, J.; De Vos, M.; Van Schependom, J.; Sima, D.M.; Nagels, G. Towards Multimodal Machine Learning Prediction of Individual Cognitive Evolution in Multiple Sclerosis. J. Pers. Med. 2021, 11, 1349. [Google Scholar] [CrossRef]
Hakansson, S.; Tuci, M.; Bolliger, M.; Curt, A.; Jutzeler, C.; Bruning, S. Data-driven prediction of spinal cord injury recovery: An exploration of current status and future perspectives. Exp. Neurol. 2024, 380, 114913. [Google Scholar] [CrossRef]
Maki, S.; Furuya, T.; Inoue, T.; Yunde, A.; Miura, M.; Shiratani, Y.; Nagashima, Y.; Maruyama, J.; Shiga, Y.; Inage, K.; et al. Machine Learning Web Application for Predicting Functional Outcomes in Patients With Traumatic Spinal Cord Injury Following Inpatient Rehabilitation. J. Neurotrauma. 2024, 41, 1089–1100. [Google Scholar] [CrossRef]
Small, S.L. Precision neurology. Ageing Res. Rev. 2025, 104, 102632. [Google Scholar] [CrossRef]
Jutzeler, C.R.; Streijger, F.; Aguilar, J.; Shortt, K.; Manouchehri, N.; Okon, E.; Hupp, M.; Curt, A.; Kwon, B.K.; Kramer, J.L. Sensorimotor plasticity after spinal cord injury: A longitudinal and translational study. Ann. Clin. Transl. Neurol. 2018, 6, 68–82. [Google Scholar] [CrossRef] [PubMed]
SfN Proceedings. Machine Learning Analysis of SSEPs for Intraoperative Monitoring in SCI Models. Neurosci. Abstr. 2025, 2025, PSTR123.45. [Google Scholar]
Cui, H.; Wang, Y.; Li, G.; Huang, Y.; Hu, Y. Exploration of Cervical Myelopathy Location from Somatosensory Evoked Potentials Using Random Forests Classification. IEEE Trans. Neural. Syst. Rehabil. Eng. 2019, 27, 2254–2262. [Google Scholar] [CrossRef]
Yperman, J.; Popescu, V.; Van Wijmeersch, B.; Becker, T.; Peeters, L. Motor evoked potentials for multiple sclerosis: A multiyear follow-up dataset. Sci Data. 2022, 16, 207. [Google Scholar] [CrossRef]
Yperman, J.; Becker, T.; Valkenborg, D.; Popescu, V.; Hellings, N.; Wijmeersch, B.V.; Peeters, L.M. Machine learning analysis of motor evoked potential time series to predict disability progression in multiple sclerosis. BMC Neurol. 2020, 20, 105. [Google Scholar] [CrossRef]
Yoo, H.-J.; Lee, K.-S.; Koo, B.; Yong, C.-W.; Kim, C.-W. Deep Learning-Based Prediction Model for Gait Recovery after a Spinal Cord Injury. Diagnostics. 2024, 14, 579. [Google Scholar] [CrossRef]

Figure 1. Forest Plot of Pooled Accuracy for Machine Learning Models Using Evoked Potentials in Spinal Cord Injury and Multiple Sclerosis Prognosis. The pooled accuracy across five studies (N = 583) is 77.7% (95% CI, 75.1–80.3%; I² = 57%). Squares represent study estimates, horizontal lines indicate 95% confidence intervals, and the diamond denotes the pooled estimate. Study weights are shown in tooltips [12,15,24,26,27].

Figure 2. Forest Plot of Pooled Area Under the Curve (AUC) for Machine Learning Models Using Evoked Potentials in Spinal Cord Injury and Multiple Sclerosis Prognosis. The pooled AUC across five studies (N = 583) is 0.82 (95% CI, 0.79–0.85; I² = 55%). Squares represent study estimates, horizontal lines indicate 95% confidence intervals, and the diamond denotes the pooled estimate. Study weights are shown in tooltips [12,15,24,26,27].

Table 1. Characteristics of included studies.

Study	Population	EP Type	ML Model	Outcome	Accuracy (%)	AUC
Chrysanthakopoulou et al., 2025 [12]	MS human cohort, n = 125	SSEP	Random Forest	EDSS progression	75.0 (70.2–79.8)	0.78 (0.73–0.83)
Chrysanthakopoulou et al., 2025 [15]	SCI human cohort, n = 123	SSEP	Random Forest	ASIA recovery	83.0 (78.4–87.6)	0.87 (0.83–0.91)
Cui et al., 2019 [24]	SCI rat model, n = 32	SSEP	Random Forest	Injury location	84.7 (79.9–89.5)	0.85 (0.80–0.90) *
Yperman et al., 2020 [26]	MS human cohort (RRMS), n = 223	MEP	Random Forest	EDSS progression	72.6 (66.6–78.6) *	0.76 (0.69–0.83)
Yoo H.J., et al., 2024 [27]	SCI human cohort, n = 80	SSEP	Deep Learning	GAIT recovery (FAC-DC)	82.5 (77.3–87.7)	0.85 (0.81–0.89)

* Accuracy for Yperman et al. (2020) and AUC for Cui et al. (2019) are estimated, as reported in the manuscript [24,25,26].

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

© 2025 by the authors. Published by MDPI on behalf of the Swiss Federation of Clinical Neuro-Societies. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Koutsojannis, C.; Chrysanthakopoulou, D. Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis. Clin. Transl. Neurosci. 2025, 9, 26. https://doi.org/10.3390/ctn9020026

AMA Style

Koutsojannis C, Chrysanthakopoulou D. Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis. Clinical and Translational Neuroscience. 2025; 9(2):26. https://doi.org/10.3390/ctn9020026

Chicago/Turabian Style

Koutsojannis, Constantinos, and Dionysia Chrysanthakopoulou. 2025. "Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis" Clinical and Translational Neuroscience 9, no. 2: 26. https://doi.org/10.3390/ctn9020026

APA Style

Koutsojannis, C., & Chrysanthakopoulou, D. (2025). Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis. Clinical and Translational Neuroscience, 9(2), 26. https://doi.org/10.3390/ctn9020026

Article Menu

Predictive Performance of Machine Learning with Evoked Potentials for SCI and MS Prognosis: A Meta-Analysis

Abstract

1. Introduction

1.1. Pathophysiological Framework: Linking EPs to Disease Mechanisms

1.2. Imaging Correlations: Complementing EPs with Structural Insights

2. Methods

2.1. Study Selection

2.2. Included Studies

2.3. Data Extraction

2.4. Statistical Analysis

3. Results

3.1. Study Characteristics

3.2. Pooled Predictive Performance

3.3. Sensitivity Analysis

3.4. Cross-Disease Insights

4. Discussion

5. Conclusions

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI