Next Article in Journal
Respiratory Delivery of Highly Conserved Antiviral siRNAs Suppress SARS-CoV-2 Infection
Previous Article in Journal
Insulin Deficiency Exacerbates Muscle Atrophy and Osteopenia in Chrebp Knockout Mice
Previous Article in Special Issue
The Therapeutic Potential of Glymphatic System Activity to Reduce the Pathogenic Accumulation of Cytotoxic Proteins in Alzheimer’s Disease
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Development of Plasma Protein Classification Models for Alzheimer’s Disease Using Multiple Machine Learning Approaches

1
Department of Surgery, Massachusetts General Hospital and Harvard Medical School, 55 Fruit St., Boston, MA 02114, USA
2
Neurochemistry Laboratory, Department of Psychiatry, Massachusetts General Hospital and Harvard Medical School, 149 13th St., Charlestown, MA 02129, USA
3
Department of Neurology, Duke University School of Medicine, 3116 N. Duke St., Durham, NC 27704, USA
4
Duke-UNC Alzheimer’s Disease Research Center, 2424 Erwin Rd, Durham, NC 27705, USA
5
Department of Bioengineering, University of Pennsylvania, 210 South 33rd Street, Philadelphia, PA 19104, USA
6
Department of Neurology, Massachusetts General Hospital and Harvard Medical School, 149 13th St., Charlestown, MA 02129, USA
7
Department of Neurology, Brigham and Women’s Hospital and Harvard Medical School, 60 Fenwood Rd., Boston, MA 02115, USA
*
Author to whom correspondence should be addressed.
Int. J. Mol. Sci. 2025, 26(23), 11673; https://doi.org/10.3390/ijms262311673
Submission received: 20 September 2025 / Revised: 27 October 2025 / Accepted: 30 October 2025 / Published: 2 December 2025
(This article belongs to the Special Issue Advances in Molecular Mechanisms of Neurodegenerative Diseases)

Abstract

Alzheimer’s Disease (AD) management is challenging due to limitations in detection methods. Currently, cerebrospinal fluid (CSF) biomarkers involve assessing β-amyloid (Aβ) and phosphorylated tau proteins. The lumbar puncture procedure to obtain CSF is invasive and sometimes causes significant anxiety in patients. In contrast, plasma biomarkers would allow rapid, accurate, and cost-effective diagnosis, while minimizing invasiveness and discomfort. Using a dataset involving 120 plasma proteins from clinically diagnosed AD patients versus cognitively normal subjects, we developed classification models by applying various machine learning algorithms (EBlasso, EBEN, XGBoost, LightGBM, TabNet, and TabPFN) to plasma proteomic measurements. Gene ontology and pathway enrichment, and a literature review were used to evaluate the potential relevance of the biomarkers identified in AD-related mechanisms. Biomarkers identified were also evaluated for the enrichment of aging-related biomarkers. The models developed yielded high AUROC and accuracy, mostly >0.9. Proteins selected as predictors by all the models included Angiopoietin-2 (ANG-2), epidermal growth factor (EGF), Interleukin 1α (IL-1α), and platelet growth factor subunit B (PDGF-BB). Ample previous literature supported their relevance in AD. The pool of all the biomarkers identified was significantly enriched with known aging-related biomarkers (p = 0.040). Applying cutting-edge algorithms is expected to be advantageous for developing AD prediction models with plasma proteomic data, and future large studies to externally validate the constructed models in other populations to assess their generalizability is important. The proteins uncovered may represent novel preventative or therapeutic targets.

1. Introduction

Alzheimer’s Disease (AD) cases in the US are currently estimated to exceed 6 million and are projected to increase to a staggering 13.8 million by 2060 [1]. AD is the most common basis for dementia and the sixth leading cause of death [1,2]. Additionally, it imposes an enormous economic burden on society, as AD-associated healthcare costs were estimated to be $321 billion in 2022, and are expected to surge to above $1 trillion by 2050 [1]. These figures highlight the need to devise new strategies for AD management as an urgent public health priority area.
One major challenge in AD management is the dearth of effective and rapid detection methods. Currently, positron emission tomography (PET) imaging and cerebrospinal fluid (CSF) biomarkers involve assessing β-amyloid (Aβ) and phosphorylated tau peptides to monitor and diagnose AD are used [3]. The lumbar puncture procedure for CSF collection is invasive, could cause significant anxiety, and could be dangerous to specific groups of patients, such as those with structural brain lesions or thrombocytopenia and other conditions necessitating anticoagulation treatment [4]. Moreover, PET imaging centers are costly, labor intensive, and more common at larger hospitals, decreasing access and widespread adoption. These limitations make the widespread adoption of PET imaging and CSF biomarkers difficult, underscoring the urgent need for novel biomarkers for diagnosing AD rapidly, accurately, and cost-effectively while reducing invasiveness and discomfort to patients.
Plasma, which can even be collected as part of routine laboratory work in various clinical settings, such as primary care or geriatrician offices, offers an attractive solution for minimally invasive and cost-effective biomarkers to facilitate the wide adoption of early screening and longitudinal monitoring. The measurement of various Aβ and phosphorylated tau peptides in plasma or serum, similar to those measured in CSF, or other markers of neurodegeneration, has been reported in a large number of studies [5,6,7,8,9,10,11,12,13,14,15,16,17,18,19]. Commonly tested peptides include Aβ1–42, phosphorylated-Tau (p-Tau) at various residues, and total-Tau (t-Tau). Recently, the Lumipulse G plasma p-Tau217/Aβ1–42 ratio blood test was shown to detect abnormal Aβ- and Tau-positron emission tomography (PET) with high accuracy [20] and became the first blood test for AD diagnosis to be approved by the U.S. Food and Drug Administration [21].
Other relatively recent studies have shown that brain-derived p-Tau more specifically correlates with Tau-PET and cognition compared to t-Tau [18], or that Tau microtubule-binding region (MTBR) containing the residue 243 (MTBR-tau243) can better detect insoluble Tau aggregates compared to common p-Tau measures later in more advanced stages of AD [22], suggesting that additional studies to determine which specific Aβ and Tau peptides can best detect AD pathology may also be beneficial. Additional challenges with the approaches based on plasma Aβ and Tau include challenges in establishing reference ranges and measures becoming elevated for reasons other than AD, including common comorbidities such as prior myocardial infarction, stroke, or chronic kidney disease [14,23,24]. Differences in race have also been reported [14], and, moreover, AD pathogenesis is considered to be a heterogeneous condition involving additional pathways other than the peptides previously known to be related to neurofibrillary tangles (NFTs) [25,26]. Consequently, additional studies to investigate novel plasma AD biomarkers more broadly may still be needed to improve the accuracy of detection and elucidate previously unknown molecular mechanisms, or to further understand the contribution of previously implicated mechanisms.
Various recent machine learning (ML) algorithms provide effective means for analyzing large molecular datasets, which have yet to be applied to AD plasma proteomic datasets. AD proteomic ML studies have mostly evaluated CSF [27,28,29,30,31] or brain [29,32,33,34], rather than plasma, which would not allow for clinically useful, non-invasive biomarkers. Previous AD plasma proteomic ML studies used predictive analysis of microarrays (PAM) [35], random forest [36], or support vector machine (SVM) [37,38,39], which are older algorithms that lack interpretability. A more recent study used the least absolute shrinkage and selection operator (Lasso) to develop accurate prediction models for the prognosis of MCI patients who progress to dementia [40], and another recent study applied Light Gradient Boosting Machine (LightGBM) to classify AD versus cognitively normal status [41]. These studies included evaluating both protein alone and protein with clinical information associated with AD and its pathological outcome (demographic and cognition). A recent study profiled thousands of plasma proteins from a large number of samples with the SomaScan 7k platform and also used Lasso to develop prediction models for AD clinical status as well as AD biomarker status [42]. Another recent study using a large, harmonized dataset derived from the SomaScan platform developed prediction models for APOE ε4 status more widely across AD, PD, FT and ALS, rather than developing prediction models for AD clinical outcome, and evaluated the correlation between different organ aging [43]. Other studies that developed AD classification models from plasma or serum protein measures evaluated a select panel of proteins previously known to be related to NFTs and inflammation rather than employing an unbiased approach to identify previously unknown AD-associated proteins [34,35,44,45,46], or by functional network analysis [47]. One study evaluating miRNAs [48] used random forest, and another study analyzed plasma metabolites with deep learning, random forest, and eXtreme Gradient Boosting (XGBoost) [49]. Other studies were designed to identify plasma proteins associated with AD rather than to develop classification models [50,51,52]. More recently, additional machine learning algorithms for tabular data have become available, and, therefore, testing cutting-edge algorithms is expected to further improve prediction modeling using plasma proteins.
In this study, we tested various ML algorithms, most of which have not yet been tested in AD plasma proteomic data analysis but may be more advantageous. We first applied the fast empirical Bayesian Lasso (EBlasso) [53] and empirical Bayesian Elastic Net (EBEN) [54], described to improve analysis with abundant multicollinearity. We also tested XGBoost [55] and LightGBM [56] gradient-boosted decision tree (GBDT) methods, with SHapely Additive exPlanations (SHAP) [57,58] to allow for interpretable models. Furthermore, we applied TabNet [59] and Tabular Prior-Data Fitted Network (TabPFN) [60,61], representing more recent advancements in interpretable tabular deep learning. In particular, TabPFN is based on a pre-trained foundation model, which is a novel approach to model development. We analyzed the proteomic datasets provided as part of the Supplementary Materials of a previous study [35], which was published many years before these methods were developed. Assessing whether the results of the prediction model developed using these newer algorithms over the original study’s model prediction results is expected to help understand whether advancements in ML prediction analysis may advance AD detection using plasma proteomics.
Given the advantage of developing interpretable models, we also aimed to evaluate pathways and previous reports of how the proteins identified may be mechanistically related to AD. Moreover, we evaluated whether the biomarkers found to be important for classifying AD are enriched with those shown to be related to aging by being identified as part of aging clock models, differentially regulated by age [62,63,64,65,66,67,68,69,70,71], or among those identified in the Human Ageing Genomic Resources (HAGR) database (GenAge, CellAge and cell senescence signatures) [72,73].

2. Results

2.1. Applying Various ML Algorithms to Previously Generated Plasma Proteomic Data Yielded Highly Accurate Classification Models

The plasma proteomic dataset provided as Supplementary Materials by a previous study [35] included measures of 120 plasma proteins from confirmed Alzheimer’s Disease (AD) and cognitively normal (CN) subjects. There were 83 subjects included in the training set (AD, n = 43 and CN, n = 40) and 81 in the test set (AD, n = 42 and CN, n = 39). The original study performed predictive analysis of microarrays (PAM) to develop AD outcome classification models using the training set, then assessed their performance in the test set—this model included 18 proteins and provided 89% accuracy in both the training and test sets [35]. Our objective was to utilize the same dataset to test whether newer and interpretable algorithms, including empirical Bayesian Lasso (EBlasso) [53] and EN (EBEN) [54] based on penalized regression; eXtreme Gradient Boosting (XGBoost) [55] and Light Gradient Boosting Machine (LightGBM) [56] with Bayesian optimization, based on GBDT; and TabNet [59] and TabPFN [60], based on tabular deep learning, may improve prediction in the original dataset, as well as compare with other published models (Table 1).
The AD classification model developed with EBlasso included seven protein predictors and yielded a decent area under the ROC curve (AUROC) [95% confidence interval (CI)] in both the training and test sets (0.952 [0.905–1.000] and 0.971 [0.943–1.000], respectively) (Figure 1A,B; Table 2). The EBEN model included 9 proteins and also had a decent AUROC in both the training and test sets (0.966 [0.932–0.999] and 0.926 [0.874–0.978], respectively) (Figure 1C,D; Table 2). The XGBoost model yielded improved AUROC in the training set with comparable measures in the test set (0.999 [0.996–1.000] and 0.965 [0.922–1.000], respectively) (Figure 1E,F; Table 2). The LightGBM model yielded high AUROC in both training and test sets (1.000 [0.993–1.000] and 0.980 [0.955–1.000], respectively) (Figure 1G,H; Table 2), suggesting that GBDT may improve prediction. The TabNet model also had decent AUROC in both the training and test sets (0.935 [0.884–0.982] and 0.939 [0.876–1.000], respectively) (Figure 1I,J; Table 2), and the TabPFN model, which represents a cutting-edge approach to model development based on a pre-trained foundation model, also yielded high AUROC in both the training and test sets (1.000 [1.000–1.000] and 0.979 [0.956–1.000], respectively) (Figure 1K,L; Table 2).
Each of the models yielded relatively high accuracy, sensitivity/recall, specificity, positive predictive value (PPV)/precision, and negative predictive value (NPV), with the LightGBM model showing relatively the best test set metrics (Table 1). Therefore, despite using the same dataset, employing newer approaches to develop AD versus CN classification models, especially EBlasso, XGBoost, LightGBM, and TabPFN, showed improvement compared to the originally reported model using the older PAM algorithm [35] (Table 1). These models also showed similar or higher performance metrics compared to other previously published models, although it is not possible to make direct comparisons, given differences in the datasets across studies.
We evaluated the coefficient estimates for the EBlasso and EBEN models, or mean Shapley Additive Explanations (SHAP) values for XGBoost, LightGBM and TabPFN models, and feature importance for the TabNet model, to evaluate how each of the proteins included in various models contributes to the AD outcome prediction. The proteins selected in the EBlasso model included Angiopoietin-2 (ANG-2/ANGPT-2), Interleukin 1 alpha (IL-1α), Epidermal growth factor (EGF), Interleukin 3 (IL-3), Interleukin 11 (IL-11), Platelet-derived growth factor subunit B (PDGF-BB), and Tumor necrosis factor-alpha (TNF-α) (Figure 1B; Supplementary Table S1). The EBEN model additionally selected B lymphocyte chemoattractant (BLC) and Macrophage colony-stimulating factor (M-CSF/CSF1) (Figure 1B; Supplementary Table S1). There were 36 proteins with a mean SHAP value above 0, identified with XGBoost (Figure 1F; Supplementary Table S2), and 20 proteins with the LightGBM model (Figure 1H; Supplementary Table S2). The top ten proteins identified with the largest mean SHAP values in the XGBoost and LightGBM models also included ANG-2/ANGPT-2, IL-1α, IL-11, and PDGF-BB, as well as other interleukins, granulocyte-colony stimulating factor (G-CSF) and Monocyte chemotactic protein-3 (MCP-3/CCL7). The TabNet model included 26 proteins with feature importance above 0.01 (Figure 1J; Supplementary Table S3), including ANG-2/ANGPT-2, EGF, IL-1α, PDGF-BB, similar to other models, in addition to other proteins. The TabPFN model included 26 proteins with mean SHAP values above 0.01 (Figure 1L; Supplementary Table S4), which also included ANG-2/ANGPT-2, EGF, IL-1α, and PDGF-BB, among others.

2.2. Proteins Found to Be Important for Prediction Across Different Models Are Associated with Various Functions and Pathways

Overlaps in proteins identified as important predictors for AD across the six models were evaluated. There were four proteins, ANG-2/ANGPT-2, EGF, IL-1α, and PDGF-BB, included across all the models (Figure 2A). Furthermore, five of the models included IL-3, IL-11, M-CSF/CSF1, and TNF-α, four of the models included IL-1RA/IL1RN, and three of the models included CKβ-8-1/CCL23, G-CSF/CSF3, IL-1β, and I-TAC/CXCL11.
To better characterize the molecular pathways associated with these 13 commonly found proteins, we evaluated significantly enriched Gene Ontology (GO) Biological Processes (BP) (Figure 2B; Supplementary Table S5) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway terms (Figure 2C; Supplementary Table S6). We found GO BP terms related to key cellular processes, including cell proliferation, division, and migration; terms related to relevant signaling pathways, including peptidyl-tyrosine phosphorylation, mitogen-activated protein kinase (MAPK), and cytokine signaling; as well as terms related to immune cell functions, including chemotaxis. KEGG pathway terms related to various signaling cascades related to cytokine and inflammatory responses, and cell proliferation and survival, including MAPK, phosphatidylinositol 3-kinase (PI3K)/protein kinase B (AKT), Janus kinase/signal transducer and activator of transcription (JAK/STAT), Ras-related protein 1 (Rap1), Ras and TNF signaling were identified, as well as terms related to cytokines, rheumatoid arthritis, and hematopoietic cell lineage. We also constructed a STRING protein–protein interaction network (Figure 2D).
We also performed a literature review to describe previous studies that have found these proteins, or genes encoding them or their transcripts, to be biomarkers associated with AD, as well as mechanistic studies in model organisms and cell culture that demonstrated their potential roles in AD development (Supplementary Table S7). Many studies were found, providing ample evidence demonstrating their roles as biomarkers, or their mechanistic involvement, such as by regulating the blood-brain barrier (BBB), cerebrovascular functions, microglia and astrocyte functions, inflammation, neuronal survival, and Aβ and Tau regulation.

2.3. The AD Biomarkers Identified Show Overrepresentation of Aging-Related Biomarkers

Given the known importance of aging as a major risk factor for AD, we evaluated how each biomarker identified may be related to aging by evaluating overlaps with previously published blood aging clock models or found to be differentially regulated by aging, based on the DNA methylome [62,63,64,65], transcriptome [66], and proteomics [67,68,69,70,71], or included in the Human Ageing Genomic Resources (HAGR) database (GenAge, CellAge, and cell senescence signatures) [72,73]. Consistent with the notion that aging is a major risk factor for AD, we found that the biomarkers were significantly overrepresented with known aging-related biomarkers and functions (p = 0.040) (Figure 3; Supplementary Table S8).

2.4. The Different Models Showed Variable Performance in Predicting Other Dementia or MCI Progression to AD

In addition to AD and CN plasma protein measures, the original publication [35] provided plasma protein measures of 11 subjects with other dementia (OD). The models developed earlier for AD versus CN classification were tested in this dataset for their potential ability to distinguish between AD and OD. The EBlasso and EBEN models predicted 90.9% of OD subjects correctly as non-AD (Table 3), similarly to the original study. The XGBoost and LightGBM models showed improvement, with 100% of OD subjects correctly predicted as non-AD (Table 3). The TabNet and TabPFN models performed much worse than the other models (Table 3).
Proteomic measures were available for MCI subjects, among whom n = 22 later developed AD (after a mean follow-up of 29.6 months), n = 1 developed FTD, n = 3 developed LBD, n = 4 developed VaD, and n = 17 remained as MCI (after a mean follow-up of 27.8 months). The models developed for AD versus CN classification were also tested in this dataset to evaluate their potential for detecting MCI to AD progression versus non-progression, or MCI to OD progression. The EBlasso and EBEN AD versus CN classification models predicted future MCI to AD progression for 86.4% or 90.9% of the subjects (n = 22), while predictions with the other models were lower (77.3% for XGBoost, 72.7% for LightGBM, 63.6% for TabNet, and 81.8% for TabPFN) (Table 4). All models correctly predicted MCI subjects who later developed FTD, LBD, or VaD (n = 8 total for all conditions) as non-AD. The models showed variability in predicting MCI subjects who remained as MCI after follow-up (n = 17), in which EBlasso, XGBoost and TabPFN predicted n = 4 as non-AD and n = 13 as non-AD, LightGBM predicted n = 7 as non-AD and n = 10 as AD, and EBEN and TabNet predicted n = 8 as non-AD and n = 9 as AD. These results may indicate that the heterogeneity in the expression levels of the proteins in the model during the MCI stage make it challenging to yield accurate predictions, or that a longer follow-up is needed to acquire more data on the patients’ future long-term outcomes to provide a better assessment of whether the models indeed distinguish between those who will develop AD and other outcomes and those who will never develop outcomes. Furthermore, since the models tested were initially trained to classify AD versus CN, and with the small sample sizes of all the different groups, additional studies to develop models specifically for the prediction of the progression of MCI to AD versus other outcomes are expected to be highly beneficial.

3. Discussion

The objective of this study was to apply various cutting-edge ML algorithms (EBlasso, EBEN, XGBoost, LightGBM, TabNet, and TabPFN) to a previously published plasma proteomics dataset [35] to develop AD outcome classification models and compare their performance to the original study, as well as with other studies. Each of the models we developed showed high AUROC, sensitivity/recall, specificity, PPV/precision, and NPV. Most of our models achieved improved accuracy compared to the model described in the original study, and given that the same dataset was used, our results support the notion that novel ML algorithms improve the development of AD plasma biomarkers. Our models were also comparable or showed better performance metrics compared to various other previously published plasma proteomic AD outcome prediction models; however, given that different datasets were used, it is not possible to make direct comparisons, as in the original study’s results. The training and test sets used to develop and assess the models in this study were relatively small, and the MCI or OD subgroup datasets were particularly small. To better evaluate whether the models are generalizable and not overfit, as well as the potential clinical utility, additional external validation studies with more patients are needed.
Our models identified various proteins to be important predictors for AD versus CN classification (7 proteins for EBlasso, 9 for EBEN, 36 for XGBoost, 20 for LightGBM, 26 for TabNet, and 26 for TabPFN), and further functional enrichment and literature review suggested that they may indeed be relevant to AD outcome. The models developed with EBlasso and EBEN yielded simple, regression-based prediction methods with only a handful of proteins, whereas the XGBoost, LightGBM, TabNet, and TabPFN models yielded more complex models that included more protein predictors. The protein predictor panels can be measured with a multiplex enzyme-linked immunosorbent assay (ELISA) or Luminex assays to facilitate implementation in the clinical setting.
We aimed to understand the underlying molecular mechanisms associated with the proteins conferring prediction by functional enrichment, as well as by literature review. ANG-2/ANGPT-2, EGF, IL-1α, and PDGF-BB were identified by all six algorithms, and each had ample prior studies linking them mechanistically to AD, providing strong evidence for their relevance (see Supplementary Table S7). Other proteins identified by at least three of the six models tested include IL-3, IL-11, M-CSF/CSF1, TNF-α, IL-1RA/IL1RN, CKβ-8-1/CCL23, G-CSF/CSF3, IL-1β, and I-TAC/CXCL11, which also had previous studies demonstrating their potential role in AD outcome (see Supplementary Table S7). Collectively, these proteins were associated with the enrichment of GO BP and KEGG pathway terms related to cytokines and the MAPK, PI3K/AKT, and JAK/STAT pathways, suggesting that these pathways may represent targets for developing novel preventative or therapeutic interventions. These findings are supported by previous works that described the importance of addressing the JAK/STAT pathway [25], aging-related neuroinflammation [26], and use of inhibitors against components of the pathways related to EGF [74], PDGF-BB [75], MAPK [75], or relevant cytokine receptors [76,77,78,79], as potentially ways to counter AD.
Ample previous studies have suggested that ANG-2/ANGPT-2 plays a role in AD pathology, especially through mediating angiogenesis and impacting the BBB. It is well-established that ANG-2 promotes angiogenesis through the TEK receptor tyrosine kinase (TIE-2) and inhibition of ANG-1 [80]. ANG-2 gain-of-function mice display increased BBB permeability, with downregulated tight junctions and adherens in endothelial cells, and upregulation of caveolin-1, which promotes permeability [81]. During murine development, Ang-2/Angpt-2, along with vascular endothelial growth factor (VEGF), promotes increased permeability and angiogenesis of the pupillary membrane micro-vessels [82]. In patients, ANG-2 was found to be upregulated in the CSF and correlated with the disease pathology, t-Tau, and p-Tau, as well as BBB permeability [83], showing its potential use as a biomarker. In the postmortem brains of AD patients, it was also upregulated, compared to controls [84], along with TIE-2, notably in endothelial cells, as well as altered in BA7 tissue homogenate expression, depending on Braak stages, as with TIE-2 [85], further showing that cerebrovasculature and BBB dysregulation due to ANG-2:TIE-2 signaling is mechanistically relevant. In a murine Amyloid Precursor Protein (APP) transgenic model, Aβ was found to upregulate Ang-2 in the cortex and hippocampus, leading to pathological angiogenesis [86]. In another human AβPP transgenic murine model, Ang-2 and Tie-2 were both shown to be upregulated [87]. In patients, ANG-2 has also been associated with various inflammatory conditions, including autoimmune diseases, sepsis, and acute lung injury [88]. Another study showed that in the presence of Mycoplasma pulmonis infection and thus high inflammation condition, Ang-2 binds Tie-2 in an antagonistic manner, suppressing its downstream phosphorylation and promoting forkhead box O1 (FoxO1) activation and increased Ang-2 expression by a positive feedback loop, leading to increased pathological vascularity and vessel permeability; whereas in the absence of the pathogen, Ang-2 activates Tie-2, leading to enlarged vessels without leakiness [89]. Collectively, these studies suggest the role of ANG-2 overexpression in pathological vasculature and BBB permeability to promote AD pathology.
EGF, which was also identified to be among the most relevant to AD classification, has also been shown extensively to be relevant to AD pathogenesis. Lower plasma EGF at baseline was shown to predict worse long-term cognitive outcomes in AD cohorts, and it was also found to be reduced among MCI and AD patients, compared to CN controls [90]. Other studies, on the other hand, have reported elevated plasma EGF in AD patients compared to controls [91,92]. A study in a murine model overexpressing human APOE4, with increased Aβ1–42 burden and cognitive impairment, lower plasma EGF was also reported, and EGF treatment was shown to alleviate cognitive decline, reduce microbleeds and increase cerebrovascular coverage [93,94]. Further supporting that EGF may be protective of AD, a study of single brain endothelial cell cultures and triple cultures (endothelial cells, astrocytes, and pericytes) mimicking a microvascular unit with oligomeric Aβ1–42 results in decreased angiogenesis and increased vessel disruption; however, EGF treatment prevented this effect, suggesting its use as a potential therapeutic for AD [95]. It is also well-established that the stimulation of its receptor EGFR by Aβ1–42, is one mechanism that promotes AD pathogenesis [96], and pharmacological inhibitors against EGFR have been shown to improve memory behavioral outcomes in Drosophila and murine models [74], as well as reduce Tau-induced neuroinflammatory response, microglia and astrocyte activation, and Tau hyperphosphorylation [97]. It is thus speculated that promoting EGF stimulation of EGFR, instead of by Aβ1–42, may aid in ameliorating AD outcomes.
Various cytokines were identified, including IL-1α by all six models; TNF-α, IL-3, and IL-11 by five of the models; and IL-1β by three models. IL-1α is a well-established pro-inflammatory cytokine which has been shown to be increased in the serum of AD patients, along with its family member IL-1β, and their antagonist IL-1RA and soluble receptor sIL-1R1 [98]. IL-1β and IL-RA were also both identified by at least three models. Plasma IL-1α was also found to be correlated with cognitive tests and Aβ40 [99]. IL-1 was also found to be elevated in postmortem brain samples from AD patients [100]. Polymorphisms of the IL-1α gene have also been associated with AD [101,102,103,104,105]. Moreover, in the acute phase of head injury, known to augment AD risk afterward, the number of IL-1α expressing activated microglia was found to be increased and correlated with the number of neurons with higher APP expression [106]. Mechanistically, in human astrocytes, IL-1α and IL-1β can lead to increased APP translation, and further assessment with a reporter system showed that they regulate the 5′-untranslated region (UTR) of APP [107]. Additionally, IL-1α was found to upregulate α-disintegrin and metalloproteinase (ADAM)-10 and -17, promoting soluble amyloid precursor protein-α (sAPPα) release [108]. It was further shown that this IL-1α stimulation of sAPPα secretion depended on initial p38 MAPK activation and subsequent MEK and PI3K activation, further elucidating the specific pathways involved [107]. In addition to its role in inflammation and APP processing, IL-1α induces the free radical nitric oxide in primary human astrocytes [109]. Other cytokines and chemokines identified include IL-3, IL-11, M-CSF/CSF1, TNF-α, CKβ-8-1/CCL23, G-CSF/CSF3, IL-1β, and I-TAC/CXCL11. Ample evidence links them to AD, including studies of plasma, serum, CSF biomarkers, and polymorphisms, and by mechanistic studies (see Supplementary Table S7). Taken together, there is ample literature to support the involvement of these biomarkers mechanistically in AD pathogenesis.
PDGF-BB was also identified as an important predictor across all six models. In a previous study, it was found to be elevated in AD patients’ plasma [91], CSF [110], and postmortem frontal cortex samples [111], compared to controls, as well as elevated in the CSF of MCI patients and in an AD murine model [112], compared to controls. It is well established that endothelial-derived, secreted PDGF-BB induces PDGF receptor-β (PDGFRβ) signaling in pericytes, which is crucial for the maintenance of the integrity of the blood-brain barrier [113,114]. Deficiency in pericytes and age-related vascular damage have been found to lead to neurodegeneration, neuroinflammation, and learning and memory loss [113]. Pericyte proliferation through PDGF-BB signaling has been described to be protective of AD-related neuronal damage in both murine models of Aβ pathology and human cells [74]. PDGF-BB has also been shown to confer protection against neuronal death from ischemic neuronal damage [115,116] and Parkinson’s disease [117,118]. Murine studies also support the mechanism related to aberrant PDGF-BB elevation in BBB permeability and neuroinflammation [119,120]. These studies support the notion that proper regulation of PDGF-BB is important for the maintenance of proper cerebrovasculature and BBB maintenance to prevent AD and cognitive decline.
Furthermore, we explored whether the biomarkers identified by the six models constructed were significantly enriched with previously established aging-related biomarkers [62,63,64,65,66,67,68,69,70,71,72,73]. We found a significant overrepresentation of aging-related biomarkers among the AD biomarkers in the models, corroborating the notion that aging is a major risk factor for AD. Our study characterized specific aging-related biomarkers, and additional studies to explore overlaps with biomarkers of other aging-related conditions and the possibility of developing anti-aging interventions to broadly prevent or delay aging-related diseases, are expected to be highly advantageous.
As a method for predicting AD outcome among MCI patients years before diagnosis, all the models were able to detect the majority of those who developed AD after follow-up, although the TabNet model performed notably worse. The models classified MCI patients who later developed FTD, LBD, or VaD as non-AD, although further studies with additional samples to be able to develop prediction models specifically for MCI to AD, versus MCI to OD, versus MCI to CN would be beneficial. Similar to the result of the model described in the original study [35], the models constructed in this study predicted many of the MCI patients who remained as MCI after follow-up, as AD. The limited accuracy in the prediction of MCI may indicate heterogeneity in the proteins during the MCI stage, or, since it is unknown whether the MCI patients in this study later developed AD or OD after the study’s follow-up, the assessment time may not have been sufficient. Additional prospective studies are needed to evaluate how well the models can detect future MCI to AD transition, early. Furthermore, since the models were initially developed for the classification of AD versus CN, additional prospective studies to perform prediction model development among MCI patients confirmed to develop AD, versus those who never develop MCI or develop OD in the long run, would be beneficial. A major challenge with such a study is having sufficient follow-up time to be certain about the patients’ long-term outcomes, with sufficient sample sizes. The sample sizes for these subgroup assessments were particularly small, making it difficult to evaluate the clinical utility of this approach, and thus, future studies are required.
Overall, our results support the notion that modern ML algorithms can be powerful tools for developing classification models for AD, based on plasma protein signatures. Moreover, these algorithms allow us to distinguish AD from OD. Applying these analysis methods to datasets with recent methods for large proteome profiling that allow measuring many more proteins, with recent AD clinical classification, is expected to further improve disease detection. A major limitation of our study is that the dataset was derived from a previously published study without linked demographic or other clinical information, and in a relatively small population. Thus, a future study to validate our models using a large study base is necessary. The utilized dataset had limited MCI subgroup data, impeding the capacity of ML algorithms to discern the MCI subgroup transitioning from MCI to AD and OD. The ability to predict the progression of MCI patients to AD or OD holds significant clinical utility for healthcare providers [19,40]. It would also be advantageous to validate the biomarkers in various settings, including in targeted at-risk patient groups, or widely in the community setting, to evaluate whether they can be beneficial for screening.
Further limitations of this study are that clinical data associated with the proteomic data were not available, the criteria for AD diagnosis and cognitive norm were evaluated according to the labels provided by the original study only, information about key factors such as the subjects’ APOE ε4 genotype could not be accounted for, and measurements of Aβ and Tau peptides were not available to be able to make comparisons. Future large external validations in different populations are required to evaluate the generalizability and better assessment of potential overfitting of the models, with fully linked clinical and molecular information. In this study, the standard threshold of 0.5 was used to predict AD outcome, rather than employing threshold optimization, such as Youden’s Index, as this was a small study with equal cases to non-cases, and the sensitivity vs. specificity measures were comparable. With a future large study, it would be advantageous to evaluate how different thresholds that maximize both sensitivity and specificity, or prioritize sensitivity or specificity, may be advantageous in the clinical setting. Although more patients may be detected by prioritizing sensitivity, it may also be disadvantageous, given the additional cost imposed for diagnostic evaluation of those found to be at high risk and worry for those who may be false positives. Furthermore, future studies to evaluate whether employing simpler measures, such as p-Tau217/Aβ1–42 ratio is more cost-effective, or whether employing proteomic information may aid in predicting response to specific therapeutics, would be highly informative.
In the future, if these models can be validated, a rapid assay can be developed to facilitate measuring the biomarkers in the clinical setting. Moreover, molecular characterization to elucidate how the proteins identified may be related to AD pathogenesis, and exploring whether they are drivers of the disease status, may aid in the development of novel therapeutic strategies.

4. Materials and Methods

4.1. Dataset Used

This study used de-identified datasets originally provided as Supplementary Materials by Ray et al. [35] and was determined not to meet the criteria for human subject research by the Mass General Brigham Institutional Review Board. The datasets entailed standardized measures of 120 plasma proteins assessed with filter-based, arrayed sandwich ELISAs [121]. The samples were derived from patients who were diagnosed with Alzheimer’s disease (AD) at the time of plasma collection or matched cognitively normal (CN) controls. According to the original study, the samples from patients diagnosed with clinical symptoms by neurologists were archived at various academic centers specializing in neurological or neurodegenerative diseases (Sun Health Research Institute, Oregon Health Sciences University, UC San Diego, University of Genoa, Göteborg University, University of Wroclaw, UC San Francisco, Stanford University). Although the final clinical labeling information was provided, linked clinical datasets with additional information and APOE genotype information were not available. The original study provided data from 83 subjects as the training set (n = 43 AD and n = 40 CN) and 81 as the test set (n = 42 AD and n = 39 CN). Additional data provided in the original study, which were also analyzed in this study, included the following: (i) subjects diagnosed with other dementia (OD) (n = 11 OD total, with n = 8 Frontotemporal dementia (FTD) and n = 3 Corticobasal degeneration (CBD)); (ii) subjects diagnosed with mild cognitive impairment (MCI) at the time of plasma collection, who developed AD after 2–5 years of follow-up (n = 22 MCI-AD); (iii) subjects diagnosed with MCI at the time of plasma collection, who developed OD (n = 8 MCI-OD total, with n = 1 MCI-FTD, n = 3 MCI-Lewy Body dementia (LBD), and n = 3 MCI-Vascular dementia (VaD)); and (iv) subjects diagnosed with MCI at the time of plasma collection, who remained as MCI patients after 4–6 years of follow up (n = 17 MCI-MCI).

4.2. Software

Python version 3.12.3 [122] and R version 4.4.2 [123] were used for the analyses described below. The code scripts and output are provided in a Github repository (https://github.com/additbio/AD-plasma-inflammatory-biomarkers (accessed on 18 October 2025)).

4.3. Development of the EBlasso, EBEN, XGBoost, LightGBM, TabNet, and TabPFN Models

EBlasso and EBEN were performed in the training set for feature selection using the “EBglmnet” R package version 6.0 [124], where 5-fold CV was performed to identify the hyperparameters that maximize the penalized likelihood function (for EBlasso, α = 1 and λ = 0.1703, and for EBEN, α = 0.300 and λ = 0.118). With the selected proteins (7 for EBlasso and 9 for EBEN), final models were constructed by performing multivariable logistic regression on the training set to estimate the maximum likelihood coefficient estimates with base functions. The coefficient estimates and 95% CI were plotted with the “ggplot2” R package version 3.4.4.
The XGBoost [55] model was constructed using the “xgboost” library version 2.1.1 [125], and the LightGBM [56] model was constructed using the “lightbgm” Python library version 4.5.0 [126]. Using the training set, Bayesian optimization was performed with 5-fold CV to determine the hyperparameter values that maximized the mean test AUROC, using the “scikit-learn” Python library version 1.5.2 [127] and “bayesian-optimization” Python library version 2.0.3 [128] (for XGBoost, learning rate = 0.1215, maximum depth of a tree = 4, gamma = 0.0526, minimum child weight = 3.2028, subsample ratio = 0.8090, column sample ratio = 0.9578, L1 regularization = 0.4681, L2 regularization = 2.7769, number of boosting rounds = 96; and for LightGBM, learning rate = 0.2653, maximum depth of a tree = 5, number of leaves in a tree = 16, minimum data in a leaf = 24, minimum sum of Hessian in a leaf = 0.001, subsample ratio = 0.8262, feature fraction = 0.9200, minimum gain to split = 1.0782, L1 regularization = 0.1931, L2 regularization = 1.3596, number of boosting rounds = 69). For the XGBoost and LightGBM models, SHapely Additive exPlanations (SHAP) scores [57,58] were found for each protein with the “shap” Python library version 0.46.0 [129]. The XGBoost model included 36 proteins, and the LightGBM model included 20 proteins with a mean absolute SHAP score above 0.
The TabNet model [59] was constructed using the “pytorch_tabnet” library version 4.1.0 [130]. Five-fold CV was performed to tune the hyperparameters that maximized the AUROC with the optimizer in the “pytorch” library version 2.3.1 [131] (mask type=“sparsemax”, width of decision prediction layer = 12, width of attention embedding = 12, number of steps = 4, gamma = 1.3, epsilon = 3.827 × 10−15, momentum = 0.028, sparsity loss coefficient = 8.866 × 10−5, AMGrad optimizer with initial learning rate = 0.008 and weight decay = 1 × 10−5, Reduce on Plateau learning rate scheduler, with minimum learning rate = 0.001, factor = 0.1, patience = 10, batch size = 16, virtual batch size = 8, epochs = 109, seed = 396). The top feature importance was plotted with the “ggplot2” R package version 3.4.4. The TabPFN model [60,61], which does not require hyperparameter tuning, was constructed with the “tabpfn” library version 2.0.6 [132] and SHAP scores were obtained with the “tabpfn_extensions” library [133] version 0.0.4 in Python.
Each final model constructed in the training set was applied to the test set, and predicted probabilities were obtained. The AUROC [95% CI] was calculated and curves were plotted with the “pROC” R package version 1.18.5 [134]. The sensitivity, specificity, PPV, and NPV, and their 95% CIs were calculated using the “epiR” R package version 2.0.78 [135]. A predicted probability threshold cut-off of 0.5 was used when determining AD outcome prediction.

4.4. GO and KEGG Pathway Enrichment and Network Analysis of the Overlapping Proteins

A Venn diagram was plotted to display the proteins commonly found across all the models, using the “ggVennDiagram” R package version 1.5.4 [136]. Using the 13 proteins found in at least three models, functional annotation enrichment analysis for Gene Ontology (GO), Biological Process (BP), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway terms was performed with minimum signal and strength settings ≥1.00, and a Protein–Protein Interaction (PPI) Network plot was constructed with the high confidence (0.700) interaction score setting based on curated databases, experimental evidence, text mining, and co-expression (as indicated in the figure), using STRING version 12.0 [137,138].

4.5. Evaluating Overlaps with Aging Models Developed in Blood

Proteins identified to be important for AD prediction were compared with previously published DNA methylome-based [62,63,64,65], transcriptome-based [66], and protein-based [67,68,69,70,71] aging clock models, and previously known aging-related proteins described in the publications [68,70]. The Human Ageing Genomic Resources (HAGR) database (GenAge, CellAge, and cell senescence signatures) [72,73] was also used to identify aging-related gene encodings for relevant proteins. The ENTREZ identification numbers provided in the original AD dataset [35] were used to make comparisons. For studies where an ENTREZ identification number was not provided, the “AnnotationDbi” R package version 1.68.0 [139] with “org.Hs.eg.db” version 3.20.0 [140] were used to find the associated numbers. A hypergeometric test p-value was calculated with the R base stats Fisher’s exact test function, with an alternative hypothesis of greater than to assess overrepresentation of aging-related biomarkers.

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/ijms262311673/s1.

Author Contributions

Conceptualization, A.T.; Methodology, A.T.; Formal analysis, A.T.; Writing—original draft, A.T.; Writing—review & editing, C.M.C., A.J.L., P.C., S.D. and A.K.; Supervision, A.T., C.M.C., A.J.L., P.C., S.D. and A.K.; Funding acquisition, A.T. and A.K. All authors have read and agreed to the published version of the manuscript.

Funding

This work was supported by the Alzheimer’s Association grant AARG-NTF-1244080 to A.T.

Institutional Review Board Statement

Not applicable.

Informed Consent Statement

Not applicable.

Data Availability Statement

Restrictions apply to the availability of these data. Data were obtained from Ray et al. [35] Supplementary Materials, and the link for their access and code for preprocessing them for this study are available at https://github.com/additbio/AD-plasma-inflammatory-biomarkers with the permission of Ray et al. [35].

Acknowledgments

We acknowledge the Alzheimer’s Association for supporting A.T., and also gratefully acknowledge the support from the Patricia K. Donahoe Surgeon-Scientist Research Pro-gram and Huiying Memorial Foundation to A.T. and the Health Science Research Grant from Meiji Yasuda Life Foundation of Health and Welfare to A.K.

Conflicts of Interest

The authors declare no conflicts of interest.

References

  1. Skaria, A. The Economic and Societal Burden of Alzheimer Disease: Managed Care Considerations. 2022. Available online: https://www.ajmc.com/view/the-economic-and-societal-burden-of-alzheimer-disease-managed-care-considerations (accessed on 17 November 2024).
  2. 2024 Alzheimer’s disease facts and figures. Alzheimer’s Dement. 2024, 20, 3708–3821. [CrossRef]
  3. Bouwman, F.H.; Frisoni, G.B.; Johnson, S.C.; Chen, X.; Engelborghs, S.; Ikeuchi, T.; Paquet, C.; Ritchie, C.; Bozeat, S.; Quevenco, F.-C.; et al. Clinical application of CSF biomarkers for Alzheimer’s disease: From rationale to ratios. Alzheimer’s Dement. Diagn. Assess. Dis. Monit. 2022, 14, e12314. [Google Scholar] [CrossRef]
  4. Johnson, D.J.; Richie, M. Lumbar Puncture: Technique, Indications, Contraindications, and Complications in Adults; Wolters Kluwer: Cyber City, India, 2025; Available online: https://www.uptodate.com/contents/lumbar-puncture-technique-contraindications-and-complications-in-adults (accessed on 16 May 2025).
  5. Schindler, S.E.; Bollinger, J.G.; Ovod, V.; Mawuenyega, K.G.; Li, Y.; Gordon, B.A.; Holtzman, D.M.; Morris, J.C.; Benzinger, T.L.S.; Xiong, C.; et al. High-precision plasma β-amyloid 42/40 predicts current and future brain amyloidosis. Neurology 2019, 93, e1647–e1659. [Google Scholar] [CrossRef]
  6. Barthélemy, N.R.; Horie, K.; Sato, C.; Bateman, R.J. Blood plasma phosphorylated-tau isoforms track CNS change in Alzheimer’s disease. J. Exp. Med. 2020, 217, e20200861. [Google Scholar] [CrossRef]
  7. Benussi, A.; Karikari, T.K.; Ashton, N.; Gazzina, S.; Premi, E.; Benussi, L.; Ghidoni, R.; Rodriguez, J.L.; Emeršič, A.; Simrén, J.; et al. Diagnostic and prognostic value of serum NfL and p-Tau181 in frontotemporal lobar degeneration. J. Neurol. Neurosurg. Psychiatry 2020, 91, 960–967. [Google Scholar] [CrossRef]
  8. Janelidze, S.; Mattsson, N.; Palmqvist, S.; Smith, R.; Beach, T.G.; Serrano, G.E.; Chai, X.; Proctor, N.K.; Eichenlaub, U.; Zetterberg, H.; et al. Plasma P-tau181 in Alzheimer’s disease: Relationship to other biomarkers, differential diagnosis, neuropathology and longitudinal progression to Alzheimer’s dementia. Nat. Med. 2020, 26, 379–386. [Google Scholar] [CrossRef]
  9. Karikari, T.K.; Pascoal, T.A.; Ashton, N.J.; Janelidze, S.; Benedet, A.L.; Rodriguez, J.L.; Chamoun, M.; Savard, M.; Kang, M.S.; Therriault, J.; et al. Blood phosphorylated tau 181 as a biomarker for Alzheimer’s disease: A diagnostic performance and prediction modelling study using data from four prospective cohorts. Lancet Neurol. 2020, 19, 422–433. [Google Scholar] [CrossRef]
  10. Mattsson-Carlgren, N.; Janelidze, S.; Palmqvist, S.; Cullen, N.; Svenningsson, A.L.; Strandberg, O.; Mengel, D.; Walsh, D.M.; Stomrud, E.; Dage, J.L.; et al. Longitudinal plasma p-tau217 is increased in early stages of Alzheimer’s disease. Brain 2020, 143, 3234–3241. [Google Scholar] [CrossRef]
  11. Palmqvist, S.; Janelidze, S.; Quiroz, Y.T.; Zetterberg, H.; Lopera, F.; Stomrud, E.; Su, Y.; Chen, Y.; Serrano, G.E.; Leuzy, A.; et al. Discriminative Accuracy of Plasma Phospho-tau217 for Alzheimer Disease vs Other Neurodegenerative Disorders. JAMA 2020, 324, 772–781. [Google Scholar] [CrossRef]
  12. Cullen, N.C.; Leuzy, A.; Palmqvist, S.; Janelidze, S.; Stomrud, E.; Pesini, P.; Sarasa, L.; Allué, J.A.; Proctor, N.K.; Zetterberg, H.; et al. Individualized prognosis of cognitive decline and dementia in mild cognitive impairment based on plasma biomarker combinations. Nat. Aging 2021, 1, 114–123. [Google Scholar] [CrossRef]
  13. Mielke, M.M.; Frank, R.D.; Dage, J.L.; Jeromin, A.; Ashton, N.J.; Blennow, K.; Karikari, T.K.; Vanmechelen, E.; Zetterberg, H.; Algeciras-Schimnich, A.; et al. Comparison of Plasma Phosphorylated Tau Species With Amyloid and Tau Positron Emission Tomography, Neurodegeneration, Vascular Pathology, and Cognitive Outcomes. JAMA Neurol. 2021, 78, 1108–1117. [Google Scholar] [CrossRef] [PubMed]
  14. Mielke, M.M.; Dage, J.L.; Frank, R.D.; Algeciras-Schimnich, A.; Knopman, D.S.; Lowe, V.J.; Bu, G.; Vemuri, P.; Graff-Radford, J.; Jack, C.R.; et al. Performance of plasma phosphorylated tau 181 and 217 in the community. Nat. Med. 2022, 28, 1398–1405, Correction in Nat. Med. 2023, 29, 2954. https://doi.org/10.1038/s41591-022-02066-w. [Google Scholar] [CrossRef] [PubMed]
  15. Moscoso, A.; Grothe, M.J.; Ashton, N.J.; Karikari, T.K.; Rodriguez, J.L.; Snellman, A.; Suárez-Calvet, M.; Zetterberg, H.; Blennow, K.; Schöll, M.; et al. Time course of phosphorylated-tau181 in blood across the Alzheimer’s disease spectrum. Brain 2021, 144, 325–339, Erratum in Brain 2021, 144, e57. https://doi.org/10.1093/brain/awab075. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  16. Simrén, J.; Leuzy, A.; Karikari, T.K.; Hye, A.; Benedet, A.L.; Lantero-Rodriguez, J.; Mattsson-Carlgren, N.; Schöll, M.; Mecocci, P.; Vellas, B.; et al. The diagnostic and prognostic capabilities of plasma biomarkers in Alzheimer’s disease. Alzheimer’s Dement. 2021, 17, 1145–1156. [Google Scholar] [CrossRef]
  17. Ashton, N.J.; Janelidze, S.; Mattsson-Carlgren, N.; Binette, A.P.; Strandberg, O.; Brum, W.S.; Karikari, T.K.; González-Ortiz, F.; Di Molfetta, G.; Meda, F.J.; et al. Differential roles of Aβ42/40, p-tau231 and p-tau217 for Alzheimer’s trial selection and disease monitoring. Nat. Med. 2022, 28, 2555–2562. [Google Scholar] [CrossRef]
  18. Gonzalez-Ortiz, F.; Turton, M.; Kac, P.R.; Smirnov, D.; Premi, E.; Ghidoni, R.; Benussi, L.; Cantoni, V.; Saraceno, C.; Rivolta, J.; et al. Brain-derived tau: A novel blood-based biomarker for Alzheimer’s disease-type neurodegeneration. Brain 2023, 146, 1152–1165, Correction in Brain 2023, 146, 1152–1165. https://doi.org/10.1093/brain/awad208. [Google Scholar] [CrossRef] [PubMed] [PubMed Central]
  19. Kivisäkk, P.; Carlyle, B.C.; Sweeney, T.; Trombetta, B.A.; LaCasse, K.; El-Mufti, L.; Tuncali, I.; Chibnik, L.B.; Das, S.; Scherzer, C.R.; et al. Plasma biomarkers for diagnosis of Alzheimer’s disease and prediction of cognitive decline in individuals with mild cognitive impairment. Front. Neurol. 2023, 14, 1069411. [Google Scholar] [CrossRef]
  20. Wang, J.; Huang, S.; Lan, G.; Lai, Y.; Wang, Q.; Chen, Y.; Xiao, Z.; Chen, X.; Bu, X.; Liu, Y.; et al. Diagnostic accuracy of plasma p-tau217/Aβ42 for Alzheimer’s disease in clinical and community cohorts. Alzheimer’s Dement. 2025, 21, e70038. [Google Scholar] [CrossRef]
  21. FDA. FDA Clears First Blood Test Used in Diagnosing Alzheimer’s Disease. Available online: https://www.fda.gov/news-events/press-announcements/fda-clears-first-blood-test-used-diagnosing-alzheimers-disease (accessed on 18 June 2025).
  22. Horie, K.; Salvadó, G.; Barthélemy, N.R.; Janelidze, S.; Li, Y.; He, Y.; Saef, B.; Chen, C.D.; Jiang, H.; Strandberg, O.; et al. CSF MTBR-tau243 is a specific biomarker of tau tangle pathology in Alzheimer’s disease. Nat. Med. 2023, 29, 1954–1963. [Google Scholar] [CrossRef]
  23. Balogun, W.G.; Zetterberg, H.; Blennow, K.; Karikari, T.K. Plasma biomarkers for neurodegenerative disorders: Ready for prime time? Curr. Opin. Psychiatry 2023, 36, 112–118. [Google Scholar] [CrossRef]
  24. Pais, M.V.; Forlenza, O.V.; Diniz, B.S. Plasma Biomarkers of Alzheimer’s Disease: A Review of Available Assays, Recent Developments, and Implications for Clinical Practice. J. Alzheimer’s Dis. Rep. 2023, 7, 355–380. [Google Scholar] [CrossRef] [PubMed]
  25. Rusek, M.; Smith, J.; El-Khatib, K.; Aikins, K.; Czuczwar, S.J.; Pluta, R. The Role of the JAK/STAT Signaling Pathway in the Pathogenesis of Alzheimer’s Disease: New Potential Treatment Target. Int. J. Mol. Sci. 2023, 24, 864. [Google Scholar] [CrossRef] [PubMed]
  26. Eiser, A.R.; Fulop, T. Alzheimer’s Disease Is a Multi-Organ Disorder: It May Already Be Preventable. J. Alzheimer’s Dis. 2023, 91, 1277–1281. [Google Scholar] [CrossRef] [PubMed]
  27. Bellomo, G.; Indaco, A.; Chiasserini, D.; Maderna, E.; Paolini Paoletti, F.; Gaetani, L.; Paciotti, S.; Petricciuolo, M.; Tagliavini, F.; Giaccone, G.; et al. Machine Learning Driven Profiling of Cerebrospinal Fluid Core Biomarkers in Alzheimer’s Disease and Other Neurological Disorders. Front. Neurosci. 2021, 15, 647783. [Google Scholar] [CrossRef]
  28. Gogishvili, D.; Vromen, E.M.; Koppes-den Hertog, S.; Lemstra, A.W.; Pijnenburg, Y.A.L.; Visser, P.J.; Tijms, B.M.; Del Campo, M.; Abeln, S.; Teunissen, C.E.; et al. Discovery of novel CSF biomarkers to predict progression in dementia using machine learning. Sci. Rep. 2023, 13, 6531. [Google Scholar] [CrossRef]
  29. Tandon, R.; Watson, C.M.; Seyfried, N.T.; Mitchell, C.S.; Zhao, L.; Lah, J.J. Machine Learning to Stratify Asymptomatic Alzheimer’s Disease Progression Risk with Cerebrospinal Fluid Biomarkers. Alzheimer’s Dement. 2022, 18, e069445. [Google Scholar] [CrossRef]
  30. Gaetani, L.; Bellomo, G.; Parnetti, L.; Blennow, K.; Zetterberg, H.; Di Filippo, M. Neuroinflammation and Alzheimer’s Disease: A Machine Learning Approach to CSF Proteomics. Cells 2021, 10, 1930. [Google Scholar] [CrossRef]
  31. Ficiarà, E.; Boschi, S.; Ansari, S.; D’Agata, F.; Abollino, O.; Caroppo, P.; Di Fede, G.; Indaco, A.; Rainero, I.; Guiot, C. Machine Learning Profiling of Alzheimer’s Disease Patients Based on Current Cerebrospinal Fluid Markers and Iron Content in Biofluids. Front. Aging Neurosci. 2021, 13, 607858. Available online: https://www.frontiersin.org/articles/10.3389/fnagi.2021.607858 (accessed on 1 September 2023). [CrossRef]
  32. Desaire, H.; Stepler, K.E.; Robinson, R.A.S. Exposing the Brain Proteomic Signatures of Alzheimer’s Disease in Diverse Racial Groups: Leveraging Multiple Data Sets and Machine Learning. J. Proteome Res. 2022, 21, 1095–1104. [Google Scholar] [CrossRef]
  33. Johnson, E.C.B.; Dammer, E.B.; Duong, D.M.; Ping, L.; Zhou, M.; Yin, L.; Higginbotham, L.A.; Guajardo, A.; White, B.; Troncoso, J.C.; et al. Large-scale proteomic analysis of Alzheimer’s disease brain and cerebrospinal fluid reveals early changes in energy metabolism associated with microglia and astrocyte activation. Nat. Med. 2020, 26, 769–780. [Google Scholar] [CrossRef]
  34. Sung, Y.J.; Yang, C.; Norton, J.; Johnson, M.; Fagan, A.; Bateman, R.J.; Perrin, R.J.; Morris, J.C.; Farlow, M.R.; Chhatwal, J.P.; et al. Proteomics of brain, CSF, and plasma identifies molecular signatures for distinguishing sporadic and genetic Alzheimer’s disease. Sci. Transl. Med. 2023, 15, eabq5923. [Google Scholar] [CrossRef]
  35. Ray, S.; Britschgi, M.; Herbert, C.; Takeda-Uchimura, Y.; Boxer, A.; Blennow, K.; Friedman, L.F.; Galasko, D.R.; Jutel, M.; Karydas, A.; et al. Classification and prediction of clinical Alzheimer’s diagnosis based on plasma signaling proteins. Nat. Med. 2007, 13, 1359–1362. [Google Scholar] [CrossRef]
  36. O’Bryant, S.E.; Xiao, G.; Barber, R.; Reisch, J.; Doody, R.; Fairchild, T.; Adams, P.; Waring, S.; Diaz-Arrastia, R.; Texas Alzheimer’s Research, C. A serum protein-based algorithm for the detection of Alzheimer disease. Arch. Neurol. 2010, 67, 1077–1081. [Google Scholar] [CrossRef]
  37. Eke, C.S.; Jammeh, E.; Li, X.; Carroll, C.; Pearson, S.; Ifeachor, E. Early Detection of Alzheimer’s Disease with Blood Plasma Proteins Using Support Vector Machines. IEEE J. Biomed. Health Inf. 2021, 25, 218–226. [Google Scholar] [CrossRef]
  38. Ashton, N.J.; Nevado-Holgado, A.J.; Barber, I.S.; Lynham, S.; Gupta, V.; Chatterjee, P.; Goozee, K.; Hone, E.; Pedrini, S.; Blennow, K.; et al. A plasma protein classifier for predicting amyloid burden for preclinical Alzheimer’s disease. Sci. Adv. 2019, 5, eaau7220. [Google Scholar] [CrossRef]
  39. Zhang, F.; Petersen, M.; Johnson, L.; Hall, J.; O’Bryant, S.E. Combination of Serum and Plasma Biomarkers Could Improve Prediction Performance for Alzheimer’s Disease. Genes 2022, 13, 1738. [Google Scholar] [CrossRef] [PubMed]
  40. Kivisäkk, P.; Magdamo, C.; Trombetta, B.A.; Noori, A.; Kuo, Y.K.E.; Chibnik, L.B.; Carlyle, B.C.; Serrano-Pozo, A.; Scherzer, C.R.; Hyman, B.T.; et al. Plasma biomarkers for prognosis of cognitive decline in patients with mild cognitive impairment. Brain Commun. 2022, 4, fcac155. [Google Scholar] [CrossRef] [PubMed]
  41. Guo, Y.; You, J.; Zhang, Y.; Liu, W.-S.; Huang, Y.-Y.; Zhang, Y.-R.; Zhang, W.; Dong, Q.; Feng, J.-F.; Cheng, W.; et al. Plasma proteomic profiles predict future dementia in healthy adults. Nat. Aging 2024, 4, 247–260. [Google Scholar] [CrossRef] [PubMed]
  42. Heo, G.; Xu, Y.; Wang, E.; Ali, M.; Oh, H.S.-H.; Moran-Losada, P.; Anastasi, F.; González Escalante, A.; Puerta, R.; Song, S.; et al. Large-scale plasma proteomic profiling unveils diagnostic biomarkers and pathways for Alzheimer’s disease. Nat. Aging 2025, 5, 1114–1131. [Google Scholar] [CrossRef]
  43. Imam, F.; Saloner, R.; Vogel, J.W.; Krish, V.; Abdel-Azim, G.; Ali, M.; An, L.; Anastasi, F.; Bennett, D.; Pichet Binette, A.; et al. The Global Neurodegeneration Proteomics Consortium: Biomarker and drug target discovery for common neurodegenerative diseases and aging. Nat. Med. 2025, 31, 2556–2566. [Google Scholar] [CrossRef]
  44. O’Bryant, S.E.; Edwards, M.; Johnson, L.; Hall, J.; Villarreal, A.E.; Britton, G.B.; Quiceno, M.; Cullum, C.M.; Graff-Radford, N.R. A blood screening test for Alzheimer’s disease. Alzheimer’s Dement. 2016, 3, 83–90. [Google Scholar] [CrossRef] [PubMed]
  45. Llano, D.A.; Devanarayan, V.; Simon, A.J. Alzheimer’s Disease Neuroimaging Initiative (ADNI) Evaluation of plasma proteomic data for Alzheimer disease state classification and for the prediction of progression from mild cognitive impairment to Alzheimer disease. Alzheimer Dis. Assoc. Disord. 2013, 27, 233–243. [Google Scholar] [CrossRef] [PubMed]
  46. Morgan, A.R.; Touchard, S.; Leckey, C.; O’Hagan, C.; Nevado-Holgado, A.J.; Barkhof, F.; Bertram, L.; Blin, O.; Bos, I.; Dobricic, V.; et al. Inflammatory biomarkers in Alzheimer’s disease plasma. Alzheimer’s Dement. 2019, 15, 776–787. [Google Scholar] [CrossRef] [PubMed]
  47. Jiang, Y.; Zhou, X.; Ip, F.C.; Chan, P.; Chen, Y.; Lai, N.C.H.; Cheung, K.; Lo, R.M.N.; Tong, E.P.S.; Wong, B.W.Y.; et al. Large-scale plasma proteomic profiling identifies a high-performance biomarker panel for Alzheimer’s disease screening and staging. Alzheimers Dement. 2022, 18, 88–102. [Google Scholar] [CrossRef]
  48. Zhao, X.; Kang, J.; Svetnik, V.; Warden, D.; Wilcock, G.; David Smith, A.; Savage, M.J.; Laterza, O.F. A Machine Learning Approach to Identify a Circulating MicroRNA Signature for Alzheimer Disease. J. Appl. Lab. Med. 2020, 5, 15–28. [Google Scholar] [CrossRef]
  49. Stamate, D.; Kim, M.; Proitsi, P.; Westwood, S.; Baird, A.; Nevado-Holgado, A.; Hye, A.; Bos, I.; Vos, S.J.B.; Vandenberghe, R.; et al. A metabolite-based machine learning approach to diagnose Alzheimer-type dementia in blood: Results from the European Medical Information Framework for Alzheimer disease biomarker discovery cohort. Alzheimer’s Dement. 2019, 5, 933–938. [Google Scholar] [CrossRef]
  50. Shi, L.; Buckley, N.J.; Bos, I.; Engelborghs, S.; Sleegers, K.; Frisoni, G.B.; Wallin, A.; Lleo, A.; Popp, J.; Martinez-Lage, P.; et al. Plasma Proteomic Biomarkers Relating to Alzheimer’s Disease: A Meta-Analysis Based on Our Own Studies. Front. Aging Neurosci. 2021, 13, 712545. [Google Scholar] [CrossRef]
  51. Chen, M.; Xia, W. Proteomic Profiling of Plasma and Brain Tissue from Alzheimer’s Disease Patients Reveals Candidate Network of Plasma Biomarkers. J. Alzheimer’s Dis. 2020, 76, 349–368. [Google Scholar] [CrossRef]
  52. Elahi, F.M.; Casaletto, K.B.; La Joie, R.; Walters, S.M.; Harvey, D.; Wolf, A.; Edwards, L.; Rivera-Contreras, W.; Karydas, A.; Cobigo, Y.; et al. Plasma biomarkers of astrocytic and neuronal dysfunction in early- and late-onset Alzheimer’s disease. Alzheimer’s Dement. 2020, 16, 681–695. [Google Scholar] [CrossRef]
  53. Cai, X.; Huang, A.; Xu, S. Fast empirical Bayesian LASSO for multiple quantitative trait locus mapping. BMC Bioinform. 2011, 12, 211. [Google Scholar] [CrossRef]
  54. Huang, A.; Xu, S.; Cai, X. Empirical Bayesian elastic net for multiple quantitative trait locus mapping. Heredity 2015, 114, 107–115. [Google Scholar] [CrossRef] [PubMed]
  55. Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Jeju Island, Republic of Korea, 9–13 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
  56. Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.-Y. LightGBM: A highly efficient gradient boosting decision tree. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; pp. 3149–3157. [Google Scholar] [CrossRef]
  57. Lundberg, S.M.; Lee, S.-I. A unified approach to interpreting model predictions. In Proceedings of the 31st International Conference on Neural Information Processing Systems (NIPS’17), Long Beach, CA, USA, 4–9 December 2017; Curran Associates Inc.: Red Hook, NY, USA, 2017; pp. 4768–4777. [Google Scholar]
  58. Lundberg, S.M.; Nair, B.; Vavilala, M.S.; Horibe, M.; Eisses, M.J.; Adams, T.; Liston, D.E.; Low, D.K.; Newman, S.F.; Kim, J.; et al. Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nat. Biomed. Eng. 2018, 2, 749–760. [Google Scholar] [CrossRef] [PubMed]
  59. Arik, S.O.; Pfister, T. TabNet: Attentive Interpretable Tabular Learning. arXiv 2020. [Google Scholar] [CrossRef]
  60. Hollmann, N.; Müller, S.; Eggensperger, K.; Hutter, F. TabPFN: A Transformer That Solves Small Tabular Classification Problems in a Second. In In Proceedings of the Eleventh International Conference on Learning Representations, Kigali, Rwanda, 1–5 May 2024; Available online: https://openreview.net/forum?id=cp5PvcI6w8_ (accessed on 15 May 2024).
  61. Hollmann, N.; Müller, S.; Purucker, L.; Krishnakumar, A.; Körfer, M.; Hoo, S.B.; Schirrmeister, R.T.; Hutter, F. Accurate predictions on small data with a tabular foundation model. Nature 2025, 637, 319–326. [Google Scholar] [CrossRef]
  62. Horvath, S. DNA methylation age of human tissues and cell types. Genome Biol. 2013, 14, R115, Erratum in Genome Biol. 2015, 16, 96. https://doi.org/10.1186/s13059-015-0649-6. [Google Scholar] [CrossRef]
  63. Hannum, G.; Guinney, J.; Zhao, L.; Zhang, L.; Hughes, G.; Sadda, S.; Klotzle, B.; Bibikova, M.; Fan, J.B.; Gao, Y.; et al. Genome-wide methylation profiles reveal quantitative views of human aging rates. Mol. Cell 2013, 49, 359–367. [Google Scholar] [CrossRef]
  64. Levine, M.E.; Lu, A.T.; Quach, A.; Chen, B.H.; Assimes, T.L.; Bandinelli, S.; Hou, L.; Baccarelli, A.A.; Stewart, J.D.; Li, Y.; et al. An epigenetic biomarker of aging for lifespan and healthspan. Aging 2018, 10, 573–591. [Google Scholar] [CrossRef]
  65. Belsky, D.W.; Caspi, A.; Corcoran, D.L.; Sugden, K.; Poulton, R.; Arseneault, L.; Baccarelli, A.; Chamarti, K.; Gao, X.; Hannon, E.; et al. DunedinPACE, a DNA methylation biomarker of the pace of aging. eLife 2022, 11, e73420. [Google Scholar] [CrossRef]
  66. Duran, I.; Tsurumi, A. Evaluating transcriptional alterations associated with ageing and developing age prediction models based on the human blood transcriptome. Biogerontology 2025, 26, 86. [Google Scholar] [CrossRef]
  67. Tanaka, T.; Biancotto, A.; Moaddel, R.; Moore, A.; Gonzalez-Freire, M.; Aon, M.; Candia, J.; Zhang, P.; Cheung, F.; Fantoni, G.; et al. Plasma proteomic signature of age in healthy humans. Aging Cell 2018, 17, e12799. [Google Scholar] [CrossRef]
  68. Lehallier, B.; Gate, D.; Schaum, N.; Nanasi, T.; Lee, S.E.; Yousef, H.; Moran Losada, P.; Berdnik, D.; Keller, A.; Verghese, J.; et al. Undulating changes in human plasma proteome profiles across the lifespan. Nat. Med. 2019, 25, 1843–1850. [Google Scholar] [CrossRef]
  69. Johnson, A.A.; Shokhirev, M.N.; Wyss-Coray, T.; Lehallier, B. Systematic review and analysis of human proteomics aging studies unveils a novel proteomic aging clock and identifies key processes that change with age. Ageing Res. Rev. 2020, 60, 101070. [Google Scholar] [CrossRef] [PubMed]
  70. Coenen, L.; Lehallier, B.; de Vries, H.E.; Middeldorp, J. Markers of aging: Unsupervised integrated analyses of the human plasma proteome. Front. Aging 2023, 4, 1112109. [Google Scholar] [CrossRef] [PubMed]
  71. Argentieri, M.A.; Xiao, S.; Bennett, D.; Winchester, L.; Nevado-Holgado, A.J.; Ghose, U.; Albukhari, A.; Yao, P.; Mazidi, M.; Lv, J.; et al. Proteomic aging clock predicts mortality and risk of common age-related diseases in diverse populations. Nat. Med. 2024, 30, 2450–2460. [Google Scholar] [CrossRef] [PubMed]
  72. Tacutu, R.; Thornton, D.; Johnson, E.; Budovsky, A.; Barardo, D.; Craig, T.; Diana, E.; Lehmann, G.; Toren, D.; Wang, J.; et al. Human Ageing Genomic Resources: New and updated databases. Nucleic Acids Res. 2018, 46, D1083–D1090. [Google Scholar] [CrossRef]
  73. de Magalhães, J.P.; Abidi, Z.; Dos Santos, G.A.; Avelar, R.A.; Barardo, D.; Chatsirisupachai, K.; Clark, P.; De-Souza, E.A.; Johnson, E.J.; Lopes, I.; et al. Human Ageing Genomic Resources: Updates on key databases in ageing research. Nucleic Acids Res. 2024, 52, D900–D908. [Google Scholar] [CrossRef]
  74. Wang, L.; Chiang, H.-C.; Wu, W.; Liang, B.; Xie, Z.; Yao, X.; Ma, W.; Du, S.; Zhong, Y. Epidermal growth factor receptor is a preferred target for treating Amyloid-β–induced memory loss. Proc. Natl. Acad. Sci. USA 2012, 109, 16743–16748. [Google Scholar] [CrossRef]
  75. Smyth, L.C.D.; Highet, B.; Jansson, D.; Wu, J.; Rustenhoven, J.; Aalderink, M.; Tan, A.; Li, S.; Johnson, R.; Coppieters, N.; et al. Characterisation of PDGF-BB:PDGFRβ signalling pathways in human brain pericytes: Evidence of disruption in Alzheimer’s disease. Commun. Biol. 2022, 5, 235. [Google Scholar] [CrossRef]
  76. Spangenberg, E.; Severson, P.L.; Hohsfield, L.A.; Crapser, J.; Zhang, J.; Burton, E.A.; Zhang, Y.; Spevak, W.; Lin, J.; Phan, N.Y.; et al. Sustained microglial depletion with CSF1R inhibitor impairs parenchymal plaque development in an Alzheimer’s disease model. Nat. Commun. 2019, 10, 3758. [Google Scholar] [CrossRef]
  77. Liu, Y.; Given, K.S.; Dickson, E.L.; Owens, G.P.; Macklin, W.B.; Bennett, J.L. Concentration-dependent effects of CSF1R inhibitors on oligodendrocyte progenitor cells ex vivo and in vivo. Exp. Neurol. 2019, 318, 32–41. [Google Scholar] [CrossRef]
  78. Ou, W.; Yang, J.; Simanauskaite, J.; Choi, M.; Castellanos, D.M.; Chang, R.; Sun, J.; Jagadeesan, N.; Parfitt, K.D.; Cribbs, D.H.; et al. Biologic TNF-α inhibitors reduce microgliosis, neuronal loss, and tau phosphorylation in a transgenic mouse model of tauopathy. J. Neuroinflamm. 2021, 18, 312. [Google Scholar] [CrossRef] [PubMed]
  79. Chang, R.; Knox, J.; Chang, J.; Derbedrossian, A.; Vasilevko, V.; Cribbs, D.; Boado, R.J.; Pardridge, W.M.; Sumbria, R.K. Blood-Brain Barrier Penetrating Biologic TNF-α Inhibitor for Alzheimer’s Disease. Mol. Pharm. 2017, 14, 2340–2349. [Google Scholar] [CrossRef] [PubMed]
  80. Song, S.-H.; Kim, K.L.; Lee, K.-A.; Suh, W. Tie1 regulates the Tie2 agonistic role of angiopoietin-2 in human lymphatic endothelial cells. Biochem. Biophys. Res. Commun. 2012, 419, 281–286. [Google Scholar] [CrossRef] [PubMed]
  81. Gurnik, S.; Devraj, K.; Macas, J.; Yamaji, M.; Starke, J.; Scholz, A.; Sommer, K.; Di Tacchio, M.; Vutukuri, R.; Beck, H.; et al. Angiopoietin-2-induced blood–brain barrier compromise and increased stroke size are rescued by VE-PTP-dependent restoration of Tie2 signaling. Acta Neuropathol. 2016, 131, 753–773. [Google Scholar] [CrossRef]
  82. Lobov, I.B.; Brooks, P.C.; Lang, R.A. Angiopoietin-2 displays VEGF-dependent modulation of capillary structure and endothelial cell survival in vivo. Proc. Natl. Acad. Sci. USA 2002, 99, 11205–11210. [Google Scholar] [CrossRef]
  83. Van Hulle, C.; Ince, S.; Okonkwo, O.C.; Bendlin, B.B.; Johnson, S.C.; Carlsson, C.M.; Asthana, S.; Love, S.; Blennow, K.; Zetterberg, H.; et al. Elevated CSF angiopoietin-2 correlates with blood-brain barrier leakiness and markers of neuronal injury in early Alzheimer’s disease. Transl. Psychiatry 2024, 14, 3. [Google Scholar] [CrossRef]
  84. Duche, A.H.; Tan, O.; Baskys, A.; Sumbria, R.K.; Roosan, M.R. Predictive gene expression signatures for Alzheimer’s disease using post-mortem brain tissue. Front. Aging Neurosci. 2025, 17, 1591946. [Google Scholar] [CrossRef]
  85. Ince, S.; Love, S.; Minors, J.S. Dysregulated Angiopoietin-Tie signalling contributes to neurovascular dysfunction in Alzheimer’s disease. Alzheimer’s Dement. 2025, 20, e095139. [Google Scholar] [CrossRef]
  86. Sheikh, A.M.; Yano, S.; Tabassum, S.; Mitaki, S.; Michikawa, M.; Nagai, A. Alzheimer’s Amyloid β Peptide Induces Angiogenesis in an Alzheimer’s Disease Model Mouse through Placental Growth Factor and Angiopoietin 2 Expressions. Int. J. Mol. Sci. 2023, 24, 4510. [Google Scholar] [CrossRef]
  87. Skaaraas, G.H.E.S.; Melbye, C.; Puchades, M.A.; Leung, D.S.Y.; Jacobsen, Ø.; Rao, S.B.; Ottersen, O.P.; Leergaard, T.B.; Torp, R. Cerebral Amyloid Angiopathy in a Mouse Model of Alzheimer’s Disease Associates with Upregulated Angiopoietin and Downregulated Hypoxia-Inducible Factor. J. Alzheimer’s Dis. 2021, 83, 1651–1663. [Google Scholar] [CrossRef]
  88. Scholz, A.; Plate, K.H.; Reiss, Y. Angiopoietin-2: A multifaceted cytokine that functions in both angiogenesis and inflammation. Ann. N. Y. Acad. Sci. 2015, 1347, 45–51. [Google Scholar] [CrossRef]
  89. Kim, M.; Allen, B.; Korhonen, E.A.; Nitschké, M.; Yang, H.W.; Baluk, P.; Saharinen, P.; Alitalo, K.; Daly, C.; Thurston, G.; et al. Opposing actions of angiopoietin-2 on Tie2 signaling and FOXO1 activation. J. Clin. Investig. 2016, 126, 3511–3525. [Google Scholar] [CrossRef] [PubMed]
  90. Lim, N.S.; Swanson, C.R.; Cherng, H.; Unger, T.L.; Xie, S.X.; Weintraub, D.; Marek, K.; Stern, M.B.; Siderowf, A.; Trojanowski, J.Q.; et al. Plasma EGF and cognitive decline in Parkinson’s disease and Alzheimer’s disease. Ann. Clin. Transl. Neurol. 2016, 3, 346–355. [Google Scholar] [CrossRef] [PubMed]
  91. Björkqvist, M.; Ohlsson, M.; Minthon, L.; Hansson, O. Evaluation of a previously suggested plasma biomarker panel to identify Alzheimer’s disease. PLoS ONE 2012, 7, e29868. [Google Scholar] [CrossRef] [PubMed]
  92. Marksteiner, J.; Kemmler, G.; Weiss, E.M.; Knaus, G.; Ullrich, C.; Mechtcheriakov, S.; Oberbauer, H.; Auffinger, S.; Hinterholzl, J.; Hinterhuber, H.; et al. Five out of 16 plasma signaling proteins are enhanced in plasma of patients with mild cognitive impairment and Alzheimer’s disease. Neurobiol. Aging 2011, 32, 539–540. [Google Scholar] [CrossRef]
  93. Thomas, R.; Zuchowska, P.; Morris, A.W.J.; Marottoli, F.M.; Sunny, S.; Deaton, R.; Gann, P.H.; Tai, L.M. Epidermal growth factor prevents APOE4 and amyloid-beta-induced cognitive and cerebrovascular deficits in female mice. Acta Neuropathol. Commun. 2016, 4, 111. [Google Scholar] [CrossRef]
  94. Thomas, R.; Morris, A.W.J.; Tai, L.M. Epidermal growth factor prevents APOE4-induced cognitive and cerebrovascular deficits in female mice. Heliyon 2017, 3, e00319. [Google Scholar] [CrossRef]
  95. Koster, K.P.; Thomas, R.; Morris, A.W.; Tai, L.M. Epidermal growth factor prevents oligomeric amyloid-β induced angiogenesis deficits in vitro. J. Cereb. Blood Flow. Metab. 2016, 36, 1865–1871. [Google Scholar] [CrossRef]
  96. Jayaswamy, P.K.; Vijaykrishnaraj, M.; Patil, P.; Alexander, L.M.; Kellarai, A.; Shetty, P. Implicative role of epidermal growth factor receptor and its associated signaling partners in the pathogenesis of Alzheimer’s disease. Ageing Res. Rev. 2023, 83, 101791. [Google Scholar] [CrossRef]
  97. Kim, J.; Kim, S.-J.; Jeong, H.-R.; Park, J.-H.; Moon, M.; Hoe, H.-S. Inhibiting EGFR/HER-2 ameliorates neuroinflammatory responses and the early stage of tau pathology through DYRK1A. Front. Immunol. 2022, 13, 903309. [Google Scholar] [CrossRef]
  98. Italiani, P.; Puxeddu, I.; Napoletano, S.; Scala, E.; Melillo, D.; Manocchio, S.; Angiolillo, A.; Migliorini, P.; Boraschi, D.; Vitale, E.; et al. Circulating levels of IL-1 family cytokines and receptors in Alzheimer’s disease: New markers of disease progression? J. Neuroinflamm. 2018, 15, 342. [Google Scholar] [CrossRef]
  99. Mahdavi, M.; Karima, S.; Rajaei, S.; Aghamolaii, V.; Ghahremani, H.; Ataei, R.; Tehrani, H.S.; Baram, S.M.; Tafakhori, A.; Safarpour Lima, B.; et al. Plasma Cytokines Profile in Subjects with Alzheimer’s Disease: Interleukin 1 Alpha as a Candidate for Target Therapy. Galen. Med. J. 2021, 10, e1974. [Google Scholar] [CrossRef]
  100. Griffin, W.S.; Stanley, L.C.; Ling, C.; White, L.; MacLeod, V.; Perrot, L.J.; White, C.L.; Araoz, C. Brain interleukin 1 and S-100 immunoreactivity are elevated in Down syndrome and Alzheimer disease. Proc. Natl. Acad. Sci. USA 1989, 86, 7611–7615. [Google Scholar] [CrossRef]
  101. Mun, M.-J.; Kim, J.-H.; Choi, J.-Y.; Jang, W.-C. Genetic polymorphisms of interleukin genes and the risk of Alzheimer’s disease: An update meta-analysis. Meta Gene 2016, 8, 1–10. [Google Scholar] [CrossRef]
  102. Grimaldi, L.M.; Casadei, V.M.; Ferri, C.; Veglia, F.; Licastro, F.; Annoni, G.; Biunno, I.; De Bellis, G.; Sorbi, S.; Mariani, C.; et al. Association of early-onset Alzheimer’s disease with an interleukin-1alpha gene polymorphism. Ann. Neurol. 2000, 47, 361–365. [Google Scholar] [CrossRef]
  103. Du, Y.; Dodel, R.C.; Eastwood, B.J.; Bales, K.R.; Gao, F.; Lohmüller, F.; Müller, U.; Kurz, A.; Zimmer, R.; Evans, R.M.; et al. Association of an interleukin 1 alpha polymorphism with Alzheimer’s disease. Neurology 2000, 55, 480–483. [Google Scholar] [CrossRef]
  104. Nicoll, J.A.; Mrak, R.E.; Graham, D.I.; Stewart, J.; Wilcock, G.; MacGowan, S.; Esiri, M.M.; Murray, L.S.; Dewar, D.; Love, S.; et al. Association of interleukin-1 gene polymorphisms with Alzheimer’s disease. Ann. Neurol. 2000, 47, 365–368. [Google Scholar] [CrossRef] [PubMed]
  105. Rainero, I.; Bo, M.; Ferrero, M.; Valfrè, W.; Vaula, G.; Pinessi, L. Association between the interleukin-1α gene and Alzheimer’s disease: A meta-analysis. Neurobiol. Aging 2004, 25, 1293–1298. [Google Scholar] [CrossRef] [PubMed]
  106. Griffin, W.S.T.; Shenga, J.G.; Gentleman, S.M.; Graham, D.I.; Mrak, R.E.; Roberts, G.W. Microglial interleukin-1α expression in human head injury: Correlations with neuronal and neuritic β-amyloid precursor protein expression. Neurosci. Lett. 1994, 176, 133–136. [Google Scholar] [CrossRef] [PubMed]
  107. Rogers, J.T.; Leiter, L.M.; McPhee, J.; Cahill, C.M.; Zhan, S.-S.; Potter, H.; Nilsson, L.N.G. Translation of the Alzheimer Amyloid Precursor Protein mRNA Is Up-regulated by Interleukin-1 through 5′-Untranslated Region Sequences. J. Biol. Chem. 1999, 274, 6421–6431. [Google Scholar] [CrossRef]
  108. Bandyopadhyay, S.; Hartley, D.M.; Cahill, C.M.; Lahiri, D.K.; Chattopadhyay, N.; Rogers, J.T. Interleukin-1α stimulates non-amyloidogenic pathway by α-secretase (ADAM-10 and ADAM-17) cleavage of APP in human astrocytic cells involving p38 MAP kinase. J. Neurosci. Res. 2006, 84, 106–118. [Google Scholar] [CrossRef]
  109. Chao, C.C.; Hu, S.; Sheng, W.S.; Bu, D.; Bukrinsky, M.I.; Peterson, P.K. Cytokine-stimulated astrocytes damage human neurons via a nitric oxide mechanism. Glia 1996, 16, 276–284. [Google Scholar] [CrossRef]
  110. De Kort, A.M.; Kuiperij, H.B.; Kersten, I.; Versleijen, A.A.M.; Schreuder, F.H.B.M.; Van Nostrand, W.E.; Greenberg, S.M.; Klijn, C.J.M.; Claassen, J.A.H.R.; Verbeek, M.M. Normal cerebrospinal fluid concentrations of PDGFRβ in patients with cerebral amyloid angiopathy and Alzheimer’s disease. Alzheimer’s Dement. 2022, 18, 1788–1796. [Google Scholar] [CrossRef]
  111. Masliah, E.; Mallory, M.; Alford, M.; Deteresa, R.; Saitoh, T. PDGF is associated with neuronal and glial alterations of Alzheimer’s disease. Neurobiol. Aging 1995, 16, 549–556. [Google Scholar] [CrossRef] [PubMed]
  112. Montagne, A.; Barnes, S.R.; Sweeney, M.D.; Halliday, M.R.; Sagare, A.P.; Zhao, Z.; Toga, A.W.; Jacobs, R.E.; Liu, C.Y.; Amezcua, L.; et al. Blood-brain barrier breakdown in the aging human hippocampus. Neuron 2015, 85, 296–302. [Google Scholar] [CrossRef] [PubMed]
  113. Bell, R.D.; Winkler, E.A.; Sagare, A.P.; Singh, I.; LaRue, B.; Deane, R.; Zlokovic, B.V. Pericytes Control Key Neurovascular Functions and Neuronal Phenotype in the Adult Brain and during Brain Aging. Neuron 2010, 68, 409–427. [Google Scholar] [CrossRef]
  114. Armulik, A.; Genové, G.; Mäe, M.; Nisancioglu, M.H.; Wallgard, E.; Niaudet, C.; He, L.; Norlin, J.; Lindblom, P.; Strittmatter, K.; et al. Pericytes regulate the blood–brain barrier. Nature 2010, 468, 557–561. [Google Scholar] [CrossRef]
  115. Iihara, K.; Hashimoto, N.; Tsukahara, T.; Sakata, M.; Yanamoto, H.; Taniguchi, T. Platelet-derived growth factor-BB, but not -AA, prevents delayed neuronal death after forebrain ischemia in rats. J. Cereb. Blood Flow Metab. 1997, 17, 1097–1106. [Google Scholar] [CrossRef]
  116. Kawabe, T.; Wen, T.-C.; Matsuda, S.; Ishihara, K.; Otsuka, H.; Sakanaka, M. Platelet-derived growth factor prevents ischemia-induced neuronal injuries in vivo. Neurosci. Res. 1997, 29, 335–343. [Google Scholar] [CrossRef]
  117. Padel, T.; Özen, I.; Boix, J.; Barbariga, M.; Gaceb, A.; Roth, M.; Paul, G. Platelet-derived growth factor-BB has neurorestorative effects and modulates the pericyte response in a partial 6-hydroxydopamine lesion mouse model of Parkinson’s disease. Neurobiol. Dis. 2016, 94, 95–105. [Google Scholar] [CrossRef]
  118. Chen, H.; Teng, Y.; Chen, X.; Liu, Z.; Geng, F.; Liu, Y.; Jiang, H.; Wang, Z.; Yang, L. Platelet-derived growth factor (PDGF)-BB protects dopaminergic neurons via activation of Akt/ERK/CREB pathways to upregulate tyrosine hydroxylase. CNS Neurosci. Ther. 2021, 27, 1300–1312. [Google Scholar] [CrossRef]
  119. Liu, G.; Wang, J.; Wei, Z.; Fang, C.-L.; Shen, K.; Qian, C.; Qi, C.; Li, T.; Gao, P.; Wong, P.C.; et al. Elevated PDGF-BB from Bone Impairs Hippocampal Vasculature by Inducing PDGFRβ Shedding from Pericytes. Adv. Sci. 2023, 10, 2206938. [Google Scholar] [CrossRef]
  120. Liu, G.; Shu, W.; Chen, Y.; Fu, Y.; Fang, S.; Zheng, H.; Cheng, W.; Lin, Q.; Hu, Y.; Jiang, N.; et al. Bone-derived PDGF-BB enhances hippocampal non-specific transcytosis through microglia-endothelial crosstalk in HFD-induced metabolic syndrome. J. Neuroinflamm. 2024, 21, 111. [Google Scholar] [CrossRef] [PubMed]
  121. Huang, R.-P. Cytokine Protein Arrays. In Protein Arrays: Methods and Protocols; Fung, E.T., Ed.; Methods in Molecular Biology; Humana Press: Totowa, NJ, USA, 2004; pp. 215–231. ISBN 978-1-59259-759-8. [Google Scholar]
  122. Van Rossum, G.; Drake, F., Jr. Python Reference Manual; Centrum voor Wiskunde en Informatica Amsterdam: Amsterdam, The Netherlands, 1995. [Google Scholar]
  123. R Core Team. R: A Language and Environment for Statistical Computing; R Foundation for Statistical Computing: Vienna, Austria, 2023; Available online: https://www.R-project.org/ (accessed on 1 September 2023).
  124. Huang, A.; Liu, D. EBglmnet: A comprehensive R package for sparse generalized linear regression models. Bioinformatics 2021, 37, 1627–1629. [Google Scholar] [CrossRef] [PubMed]
  125. Github. Github-xgboost. Available online: https://github.com/dmlc/xgboost/tree/master (accessed on 6 February 2025).
  126. Github. Github-lightgbm. Available online: https://github.com/microsoft/LightGBM/tree/master (accessed on 6 February 2025).
  127. Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]
  128. Nogueira, F. Bayesian Optimization: Open Source Constrained Global Optimization Tool for Python. 2014. Available online: https://github.com/bayesian-optimization/BayesianOptimization (accessed on 10 February 2025).
  129. Github. Github-shap. Available online: https://github.com/shap/shap (accessed on 7 February 2025).
  130. Github. Github-pytorch_tabnet. Available online: https://github.com/dreamquark-ai/tabnet/tree/develop (accessed on 7 February 2025).
  131. Paszke, A.; Gross, S.; Massa, F.; Lerer, A.; Bradbury, J.; Chanan, G.; Killeen, T.; Lin, Z.; Gimelshein, N.; Antiga, L.; et al. PyTorch: An Imperative Style, High-Performance Deep Learning Library. Adv. Neural Inf. Process. Syst. 2019, 32, 8024–8035. [Google Scholar]
  132. Github. Github-PriorLabs/TabPFN: Foundation Model for Tabular Data. Available online: https://github.com/PriorLabs/TabPFN (accessed on 10 February 2025).
  133. Github. Github-PriorLabs/TabPFN-extensions. Available online: https://github.com/PriorLabs/tabpfn-extensions (accessed on 10 February 2025).
  134. Robin, X.; Turck, N.; Hainard, A.; Tiberti, N.; Lisacek, F.; Sanchez, J.C.; Muller, M. pROC: An open-source package for R and S+ to analyze and compare ROC curves. BMC Bioinform. 2011, 12, 77. [Google Scholar] [CrossRef]
  135. Stevenson, M.; Sergeant, E. epiR: Tools for the Analysis of Epidemiological Data. R Package Version 2.0.78. 2024. Available online: https://CRAN.R-project.org/package=epiR (accessed on 1 September 2023).
  136. Gao, C.-H.; Dusa, A. ggVennDiagram: A ‘ggplot2’ Implement of Venn Diagram. R Package Version 1.5.2. 2024. Available online: https://github.com/gaospecial/ggVennDiagram (accessed on 7 April 2025).
  137. Snel, B.; Lehmann, G.; Bork, P.; Huynen, M. STRING: A web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res. 2000, 28, 3442–3444. [Google Scholar] [CrossRef]
  138. Szklarczyk, D.; Kirsch, R.; Koutrouli, M.; Nastou, K.; Mehryary, F.; Hachilif, R.; Gable, A.L.; Fang, T.; Doncheva, N.T.; Pyysalo, S.; et al. The STRING database in 2023: Protein–protein association networks and functional enrichment analyses for any sequenced genome of interest. Nucleic Acids Res. 2023, 51, D638–D646. [Google Scholar] [CrossRef]
  139. Pagès, H.; Carlson, M.; Falcon, S.; Li, N. AnnotationDbi: Manipulation of SQLite-Based Annotations in Bioconductor. R Package Version 1.66.0. 2024. Available online: https://bioconductor.org/packages/AnnotationDbi (accessed on 22 December 2024).
  140. Carlson, M. org.Hs.eg.db: Genome Wide Annotation for Human. R Package Version 3.20.0. 2024. Available online: https://bioconductor.org/packages/org.Hs.eg.db (accessed on 22 December 2024).
Figure 1. ROC curves and AUROC [95%CI] of the models developed with (A) EBlasso showing (B) coefficient estimates, (C) EBEN with (D) coefficient estimates, (E) XGBoost with (F) mean SHAP scores, (G) LightGBM with (H) mean SHAP scores, (I) TabNet with (J) top feature importance scores above 0.01, and (K) TabPFN with (L) top mean SHAP scores above 0.01.
Figure 1. ROC curves and AUROC [95%CI] of the models developed with (A) EBlasso showing (B) coefficient estimates, (C) EBEN with (D) coefficient estimates, (E) XGBoost with (F) mean SHAP scores, (G) LightGBM with (H) mean SHAP scores, (I) TabNet with (J) top feature importance scores above 0.01, and (K) TabPFN with (L) top mean SHAP scores above 0.01.
Ijms 26 11673 g001
Figure 2. Overlaps across the predictive proteins included in the different models developed, and functional assessment. (A) Venn diagram showing overlaps in predictive proteins included in each of the classification models developed. The numbers show the total number of overlapping proteins, and the colors correspond to the number, with red being a higher count, as shown in the legend. (B) Gene ontology and (C) KEGG pathway enrichment analysis result of the 13 proteins identified by at least three of the six models developed, with (D) STRING PPI network analysis.
Figure 2. Overlaps across the predictive proteins included in the different models developed, and functional assessment. (A) Venn diagram showing overlaps in predictive proteins included in each of the classification models developed. The numbers show the total number of overlapping proteins, and the colors correspond to the number, with red being a higher count, as shown in the legend. (B) Gene ontology and (C) KEGG pathway enrichment analysis result of the 13 proteins identified by at least three of the six models developed, with (D) STRING PPI network analysis.
Ijms 26 11673 g002
Figure 3. Assessment of the relevance of underlying molecular mechanisms of aging, where the AD biomarkers identified by the ML models were significantly enriched with previously described aging-related biomarkers.
Figure 3. Assessment of the relevance of underlying molecular mechanisms of aging, where the AD biomarkers identified by the ML models were significantly enriched with previously described aging-related biomarkers.
Ijms 26 11673 g003
Table 1. Previously reported plasma or serum molecular biomarker classification or prognostic models aside from those based only on Aβ and Tau peptides. The original study from which the dataset for this study was derived (Ray S. et al., 2007 [35]) is indicated in bold.
Table 1. Previously reported plasma or serum molecular biomarker classification or prognostic models aside from those based only on Aβ and Tau peptides. The original study from which the dataset for this study was derived (Ray S. et al., 2007 [35]) is indicated in bold.
AD vs. CN Classification Models
PAM ML 18 Plasma ProteinsAccuracy = 89% (Training and Test)Ray S. et al., 2007 [35]
SVM ML 7 to 10 plasma proteinsAUC = 0.86 to 0.89Eke CS. et al., 2021 [37]
Four models with 5 to 14 plasma proteins AUC = 0.759 to 0.838 (CV),
0.737 to 0.842 (ext. valid.)
Llano DA. et al., 2013 [45]
9 plasma proteinsAUC = 0.79 (training and ext. valid.)Sung YJ. et al., 2023 [34]
5 plasma proteins + age + APOE genotypeAUC = 0.79, 0.81 (ext. valid.)Morgan AR. et al., 2019 [46]
Ridge + SVM ML 11 plasma proteins + ageAUC = 0.891 (test)Ashton NJ. et al., 2019 [38]
19 hub plasma proteinsAUC = 0.969 (ext. valid.)Jiang Y. et al., 2022 [47]
SVM ML 4 plasma + 6 serum proteinsAUC = 99.98% (training), 93.96% (test)Zhang F. et al., 2022 [39]
LGBM ML 4 plasma proteins + demographic + cognitionAUC = 0.913Guo Y. et al., 2024 [41]
Lasso ML 7 plasma proteinsAUC = 0.796 (test), 0.721 (replication), 0.715 and 0.757 (ext. valid.)Heo G. et al., 2025 [42]
Random Forest ML 14 serum proteinsAUC = 0.91 (training), 0.88 (test) O’Bryant SE. et al., 2010 [36]
21 serum proteins + age + sex + educationAUC = 0.89O’Bryant SE. et al., 2016 [44]
XGBoost ML plasma metaboliteAUC = 0.88 (test)Stamate D. et al., 2019 [49]
12 serum miRNA Accuracy = 76%Zhao X. et al., 2020 [48]
AD vs. MCI Classification or Prognostic Models
3 plasma proteinsAUC = 0.74 (training),
0.67 (ext. valid.)
Morgan AR. et. al., 2019 [46]
SVM ML 7 to 10 plasma proteinsAUC = 0.80 to 0.83Eke CS. et. al., 2021 [37]
Lasso-ML selected 12 plasma proteins + plasma Aβ + plasma pTau + baseline cognitive measures + age + sex + education + APOE genotypeAUC = 0.88, accuracy = 86.7% (test)—prognostic model for MCI-progressors vs. MCI-stableKivisäkk P. et. al., 2022 [40]
Table 2. Sensitivity/recall, specificity, Positive Predictive Value (PPV)/precision, and Negative Predictive Value (NPV) with 95% CI of each of the prediction models for AD vs. CN.
Table 2. Sensitivity/recall, specificity, Positive Predictive Value (PPV)/precision, and Negative Predictive Value (NPV) with 95% CI of each of the prediction models for AD vs. CN.
EBlasso
(7 Proteins)
EBEN
(9 Proteins)
XGBoost
(36 Proteins)
LightGBM
(20 Proteins)
TabNet
(27 Proteins)
TabPFN
(26 Proteins)
Training set
Accuracy0.916
[0.834–0.965]
0.916
[0.834–0.965]
0.988
[0.935–1.00]
0.964
[0.898–0.992]
0.831
[0.733–0.905]
0.988
[0.935–1.000]
Sensitivity/recall0.930
[0.809–0.985]
0.907
[0.779–0.974]
0.977
[0.877–0.999]
0.977
[0.877–0.999]
0.884
[0.749–0.961]
1.00
[0.918–1.000]
Specificity0.900
[0.763–0.972]
0.925
[0.796–0.984]
1.000
[0.912–1.000]
0.950
[0.831–0.994]
0.775
[0.615–0.892]
0.975
[0.868–0.999]
PPV/precision0.909
[0.783–0.975]
0.929
[0.805–0.985]
1.000
[0.916–1.000]
0.955
[0.845–0.994]
0.809
[0.667–0.909]
0.977
[0.880–0.999]
NPV0.923
[0.791–0.984]
0.902
[0.769–0.973]
0.976
[0.871–0.999]
0.974
[0.865–0.999]
0.861
[0.705–0.952]
1.00
[0.910–1.000]
Test set
Accuracy0.914
[0.830–0.965]
0.827
[0.727–0.902]
0.926
[0.846–0.972]
0.938
[0.862–0.980]
0.901
[0.815–0.956]
0.926
[0.846–0.972]
Sensitivity/recall0.929
[0.805–0.985]
0.833
[0.686–0.930]
0.929
[0.805–0.985]
0.952
[0.838–0.994]
0.929
[0.805–0.985]
0.929
[0.805–0.985]
Specificity0.897
[0.758–0.971]
0.821
[0.665–0.925]
0.923
[0.791–0.984]
0.923
[0.791–0.984]
0.872
[0.726–0.957]
0.923
[0.791–0.984]
PPV/precision0.907
[0.779–0.974]
0.833
[0.686–0.930]
0.929
[0.805–0.985]
0.930
[0.809–0.985]
0.886
[0.754–0.962]
0.929
[0.805–0.985]
NPV0.921
[0.786–0.983]
0.821
[0.665–0.925]
0.923
[0.791–0.984]
0.947
[0.823–0.994]
0.919
[0.781–0.983]
0.923
[0.791–0.984]
Table 3. Application of the different prediction models to samples labeled as other dementia (OD), and evaluating the proportion correctly classified as not AD.
Table 3. Application of the different prediction models to samples labeled as other dementia (OD), and evaluating the proportion correctly classified as not AD.
Correct ClassificationEBlasso
(7 Proteins)
EBEN
(9 Proteins)
XGBoost
(36 Proteins)
LightGBM
(20 Proteins)
TabNet
(26 Proteins)
TabPFN
(26 Proteins)
Other dementia
(n = 11)
10
(90.9%)
10
(90.9%)
11
(100%)
11
(100%)
7
(63.6%)
9
(81.8%)
Table 4. Application of the various prediction models to samples from MCI patients (n = 46) who later developed AD (after a mean follow up of 29.6 months), or who developed OD (Frontotemporal Dementia (FTD) or Lewy Body Dementia (LBD) or Vascular Dementia (VaD)) or remained as MCI (after a mean follow up of 27.8 months). For MCI-AD or MCI-OD outcomes, the number of correct predictions (%) is shown, and for MCI-MCI, the number of not AD/AD predictions is shown.
Table 4. Application of the various prediction models to samples from MCI patients (n = 46) who later developed AD (after a mean follow up of 29.6 months), or who developed OD (Frontotemporal Dementia (FTD) or Lewy Body Dementia (LBD) or Vascular Dementia (VaD)) or remained as MCI (after a mean follow up of 27.8 months). For MCI-AD or MCI-OD outcomes, the number of correct predictions (%) is shown, and for MCI-MCI, the number of not AD/AD predictions is shown.
Correct ClassificationEBlasso
(7 Proteins)
EBEN
(9 Proteins)
XGBoost
(36 Proteins)
LightGBM
(20 Proteins)
TabNet
(26 Proteins)
TabPFN
(26 Proteins)
MCI-AD
(n = 22)
19
(86.4%)
20
(90.9%)
17
(77.3%)
16
(72.7%)
14
(63.6%)
18
(81.8%)
MCI-OD-FTD
(n = 1)
1
(100%)
1
(100%)
1
(100%)
1
(100%)
1
(100%)
1
(100%)
MCI-OD-LBD
(n = 3)
3
(100%)
3
(100%)
3
(100%)
3
(100%)
3
(100%)
3
(100%)
MCI-OD-VaD
(n = 4)
4
(100%)
4
(100%)
4
(100%)
4
(100%)
4
(100%)
4
(100%)
MCI-MCI
(n = 17)
4 not AD/
13 AD
8 not AD/
9 AD
4 not AD/
13 AD
7 not AD/
10 AD
8 not AD/
9 AD
4 not AD/
13 AD
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Tsurumi, A.; Cahill, C.M.; Liu, A.J.; Chatterjee, P.; Das, S.; Kobayashi, A. Development of Plasma Protein Classification Models for Alzheimer’s Disease Using Multiple Machine Learning Approaches. Int. J. Mol. Sci. 2025, 26, 11673. https://doi.org/10.3390/ijms262311673

AMA Style

Tsurumi A, Cahill CM, Liu AJ, Chatterjee P, Das S, Kobayashi A. Development of Plasma Protein Classification Models for Alzheimer’s Disease Using Multiple Machine Learning Approaches. International Journal of Molecular Sciences. 2025; 26(23):11673. https://doi.org/10.3390/ijms262311673

Chicago/Turabian Style

Tsurumi, Amy, Catherine M. Cahill, Andy J. Liu, Pranam Chatterjee, Sudeshna Das, and Ami Kobayashi. 2025. "Development of Plasma Protein Classification Models for Alzheimer’s Disease Using Multiple Machine Learning Approaches" International Journal of Molecular Sciences 26, no. 23: 11673. https://doi.org/10.3390/ijms262311673

APA Style

Tsurumi, A., Cahill, C. M., Liu, A. J., Chatterjee, P., Das, S., & Kobayashi, A. (2025). Development of Plasma Protein Classification Models for Alzheimer’s Disease Using Multiple Machine Learning Approaches. International Journal of Molecular Sciences, 26(23), 11673. https://doi.org/10.3390/ijms262311673

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Article metric data becomes available approximately 24 hours after publication online.
Back to TopTop