1. Introduction
Subarachnoid hemorrhage (SAH), primarily caused by the rupture of intracranial aneurysms, is a critical condition characterized by high mortality rates and prolonged neurological disabilities [
1]. A significant complication arising from SAH is cerebral vasospasm (CVS), which greatly contributes to delayed cerebral ischemia and adverse patient outcomes [
2]. Therefore, early and accurate identification of patients at risk for vasospasm is crucial, especially during the initial phase post-hemorrhage, when traditional imaging or neurological evaluations often yield inconclusive results [
3].
Recent research has shown that glycosylation patterns, particularly N-linked glycans on serum glycoproteins, are altered in various cerebrovascular disorders, including ischemic stroke and peripheral vascular diseases [
4]. Profiling serum N-glycome presented a promising avenue for biomarker discovery, capturing subtle pathophysiological changes linked to inflammatory and vascular processes [
5,
6]. Building on our previously published findings regarding changes in serum N-glycosylation patterns in SAH and CVS, we have expanded our data analysis approach using interpretable machine learning techniques [
7] The study identified significant differences in the glycosylation patterns between the patient groups and healthy controls. Notably, higher levels of sialylation and specific glycan structures, such as A2G2 (biantennary-bigalactosylated), FA2G2 (biantennary-bigalactosylated fucosylated), A2G2S1 (biantennary-bigalactosylated, monosialylated), and A3G3S2 (tri-antennary tri-galactosylated bi-sialylated), were associated with SAH and CVS. These altered glycan profiles suggest the potential utility of glycosylation analysis as a diagnostic tool for identifying patients at risk of developing CVS after SAH. The findings support the notion that glycan alterations are reflective of underlying pathophysiological changes in these conditions.
Machine learning (ML) methods have garnered increasing interest in the realm of clinical biomarker research due to their capability to uncover complex, non-linear relationships within high-dimensional datasets [
8]. However, many ML models lack interpretability, posing challenges for clinical adoption [
9,
10,
11]. In contrast, decision trees provide a balance between predictive accuracy and explainability, facilitating transparent decision-making and the integration of ML tools into clinical workflows [
12,
13].
This study evaluated the utility of serum N-glycan profiles in distinguishing healthy individuals, SAH patients without vasospasm, and those who develop vasospasm. Beyond identifying disease-specific glycan patterns relative to controls, we also implemented a direct SAH vs. CVS classification to uncover glycomic markers predictive of vasospasm within the SAH population—addressing a key gap in early diagnostics.
We utilized a straightforward, interpretable decision tree classification model, prioritizing both transparency and clinical relevance. We also examined the impact of data preprocessing—such as outlier removal, scaling, and correlation-based feature reduction informed by decision-tree feature importance—and identified key glycan structures associated with various disease states. Furthermore, we employed decision boundary visualizations and robust cross-validation to enhance interpretability and reproducibility.
3. Discussion
This study demonstrated that interpretable decision tree classifiers can accurately differentiate between subarachnoid hemorrhage (SAH), cerebral vasospasm (CVS), and healthy controls based on serum N-glycan profiles. The models achieved high predictive performance, particularly in the CVS vs. Control comparison, with consistently robust F1-scores and minimal variability across cross-validation folds. This superior performance suggests that glycomic alterations are more pronounced in vasospasm than in SAH alone, likely reflecting distinct inflammatory or vascular remodeling mechanisms. Notably, the CVS vs. Control model achieved an average F1-score of 0.91 with a standard deviation of 0.08, while the SAH vs. CVS task reached a lower average of 0.78 ± 0.16, illustrating the greater classification challenge in differentiating the two pathological conditions. In contrast, classification between SAH and CVS yielded lower accuracy and F1-scores, indicating greater overlap in glycan patterns between these conditions, despite their clinical divergence.
The use of decision tree models offered several key advantages. Most notably, their inherent interpretability enabled the derivation of simple, threshold-based decision rules. The majority of classification tasks could be resolved with two to three glycan-based splits, facilitating transparent logic that could be easily reviewed or implemented in clinical settings. Additionally, the application of correlation-informed feature filtering supported model parsimony without compromising performance, helping to reduce redundancy while preserving biologically meaningful signals.
Among the most informative glycans, A4G4S3(2) consistently ranked as a dominant feature across all tasks, confirming its central role in distinguishing disease states. FA2(6)G1 showed strong relevance in SAH-related models, whereas A2G2 and A3G3S3(5) contributed to CVS and SAH vs. CVS classification, respectively. Interestingly, the stability of feature importance scores across cross-validation folds varied by task: A4G4S3(2) was not only highly ranked but also stable, whereas FA2(6)G1 displayed more variability, which may reflect underlying heterogeneity in SAH pathology.
Preprocessing steps—including outlier filtering and scaling—were systematically evaluated. However, these techniques did not improve classification accuracy in most cases. The exception was observed in the SAH vs. CVS task, where mild improvement was noted after outlier removal. These results are consistent with the known scale-invariance of decision tree models and reinforce the idea that decision trees are particularly robust to unscaled or minimally processed data when glycan features are well curated.
The compact and interpretable nature of the models positions them as promising tools for clinical implementation. With minimal reliance on black-box assumptions and limited feature requirements, these classifiers are ideal candidates for early-stage biomarker-driven screening in neurovascular care. The results further emphasize the potential of serum glycosylation analysis in capturing vascular and inflammatory pathophysiology, supporting the continued exploration of glycomics as a diagnostic modality in cerebrovascular disease. One key limitation of this study is the relatively small sample size (n = 22 per group), which may limit the generalizability of the findings. Nevertheless, the models showed stable performance across cross-validation folds, supporting the robustness of the results and motivating future validation in larger, independent cohorts.
4. Materials and Methods
This study employed a structured pipeline to integrate high-resolution glycomics and decision tree-based classification for the differentiation of healthy controls, SAH patients, and patients with cerebral vasospasm. The methodological design emphasized transparency and reproducibility, aligning each analytical step with clinical interpretability. Although decision tree classifiers support multi-class problems, we intentionally applied binary classification models to answer specific clinical questions (e.g., early SAH detection, CVS risk stratification) while improving model robustness given the small group sizes. Binary classification also yielded simpler and more interpretable decision rules [
14]. Key elements included enzymatic glycan release and fluorescent labeling, robust chromatographic analysis, and systematic data preprocessing strategies such as scaling, outlier filtering, and correlation-based feature pruning guided by model-derived feature importance. To ensure reliable evaluation, stratified train–test splitting and repeated cross-validation were implemented alongside comprehensive performance metrics.
An overview of the full analytical workflow is shown in
Figure 9, summarizing the major steps from patient grouping and sample preparation through glycan analysis, data preprocessing, model training, and interpretability evaluation. This schematic provides context for the detailed methodological steps described in the subsequent sections.
Patient Cohorts and Sample Preparation
Serum samples were collected from three groups: healthy controls (HC, n = 22), patients with subarachnoid hemorrhage without vasospasm (SAH, n = 22), and SAH patients with confirmed vasospasm (CVS, n = 22). Diagnosis of CVS was based on transcranial Doppler ultrasound (TCD) criteria. The study was conducted according to the guidelines of the Declaration of Helsinki and approved by the Institutional Science and Research Ethics Committee of B-A-Z County Central Hospital and University Teaching Hospital (protocol code IG-102-102/2018 and date of approval: 4 April 2018) based on the 23/2002. (V. 9.) EüM Decree on medical research conducted on humans.
N-glycans were enzymatically released using PNGase F (New England Biolabs, Ipswich, MA, USA), followed by fluorescent labeling with procainamide (Sigma-Aldrich St. Louis, MO, USA). Glycan purification was performed using solid-phase extraction cartridges (GL Sciences Inc., Tokyo, Japan). Chromatographic separation and detection were conducted via HILIC-UPLC-FLR-MS (Waters, Milford, MA, USA), a method widely used in glycomics for its robustness and high sensitivity to glycan heterogeneity. This analytical method was selected for its high sensitivity, resolution, and ability to detect subtle changes in glycan structure, which are crucial for identifying disease-related alterations. Detailed procedures were previously described in our earlier publication [
7]. Glycan nomenclature was used as Harvey et al. [
15].
Data Acquisition and Preprocessing
Raw peak area data for identified glycan structures were exported and processed in Python (v3.10) using pandas, NumPy, and scikit-learn. All models were built on glycan intensity features, excluding metadata to ensure purely glycomic-based predictions.
For each classification task (Control vs. SAH, Control vs. CVS, SAH vs. CVS), four pipeline variants were explored:
No preprocessing
Scaling only (StandardScaler)
Outlier removal only (IQR-based filter)
Scaling + outlier removal
These combinations were included to test the robustness of the model under different data preparation assumptions.
Outliers were removed using the interquartile range (IQR) method with a threshold of 1.5. This method was selected due to its simplicity, interpretability, and wide use in biomedical signal filtering, helping to mitigate the influence of extreme values without assuming data normality.
Correlated features (Spearman rho > 0.9) were filtered using a novel approach that retained the more informative variable based on feature importance scores estimated from a preliminary decision tree classifier. This strategy was chosen because it better aligns feature selection with the model’s decision logic, enhances interpretability, and preserves features that contribute meaningfully to classification rather than relying solely on statistical dispersion.
Feature scaling was applied using scikit-learn’s StandardScaler, transforming variables to have zero mean and unit variance. Although decision trees are generally insensitive to monotonic transformations, scaling was included to explore indirect effects on feature correlation structure and to maintain compatibility with potential future comparisons using other model families. These preprocessing options were applied in all combinations (with/without outlier removal; with/without scaling) for each classification task to assess robustness and evaluate their influence on model structure, feature selection stability, and predictive performance.
Feature Selection and Model Training
Model performance was evaluated using 5×10 repeated stratified cross-validation, which ensured robustness while maintaining class balance across folds. This approach reduced variance caused by random splits, ensured that all samples contributed to both training and testing phases across iterations, and provided a more stable estimate of model performance by averaging results over multiple fold repetitions, which is particularly beneficial in studies with limited sample sizes.
Hyperparameter optimization was performed using grid search with a 5 × 10 repeated stratified cross-validation. Repetition was included to reduce variance in model evaluation due to random splits. Stratification maintained equal class representation in each fold, which is particularly important for biomedical datasets with limited but balanced sample sizes.
The parameter grid included:
Criterion: [“gini”, “entropy”]
Max depth: [3, 5, 10, None]
Min samples split: [2, 5, 10]
Min samples leaf: [1, 2, 5]
The best-performing model configurations were retrained on the entire dataset to support interpretability analysis and visualization. This retraining allowed the construction of final decision trees using all available data, thereby maximizing statistical power and enabling clear threshold-based rule sets.
Performance Evaluation
Model performance was assessed using accuracy, precision_macro, recall_macro, and f1_macro metrics. These metrics were chosen to provide a balanced evaluation of classification performance, particularly in a clinical context where both sensitivity (recall) and reliability (precision) are essential. Macro-averaging was used to ensure equal weight was given to each class, regardless of sample size, which is particularly relevant in datasets with modest but balanced group sizes, as in this study. F1-score was emphasized as the harmonic mean of precision and recall, capturing the trade-off between false positives and false negatives, which is critical in diagnostic decision-making.
Interpretability and Feature Importance
Feature importance analysis was conducted to determine which glycan structures contributed most to classification. Scores were extracted from decision tree models trained on each cross-validation split using the feature_importances_ attribute provided by scikit-learn. For each binary classification task, importance values were averaged across all folds, and standard deviations were computed to assess the stability of feature selection. This strategy enabled the identification of glycan features that consistently influenced decision boundaries, independent of train/test split or preprocessing configuration [
2,
4].
The resulting scores were visualized as bar plots, with error bars representing fold-wise standard deviations. Shared and task-specific glycans were examined to support biological interpretability, particularly in the light of previous evidence linking glycosylation to vascular and inflammatory processes [
7,
8,
14]. Additionally, jointplots of top-ranked feature pairs were used to illustrate class separation and overlay decision boundaries. Final decision trees were retrained on the full dataset using the best-performing parameter settings and visualized to support transparent review of the model structure.