# Super Learner Algorithm for Carotid Artery Disease Diagnosis: A Machine Learning Approach Leveraging Craniocervical CT Angiography

^{1}

^{2}

^{3}

^{4}

^{5}

^{*}

## Abstract

**:**

## 1. Introduction

## 2. Materials and Methods

#### 2.1. Pearson Correlation Coefficient

#### 2.2. Chi-Squared Test

#### 2.3. Standard Deviation Normalization (Z-Score)

#### 2.4. Feature Importance via Lasso Regularization

#### 2.5. Recursive Feature Elimination (RFE)

- 1.
- Fit the model to the data.
- 2.
- Rank the features based on their importance.
- 3.
- Remove the least important feature(s).
- 4.
- Repeat the process until the desired number of features is obtained.

#### 2.6. Machine Learning Models

**Extreme Gradient Boosting (XGBoost):**This algorithm uses gradient boosting framework which improves the performa nce of decision trees by combining multiple weak models.**Light Gradient Boosting Machine (LightGBM):**This is an efficient and effective gradient boosting framework which uses decision tree algorithms. It is designed for quick and accurate model training.**Random forests (RF):**This is an ensemble learning method which operates by constructing multiple decision trees during training and outputting the mode of the classes.**Bootstrap aggregation (bagging):**This method reduces variance by training multiple models on different subsets of the data and averaging their predictions.**Adaptive Boosting (AdaBoost):**This is a boosting technique which combines multiple weak classifiers to create a strong classifier.**Extremely Randomized Trees (ExtraTrees):**This method randomizes the choice of the split point and features to reduce variance in high-dimensional data.

#### 2.7. Data Collection

#### 2.8. Feature Extraction and Preprocessing

**Filter-based FS algorithms**are statistical methods which rank features independent of their relationships with the target variable [32]. One such method is the correlation filter, which uses the Pearson correlation coefficient (PCC) to identify highly correlated features. The PCC is calculated with the help of Equation (1). Here, ${X}_{i}$ and ${Y}_{i}$ are individual data points, while $\overline{X}$ and $\overline{Y}$ are their respective mean values. PCC values closer to 1 indicate a strong positive relationship, values closer to −1 indicate a strong negative relationship, and values near 0 suggest no linear correlation [33]. Additionally, the chi-squared test was used to measure the dependence among features, which is particularly useful for categorical data.**Embedded methods**incorporate feature selection as part of the model training process. These methods use a wrapper to evaluate the importance of features without building a new model each time a different subset is selected. In our study, we used lasso regularization, random forests, and gradient-boosted trees to determine feature importance. Lasso regularization, for instance, minimizes the following objective function obtained with the help of Equation (4). This method penalizes the absolute size of the coefficients, effectively shrinking some of them to zero and thus performing variable selection and regularization simultaneously.**Hybrid methods**combine the strengths of both filter-based and embedded methods. Recursive feature elimination (RFE) is a hybrid method we used which iteratively fits a model and removes the least important feature(s) until the specified number of features is reached. The process involves the following:- Fitting the model to the data.
- Ranking the features based on their importance.
- Removing the least important feature(s).
- Repeating the process until the desired number of features is obtained.

- RFE is a greedy optimization algorithm which creates models repeatedly to find the best performing subset by excluding the least important feature at each iteration.

**Normal cases**: The demographic structure of the study’s normal sample included a total of 31 participants, with 16 females and 15 males. The ages of the participants ranged from 30 to 79 years, with a mean age of 55.45 years and a standard deviation of 14.16, indicating a moderately diverse age range. The difference between the measured $cos{\theta}^{\prime}$ and the predicted $cos\theta $ values (based on Murray’s law) varied between −0.456 and 0.165, with a mean of −0.102 and a standard deviation of 0.153, suggesting a slight tendency for the measured values to be lower than predicted.Regarding anatomical measurements, the right internal carotid artery diameter (ICAdiaR) varied from 4.4 mm to 8.0 mm, with a mean of 5.703 mm and a standard deviation of 0.812, reflecting relatively low variability. The left external carotid artery angle (Car.Angle.L$\alpha $) spanned from 12° to 58°, with an average of 27.2° and a standard deviation of 12.639, indicating notable variation. The total right carotid angles (Car.Angle.$\mathrm{\Sigma}$R) ranged from 20.8° to 76.0°, with a mean of 46.65° and a standard deviation of 14.934, while the total left carotid angles (Car.Angle.$\mathrm{\Sigma}$L) varied more significantly, ranging from 26.0° to 105.4°, with a mean of 50.97° and a standard deviation of 19.198. The right carotid angle (Car.Angle.$\theta $R) had values between 6.3° and 54.1°, with a mean of 23.05° and a standard deviation of 11.427, while the left carotid angle (Car.Angle.$\theta $L) ranged from 4.7° to 49.0°, with a mean of 23.77° and a standard deviation of 12.48. The combined right and left carotid angles (Car.Angle.(R+L)$\theta $) spanned from 6.1° to 47.65°, with an average of 23.41° and a standard deviation of 10.836. Lastly, the right common carotid artery diameter (CCAdiaR) ranged from 5.1 mm to 10.7 mm, with a mean of 7.161 mm and a standard deviation of 1.049, showing relatively consistent measurements across the participants.**Stenosis cases**: The demographic structure of stenosis cases consists of 30 participants, with 10 females and 20 males. The ages of the participants ranged from 52 to 70 years, with a mean age of 63.73 years and a standard deviation of 4.948, indicating a somewhat narrow age range. The difference between the measured $cos{\theta}^{\prime}$ and the predicted $cos\theta $ values (based on Murray’s law) varied between −0.458 and 0.093, with a mean of −0.177 and a standard deviation of 0.140, suggesting that the measured values tended to be lower than predicted.In terms of anatomical measurements, the right internal carotid artery diameter (ICAdiaR) ranged from 3.5 mm to 10.5 mm, with a mean of 5.707 mm and a standard deviation of 1.484, indicating notable variability. The left external carotid artery angle (Car.Angle.L$\alpha $) spanned from −67.3° to 45.0°, with an average of 16.77° and a standard deviation of 20.079, suggesting significant variation. The total right carotid angles (Car.Angle.$\mathrm{\Sigma}$R) ranged from 22.6° to 66.4°, with a mean of 41.51° and a standard deviation of 10.079, while the total left carotid angles (Car.Angle.$\mathrm{\Sigma}$L) ranged from 10.7° to 83.9°, with a mean of 42.24° and a standard deviation of 16.088, indicating more variability on the left side.The right carotid angle (Car.Angle.$\theta $R) ranged from 3.0° to 36.6°, with a mean of 17.44° and a standard deviation of 7.68, while the left carotid angle (Car.Angle.$\theta $L) ranged from 4.0° to 78.0°, with a mean of 25.48° and a standard deviation of 15.115, reflecting wider variability on the left side. The total right and left carotid angles (Car.Angle.(R+L)$\theta $) ranged from 9.35° to 53.7°, with a mean of 23.46° and a standard deviation of 9.511. Lastly, the right common carotid artery diameter (CCAdiaR) spanned from 5.9 mm to 11.1 mm, with a mean of 7.748 mm and a standard deviation of 1.898, showing relatively consistent measurements.**Aneurysm cases**: The demographic structure of aneurysm cases consists of 30 participants, with 13 females and 17 males. The ages of the participants ranged from 33 to 74 years, with a mean age of 53.17 years and a standard deviation of 11.68, indicating moderate age variation. The difference between the measured $cos{\theta}^{\prime}$ and the predicted $cos\theta $ values (based on Murray’s law) varied between −0.391 and 0.764, with a mean of 0.101 and a standard deviation of 0.258, showing greater deviation compared with the other cases.In terms of anatomical measurements, the right internal carotid artery diameter (ICAdiaR) ranged from 3.6 mm to 9.4 mm, with a mean of 6.693 mm and a standard deviation of 1.371, indicating a wider variation in the artery diameter. The left external carotid artery angle (Car.Angle.L$\alpha $) ranged from 0.0° to 46.9°, with an average of 21.08° and a standard deviation of 11.115, reflecting significant variability. The total right carotid angles (Car.Angle.$\mathrm{\Sigma}$R) spanned from 29.2° to 125.0°, with a mean of 53.997° and a standard deviation of 19.547, while the total left carotid angles (Car.Angle.$\mathrm{\Sigma}$L) ranged from 24.1° to 102.6°, with a mean of 52.317° and a standard deviation of 17.026, indicating similar variability on both sides.The right carotid angle (Car.Angle.$\theta $R) ranged from 4.3° to 67.1°, with a mean of 28.76° and a standard deviation of 13.267, while the left carotid angle (Car.Angle.$\theta $L) ranged from 8.1° to 63.8°, with a mean of 31.23° and a standard deviation of 14.587, reflecting higher variation. The total right and left carotid angles (Car.Angle.(R+L)$\theta $) spanned from 8.05° to 61.4°, with a mean of 29.998° and a standard deviation of 12.894. Lastly, the right common carotid artery diameter (CCAdiaR) ranged from 4.9 mm to 10.8 mm, with a mean of 7.393 mm and a standard deviation of 1.353, showing relatively consistent measurements across the participants.**Dissection cases**: The demographic structure of the dissection cases included 31 participants, with 18 females and 13 males. The ages of the participants ranged from 27 to 76 years, with a mean age of 48.71 years and a standard deviation of 10.558, reflecting moderate age variation. The difference between the measured $cos{\theta}^{\prime}$ and the predicted $cos\theta $ values (based on Murray’s law) ranged from −0.429 to 0.256, with a mean of −0.817 and a standard deviation of 0.177, indicating a tendency for the measured values to be lower than predicted.Regarding anatomical measurements, the right internal carotid artery diameter (ICAdiaR) spanned from 3.3 mm to 6.7 mm, with a mean of 4.877 mm and a standard deviation of 0.978, showing moderate variability. The left external carotid artery angle (Car.Angle.L$\alpha $) ranged from 1.0° to 61.2°, with a mean of 22.094° and a standard deviation of 12.735, reflecting significant variation. The total right carotid angles (Car.Angle.$\mathrm{\Sigma}$R) varied between 23.8° and 99.9°, with a mean of 49.629° and a standard deviation of 16.831, while the total left carotid angles (Car.Angle.$\mathrm{\Sigma}$L) spanned from 25.2° to 93.8°, with a mean of 51.545° and a standard deviation of 16.909, indicating similar variability on both sides.The right carotid angle (Car.Angle.$\theta $R) ranged from 7.2° to 40.8°, with a mean of 23.516° and a standard deviation of 8.299, while the left carotid angle (Car.Angle.$\theta $L) varied from 7.4° to 55.9°, with a mean of 29.452° and a standard deviation of 12.408, showing greater variation on the left side. The total right and left carotid angles (Car.Angle.(R+L)$\theta $) ranged from 12.7° to 46.85°, with a mean of 26.484° and a standard deviation of 9.458. Lastly, the right common carotid artery diameter (CCAdiaR) ranged from 4.9 mm to 8.9 mm, with a mean of 6.261 mm and a standard deviation of 0.888, indicating relatively consistent measurements across the participants.

## 3. Enhanced Model Robustness Techniques

#### 3.1. Cross-Validation

`from sklearn.model_selection import KFold``kf = KFold(n_splits=5)``for train_index, test_index in kf.split(X):``X_train, X_test = X[train_index], X[test_index]``y_train, y_test = y[train_index], y[test_index]``model.fit(X_train, y_train)``predictions = model.predict(X_test)`

#### 3.2. Bootstrapping

`from sklearn.utils import resample``X_bootstrap, y_bootstrap = resample(X_train, y_train,``replace=True,``n_samples=len(X_train))``model.fit(X_bootstrap, y_bootstrap)`

#### 3.3. Data Augmentation

`import numpy as np``noise = np.random.normal(0, 0.01, X_train.shape)``X_augmented = X_train + noise``model.fit(X_augmented, y_train)`

#### 3.4. Synthetic Data Generation (SMOTE)

`from imblearn.over_sampling import SMOTE``smote = SMOTE()``X_resampled, y_resampled = smote.fit_resample(X_train, y_train)``model.fit(X_resampled, y_resampled)`

#### 3.5. Ensemble Learning

`from sklearn.ensemble import StackingClassifier``estimators = [``(’rf’, RandomForestClassifier(n_estimators=100)),``(’xgb’, XGBClassifier()),``(’ada’, AdaBoostClassifier())``]``stacking_model = StackingClassifier(estimators=estimators,``final_estimator=LogisticRegression())``stacking_model.fit(X_train, y_train)`

## 4. Construction of Carotid Artery Disease Detection Model

**Extreme Gradient Boosting (XGBoost)**: XGBoost is a supervised ML algorithm based on the tree boosting method [35]. It is an ensemble learning algorithm which creates a final model from a collection of individual models, typically decision trees. XGBoost uses gradient descent to optimize weights and minimize the loss function, considering second-order gradients to improve model performance.**Light Gradient-Boosting Machine (LightGBM)**: LightGBM is a variant of gradient boosting which achieves superior performance, especially with high-dimensional data and large datasets [36]. It employs two novel techniques—Gradient-based one-side sampling (GOSS) and exclusive feature bundling (EFB)—which enhance the training speed and efficiency. Like XGBoost, LightGBM is based on decision tree algorithms.**Random forests (RF)**: RF is a robust ensemble learning method based on a collection of decision trees [37]. Each tree is constructed using a random vector sampled independently but with the same distribution across all trees. Nodes in the decision trees are split based on measures like entropy or the Gini index.**Bootstrap aggregation (bagging)**: Bagging is an ensemble learning technique which reduces variance in noisy datasets and is considered an extension of the random forests algorithm [38]. It involves selecting random samples of data with replacement, training multiple models independently, and averaging their predictions for improved accuracy.**Adaptive Boosting (AdaBoost)**: AdaBoost is a boosting approach which generates a robust classifier from a set of weak classifiers. It maintains weights over the training data and adjusts them adaptively after each learning cycle, increasing the weights for incorrectly classified samples and decreasing the weights for correctly classified ones.**Extremely Randomized Trees (ExtraTrees)**: ExtraTrees is an ensemble learning technique based on decision trees. Unlike RF, where the tree splits are deterministic, ExtraTrees uses randomized splits, providing a robust approach for high-dimensional data by balancing bias and variance.

## 5. Dataset Expansion Using SMOTE

#### 5.1. SMOTE Process

#### 5.2. Outcome of SMOTE Application

- Expanding the dataset to 1000 samples.
- Addressing class scarcity by increasing the minority class to 920 instances.
- Enhancing model accuracy, particularly for the minority classes.

## 6. Model Accuracy Comparison: Original Dataset versus SMOTE-Expanded Dataset

#### 6.1. Model Training on Original Dataset

#### 6.2. Model Training on SMOTE-Expanded Dataset

#### 6.3. Comparison and Analysis

## 7. Performance Comparison: Original versus SMOTE with Optuna

#### 7.1. Optuna Optimization on Original Data and on SMOTE-Expanded Data

#### 7.2. Comparison of Performance

**XGBoost**: Accuracy increased from 0.81 to 0.89 after SMOTE and Optuna optimization.**LightGBM**: Accuracy improved from 0.86 to 0.91, making it the top performer.**Bagging and random forests**: Bagging’s accuracy increased from 0.86 to 0.90, while that of random forests improved from 0.81 to 0.88.**AdaBoost and ExtraTrees**: AdaBoost’s accuracy improved from 0.81 to 0.87, while that of ExtraTrees increased from 0.71 to 0.79.

#### 7.3. Performance Metrics

**True negative (TN)**: A TN is an outcome where the model correctly identifies non-carotid artery diseases.**True positive (TP)**: A TP is an outcome where the model correctly identifies carotid artery diseases.**False negative (FN)**: An FN is an outcome where the model incorrectly identifies non-carotid artery diseases.**False positive (FP)**: An FP is an outcome where the model incorrectly identifies carotid artery diseases.

## 8. Conclusions

#### Future Directions

## Author Contributions

## Funding

## Institutional Review Board Statement

## Informed Consent Statement

## Data Availability Statement

## Acknowledgments

## Conflicts of Interest

## References

- Campbell, B.C.V.; Silva, D.A.D.; Macleod, M.R.; Coutts, S.B.; Schwamm, L.H.; Davis, S.M.; Donnan, G.A. Ischaemic stroke. Nat. Rev. Dis. Prim.
**2019**, 5, 70. [Google Scholar] [CrossRef] [PubMed] - Shiber, J.R.; Fontane, E.; Adewale, A. Stroke registry: Hemorrhagic vs ischemic strokes. Am. J. Emerg. Med.
**2010**, 28, 331–333. [Google Scholar] [CrossRef] [PubMed] - Sacco, R.L. Risk factors, outcomes, and stroke subtypes for ischemic stroke. Neurology
**1997**, 49, S39–S44. [Google Scholar] [CrossRef] [PubMed] - Romero, J.R. Prevention of Ischemic Stroke: Overview of Traditional Risk Factors. Curr. Drug Targets
**2007**, 8, 794–801. [Google Scholar] [CrossRef] - Pandian, J.D.; Gall, S.L.; Kate, M.P.; Silva, G.S.; Akinyemi, R.O.; Ovbiagele, B.I.; Lavados, P.M.; Gandhi, D.B.C.; Thrift, A.G. Prevention of stroke: A global perspective. Lancet
**2018**, 392, 1269–1278. [Google Scholar] [CrossRef] - Rittenhouse, E.A.; Radke, H.M.; Sumner, D.S. Carotid Artery Aneurysm: Review of the Literature and Report of a Case With Rupture Into the Oropharynx. Arch. Surg.
**1972**, 105, 786–789. [Google Scholar] [CrossRef] - Bahram, M. Spontaneous dissections of internal carotid arteries. Neurologist
**1997**, 3, 104–119. [Google Scholar] - Murray, C.D. The physiological principle of minimum work applied to the angle of branching of arteries. J. Gen. Physiol.
**1926**, 9, 835–841. [Google Scholar] [CrossRef] - Murray, C.D. A relationship between circumference and weight in trees and its bearing on branching angles. J. Gen. Physiol.
**1927**, 10, 725–729. [Google Scholar] [CrossRef] - Prasad, K.M.; Radhakrishnamacharya, G. Flow of Herschel-Bulkley fluid through an inclined tube of nonuniform cross-section with multiple stenosis. Arch. Mech.
**2008**, 60, 161–172. [Google Scholar] - Dhange, M.; Sankad, G.; Safdar, R.; Jamshed, W.; Eid, M.R.; Bhujakkanavar, U.; Gouadria, S.; Chouikh, R. A mathematical model of blood flow in a stenosed artery with post-stenotic dilatation and a forced field. PLoS ONE
**2022**, 17, e0266727. [Google Scholar] [CrossRef] [PubMed] - Sun, J.; Guo, L.; Jing, J.; Tang, C.; Lu, Y.; Fu, J.; Ullmann, A.; Brauner, N. Investigation on laminar pipe flow of a non-Newtonian Carreau-Extended fluid. J. Pet. Sci. Eng.
**2021**, 205, 108915. [Google Scholar] [CrossRef] - Reid, L. An Introduction to Biomedical Computational Fluid Dynamics. In Biomedical Visualisation: Volume 10; Rea, P.M., Ed.; Springer Nature: Cham, Switzerland, 2021; pp. 205–222. [Google Scholar] [CrossRef]
- Apaydin, M.; Cetinoglu, K. Carotid angle in young stroke. Clin. Imaging
**2021**, 70, 10–17. [Google Scholar] [CrossRef] [PubMed] - Noh, S.M.; Kang, H.G. Clinical significance of the internal carotid artery angle in ischemic stroke. Sci. Rep.
**2019**, 9, 4618. [Google Scholar] [CrossRef] - Ojaare1, M.G.; Annougu, T.I.; Msuega, C.D.; Mohammad, H.O.; Farati, A.; Alexander, A.; Umer, B.P. Carotid artery diameter assessment in men and women and the relation to age, sex and body mass index using ultrasonography. Int. J. Adv. Med.
**2021**, 8, 1274–1279. [Google Scholar] [CrossRef] - Tan, Q.; Qin, C.; Yang, J.; Wang, T.; Lin, H.; Lin, C.; Chen, X. Inner diameters of the normal carotid arteries measured using three-dimensional digital subtraction catheter angiography: A retrospective analysis. BMC Neurol.
**2021**, 21, 292. [Google Scholar] [CrossRef] - İbrahim Özdemir, H. The structural properties of carotid arteries in carotid artery diseases a retrospective computed tomography angiography study. Pol. J. Radiol.
**2020**, 85, 82–89. [Google Scholar] [CrossRef] - Yoshida, K.; Yang, T.; Yamamoto, Y.; Kurosaki, Y.; Funaki, T.; Kikuchi, T.; Ishii, A.; Kataoka, H.; Miyamoto, S. Expansive carotid artery remodeling: Possible marker of vulnerable plaque. J. Neurosurg.
**2019**, 133, 1435–1440. [Google Scholar] [CrossRef] - Le, E.P.; Wong, M.Y.; Rundo, L.; Tarkin, J.M.; Evans, N.R.; Weir-McCall, J.R.; Chowdhury, M.M.; Coughlin, P.A.; Pavey, H.; Zaccagna, F.; et al. Using machine learning to predict carotid artery symptoms from CT angiography: A radiomics and deep learning approach. Eur. J. Radiol. Open
**2024**, 13, 100594. [Google Scholar] [CrossRef] - Porcu, M.; Cau, R.; Suri, J.S.; Saba, L. Artificial intelligence-and radiomics-based evaluation of carotid artery disease. In Artificial Intelligence in Cardiothoracic Imaging; Springer: Berlin/Heidelberg, Germany, 2022; pp. 513–523. [Google Scholar]
- Saba, L.; Chen, H.; Cau, R.; Rubeis, G.; Zhu, G.; Pisu, F.; Jang, B.; Lanzino, G.; Suri, J.; Qi, Y.; et al. Impact analysis of different CT configurations of carotid artery plaque calcifications on cerebrovascular events. Am. J. Neuroradiol.
**2022**, 43, 272–279. [Google Scholar] [CrossRef] - Goodfellow, I.J.; Bengio, Y.; Courville, A. Deep Learning; MIT Press: Cambridge, MA, USA, 2016. [Google Scholar]
- Pisu, F.; Chen, H.; Jiang, B.; Zhu, G.; Usai, M.V.; Austermann, M.; Shehada, Y.; Johansson, E.; Suri, J.; Lanzino, G.; et al. Machine learning detects symptomatic patients with carotid plaques based on 6-type calcium configuration classification on CT angiography. Eur. Radiol.
**2024**, 34, 3612–3623. [Google Scholar] [CrossRef] [PubMed] - Cohen, I.; Huang, Y.; Chen, J.; Benesty, J.; Benesty, J.; Chen, J.; Huang, Y.; Cohen, I. Pearson correlation coefficient. In Noise Reduction in Speech Processing; Springer: Berlin/Heidelberg, Germany, 2009; pp. 1–4. [Google Scholar]
- Tallarida, R.J.; Murray, R.B.; Tallarida, R.J.; Murray, R.B. Chi-square test. In Manual of Pharmacologic Calculations: With Computer Programs; Springer: New York, NY, USA, 1987; pp. 140–142. [Google Scholar]
- Colan, S.D. The why and how of Z scores. J. Am. Soc. Echocardiogr.
**2013**, 26, 38–40. [Google Scholar] [CrossRef] - Fonti, V.; Belitser, E. Feature selection using lasso. VU Amsterdam Res. Pap. Bus. Anal.
**2017**, 30, 1–25. [Google Scholar] - Chen, X.w.; Jeong, J.C. Enhanced recursive feature elimination. In Proceedings of the Sixth International Conference on Machine Learning and Applications (ICMLA 2007), Cincinnati, OH, USA, 13–15 December 2007; pp. 429–435. [Google Scholar]
- İbrahim Özdemir, H.; Çınar, C.; İsmail, O. Determination of hemodynamic and rheological properties in carotid artery diseases. Imaging Med.
**2021**, 13, 1–8. [Google Scholar] - John, G.H.; Kohavi, R.; Pfleger, K. Irrelevant Features and the Subset Selection Problem. In Machine Learning Proceedings 1994; Cohen, W.W., Hirsh, H., Eds.; Morgan Kaufmann: San Francisco, CA, USA, 1994; pp. 121–129. [Google Scholar] [CrossRef]
- Saeys, Y.; Inza, I.; Larrañaga, P. A review of feature selection techniques in bioinformatics. Bioinformatics
**2007**, 23, 2507–2517. [Google Scholar] [CrossRef] [PubMed] - Harrell, F. Regression Modeling Strategies: With Applications to Linear Models, Logistic and Ordinal Regression, and Survival Analysis; Springer Series in Statistics; Springer International Publishing: Berlin/Heidelberg, Germany, 2015. [Google Scholar]
- Akiba, T.; Sano, S.; Yanase, T.; Ohta, T.; Koyama, M. Optuna: A Next-generation Hyperparameter Optimization Framework. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, Anchorage, AK, USA, 4–8 August 2019; pp. 2623–2631. [Google Scholar] [CrossRef]
- Chen, T.; Guestrin, C. XGBoost: A Scalable Tree Boosting System. In Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, CA, USA, 13–17 August 2016; pp. 785–794. [Google Scholar] [CrossRef]
- Ke, G.; Meng, Q.; Finley, T.; Wang, T.; Chen, W.; Ma, W.; Ye, Q.; Liu, T.Y. Lightgbm: A highly efficient gradient boosting decision tree. Adv. Neural Inf. Process. Syst.
**2017**, 30, 3146–3154. [Google Scholar] - Ho, T.K. Random decision forests. In Proceedings of the 3rd International Conference on Document Analysis and Recognition, Montreal, QC, Canada, 14–16 August 1995; Volume 1, pp. 278–282. [Google Scholar] [CrossRef]
- Breiman, L. Bagging predictors. Mach. Learn.
**2004**, 24, 123–140. [Google Scholar] [CrossRef] - Breiman, L. Pasting Small Votes for Classification in Large Databases and On-Line. Mach. Learn.
**1999**, 36, 85–103. [Google Scholar] [CrossRef] - Louppe, G.; Geurts, P. Ensembles on Random Patches. In Proceedings of the Machine Learning and Knowledge Discovery in Databases, Bristol, UK, 24–28 September 2012; pp. 346–361. [Google Scholar] [CrossRef]
- Laan, M.; Polley, E.; Hubbard, A. Super Learner. Stat. Appl. Genet. Mol. Biol.
**2007**, 6, 25. [Google Scholar] [CrossRef] - Nasarian, E.; Abdar, M.; Fahami, M.A.; Alizadehsani, R.; Hussain, S.; Basiri, M.E.; Zomorodi-Moghadam, M.; Zhou, X.; Pławiak, P.; Acharya, U.R.; et al. Association between work-related features and coronary artery disease: A heterogeneous hybrid feature selection integrated with balancing approach. Pattern Recognit. Lett.
**2020**, 133, 33–40. [Google Scholar] [CrossRef]

**Figure 2.**Diameters and angles of carotid artery with the help of Sectra software. (

**a**) Diameter, (

**b**) 3D imaging of carotid arteries, and (

**c**) artery lumen centers.

**Figure 3.**Set of 3D images of the carotid arteries obtained with AW Server software for patients: (

**a**) normal, (

**b**) stenosis, (

**c**) aneurysm, and (

**d**) dissection.

Sex | 16 Female, 15 Male | |||
---|---|---|---|---|

Attribute | Minimum | Maximum | Mean | STD |

Age | 30.000 | 79.000 | 55.452 | 14.156 |

$cos{\theta}^{\prime}$–$cos\theta $ | −0.456 | 0.165 | −0.102 | 0.153 |

$cos\theta $ (measured) | 0.674 | 0.994 | 0.902 | 0.081 |

ICAdiaR (mm) | 4.400 | 8.000 | 5.703 | 0.812 |

Car.Angle.L$\alpha $ | 12.000 | 58.000 | 27.200 | 12.639 |

Car.Angle.$\mathrm{\Sigma}$R | 20.800 | 76.000 | 46.652 | 14.934 |

Car.Angle.$\mathrm{\Sigma}$L | 26.000 | 105.400 | 50.965 | 19.198 |

Car.Angle.$\theta $R | 6.300 | 54.100 | 23.052 | 11.427 |

Car.Angle.$\theta $L | 4.700 | 49.000 | 23.765 | 12.480 |

Car.Angle.(R+L)$\theta $ | 6.100 | 47.650 | 23.408 | 10.836 |

CCAdiaR (mm) | 5.100 | 10.700 | 7.161 | 1.049 |

Sex | 10 Female, 20 Male | |||
---|---|---|---|---|

Attribute | Minimum | Maximum | Mean | STD |

Age | 52.000 | 70.000 | 63.733 | 4.948 |

$cos{\theta}^{\prime}$–$cos\theta $ | −0.458 | 0.093 | −0.177 | 0.140 |

$cos\theta $ (measured) | 0.592 | 0.987 | 0.919 | 0.079 |

ICAdiaR (mm) | 3.500 | 10.500 | 5.707 | 1.484 |

Car.Angle.L$\alpha $ | −67.300 | 45.000 | 16.767 | 20.079 |

Car.Angle.$\mathrm{\Sigma}$R | 22.600 | 66.400 | 41.510 | 10.079 |

Car.Angle.$\mathrm{\Sigma}$L | 10.700 | 83.900 | 42.243 | 16.088 |

Car.Angle.$\theta $R | 3.000 | 36.600 | 17.440 | 7.680 |

Car.Angle.$\theta $L | 4.000 | 78.000 | 25.477 | 15.115 |

Car.Angle.(R+L)$\theta $ | 9.350 | 53.700 | 23.458 | 9.511 |

CCAdiaR (mm) | 5.900 | 11.100 | 7.748 | 1.898 |

Sex | 13 Female, 17 Male | |||
---|---|---|---|---|

Attribute | Minimum | Maximum | Mean | STD |

Age | 33.000 | 74.000 | 53.167 | 11.680 |

$cos{\theta}^{\prime}$–$cos\theta $ | −0.391 | 0.764 | 0.101 | 0.258 |

$cos\theta $ (measured) | 0.479 | 0.990 | 0.845 | 0.119 |

ICAdiaR (mm) | 3.600 | 9.400 | 6.693 | 1.371 |

Car.Angle.L$\alpha $ | 0.000 | 46.900 | 21.083 | 11.115 |

Car.Angle.$\mathrm{\Sigma}$R | 29.200 | 125.000 | 53.997 | 19.547 |

Car.Angle.$\mathrm{\Sigma}$L | 24.100 | 102.600 | 52.317 | 17.026 |

Car.Angle.$\theta $R | 4.300 | 67.100 | 28.763 | 13.267 |

Car.Angle.$\theta $L | 8.100 | 63.800 | 31.233 | 14.587 |

Car.Angle.(R+L)$\theta $ | 8.050 | 61.400 | 29.998 | 12.894 |

CCAdiaR (mm) | 4.900 | 10.800 | 7.393 | 1.353 |

Sex | 18 Female, 13 Male | |||
---|---|---|---|---|

Attribute | Minimum | Maximum | Mean | STD |

Age | 27.000 | 76.000 | 48.710 | 10.558 |

$cos{\theta}^{\prime}$–$cos\theta $ | −0.429 | 0.256 | −0.817 | 0.177 |

$cos\theta $ (measured) | 0.684 | 0.976 | 0.884 | 0.082 |

ICAdiaR (mm) | 3.300 | 6.700 | 4.877 | 0.978 |

Car.Angle.L$\alpha $ | 1.000 | 61.200 | 22.094 | 12.735 |

Car.Angle.$\mathrm{\Sigma}$R | 23.800 | 99.900 | 49.629 | 16.831 |

Car.Angle.$\mathrm{\Sigma}$L | 25.200 | 93.800 | 51.545 | 16.909 |

Car.Angle.$\theta $R | 7.200 | 40.800 | 23.516 | 8.299 |

Car.Angle.$\theta $L | 7.400 | 55.900 | 29.452 | 12.408 |

Car.Angle.(R+L)$\theta $ | 12.700 | 46.850 | 26.484 | 9.458 |

CCAdiaR (mm) | 4.900 | 8.900 | 6.261 | 0.888 |

Sex | 57 Female, 65 Male | |||
---|---|---|---|---|

Attribute | Minimum | Maximum | Mean | STD |

Age | 27.000 | 79.000 | 55.213 | 12.074 |

$cos{\theta}^{\prime}$–$cos\theta $ | −0.458 | 0.764 | −0.065 | 0.211 |

$cos\theta $ (measured) | 0.479 | 0.994 | 0.888 | 0.095 |

ICAdiaR (mm) | 3.300 | 10.500 | 5.738 | 1.340 |

Car.Angle.L$\alpha $ | −67.300 | 61.200 | 21.833 | 14.831 |

Car.Angle.$\mathrm{\Sigma}$R | 20.800 | 125.000 | 47.950 | 16.185 |

Car.Angle.$\mathrm{\Sigma}$L | 10.700 | 105.400 | 49.300 | 17.618 |

Car.Angle.$\theta $R | 3.000 | 67.100 | 23.194 | 11.032 |

Car.Angle.$\theta $L | 4.000 | 78.000 | 27.467 | 13.841 |

Car.Angle.(R+L)$\theta $ | 6.100 | 61.400 | 25.331 | 11.101 |

CCAdiaR (mm) | 4.900 | 11.100 | 7.129 | 1.418 |

Model | Accuracy |
---|---|

XGBoost | 0.81 |

LightGBM | 0.86 |

Random Forests | 0.81 |

Bagging | 0.86 |

AdaBoost | 0.81 |

ExtraTrees | 0.71 |

Classes | Precision | Recall | F Score |
---|---|---|---|

Normal | 1.0 | 1.0 | 1.0 |

Stenosis | 0.83 | 0.83 | 0.83 |

Aneurysms | 0.83 | 1.0 | 0.91 |

Dissection | 1.0 | 0.80 | 0.89 |

Sex | 480 Female, 520 Male | |||
---|---|---|---|---|

Attribute | Minimum | Maximum | Mean | STD |

Age | 27.000 | 79.000 | 52.329 | 15.433 |

$cos{\theta}^{\prime}$–$cos\theta $ | −0.457 | 0.763 | 0.163 | 0.353 |

$cos\theta $ (measured) | 0.479 | 0.993 | 0.735 | 0.146 |

ICAdiaR (mm) | 3.302 | 10.499 | 6.883 | 2.085 |

Car.Angle.L$\alpha $ | −66.166 | 61.186 | −2.431 | 36.704 |

Car.Angle.$\mathrm{\Sigma}$R | 20.800 | 124.981 | 73.300 | 30.115 |

Car.Angle.$\mathrm{\Sigma}$L | 10.758 | 105.352 | 58.443 | 28.110 |

Car.Angle.$\theta $R | 3.187 | 67.070 | 34.791 | 18.662 |

Car.Angle.$\theta $L | 4.100 | 77.959 | 42.268 | 21.762 |

Car.Angle.(R+L)$\theta $ | 6.249 | 61.368 | 32.789 | 15.736 |

CCAdiaR (mm) | 4.901 | 11.093 | 8.051 | 1.769 |

Model | Accuracy (Original Data) |
---|---|

XGBoost | 0.66 |

LightGBM | 0.76 |

Random Forests | 0.71 |

Bagging | 0.57 |

AdaBoost | 0.52 |

ExtraTrees | 0.71 |

Model | Accuracy (SMOTE Data) |
---|---|

XGBoost | 0.81 |

LightGBM | 0.86 |

Random Forests | 0.81 |

Bagging | 0.86 |

AdaBoost | 0.81 |

ExtraTrees | 0.71 |

Model | Accuracy (Original) | Accuracy (SMOTE) |
---|---|---|

XGBoost | 0.66 | 0.81 |

LightGBM | 0.76 | 0.86 |

Random Forests | 0.71 | 0.81 |

Bagging | 0.57 | 0.86 |

AdaBoost | 0.52 | 0.81 |

ExtraTrees | 0.71 | 0.71 |

Model | Accuracy (SMOTE Data + Optuna) |
---|---|

XGBoost | 0.89 |

LightGBM | 0.91 |

Random Forests | 0.88 |

Bagging | 0.90 |

AdaBoost | 0.87 |

ExtraTrees | 0.79 |

Model | Accuracy (Original Data + Optuna) | Accuracy (SMOTE Data + Optuna) |
---|---|---|

XGBoost | 0.81 | 0.89 |

LightGBM | 0.86 | 0.91 |

Random Forests | 0.81 | 0.88 |

Bagging | 0.86 | 0.90 |

AdaBoost | 0.81 | 0.87 |

ExtraTrees | 0.71 | 0.79 |

Classes | Precision | Recall | F Score |
---|---|---|---|

Normal | 1.0 | 1.0 | 1.0 |

Stenosis | 0.88 | 0.86 | 0.87 |

Aneurysm | 0.90 | 1.0 | 0.95 |

Dissection | 1.0 | 0.85 | 0.92 |

Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content. |

© 2024 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/).

## Share and Cite

**MDPI and ACS Style**

Özdemir, H.İ.; Atman, K.G.; Şirin, H.; Çalık, A.E.; Senturk, I.; Bilge, M.; Oran, İ.; Bilge, D.; Çınar, C.
Super Learner Algorithm for Carotid Artery Disease Diagnosis: A Machine Learning Approach Leveraging Craniocervical CT Angiography. *Tomography* **2024**, *10*, 1622-1644.
https://doi.org/10.3390/tomography10100120

**AMA Style**

Özdemir Hİ, Atman KG, Şirin H, Çalık AE, Senturk I, Bilge M, Oran İ, Bilge D, Çınar C.
Super Learner Algorithm for Carotid Artery Disease Diagnosis: A Machine Learning Approach Leveraging Craniocervical CT Angiography. *Tomography*. 2024; 10(10):1622-1644.
https://doi.org/10.3390/tomography10100120

**Chicago/Turabian Style**

Özdemir, Halil İbrahim, Kazım Gökhan Atman, Hüseyin Şirin, Abdullah Engin Çalık, Ibrahim Senturk, Metin Bilge, İsmail Oran, Duygu Bilge, and Celal Çınar.
2024. "Super Learner Algorithm for Carotid Artery Disease Diagnosis: A Machine Learning Approach Leveraging Craniocervical CT Angiography" *Tomography* 10, no. 10: 1622-1644.
https://doi.org/10.3390/tomography10100120