Machine Learning Quantitation of Cardiovascular and Cerebrovascular Disease: A Systematic Review of Clinical Applications

Research into machine learning (ML) for clinical vascular analysis, such as those useful for stroke and coronary artery disease, varies greatly between imaging modalities and vascular regions. Limited accessibility to large diverse patient imaging datasets, as well as a lack of transparency in specific methods, are obstacles to further development. This paper reviews the current status of quantitative vascular ML, identifying advantages and disadvantages common to all imaging modalities. Literature from the past 8 years was systematically collected from MEDLINE® and Scopus database searches in January 2021. Papers satisfying all search criteria, including a minimum of 50 patients, were further analysed and extracted of relevant data, for a total of 47 publications. Current ML image segmentation, disease risk prediction, and pathology quantitation methods have shown sensitivities and specificities over 70%, compared to expert manual analysis or invasive quantitation. Despite this, inconsistencies in methodology and the reporting of results have prevented inter-model comparison, impeding the identification of approaches with the greatest potential. The clinical potential of this technology has been well demonstrated in Computed Tomography of coronary artery disease, but remains practically limited in other modalities and body regions, particularly due to a lack of routine invasive reference measurements and patient datasets.


Introduction
As the first and second leading causes of global mortality, ischemic heart disease and stroke demonstrate the need for improved tools in the management of occlusive vascular disease [1]. In spite of the global incidence of both being on the decline, regional trends vary, and the total number of persons affected continues to rise due to population growth [2,3]. Patients with cardiovascular disease leading to stroke and myocardial infarction often require significant medical imaging in the acute, sub-acute, and chronic settings, using a range of imaging modalities. Vascular imaging is then used as a key source of information in the determination of appropriate clinical management, from a range of potential pharmacological and surgical approaches [4][5][6]. The large datasets obtained from this imaging are traditionally interpreted qualitatively by clinicians and are highly heterogeneous, varying due to differences in patient, imaging technology, and site scanning protocols.
Artificial Intelligence (AI) is a broad field usually characterised by two key commonalities; the design of machines to mimic human cognition, and the design of machines to complete a task whilst optimising the outcome [7]. The potential applications of machine learning (ML) to medical imaging is an emergent field, attracting growing research investment [8]. Patients suspected of cardiovascular and neurovascular diseases undergo multiple medical imaging procedures, which generates large amounts of data used in conjunction with conventional medical datasets such as patient records. Whilst these records are meticulously maintained for each individual patient, pooling this data into multi-site collaborative databases will bolster the development of ML tools and automated ML analysis. Within medicine, ML has emerged as a prominent approach for automated diagnosis and image segmentation; for detailed background of ML theory and associated medical applications, readers are directed to [9,10].
Selecting the appropriate algorithm from a great many possibilities, with each having inherent strengths and weaknesses, is a core component in development of any ML technology. Discussion of algorithm selection, technical explanation of algorithm function, or explanation of supporting mathematics are beyond the scope of this work. ML, as the currently preferred approach for analysing medical imaging datasets, refers to algorithms used to build a model capable of identifying correlations between data features. The correlations in data features are identified by "seen" input data, before then being applied to previously unseen data to perform predictions.
This seen input data can be provided in two forms, supervised and unsupervised, with each using different classes of algorithms for model development. Supervised machine learning uses input data labelled by a relevant domain expert, such as a medical specialist. These labels may be in the form of sematic segmentation, labelling at a voxel level by contouring structures from an image, or classify at an image level with labels singularly classifying the image as a whole. These labels could identify the presence of pathology in an image, or sort images based on disease stage or sub-type [10]. Common supervised ML algorithms include Support Vector Machines (SVM), k-nearest neighbour, deep neural networks, and random forest [11]. In unsupervised machine learning algorithms such as fuzzy C-means, experts do not provide data labels, instead allowing the algorithm to determine classifications independently [9,10]. Although medical imaging data can be acquired using a variety of modalities or protocols and is notoriously heterogeneous, datasets can be considered as decimal arrays with two or more dimensions. Each pixel or voxel then represents levels of grey between 0 and the maximum bit-depth of the image. These matrices of grey level values are the input data which ML algorithms use for their model development.
The appearance, volume, and variability of data made available to researchers, as well as the clinical condition being investigated, all determine which machine learning algorithm and imaging modality are employed. Invasive Coronary Angiography (ICA), Computed Tomography Angiography (CTA), two-dimensional ltrasound (US), Intravascular ultra-Sound (IVUS), Magnetic Resonance Imaging (MRI), Optical Coherence Tomography (OCT), Invasive Coronary Angiography (ICA), and Nuclear Medicine (NM) each play a role in diagnostic and therapeutic vascular imaging, in the process generating datasets suitable for ML analysis.
Imaging of vasculature following myocardial infarction or stroke is routinely performed using a range of imaging techniques, as each provides unique information. The cur-rently published research in clinically useful vascular ML demonstrates different stages of development, varying with both vascular region and imaging modality. For example, US and IVUS ML research has a wealth of carotid publications, but a comparative dearth of coronary papers. Conversely, coronary CTA products have been commercialised for routine clinical use, whilst preliminary studies into non-coronary CTA products are lacking or absent. Both cardiovascular and neurovascular diseases have large disease burdens, especially in developed countries [26], but current ML research is not proportionate, and coronary applications are far more advanced. This is due to many factors, including coronary disease being the global leading cause of death and recent advances in the hardware and software used for coronary CTA, improving both image quality and acquisition speed. These techniques have the potential to assist in clinical management and reduce disease burdens, but the highly heterogeneous state of current literature makes the identification of future areas difficult. Review, synthesis, and critical analysis of ML approaches from a wide range of modalities and vascular locations is needed to identify commonalities and gaps in research. With this, researchers are better placed to ensure future work focuses on relevant clinical needs and more effectively translate into clinical practice, improving patient care.
This work reviews the current status of combined knowledge for ML based quantitation of both coronary and neurological vascular disease. The current effectiveness of ML to provide quantitative descriptors of vessel disease will be examined, as well as the impact of this quantitation on clinical decision making and patient outcomes, for a wide range of medical imaging modalities.

Materials and Methods
A systematic review was conducted in accordance with guidance included in the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [27].

Data Sources
A computerised literature search of Ovid MEDLINE and Elsevier SCOPUS was performed to find full text, original articles published in the eight years prior to 29 January 2021 in medicine, engineering, or computing, which investigated ML analysis of human vascular imaging. The search terms were combined as follows: (ALL ("Machine Learning" OR "Neural Network" OR "Support Vector" OR "Random Forest" OR "Bayesian Network" OR "Nearest Neighbor")) AND (ALL (plaque OR calcification OR ulceration OR stenosis))) AND (ALL (quantif* OR quantitati*))) AND NOT (ALL (spectroscop* OR alzheimer OR "tau" OR "amyloid" OR "lung" OR "multiple sclerosis")).

Data Extraction and Quality Assessment
Studies satisfying these criteria underwent assessment of title and abstract by two authors (C.B and E.B), with equivocal papers additionally reviewed by a third (G.B). This included analysis of abstract and titles which indicated quantitative machine learning analysis of human vascular imaging, with pertinent studies continuing to full text review. Using a standardised data extraction form, relevant information was collected, including (a) participant characteristics (patient numbers, patient age, age range, and sex), (b) data characteristics (number of images, imaging modality, and disease site), (c) AI characteristics (algorithm, algorithm parameters, quantitation results, and model validation/crossvalidation), and (d) per patient outcomes (gold standard, sensitivity/specificity/accuracy, and Area Under the Curve (AUC)).

Literature Search
The detailed literature selection process is shown in Figure 1. The systematic search identified 1098 articles, with most records deemed outside the area of interest based on title and abstract review. Although several studies quoted large datasets, some were referencing total number of pixels [28], Region-Of-Interest (ROI) subsets extracted from small image numbers [29], or number of images extracted from a small number of patients [30][31][32][33]. To manage any potential bias and avoid inclusion of overfit or low variability datasets, articles where total unique patient numbers were less than or equal to 50 were also excluded. Seven articles failed to identify the number of patient datasets used and were also excluded. The reference lists of the remaining 42 papers were then pearled, and an additional 5 papers were included, for a total of 47.

Literature Search
The detailed literature selection process is shown in Figure 1. The systematic search identified 1098 articles, with most records deemed outside the area of interest based on title and abstract review. Although several studies quoted large datasets, some were referencing total number of pixels [28], Region-Of-Interest (ROI) subsets extracted from small image numbers [29], or number of images extracted from a small number of patients [30][31][32][33]. To manage any potential bias and avoid inclusion of overfit or low variability datasets, articles where total unique patient numbers were less than or equal to 50 were also excluded. Seven articles failed to identify the number of patient datasets used and were also excluded. The reference lists of the remaining 42 papers were then pearled, and an additional 5 papers were included, for a total of 47. As previously mentioned, the appropriate ML techniques depend on both the type and quantity of available data, as well as the ML researchers' experience, skill, and currency of knowledge. Experienced ML researchers consider these and other practical factors in determining the most suitable algorithm, although algorithm choice is not unique, and several algorithms may be suitable for one data type, or multiple data types may be analysed successfully with one algorithm. The literature identified was sorted by imaging modality and algorithm selection, before extracting and tabulating data relevant for the development of an integrative overview of automated ML vascular quantitation. Vascular As previously mentioned, the appropriate ML techniques depend on both the type and quantity of available data, as well as the ML researchers' experience, skill, and currency of knowledge. Experienced ML researchers consider these and other practical factors in determining the most suitable algorithm, although algorithm choice is not unique, and several algorithms may be suitable for one data type, or multiple data types may be analysed successfully with one algorithm. The literature identified was sorted by imaging modality and algorithm selection, before extracting and tabulating data relevant for the development of an integrative overview of automated ML vascular quantitation. Vascular pathology quantification was most often performed using CTA (27) and ultrasound (6) imaging, as well as several publications using other modalities (9). All but one IVUS study [34] was excluded due to insufficient patient numbers [28,[35][36][37][38][39][40][41], highlighting the need for more large-scale research in this modality. A few studies attempting vascular quantitation from Nuclear Magnetic Resonance (NMR) [42] and Nuclear Medicine (NM) [43][44][45] imaging were also included, as well as one publication using ICA [46].

Computed Tomography Angiography
CT for vascular imaging has increased in utility due to consistent improvement in CT technology and is now the most prevalent imaging modality for quantitative ML analysis, spurred on by recent research into the low diagnostic yield of ICA [47]. Technological advancements allowing increased tube rotation speeds, reducing motion blur, smaller detector element sizes increasing spatial resolution, and the clinical implementation of advanced filtration and reconstruction techniques, have all improved visualisation of vasculature. A summary of CTA quantitative analysis is shown in Table 1. ML vascular analysis from CTA imaging is a well-established field [48,49], with assessment of ML and CTA-based calculation of Fractional Flow Reserve (FFR) published by multiple sites [50][51][52][53][54][55][56][57] and now explored by several comprehensive reviews [58][59][60]. FFR is defined as the ratio of blood flow, proximally and distally to a coronary vascular lesion, measured under pharmacologically maximised coronary blood flow [61]. Han et al. [62] performed CT-based FFR quantification using resting perfusion CT, instead of the more widely used CTA, obtaining sensitivity, specificity, and accuracy values shown in Table 2. Siemens ® Syngo cFFR (Erlangen, Germany) software was widely used across the literature, in various development stages, for the quantitation of FFR, and results are summarised in Table 3. This technology, based on coronary deep neural network research published previously [63], was used by both [53] and Yu et al. [57] to quantify FFR as part of a wider evaluation of lesion specific ischemia and functional significance. Both authors found a limited but measurable benefit of including CT-based FFR in plaque analysis. The resulting Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) measured by Coenen et al. [64], Hu et al. [51], and Yu et al. [54] for CT-based FFR were all similar to that quoted above, with all using invasive FFR measurements as a benchmark.
FFR clinical significance was typically based on a threshold of 0.8 and equivocality around this value (termed the 'grey zone' [57]) was observed for all measurement techniques, although to varying degrees [54]. Various other aspects of vascular health were quantitatively analysed using machine learning, based on CT imaging datasets. In these instances, hold-out [65][66][67][68] and k-fold cross validation [50,65,67], two ML validation techniques, were employed to improve algorithm performance, with both used in conjunction where possible to minimise bias. Quantification of individual plaque risk and Major Adverse Cardiovascular Events (MACE) risk stratification was examined in several papers, both with [53,69] and without [65,67,70] CTA-derived FFR.
Specific data properties used in ML analysis, often termed 'features', contribute unequally to the overall model performance and the inclusion of features should be done judiciously [71]. In vascular analysis, features such as remodelling index, plaque length, mean lumen diameter, mean luminal area, and napkin ring sign showed some utility as discriminating features, whilst others such as calcified plaque burden and spotty calcification had near zero information gain, meaning they were unlikely to provide any predictive power to model performance [50,53,69,70,72]. Several equivocal image features were identified as both strongly and weakly predictive by different publications, and these included total plaque volume, non-calcified plaque volume, and Agatston score (a score based on the maximal Hounsfield Unit (HU) value observed in CT imaging of coronary artery calcifications) [50,53,69,72]. This equivocality suggests feature selection should be performed systematically for each research work individually, due to variabilities in available data and project outcomes. Generally, ranking of features by information gain was consistent for CTA imaging, with quantitative image parameters contributing greater to model function than qualitative imaging parameters or clinical/laboratory values [71]. The automated and semi-automated determination of Agatston score has also been investigated separately [68,[73][74][75][76][77], although some publications utilise thresholding and individual/direct image analysis approaches, rather than ML.
Although useful, plaque quantitation is not the only informative quantitative metric in cardiovascular risk management. Motwani et al. [78], with access to one of the largest CTA imaging datasets in current vascular ML research, compared existing coronary risk stratification quantifiers such as Agatston score and Framingham risk score with a ML LogitBoost model for quantification of five-year All-Cause Mortality (ACM). Analysis of 44 imaging features and 25 clinical features found a statistically significant increase in the AUC of all-cause mortality risk, compared to Framingham risk score alone (0.79 v 0.61), although no validation cohort was investigated. Similar ML performance was observed in the ACM risk quantification of an 86,155 strong cohort of Korean patients [79], with the AUC values of ML and Agatston score mortality risk of 0.82 and 0.70, respectively. Despite the large dataset, performance in a validation cohort was not found to be statistically significant (AUC: 0.78 v 0.62), possibly suggestive of model overfitting in spite of large sample sizes.
Zreik et al. [80] analysed clinical plaque significance using a convolutional neural network developed in-house, investigating only left ventricular myocardium CT images with clusters of pixels belonging to myocardium identified by fast K-means clustering. This method contrasted to the direct vessel imaging approached employed widely throughout the literature across all modalities. The model development utilised 50% random dropout and 10-fold cross validation to minimise overfitting, and plaque significance was benchmarked against invasive FFR. Multiple FFR thresholds were investigated for their impact on the determination of individual lesion clinical significance, ranging from 0.72 to 0.8. Technical parameters were similarly varied, using multiple fast K-means cluster values (1-1000) for division of the myocardium and several convolutional auto-encoders. The mean AUC across the 50 cross-validations was 0.74 ± 0.02, with the majority of algorithm and clinical setting combinations returning AUC values between 0.65 and 0.75. In a continuation of this work, van Hamersvelt et al. [81] applied the analysis method developed previously to intermediary stenosis (defined as 25-69% stenosis as assessed by invasive coronary angiography), noting improved sensitivity and AUC with a slight decrease in specificity.
ML-based coronary vessel FFR quantitation using CTA datasets is rapidly becoming established as a clinical methodology, although as with all clinical practices, critical evaluation remains ongoing. Two elements of FFR quantitation subject to this critical evaluation were the significance of X-ray tube peak kilovoltage (kVp) dependencies undertaken by De Geer et al. [82], and the role of partial volume effects by Freiman et al. [83]. De Geer et al. [82] concluded no impact in FFR quantitation between 100 and 120 kVp and slightly better agreement between ML and invasive FFR quantitation at 100 kVp. The large pixels of CT imaging, in comparison to ultrasound and fluoroscopy, can result in partial volume effects in which assigned pixel values represent a mean of values from multiple structures contained within. Consideration of the role these partial volume effects play in FFR were found to improve both the specificity (0.51 to 0.73) and AUC (0.76 to 0.8) of FFR ML quantitation as compared with angiographically determined values.
Work in quantitative machine learning cerebrovascular analysis was extremely limited. No clear practical or technical reason could be found to sufficiently explain why research in this area is so sparse, particularly given the comparative examination throughput and vessel lumen diameter of coronary vessels and many cerebral vessels of clinical significance. The absence of a routinely performed invasive quantitation comparison methods and no collaborative multi-centre databases appear to be two possible reasons, with just two groups attempting to investigate this area [66,84]. The work of Park et al. [66] investigated aneurysm detection by expert operator from CT imaging, with and without the assistance of a segmentation deep neural network. An incremental but statistically significant increase in reader sensitivity, accuracy, and agreement was found. Deep learning approaches to neurological CTA were noted by the authors as absent from current literature prior to their investigation, suggesting future development may provide clinically useful results warranting further research. A preliminary pilot study by Acharya et al. [84] into carotid lumen segmentation and pathology quantification from CT data produced sensitivities, specificities, and accuracies of 0.88, 0.865, and 0.902, respectively, using an SVM classifier with radial basis functions. Statistically significant high-level features (p < 0.01) were identified for the differentiation of symptomatic and asymptomatic plaques, in particular higher energy and lower entropy in symptomatic images due to increased texture complexity. The generalisation of these results was limited, however, as only 20 consecutively sampled patients were used.

Ultrasound
The excellent soft tissue resolution and sub-millimetre spatial resolution of ultrasound is well suited to imaging of vascular detail, readily applicable to the large, superficial, and accessible common carotid artery bifurcation. As shown in Table 4 in contrast with CTA investigations, cardiac studies were comparatively few, with external ultrasound ill-suited to coronary vasculature investigation. Study sample sizes often included less than 50 patients and were too small for generalisation, particularly for highly invasive IVUS studies [28,36,39,40], although recently published works have attempted to address this [34]. Research examining segmentation of vascular anatomy and pathology using ultrasound is extensive, with publications reaching as back as far as 2000 [85,86], although advancements enabling automated quantification are comparatively few. The development of ML tools capable of identifying at-risk asymptomatic carotid disease and providing decision support is the primary focus of current ultrasound research. Bae et al. [34] developed and compared the performance of multiple ML methods for identification of vulnerable coronary plaques from IVUS investigations, as defined by the presence of a fibrous thin cap atheroma. Results were compared with OCT, the highest resolution vascular imaging currently clinically available. The implementation of tools such as these would support the management of neurovascular and cardiovascular diseases, although to date ultrasound is not widely employed for initial diagnosis in acute stroke [87][88][89].
As expected from the acquisition technology employed, US examinations offered an alternative set of image parameters compared to CTA, with some having utility as machine learning features. The most consistent of these features was the measurement of carotid Intima Media Thickness (IMT).
IMT is below the spatial resolution of CTA and cannot be visualized, but has been correlated to increased vascular risk [90]. This makes US a uniquely practical tool for measurement of IMT thickness. Segmentation of IMT has been performed using several US imaging parameters such as the Hough transform [91], frequency domain analysis [92], and pixel intensity analysis, the latter using a deep learning single-layer feed-forward neural network. This method achieved a sensitivity and specificity greater than 97% (shown in Table 5) when compared to expert manual segmentation [33]. Another analysis utilising image features unique to ultrasound was evaluation of the discrete Fréchet distances of greyscale cumulative distribution functions, as compared to idealised functions, from Huang et al. [93]. This greyscale distribution analysis for individual plaques in combination with a k-nearest neighbour classification system sorted plaques into echo-rich, intermediate, and echo-lucent, with association between echo type and plaque vulnerability established elsewhere [94,95].
The correlation of echogenicity and carotid plaque vulnerability was also utilised by Pedro et al. [96], Roy-Cardinal et al. [97] and Golemati et al. [98], although the former used a simplistic ROC cut-off analysis to produce a semi-quantitative plaque vulnerability indicator. Echogenicity was used in conjunction with both general greyscale image features such as Rayleigh parameters or grey level co-occurrence matrix (GLCM) decomposition and features more specific to ultrasound imaging such as homodyned-K parametric mapping, wavelet energy decomposition, or elastography. The two other papers, however, provided comprehensive examination of both symptomatic and asymptomatic carotid stenosis, detailing machine learning techniques, patient descriptors, and clinical outcomes. Using a range of extracted image parameters, US elastography, and plaque motion synchronisation, the composition and clinical significance of symptomatic and asymptomatic plaques were analysed with a random forest classifier. Roy-Cardinal et al. [97] also compared US plaque to composition determined by MRI, as well as comparing patient symptomology to experimental predictions, keeping results focused on improvement of clinical patient outcomes, the ultimate target of any ML model.

Other Imaging Modalities
Whilst US and CTA are the dominant imaging modalities, several other methods exist which are useful in the diagnosis and management of vascular disease, including MRI, NM, OCT, and ICA. Application of these modalities showed the greatest variability with ML approaches used on both direct image analysis and routinely obtained clinical descriptors. Image analysis research from these modalities used markedly smaller sample sizes, especially when compared to US and CT patient databases described above. Some reasons may include increased invasiveness and fewer centres to facilitate multi-institutional research or cost, although the exact reason remains unclear. ICA showed extensive use as a gold standard for all other imaging modalities, but ML image analysis of ICA was limited. Table 6 summarises ML vascular quantitation results performed using modalities other than CT and ultrasound.

Magnetic Resonance Imaging
Only two identified MRI publications satisfied all selection criteria. Waddle et al. [99], applied an SVM model with a fitcsvm function and radial bias function kernel to MRI, Magnetic Resonance Angiography (MRA), and functional MRI (fMRI) data of moyamoya patients. The hemispheric blood flow of patients was compared with healthy controls, and ML classification of hemispheres was performed with a resulting sensitivity, specificity, and AUC of 0.7, 0.83, and 0.71, respectively. The authors commented that ML techniques are notably underutilised in vascular imaging, despite finding their results "collectively offer increased support that both anatomical and functional hemodynamic imaging can serve as important machine learning inputs" [99].
Wu et al. [100] developed a deep convolutional neural network to investigate MRA of carotid vessels, applied to blood signal suppressed, or "black blood" images. Patient imaging was supplied from two previously collected research datasets [101,102] with carotid segmentation performed and quantitatively compared to expert manual segmentation. In addition to analysis of individual patient slices, data from the images immediately before and after each slice were considered, producing a "2.5-dimension" dataset. Carotid lesion types, as defined by the American Heart Association [103], were determined from segmented vessels and identified as atherosclerotic or non-atherosclerotic, with maximum ML accuracy and AUC of 0.89 and 0.95, respectively, in relation to expert decisions.
Four other publications utilising magnetic resonance techniques for quantitative vasculature analysis were identified, which although not satisfying inclusion criteria, warrant further discussion. These publications show future potential of MRI both for direct analysis of vessels, and the possible application of ML to information unique to MRI such as relaxation times or spectroscopic analysis.
The relationship between carotid vessel image parameters and stroke risk was investigated by Van Den Bouwhuijsen et al. [104] using logistic regression, from a large, pre-existing patient database [105]. Despite the simplicity of the approach, this method supports the importance of large datasets, associating stroke risk with intraplaque haemorrhage, carotid wall thickness, and calcification. Automated segmentation of Carotid artery plaque directly from contrast enhanced MRI has also shown promise, with a recent study [106] using 35 patients to obtain automated segmentations with a Dice score and true-positive of 0.89 and 0.93, respectively, as compared to manual analysis.
Van Hespen et al. [107] used sub-millimetre isotropic voxels to image ex vivo circle of Willis specimens and train a convolutional neural net to measure wall thicknesses for intracranial aneurysm. Although the results were promising, with less than 0.1 mm error in intracranial vessel wall estimation, a three patient validation sample size, and long acquisitions in a high field MRI (7T) places limits on any immediate clinical translation.
Forssen et al. [42] demonstrated the breadth of MRI to obtain clinically useful information when combined with machine learning, utilising supervised ML to quantify 256 metabolites associated with coronary artery disease, through Nuclear Magnetic Resonance spectroscopy.

Nuclear Medicine
The large difference in spatial resolution between CTA, US, and IVUS discussed previously is similar when comparing CTA and many common NM procedures. Instead of imaging vessels directly, Nakajima et al. [43][44][45] used approximately 2000 Technetium-99m Myocardial Perfusion Imaging (MPI) studies from Swedish and Japanese datasets to train an artificial neural network in the detection of perfusion defects and ischemia. Each model was tested on an in-house cohort of 106 patients and compared to both expert readers and a >50% stenosis gold standards, with version 1.1 of the model further validated on 364 patients. As shown in Table 7, improvement was observed in the upgraded 1.1 version ML tool, with sensitivities and specificities in excess of 87% for all patients, peaking at 88 and 100%, respectively, in patients with no history of prior infarction or coronary revascularization.
Quantitative and semi-quantitative descriptors routinely obtained during cardiac Positron Emission Tomography (PET) were also analysed using six ML algorithms and linear regression. Obstructive coronary disease status was determined from a sample of 88 patients, with performance statistics approaching 0.9 and SVM performing best.

Discussion
Across many imaging modalities and organ vasculatures, clinically useful information can be gained by quantitation of atherosclerotic disease and associated infarction. The task of segmenting and quantifying vascular pathology is currently both repetitive and laborious, as well as requiring specialist expertise. This combination makes the task well suited to automation by machine learning.
The ultimate focus of any research into ML vascular quantitation across all modalities and vascular territories must remain the meaningful and positive impact to patient outcomes. Although ML analysis of coronary CTA imaging has progressed furthest towards broad clinical use, accuracies in the range of 70-80% show that research remains to be done. Researchers should be buoyed by these results however, as they do demonstrate the clinical potential of quantitative vascular ML steadily becoming actualised.

Limitations and Future Work
The works reviewed in this paper achieved performance comparable to current methods and in some cases demonstrated commercialised ML vascular analysis in clinic. Despite the success of these methods, several hurdles remain before ML vascular image quantitation is ready to be applied to patient care in some contexts.

Common Machine Learning Limitations
Limited data availability and a lack of code accessibility (the black box reputation of ML) are limitations seen in many machine learning applications, within medicine and beyond. Although not unique to quantitative vascular machine learning, both are nonetheless important considerations in any future research.
Data management is the foundation from which all ML research is undertaken, and the comparative infancy of most current vascular ML quantitation research provides the ideal opportunity to establish standardised data sharing and result reporting approaches, which will support the development of useful clinical technology into the future. Construction of vascular imaging benchmark datasets has begun to address both the problem of transparency and data availability simultaneously. Unfortunately, current datasets are not truly publicly available (instead limited to ethically approved research trials with data available to a select group of researchers) and participant numbers insufficient in most cases for generalisation to the diversity of patients seen in many hospitals. These small datasets offer developers a starting point for useful evaluation and comparison of privately developed models, on the understanding that this database is not also used for model training. Some identified datasets include the MACHINE consortium [64], CON-FIRM registry [108], and PARADIGM [109] CTA datasets, as well as the annual challenge databases of the Medical Image Computing and Computer Assisted Intervention (MICCAI) society from all modalities, all of which have been used multiple times throughout the literature [67,73,79,83,110,111]. The handling of patient imaging for use in medical machine learning is a complex issue being contended with around the world [112]. Only once a broad framework of requisite protections is in place to allow ethical data handling can large databases be constructed to allow robust model development.
Many algorithms produced by current research have been developed and evaluated using patient data that is not independently accessible and which cannot be externally validated for reproducibility. Once these models are demonstrated by researchers to provide benefit, some are then packaged into commercial offerings with little or no public details on further changes or advances [63]. Without an external or open access reference database, further research benchmarking of the performance of comparative models in a transparent and useful fashion is impossible. This issue of data availability was discussed in both Coenen et al. [64] and Cho et al. [46], in which the authors stated patient data will not be provided for the purpose of independent result validation, with no explicit justification given. Failure by ML developers to make data or model design specifics available perpetuates the black box stereotype of these tools and adversely impacts clinician confidence when considering whether to use potentially beneficial tools. Gao et al. [113] attempted to address this with the provision of detailed algorithm processes in a summarised step-bystep fashion that, although not equivalent to open access, is an interesting intermediary step to improving reproducibility. Waddle et al. [99] was the only publication identified in this review which actively supported reproducibility, explicitly stating a provision for de-identified data to be made available on request.
For CTA studies, and in particular for the assessment of FFR, the commercial availability of deep learning platforms, such as cFFR v1.0-3.0 (Siemens Healthineers, Erlangen, Germany), CAAS vFFR (Pie-Medical, Maastricht, The Netherlands), and HeartFlow ® FFR-CT (HeartFlow, Redwood City, CA, USA) limited the provision of algorithm details due to commercial and intellectual property interests.

Reference Standards
The standardisation of cardiac analysis from CT imaging by the American Heart Foundation [114] has provided consistent segmentation nomenclature in this region. Similarly, the definition of FFR methodology [61] allowed results of works investigating FFR using CTA imaging to be compared with this standard metric [70,115]. Routine quantification of FFR for coronary CTA provides the ideal reference standard for ML based quantification from CTA imaging. Well defined coronary vascular segments, a defined methodology for FFR determination and the inclusion of quantitative invasive vascular quantitation for every patient allows method performances to be compared and approaches to be benchmarked. The routine quantitation of vascular disease in carotid vessels is currently limited to manually measured criteria such as that outlined by NASCET [116]. Although such measurements are clinically useful [117], the risk of inter and intra observer variability cannot be discounted, particularly when incorporated into large multi-nation, multi-centre databases, the likes of which will be required for robust machine learning model development.
Although not a singular reference measurement, the development and adoption of MICCAI and AAPM publications providing detailed methodologies for standardised evaluation of algorithms examining stenosis and lumen segmentation or coronary vascular quantitation supports model evaluation and comparison in a similar way [76,118,119]. Research into model benchmarking has begun, but is still in the early stages, with small patient sample sizes (n =10) and limited algorithms investigated (n = 4) [120].

Image Standards
Beyond the absence of invasive quantitation as a reference standard, the variation in acquisition methodology between sites also presents a challenge for generalisation from vascular machine learning. CT image reconstruction methods (filtered back projection, iterative reconstruction, and AI enhanced reconstruction), MRI settings (bandwidth, TR and TE times, matrix size), training of sonographers, and combinations of administered dose and acquisition time in nuclear medicine, all vary with institution. Image appearance preferences also vary between institutions and even reporting clinicians within one institution, with acquisition parameters changed accordingly. Such wide variability necessitates greater robustness in any clinical models to account for application of developed technologies to image appearances not previously encountered.

Reporting Standards
Reporting of machine learning results throughout the literature was vague and inconsistent for all modalities. Statistical metrics varied widely with combinations of sensitivity, specificity, accuracy, and AUC reported inconsistently, and compared to different gold standards. The nature of ML development and analysis further complicated inter-publication comparison with the result section of each paper quoting sensitivity, specificity, and AUC values obtained using different image feature combinations, algorithm parameter settings (such as SVM kernel), and repeating this process for multiple algorithms, resulting in large tables of statistics encompassing a wide range of values. Finally, the reporting of studies in vascular quantitation allowed results to be reported on a per-patient, per vessel, per vessel segment or per lesion basis, with some papers reporting results for several of these.

Conclusions
The review highlights that open access data, or a systematic and independent validation solution, are essential to the continued research growth of vascular ML quantitation. This issue was identified and best explained by Zreik et al. [121], who despite having access to a large imaging research facility, observed that "[with] a sufficiently large and diverse data set, a deeper CNN-only . . . analyzing a large single volume along the artery, could be employed to perform the presented analyses. However, obtaining such a large data set remains highly challenging . . . ". Neural networks, and convolutional neural networks especially, have been widely recognised across multiple modalities as the most promising candidates for vascular imaging ML analysis. Despite this, other articles identified by this review [99] concluded, similarly to Zreik et al. [121], that current data is insufficient for clinical CNN implementation. Although many techniques exist for maximising model performance from limited data, large quantities of unique datasets are essential to clinical model performance [122,123]. The imaging data accessible to vascular ML researchers at present is forcing the selection of algorithms to be heavily influenced by available datasets, not necessarily those that are the best performing or most informative. Once standardised datasets are made readily available, transparent and standardised reporting of results will promote collaboration and improve both the development of ML techniques and the clinical confidence in the use of the technology.

Conflicts of Interest:
The authors declare no conflict of interest.