Predicting CT-Based Coronary Artery Disease Using Vascular Biomarkers Derived from Fundus Photographs with a Graph Convolutional Neural Network

The study population contains 145 patients who were prospectively recruited for coronary CT angiography (CCTA) and fundoscopy. This study first examined the association between retinal vascular changes and the Coronary Artery Disease Reporting and Data System (CAD-RADS) as assessed on CCTA. Then, we developed a graph neural network (GNN) model for predicting the CAD-RADS as a proxy for coronary artery disease. The CCTA scans were stratified by CAD-RADS scores by expert readers, and the vascular biomarkers were extracted from their fundus images. Association analyses of CAD-RADS scores were performed with patient characteristics, retinal diseases, and quantitative vascular biomarkers. Finally, a GNN model was constructed for the task of predicting the CAD-RADS score compared to traditional machine learning (ML) models. The experimental results showed that a few retinal vascular biomarkers were significantly associated with adverse CAD-RADS scores, which were mainly pertaining to arterial width, arterial angle, venous angle, and fractal dimensions. Additionally, the GNN model achieved a sensitivity, specificity, accuracy and area under the curve of 0.711, 0.697, 0.704 and 0.739, respectively. This performance outperformed the same evaluation metrics obtained from the traditional ML models (p < 0.05). The data suggested that retinal vasculature could be a potential biomarker for atherosclerosis in the coronary artery and that the GNN model could be utilized for accurate prediction.


Introduction
Atherosclerosis is a chronic inflammatory disease of the arteries which is due to the buildup of plaques adhering to the inner vessel wall. The early detection of atherosclerosis is crucial for early treatment and prevention. However, current clinical diagnosis techniques, such as coronary computed tomography angiography, tend to only identify the plaques at their advanced stages rather than in the early stages. More importantly, medical imaging is usually utilized when clear symptoms of atherosclerosis, such as acute chest pain, are observed in high-risk patients. Early subclinical disease detection remains a challenge. Hence, finding additional variables for the risk stratification or even the early detection of atherosclerosis is needed.
Changes in the micro-vasculature, such as vessel nicking/narrowing, have been recognized as early indicators for macro-vascular abnormalities. It was believed that the common pathophysiologic processes of atherosclerosis may underlie both macro-vascular and micro-vascular disease [1][2][3].
The retinal vasculature shares similar anatomical and physiological characteristics with the coronary circulations [4][5][6]. It can be non-invasively acquired by a fundus camera and quantitatively measured on the digital fundus images using machine learning techniques [7]. Several recent studies have focused on investigating the association between retinal microvascular changes and the risk factors for cardiovascular diseases. Klein et al. [8] examined the relationships of retinal arteriolar changes with clinical and subclinical manifestations of atherosclerosis. They found that the arteriolar-to-venular ratio (A/V ratio), after adjusting for present and previous blood pressure and medications, was associated with the presence of carotid plaque. Hubbard et al. [9] suggested that a lower A/V ratio was associated with increased blood pressure and incidence of cardiovascular diseases independently of other known risk factors (e.g., gender, age, blood pressure). Ikram et al. found that the venular diameters were linearly related to several markers of atherosclerosis (e.g., leukocyte count, erythrocyte sedimentation rate, HDL levels etc.), and a lower AVR was significantly related to a higher carotid plaque score [10]. Lyu et al. showed that the occurrence of retinal vein occlusion was significantly associated with increased LDL cholesterol levels, increased brachial-ankle pulse wave velocity (baPWV)and the presence of carotid plaques [11]. Wong et al. studied the association of retinopathy (e.g., the presence of microaneurysms, hemorrhages, cotton wool spots, hard exudates and etc.) with coronary artery calcification scores as on cardiac computed tomography [12].
Being motivated by previous studies on retinal vascular changes and atherosclerosis, it is conceivable that the quantitative retinal vascular biomarkers can be used to predict coronary artery disease using CAD-RADS as a proxy for cardiovascular disease. In this study, we first examined the association between retinal vascular changes and CAD-RADS. Then, we utilize a graph convolutional neural network (GNN) model to predict the CAD-RADS (assessed on CCTA) using the quantitative vascular biomarkers (derived from fundus images) of the same subjects.

Materials and Methods
This prospective single-centre study included patients from a tertiary hospital in Hong Kong from May to October 2019. This study was approved by the local institutional review board. All subjects gave full written consent for the study. The patients received both fundoscopy examination and coronary computed tomography angiography on the same day. The CAD-RADS scores for the patients were stratified based on their CCTA scans. Their fundus images were processed to extract, in total, 96 vascular biomarkers (detailed below). A diagram shown in Figure 1 summarizes the pipeline of this study. The fundus images were acquired and pre-processed by brightness normalization. (C,E): The blood vessel centerlines were manually annotated, and the bifurcation relationship at each junction was labeled. (D): The optic disc was contoured, and the optic disc diameter was measured. (F): The type of the vessels was categorized into arteries or veins. (G-J): Multiple biomarkers on the extracted vasculature, including the vessel curvature, bifurcation feature, width, and the fractal for arteries and veins, were extracted, respectively. We measured the maximum, mean, median and minimum value for each biomarker, which yielded, in total, 96 biomarkers for each fundus image. At last, we built machine learning models to predict the corresponding CAD-RADS score. The fundus images were acquired and pre-processed by brightness normalization. (C,E): The blood vessel centerlines were manually annotated, and the bifurcation relationship at each junction was labeled. (D): The optic disc was contoured, and the optic disc diameter was measured. (F): The type of the vessels was categorized into arteries or veins. (G-J): Multiple biomarkers on the extracted vasculature, including the vessel curvature, bifurcation feature, width, and the fractal for arteries and veins, were extracted, respectively. We measured the maximum, mean, median and minimum value for each biomarker, which yielded, in total, 96 biomarkers for each fundus image. At last, we built machine learning models to predict the corresponding CAD-RADS score.

Study Population
In this study, we enrolled 173 subjects who were ambulatory patients scheduled for outpatient radiological investigations (i.e., coronary computed tomography angiography). The subjects were recruited on the same day as their CCTA scanning for a fundoscopy examination. Patient demographics, clinical history, and comorbidities were recorded. Blood pressure and heart rate were measured at the time of enrollment.

Fundus Examination
The fundoscopy image acquisition was performed on-site in a dedicated room. A color, non-stereo and non-mydriatic fundus camera, FundusVue (Crystalvue, Taoyuan City, Taiwan), was used for the image acquisition. The acquired image size was 2592 × 1944 pixels, and all images were stored in JPEG compression format. Examinations were performed in ambient lighting with a dark cloth drape over the participants' heads during image acquisition. During the examination, macula-centered, 45 • field-of-view retinal fundus photographs were taken for both left and right eyes. The duration of a single screening took 3-5 min.

Grading of Fundoscopy Images and Subject Exclusion
To analyze the association between retinopathy and the CAD-RADS scores, all acquired fundus images were independently reviewed by two clinical ophthalmologists from our institution in Hong Kong with 10 and 12 years of experience in fundus image interpretation. They scored for the presence or absence of four common eye diseases: (1) tessellated retina (TR), (2) DM-related retinopathy (DM-R); (3) age-related macular degeneration (AMD) and (4) pathologic myopia (PM). The readers also graded the image quality (satisfactory/sub-optimal) during the reading. Figure 2 shows an example of satisfactory and sub-optimal images, respectively. If both left and right eye fundus images were of poor quality, the patient was excluded from the study. Consequently, 28 subjects were excluded from the study. The demographics of the remained study population are summarized in Table 1.

Study Population
In this study, we enrolled 173 subjects who were ambulatory patients scheduled for outpatient radiological investigations (i.e., coronary computed tomography angiography). The subjects were recruited on the same day as their CCTA scanning for a fundoscopy examination. Patient demographics, clinical history, and comorbidities were recorded. Blood pressure and heart rate were measured at the time of enrollment.

Fundus Examination
The fundoscopy image acquisition was performed on-site in a dedicated room. A color, non-stereo and non-mydriatic fundus camera, FundusVue (Crystalvue, Taoyuan City, Taiwan), was used for the image acquisition. The acquired image size was 2592 × 1944 pixels, and all images were stored in JPEG compression format. Examinations were performed in ambient lighting with a dark cloth drape over the participants' heads during image acquisition. During the examination, macula-centered, 45° field-of-view retinal fundus photographs were taken for both left and right eyes. The duration of a single screening took 3-5 min.

Grading of Fundoscopy Images and Subject Exclusion
To analyze the association between retinopathy and the CAD-RADS scores, all acquired fundus images were independently reviewed by two clinical ophthalmologists from our institution in Hong Kong with 10 and 12 years of experience in fundus image interpretation. They scored for the presence or absence of four common eye diseases: (1) tessellated retina (TR), (2) DM-related retinopathy (DM-R); (3) age-related macular degeneration (AMD) and (4) pathologic myopia (PM). The readers also graded the image quality (satisfactory/sub-optimal) during the reading. Figure 2 shows an example of satisfactory and sub-optimal images, respectively. If both left and right eye fundus images were of poor quality, the patient was excluded from the study. Consequently, 28 subjects were excluded from the study. The demographics of the remained study population are summarized in Table 1.
(a) (b) Figure 2. The image quality of fundoscopy was graded by two ophthalmologists. The fundus images with sub-optimal quality were excluded in this study. (a) Satisfactory image; (b) sub-optimal image. Figure 2. The image quality of fundoscopy was graded by two ophthalmologists. The fundus images with sub-optimal quality were excluded in this study. (a) Satisfactory image; (b) sub-optimal image.

Coronary Computed Tomography Angiography (CCTA) Acquisition and Examination
Coronary CT scans were performed on a 320 MDCT scanner (Aquilion One, Canon Medical System, Tokyo, Japan). All patients underwent a standardized protocol of premedication with oral and intravenous beta-blockers for lowering heart rate to <65 bpm if needed. Sublingual glyceryl trinitrate (GTN) spray was given prior to scan acquisition. Prospective ECG gating was used. Intravenous iodinated contrast was injected during scan acquisition. The scans that were non-diagnostic were excluded from the study and were not considered further in our study.

CAD-RADS Score
The CAD-RADS score was developed to standardize CCTA reporting and was assessed on a per-patient basis. The CAD-RADS score represents the highest-grade coronary artery lesion documented by CCTA. It ranges from CAD-RADS 0 (zero) for the complete absence of stenosis and plaque, to CAD-RADS 5 for the presence of at least one totally occluded coronary artery [13]. The examples of CCTA for CADRADS 1 (minimal) to 5 (occluded) are shown in Supplementary Figure S1.
In this study, we examined the association between fundus vascular biomarkers and the CAD-RADS scores via binary classification. We binarized the original CAD-RADS scores (range from 0 to 5) in two ways, yielding two different predictive models. For model 1, the negative class 0 included CAD-RADS scores ≤1 (i.e., normal and minimal), and the positive class 1 included CAD-RADS scores ≥2 (i.e., mild, moderate, severe stenosis and occlusion). This model is intended to discover early signs of atherosclerosis. For model 1, the control group and the abnormal group were separated equally, where the control group (class 0) had 70 subjects, and the abnormal group (class 1) had 75 subjects. For model 2, we divided the study population into patients of non-significant CAD-RADS (CAT = 0) and significant CAD-RADS (CAT = 1). The subjects of significant CAD-RADS were of high severity of coronary artery disease. They were the patients, according to their clinical records, who were scheduled for surgical intervention, or who had prior intervention (stent or coronary artery bypass graft (CABG) found in the CCTA) or who had been stratified to CAD-RADS ≥ 4. Then the remainder were regarded as non-significant CAD-RADS.

Fundus Biomarkers
In this study, we focus on the quantitative biomarkers of the retinal microvasculature, which can be summarized into four main categories: vessel width, vessel tortuosity, vessel junction property and vessel fractal dimensions. A diagram is shown in Figure 1. A total of 96 biomarkers were generated.

Vessel Manual Modelling
Retinal vessel skeletonization is an essential step for the quantitative analysis of retinal vessel structures. In this study, the vasculature was manually modelled on a vessel-enhanced map of the fundus images.
First of all, the green channel of the color images was extracted and brightness-normalized as it provided the best contrast for the vasculature [7]. Afterwards, the vessel enhancing method proposed by Zhang et al. was employed to obtain the vessel probability map (named soft segmentation) [14]. This technique is based on applying a set of multi-scale left-invariant derivative (LID) filters and locally adaptive derivative (LAD) filters on the orientation scores of an image. It robustly enhances elongated structures (i.e., the vasculature) and suppresses the background structure, yielding the enhancement of a complete retinal vessel network.
Afterwards, we used the open-source Java plugin "NeuronJ" for ImageJ (National Institutes of Health, Bethesda, MD, USA) to manually model the retinal vasculature on the vessel enhanced map [15]. First of all, we traced the centerline for each vessel segment on the image with branching points broken. Then, we determined the vessel type (i.e., artery/vein) using the default names for neurons, i.e., axon for arteries and dendrite for the veins (as shown in Figure 3). At last, we labelled parent-child relationships for the blood vessels using the labelling system of "NeuronJ". If a vessel segment originated from the optic disc or its parent was not presented in the image, we labelled its parent as "N0". the vessel enhanced map [15]. First of all, we traced the centerline for each vessel segment on the image with branching points broken. Then, we determined the vessel type (i.e., artery/vein) using the default names for neurons, i.e., axon for arteries and dendrite for the veins (as shown in Figure 3). At last, we labelled parent-child relationships for the blood vessels using the labelling system of "NeuronJ". If a vessel segment originated from the optic disc or its parent was not presented in the image, we labelled its parent as "N0".

Figure 3.
The retinal vasculature was manually annotated by using the "NeuronJ" plugin in ImageJ software. The vessel centerlines were traced on the vessel soft segmentation map. The parent-child relationship of the vessels was determined using the labelling system in "NeuronJ".

Optic Disc Labelling
The optic disc is an important landmark for retinal image analysis. In this study, we obtained the boundary of the optic disc by manually clicking 6 points and fitted a rotated ellipse to these points. Afterwards, the pixel size for the image was estimated by taking the ratio between the general optic disc diameter (1800 µm) and the longest diameter (in pixel) [16]. The pixel size was used to convert the vessel width measured in pixels to the actual width in µm.

Vessel Width
Many clinical studies have shown that the changes in retinal vessel caliber are associated with the progress of a variety of systemic diseases [5,[17][18][19]. In this study, we measured the vessel width using the method proposed by Huang et al. [20,21]. In brief, this technique utilizes the geodesic active contour model proposed by Caselles et al. [22]. For every vessel segment, it initializes an enclosed contour by expanding the extracted vessel centerline. Afterwards, the contour is iteratively deformed to fit a smooth boundary over the vessel segment on the normalized green channel image. We computed the distance from one detected vessel edge to the other one for each control point on the contour. The measured distances with extreme values were eliminated to avoid outliers. The final vessel width was estimated as the average of the remained edge distances.
The measured distances with extreme values were eliminated to prevent outliers, and vessel width was calculated as the average of the remaining measurements. Then the (a) Raw image (b) Normalized image (c) Vessel enhanced map(d) Vessel centerlines N20 N105 Figure 3. The retinal vasculature was manually annotated by using the "NeuronJ" plugin in ImageJ software. The vessel centerlines were traced on the vessel soft segmentation map. The parent-child relationship of the vessels was determined using the labelling system in "NeuronJ".

Optic Disc Labelling
The optic disc is an important landmark for retinal image analysis. In this study, we obtained the boundary of the optic disc by manually clicking 6 points and fitted a rotated ellipse to these points. Afterwards, the pixel size for the image was estimated by taking the ratio between the general optic disc diameter (1800 µm) and the longest diameter (in pixel) [16]. The pixel size was used to convert the vessel width measured in pixels to the actual width in µm.

Vessel Width
Many clinical studies have shown that the changes in retinal vessel caliber are associated with the progress of a variety of systemic diseases [5,[17][18][19]. In this study, we measured the vessel width using the method proposed by Huang et al. [20,21]. In brief, this technique utilizes the geodesic active contour model proposed by Caselles et al. [22]. For every vessel segment, it initializes an enclosed contour by expanding the extracted vessel centerline. Afterwards, the contour is iteratively deformed to fit a smooth boundary over the vessel segment on the normalized green channel image. We computed the distance from one detected vessel edge to the other one for each control point on the contour. The measured distances with extreme values were eliminated to avoid outliers. The final vessel width was estimated as the average of the remained edge distances.
The measured distances with extreme values were eliminated to prevent outliers, and vessel width was calculated as the average of the remaining measurements. Then the contour was evolved iteratively and fitted to the boundaries of the vessel. Finally, the vessel caliber was measured by computing the distance from one detected vessel edge to the other one. The processing pipeline for the width measurement of the vessel segments is summarized in Figure 4).
contour was evolved iteratively and fitted to the boundaries of the vessel. Finally, the vessel caliber was measured by computing the distance from one detected vessel edge to the other one. The processing pipeline for the width measurement of the vessel segments is summarized in Figure 4).

Vessel Tortuosity
The vessel curvature is another important biomarker of the vasculature, which is defined as the integration of curvature along a curve. In this study, we computed in total 14 curvature-based metrics for every vessel segment as summarized by Kalitzeos et al. [23].

Bifurcation Junction Parameters
The branching pattern at vessel bifurcations might reflect pathological changes in the circulation system. It was believed that the vasculature is not a totally random network but has been developed under some optimum physical principles, e.g., the minimum friction between blood flow and vessel wall, the optimal heart rate to achieve proper blood supply and the shortest transport distances, etc. In this study, we calculated the bifurcation properties proposed by Al-Diri et al. [24]. In brief, let 0 , 1 and 2 be the width of parent vessel and its daughter vessels, 1 and 2 be the angle between two daughter vessels and the parent vessels, and be the angle between the two daughter vessels. The bifurcation optimality can be quantitatively measured by various metrics, including: (1) the asymmetry ratio = � 2

Vessel Fractal Dimensions
The last vascular biomarker that describes the overall vascular changes is the fractal dimension. The fractal dimension measures the general complexity of a self-similar structure, i.e., the vascular tree. In this study, we computed three fractal methods that are widely used in the literature for the vasculature (the centerlines): (1) the box dimension DB; (2) the information dimension DI; and (3) the correlation dimension (DC) [25][26][27].

Vessel Tortuosity
The vessel curvature is another important biomarker of the vasculature, which is defined as the integration of curvature along a curve. In this study, we computed in total 14 curvaturebased metrics for every vessel segment as summarized by Kalitzeos et al. [23].

Bifurcation Junction Parameters
The branching pattern at vessel bifurcations might reflect pathological changes in the circulation system. It was believed that the vasculature is not a totally random network but has been developed under some optimum physical principles, e.g., the minimum friction between blood flow and vessel wall, the optimal heart rate to achieve proper blood supply and the shortest transport distances, etc. In this study, we calculated the bifurcation properties proposed by Al-Diri et al. [24]. In brief, let d 0 , d 1 and d 2 be the width of parent vessel and its daughter vessels, θ 1 and θ 2 be the angle between two daughter vessels and the parent vessels, and θ be the angle between the two daughter vessels. The bifurcation optimality can be quantitatively measured by various metrics, including: (1) the asymmetry ratio α = d 2

Vessel Fractal Dimensions
The last vascular biomarker that describes the overall vascular changes is the fractal dimension. The fractal dimension measures the general complexity of a self-similar structure, i.e., the vascular tree. In this study, we computed three fractal methods that are widely used in the literature for the vasculature (the centerlines): (1) the box dimension DB; (2) the information dimension DI; and (3) the correlation dimension (DC) [25][26][27].

The GraphSAGE Model
Applying graph analysis to population data for disease prediction is highly beneficial [28]. A population graph is constructed based on two important elements: nodes and edges, where the nodes are the individuals from a population pool, and the edges are the user-defined linkage/similarity between every two persons. In this study, we built our population graph at an image level, where the graph nodes represent the fundus image samples of the left/right eye of the participants. The nodes are characterized by the quantitative vascular biomarkers as introduced above. For the edges, we defined a similarity score calculated based on the age and gender difference between every two participants. In brief, if two subjects have the same gender and their age difference was less than 5 years old, their similarity score was 2. If either one of the criteria matched, the similarity score was 1. If none of the criteria matched, the similarity score was 0, and the edge between their nodes was removed. The detail of the similarity score is summarized in Appendix A.
After the graph was constructed, we trained a graph convolutional neural network for the CAD-RADS prediction. In this study, we utilized the graph sample and aggregate network (GraphSAGE) network for the CAD-RADS score prediction [29]. It is a graph neural network framework developed for inductive representation learning on large graphs, which have rich node attributes such as high-dimensional node features and complicated connectivity between the nodes. During the training, the network learns aggregation functions that generate low-dimensional vectors from the embedding of the high-dimensional node features and the complicated node connectivity. We used a two-layer architecture with a sum-readout layer as the prediction model. The graph construction and the GNN model were summarized in Figure 5. The network architecture is summarized in Appendix B.

The GraphSAGE Model
Applying graph analysis to population data for disease prediction is highly beneficial [28]. A population graph is constructed based on two important elements: nodes and edges, where the nodes are the individuals from a population pool, and the edges are the user-defined linkage/similarity between every two persons. In this study, we built our population graph at an image level, where the graph nodes represent the fundus image samples of the left/right eye of the participants. The nodes are characterized by the quantitative vascular biomarkers as introduced above. For the edges, we defined a similarity score calculated based on the age and gender difference between every two participants. In brief, if two subjects have the same gender and their age difference was less than 5 years old, their similarity score was 2. If either one of the criteria matched, the similarity score was 1. If none of the criteria matched, the similarity score was 0, and the edge between their nodes was removed. The detail of the similarity score is summarized in Appendix A.
After the graph was constructed, we trained a graph convolutional neural network for the CAD-RADS prediction. In this study, we utilized the graph sample and aggregate network (GraphSAGE) network for the CAD-RADS score prediction [29]. It is a graph neural network framework developed for inductive representation learning on large graphs, which have rich node attributes such as high-dimensional node features and complicated connectivity between the nodes. During the training, the network learns aggregation functions that generate low-dimensional vectors from the embedding of the highdimensional node features and the complicated node connectivity. We used a two-layer architecture with a sum-readout layer as the prediction model. The graph construction and the GNN model were summarized in Figure 5. The network architecture is summarized in Appendix B.

Traditional Machine Learning Models
Besides the GraphSAGE model, we also examined the five most commonly used machine learning models on the same dataset for comparison. We tested the logistic regression classifier (LR), the linear discriminant analysis classifier (LDA), the k-nearest neighbour classifier (kNN), the naïve Bayes classifier (NB) and the support vector machine classifier (SVM). All these classifiers were built-in modules in the scikit-learn (version 1.0.1) python package [30].

Feature Selection and Dimensionality Reduction
Feature reduction was performed prior to training the GCN model and the machine learning classifiers. In this study, we examined various feature selection techniques as summarized by Li et al. [31]. In this paper, we selected the best feature selection technique for the CAD-RADS models, respectively, i.e., giving the best area under the curve (AUC) in a repeated 10-folds cross-validation and reported the corresponding classification performances. The final selected feature selection techniques include: (1) correlation-based feature selection (CFS) [32]; (2) conditional mutual information maximization (CMIM) [33]; (3) double input symmetrical relevance (DISR) [34]; (4) interaction capping (ICAP) [35]; (5) Laplacian score-based feature selection (LAP) [36]; (6) support vector machine backward (SVMB) and (7) no feature selection used (all).

Statistical Study
We examined the odds ratio of the CAD-RADS scores to the presence of the four eye diseases, diagnosed by the ophthalmologists, by multinomial logistic regression analysis using SPSS 26 [37]. In the multinomial logistic regression analysis, we firstly adjusted for age and gender (denoted as OR-model 1) and additionally adjusted for cardiovascular risk factors, including systolic and diastolic blood pressure, heart rate, body mass index (BMI), diabetes stage and current cigarette smoking (denoted as OR-model 2).
To evaluate the performance of the machine learning classifiers, we applied a repeated 10 folds cross-validation procedure and computed the average of sensitivity (Sens.), specificity (Spec.), accuracy (Accu.), the receiver operating characteristic (ROC) curve and its area under the curve (AUC), F 1 -scores and precision. We measured these metrics based on image-wise vs. subject-wise levels. The subject-wise results were obtained by averaging the left and right eye predictions (if both eyes were available). A McNemar's test was used to assess the statistical difference between the GraphSAGE model and the traditional machine learning models.
We assessed the importance of each feature by examining its odds ratios and the p-value (under 95% confidence interval) among the CAD-RADS scores of multinomial logistic regression analysis. p < 0.05 indicates that the feature is significantly associated with the CADRADS score.

Results
A total of 145 subjects were included in the study, of which 78 subjects had both the eyes included and 67 subjects had only one eye included (either the left or the right eye). Characteristics of the study population across model 1 and model 2 are shown in Tables 1 and 2.
For the CAD-RADS model 2 non-significant CAD-RADS group, there were 44/108 (40.74%) subjects who had retinopathy. Of these, 25%, 3.7%, 12.04% and 3.7% of subjects were diagnosed with tessellated retina, DM-related retinopathy, AMD and pathologic myopia, respectively. For the OR-model 2 significant CAD-RADS group, there were 13/37 (35.14%) subjects who had retinopathy. Of these, 16.22%, 5.41%, and 10.81% were diagnosed with tessellated retina, DM-related retinopathy, and AMD, respectively. The differences in terms of age and BMI for CAD-RADS-S = 0 and CAD-RADS-S = 1 were not significant (Mann-Whitney test, p = 0.32 and p = 0.87, respectively). Subjects of CAT = 1, compared to those of CAT = 0, generally had lower heart rates (Mann-Whitney test, p = 0.054) and were more likely to be male (Chi-square test, p < 0.05). No differences were found in terms of blood pressure (both the systolic and diastolic pressure), tobacco usage and education level via Chi-square testing.
In the multinomial logistic regression shown in Table 2, after adjusting for age, gender (OR-model 1), and additionally adjusting for cardiovascular risk factors (OR-model 2), we found no associations between the two CAD-RAD models and the ophthalmologist-diagnosed retinopathy (see Supplementary Table S1). In Supplementary Table S2, we summarized the results between two CAD-RADS models and each of the extracted retinal vascular biomarkers. For CAD-RADS model 1, it was found that a few retinal vascular biomarkers were significantly associated with adverse CAD-RADS scores. These were mainly pertaining to arterial width, arterial angle, venous angle, and features relating to fractal dimensions. Table 3 summarises the classification results of the CAD-RADS GNN models (model 1 and model 2). In this study, we reported the performance of the model in terms of image-wise level and subject-wise level.

Discussion
Common fundus examinations only focus on finding pathological patterns, such as micro-aneurysms, exudates and edema, etc. [7]. It is almost impossible for ophthalmologists to quantitatively measure the various properties of the blood vessels, as there are no such metrics. Therefore, artificial intelligence and machine learning techniques are essential to access these parameters for disease prognostics. In this study, we studied the association between the changes in the retinal vasculature and the changes in the coronary vasculature. The retinal vasculature was manually annotated and quantitatively measured by 96 vascular biomarkers. The stenosis severity of the coronary artery was assessed by the CAD-RADS score for a standardized CAD reporting system. We attempted to use fundus vascular features to predict patients' CAD-RADS scores by using a graph neural network. We divided our dataset in two ways: (1) CAD-RADS ≤ 1 and CAD-RADS ≥ 2 (model 1); (2) CAT = 0 and CAT = 1 (model 2).
In this study, we examined the association between each fundus feature and the CAD-RADS score using multinomial logistic regression analysis. As summarized in Supplementary Table S2, the widths of both arteries and veins were significantly associated with the CAD-RADS (p < 0.05). This finding is consistent with other studies carried out among patients with diabetes, cardiovascular diseases and atherosclerosis [3,12,38]. Moreover, we found that the curvatures for both arteries and veins were significantly associated with the CAD-RADS (p < 0.05). The bifurcation angle and the fractal dimension showed no significant difference.
The results shown in Table 3 implied that linking the individuals as a graph based on their similarity in terms of age and gender was beneficial to the predictive task. In our study population, 78 subjects had both the left and the right eye images included, while 67 subjects (46%) had one eye image excluded due to suboptimal image quality as assessed by ophthalmologists. Therefore, we built our prediction models based on an image-wise level and evaluated the performance on both the image-wise level and subject-wise level. As we can compare the AUC of the GNN model regarding the image-wise and subject-wise classification from Table 3, the AUC of GNN on model 1 was 0.739 (95%: (0.675, 0.804)) for image-wise classification and 0.769 (95%CI: (0.708, 0.831)) for subject-wise classification. For model 2, the AUC of GNN was 0.692 (95%CI: (0.608, 0.775)) for image-wise classification and 0.742 (95%CI: (0.662, 0.822)) for subject-wise classification.
Although not statistically significant, it is implied that using the eye images from both eyes tended to yield better predictions for model 2, predicting more severe or established coronary artery disease, as indicated by a higher AUC.
We performed a comparison study on the performance achieved using a one-, two-, three-and four-layer GraphSAGE architecture. The accuracies with 95% CIs were 88.10% (83.10-93.10), 82.50% (76.60-88.40), 75.00% (68.3-81.7) and 73.10% (66.30-80.00), respectively. The p-values of the Z-test (mean-test) showed that one-layer architecture was significantly better than three-layers and four-layers (p < 0.001). The difference between one layer and two layers was not significant (p = 0.35). Since the two-layer and three-layer architectures were commonly used according to the work by Dwivedi et al. [39], we utilized a two-layer architecture in this study.
There are some limitations to our study. First, we compared the diagnostic performance of GNN and traditional machine learning models. The results showed that using a graph-based model outperformed the traditional methods. Due to the small sample size of the study population, we were unable to build a prognostic model with optimal diagnostic performance. Prospective multi-centre studies of large samples will be needed. Therefore, this study is only a proof-of-concept study. Second, we obtained the retinal vasculature by manual annotation using the "ImageJ" software. Since we had a limited number of fundus images, we wanted to include as many blood vessels as possible. We decided to manually annotate the vasculature and focused on developing the pipeline for vascular feature extraction and building the classification models. In the future, when more fundus images are available, a fully automatic blood vessel segmentation and modeling technique will be required for extracting the vascular structure [40][41][42]. Finally, we only included similarity scores based on two types of demographic information (i.e., age and gender). We also tried to incorporate other patient demographic information, such as smoking status, education level and BMI, but the classification results were not significantly improved. This might be due to the small sample size in our study population; thus, introducing more complicated similarity scores did not help improve the GNN model's performance. Last but not least, in order to simplify the study, the ophthalmologists only labeled four common eye diseases, i.e., tessellated retina, DM-related retinopathy, AMD and pathologic myopia. Other eye diseases, such as hypertension retinopathy, cataracts and glaucoma, were not included in the list of abnormal conditions, and their associations to atherosclerosis were not studied. In our future work, we plan to assess them in a larger population study.

Conclusions
In this study, we introduced a graph analysis and graph convolutional neural network for the prediction of CAD-RADS by using quantitative retinal vascular biomarkers. The classification performance outperformed traditional machine learning models. The results showed that by linking the subjects via their age and gender, the retinal vasculature is potentially predictive of the stenosis severity in the coronary artery. These data suggested that common pathophysiologic processes may underlie both micro-vasculature and macro-vasculature.

Informed Consent Statement:
Written informed consent has been obtained from the patient(s) to publish this paper.

Data Availability Statement:
The datasets used in this current study are available from the corresponding author on reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.

Appendix A
Let n i : n j be the two nodes representing two images of patients, g i , g j be their genders, and a i , a j be their ages, the similarity score (denoted as SimScore) is calculated by: SimScore n i , n j = Di f f g i , g j + Di f f a i , a j , where Di f f g i , g j = 1, i f g i = g j 0, i f g i = g j Di f f a i , a j = 1, i f a i − a j ≤ 5 0, i f a i − a j > 5.
The value of SimScore n i , n j could be either 0, 1 or 2, thus SimScore n i , n j > 0 results in an edge between nodes n i and n j . A larger SimScore n i , n j indicated a stronger connection between the two nodes.

Appendix B
In this study: we built a GraphSAGE network with two SAGEConv layers with the mean pooling aggregator. The network architecture is summarized in the Table A1 below.