Diagnostic Value of Fully Automated Artificial Intelligence Powered Coronary Artery Calcium Scoring from 18F-FDG PET/CT

Objectives: The objective of this study was to assess the feasibility and accuracy of a fully automated artificial intelligence (AI) powered coronary artery calcium scoring (CACS) method on ungated CT in oncologic patients undergoing 18F-FDG PET/CT. Methods: A total of 100 oncologic patients examined between 2007 and 2015 were retrospectively included. All patients underwent 18F-FDG PET/CT and cardiac SPECT myocardial perfusion imaging (MPI) by 99mTc-tetrofosmin within 6 months. CACS was manually performed on non-contrast ECG-gated CT scans obtained from SPECT-MPI (i.e., reference standard). Additionally, CACS was performed using a cloud-based, user-independent tool (AI-CACS) on ungated CT scans from 18F-FDG-PET/CT examinations. Agatston scores from the manual CACS and AI-CACS were compared. Results: On a per-patient basis, the AI-CACS tool achieved a sensitivity and specificity of 85% and 90% for the detection of CAC. Interscore agreement of CACS between manual CACS and AI-CACS was 0.88 (95% CI: 0.827, 0.918). Interclass agreement of risk categories was 0.8 in weighted Kappa analysis, with a reclassification rate of 44% and an underestimation of one risk category by AI-CACS in 39% of cases. On a per-vessel basis, interscore agreement of CAC scores ranged from 0.716 for the circumflex artery to 0.863 for the left anterior descending artery. Conclusions: Fully automated AI-CACS as performed on non-contrast free-breathing, ungated CT scans from 18F-FDG-PET/CT examinations is feasible and provides an acceptable to good estimation of CAC burden. CAC load on ungated CT is, however, generally underestimated by AI-CACS, which should be taken into account when interpreting imaging findings.


Introduction
Hybrid 18F-fluorodeoxyglucose positron emission tomography (18F-FDG PET) with computed tomography (CT) has evolved as an important imaging modality for staging and restaging of oncological patients [1]. Clinical 18F-FDG PET/CT examinations consist of a PET scan and a non-contrast, free-breathing, ungated CT. The CT is used for (a) PET Diagnostics 2022, 12, 1876 2 of 11 attenuation correction but (b) also includes relevant morphological information regarding disease/tumor extent. Even though the appropriate oncological diagnosis and treatment planning is the primary concern in cancer patients, relevant comorbidities should not be underestimated and should ideally be described in the imaging report. Indeed, a recent population-based study including more than 3 million cancer patients indicated that the highest number of cardiovascular deaths occurred in the first year following initial cancer [2].
Coronary artery calcium (CAC) is an important biomarker in patients with coronary heart disease (CHD) [3,4]. Increased CAC scores are strongly associated with cardiovascular mortality and all-cause mortality [3]. Hybrid 18F-FDG PET/CT examinations are generally not suited for the comprehensive evaluation of CAC or CHD, as the CT scan is neither acquired nor reconstructed with the appropriate scan parameters as recommended for dedicated cardiac CT calcium scans. Therein, the lack of ECG-gating, the use of iterative reconstruction algorithms, and the specific choice of field of view, slice thickness, kernel, and tube voltage can be challenging [5][6][7]. Nonetheless, an opportunistic screening resulting in the rough estimation of the coronary disease burden by means of CAC would be highly desirable; indeed, this was recommended in a recent consensus statement of the British Societies of Cardiovascular Imaging/Cardiac Computed Tomography and Thoracic Imaging [8]. Optimally, this assessment (i.e., CAC scoring, CACS) should be performed fully automatically so that the physician can continue to focus on the oncological workup of the scan. With recent advances in the field of artificial intelligence (AI) for medical imaging [9][10][11][12], deep-learning (DL) powered calcium scoring tools have been developed that allow for the quantitative assessment of CAC in a fully automated manner [11][12][13][14].
Given the considerations outlined above, these tools would be suited for the opportunistic assessment of CAC in patients undergoing oncologic 18F-FDG PET/CT examinations, as quantitative CAC scores are provided without having to perform CAC scoring manually.
In this study, we sought to test the feasibility of such an approach. Specifically, we assessed the quantitative accuracy of an AI-powered CACS tool in estimating CAC from CT scans acquired during oncologic 18F-FDG PET/CT examinations using manual CACS measurements from a dedicated cardiac imaging workup as the standard of reference.

Study Population and Study Design
This study was approved by the local ethics committee (BASEC No. 2017-01112; Kantonale Ethikkommission, Kanton Zürich, Switzerland; secondary approval on 07.04.2021), and the need for informed consent was waived due to the retrospective nature of the study. The study population was partly shared in previous studies [1,15]. Our study population was selected from a retrospective cohort study of consecutive patients undergoing (a) a whole-body 18F-FDG-PET/CT for malignant disorders at the University Hospital of Zurich between November 2007 and February 2015, and (b) 1-day stress/rest (regadenoson, adenosine, dobutamine, or exercise) myocardial perfusion imaging by 99mTc-tetrofosmin single-photon emission computed tomography (SPECT-MPI) including non-contrast, ECGgated CT for attenuation correction within 6 months of 18F-FDG-PET/CT imaging to evaluate known or suspected CAD ( Figure 1). CAC scoring was performed manually on the dedicated non-contrast ECG-gated CT scans (120 kV, reconstructed with weighted filtered back projection, a slice thickness of 3 mm, and an increment of 1.5 mm) as obtained during myocardial perfusion imaging by two experienced physicians in consensus (i.e., reference standard) using a dedicated software program (Smartscore, GE Healthcare, Milwaukee, WI, USA) [16]. Out of 100 selected patients, 20 patients were identified with a CAC score of 0, 16 patients with a score of 1-100, 23 patients with a score of 101-400, and 41 patients with a score of >400. An overview of the patient demographics is provided in Table 1. Next, scores from manual CACS as performed on dedicated non-contrast ECG-gated CT scans (i.e., reference standard) were compared to scores from AI-CACS as performed on CT scans from 18F-FDG-PET/CT imaging (see: Section 2.3).
As a preliminary proof-of-concept, we also tested whether manual CACS and AI-CACS can theoretically be performed on all datasets (i.e., non-contrast ECG-gated CT scans from SPECT-MPI and CT scans from 18F-FDG-PET/CT imaging) effortlessly. Thus, manual CACS and AI-CACS was performed on all datasets of 15 patients. These data are provided solely in the Supplementary Material.   As a preliminary proof-of-concept, we also tested whether manual CACS and AI-CACS can theoretically be performed on all datasets (i.e., non-contrast ECG-gated CT scans from SPECT-MPI and CT scans from 18F-FDG-PET/CT imaging) effortlessly. Thus, manual CACS and AI-CACS was performed on all datasets of 15 patients. These data are provided solely in the Supplementary Material.

Whole Body 18F-FDG PET/CT Including Ungated CT
Patients underwent PET/CT imaging from skull to pelvis one hour after injection of 18F-FDG (including a non-contrast, free-breathing, ungated CT scan). Images were acquired in 3D mode on a Discovery VCT or Discovery RX scanner (GE-Healthcare, Milwaukee, WI, USA). PET/CT and CT images were merged and analyzed using Advantage Window Volume Viewer software (GE-Healthcare, Milwaukee, WI, USA) [1].

Fully Automated AI-CAC Scoring
AI-CACS was performed with a fully automated deep-learning based CAC scoring tool (AVIEW CAC, Coreline Soft, access via https://cloud.corelinesoft.eu/login accessed on 1 May 2022). In brief, the software was developed based on a 3-dimensional U-net architecture using non-enhanced cardiac CT scans acquired from multiple vendors and scanners. A more detailed description of the network architecture and the algorithm, including information on initial training datasets and validation procedures, can be found elsewhere [17,18]. No training data were included in this current study [14,17]. Thus, this study represents an external validation and test of the AI-CACS algorithm under clinically realistic conditions. Initially, the non-contrast, free-breathing, ungated CT scans from the 18F-FDG PET/CT examination were postprocessed in the hospital's PACS system by cropping the image series. Specifically, a second dataset encompassing all images from the lung apex to the lung base was generated for each patient. This anonymized image series was then transferred to the AI tool. Fully automated CACS was then performed without any further user input. The results from CACS were then summarized in a report generated by the AI tool.

Statistical Analysis
The data were initially presented with descriptive statistics. Diagnostic accuracy parameters were computed to quantify the AI tool's ability to correctly identify coronary calcium relative to the reference standard. Quantitative CAC scores between the AI tool and the reference standard were compared by means of intraclass correlation coefficient (ICC) analysis, linear regression modelling, and Bland-Altman analysis. Interclass agreement of CAC risk category classes was quantified by means of weighted Kappa analysis. For ICC and weighted Kappa analysis, the following scale was considered for results interpretation: poor (ICC, k < 0.20), fair (ICC, k = 0.21-0.40), moderate (ICC, k = 0.41-0.60), good (ICC, k = 0.61-0.80), and excellent (ICC, k = 0.81-1.00) agreement [19]. All statistical analyses were performed in the R programming language (https://www.r-project.org accessed on 1 May 2022).

Results
A total of 100 patients who underwent both whole-body 18F-FDG PET/CT and SPECT-MPI (including a dedicated ECG-gated CACS) within a 6-month period were enrolled. In all patients, the AI-CACS tool successfully managed to process the dataset (i.e., ungated low dose CT).

Diagnostic Accuracy of AI-CACS for the Detection of Coronary Calcifications
The sensitivity of the AI-CACS tool for the detection of coronary calcifications analyzed per-patient was 85.0%, and analyzed per-coronary artery analysis 74.5% (left main, LM), 82.0% (left anterior descending, LAD), 64.2% (left circumflexus, LCX), and 61.7% (right coronary artery, RCA), respectively. Further results of the per-patient and per-coronary artery diagnostic performance of AI-CACS are presented in Table 2.

Quantitative Agreement of AI-CACS with Manual CACS
Interscore agreement (i.e., ICC) of CAC scores between the AI tool and manual measurements as the reference standard was 0.88 (95% CI: 0.827, 0.918). The linear regression model between CAC scores of the AI tool and the reference standard ( Figure 2) revealed an R 2 of 0.84, an intercept of 180, and a slope of 1.2. Bland-Altman analysis showed a bias of 274.8 and a lower and upper limit agreement of −714.9 and 1264.5, respectively (see

Quantitative Agreement of AI-CACS with Manual CACS
Interscore agreement (i.e., ICC) of CAC scores between the AI tool and manual measurements as the reference standard was 0.88 (95% CI: 0.827, 0.918). The linear regression model between CAC scores of the AI tool and the reference standard (  In terms of risk category classes (Table 3), a weighted Kappa score of 0.800 was found for the interclass agreement between the AI tool and the reference standard. Reclassification of risk category occurred in 44 cases (44%), of which there was shifting by In terms of risk category classes (Table 3), a weighted Kappa score of 0.800 was found for the interclass agreement between the AI tool and the reference standard. Reclassification of risk category occurred in 44 cases (44%), of which there was shifting by one category in 39 cases (89%) and by two categories in 5 cases (11%). In 42 cases (42%), the AI tool underestimated the risk category, whereas in 2 cases (2%), the AI tool overestimated the CAC burden (risk class 1 instead of risk class 0). Representative cases of the AI-CACS tool correctly identifying the coronary calcium burden in a patient are presented in Figure 3. Further examples presenting false negative findings as well as the two patients of the study cohort that were falsely classified to have an Agatston Score > 0 (i.e., false positive findings) are presented in Figures 4 and 5, respectively. one category in 39 cases (89%) and by two categories in 5 cases (11%). In 42 cases (42%), the AI tool underestimated the risk category, whereas in 2 cases (2%), the AI tool overestimated the CAC burden (risk class 1 instead of risk class 0). Representative cases of the AI-CACS tool correctly identifying the coronary calcium burden in a patient are presented in Figure 3. Further examples presenting false negative findings as well as the two patients of the study cohort that were falsely classified to have an Agatston Score > 0 (i.e., false positive findings) are presented in Figures 4 and 5, respectively.       , however, small calcifications of mitral valve were falsely marked as ramus circumflexus CAC. In another 66-year-old male patient (B), no CAC was found in dedicated gated CAC scan, however, small areas of increased density due to image noise adjacent to a pacemaker electrode were marked as calcification. Note: Pink areas are highlighing dense areas in the CT image (i.e., bone or calcification), with the circles indicating coronary artery calcification.

Discussion
In this retrospective study, we aimed to assess the value of a fully automated AI tool to accurately quantify CAC in patients undergoing non-contrast free-breathing ungated CT as part of an oncologic 18F-FDG PET/CT examination.
Our data indicate that the AI tool manages to detect and quantify CAC with acceptable to good accuracy without requiring any user input. However, AI-CACS from ungated CT generally underestimates CAC burden, which should be kept in mind. Nonetheless, this study further provides evidence that CAC scores can be extracted effortlessly from various types of CT scans, thus potentially expanding the diagnostic value and impact of a given examination.
Following a recent consensus statement from the British Societies of Cardiovascular Imaging/Cardiac Computed Tomography and Thoracic Imaging, physicians are urged to report incidental coronary calcifications on all CT scans covering the chest, as CAC is an important marker of CAD in both symptomatic and asymptomatic patients [8]. Specifically, CAC is associated with a poorer prognosis in various patient groups, including cancer patients. Notably, a sub-analysis of the National Lung Screening Trial showed that CAC scores of >100 were associated with a four to sevenfold increase in mortality risk as compared to patients without CAC [20]. For CAC grading, the authors recommend using a semi-quantitative ordinal scoring system instead of the conventional quantitative Agatston scoring system. While the authors acknowledge that the Agatston scoring system represents the gold standard assessment for CAC, they point out that the additional time effort and use of dedicated software may prevent physicians from implementing and performing CACS on non-dedicated CT scans, as in the case of PET/CT imaging [8].
In the current study, we present a viable approach that enables physicians to extract quantitative Agatston scores from ungated CT scans as acquired for attenuation correction of oncologic PET scans. This AI tool runs fully automatically without any further user input and generates a detailed CACS report that can directly be sent to the user or to the institutions' PACS. Notably, the tool has previously been validated by Vonder et al., who tested the tool's performance relative to manual CACS measurements in a cohort of 997 patients who had undergone a dedicated cardiac CT protocol, including calcium scans, as part of a cardiovascular screening program. The authors found an interscore agreement of 0.958 and an interclass agreement of 0.96 for risk categories, thus confirming the AI tool's ability to perform CACS accurately on dedicated cardiac calcium CT scans [14].
In contrast, we found an interscore agreement of 0.88 and an interclass agreement of 0.800 for risk categories. In this regard, it should be noted that dedicated calcium scans are performed in breath-hold and with ECG-gating. This is not the case for the CT acquired during PET/CT examinations and may therefore significantly impact CAC quantification accuracy. Specifically, it has been shown that the calcium load can be underestimated on ungated CT scans [21][22][23][24]. Furthermore, it should be noted that the acquisition and reconstruction parameters of the CT scan from PET/CT imaging may differ from those recommended for a dedicated cardiac calcium scan. For example, the latter should be performed at 120 kV and should be reconstructed with weighted filtered back projection [5,7].
Despite these differences and challenges, our AI tool achieved an acceptable to good performance in detecting and quantifying CAC. Specifically, although reclassification of risk categories frequently occurred (44% of cases), risk categories nearly always (i.e., 89%) shifted by only one category. Furthermore, in nearly all cases, the risk category was underestimated by the AI tool, which may partially be due to the inherent limitations of an ungated CT scan for CAC detection. When interpreting CAC scores as obtained from AI-CACS on CT scans from PET/CT, this should be kept in mind nonetheless.
Finally, we would like to emphasize that we did not further optimize the AI tool prior to study onset by performing any specific or further training on our dataset. Thus, the data used in the current study represent a true validation set. In this regard, it should be noted that the performance of the AI tool may be further improved in the future by training the algorithm with further study/institution specific data. Importantly, we suspect that additional training with ungated CT scans may prove valuable in improving the tools accuracy.
Our study has the following limitations: First, this was a retrospective single-center study with a limited number of subjects. Nonetheless, despite the specific and selective inclusion criteria, we achieved a sample size comparable to that of similar studies [18,25].
In terms of study subject selection, it should be acknowledged that the results inherently depend on the examined patient cohort. Here, we used a unique and heterogeneous patient cohort of oncologic patients with scans ranging back to 2007. Incidentally, the AI tool may provide even better results when using more recent scans and scans from a more homogenous patient cohort (performed on more modern scanners). Second, we did not perform manual CACS or semi-quantitative visual grading of CAC on the CT scans from PET/CT imaging. This would have allowed us to better quantify the measurement inaccuracy of the AI tool itself. This should be investigated in future studies. Third, as a reference standard, we used manual CACS scores from a dedicated cardiac SPECT-MPI examination performed within 6 months of the oncologic PET/CT. CAC scores are not expected to change within this time frame; nevertheless, it should be acknowledged that minor changes may have occurred, thus potentially introducing a bias.
In conclusion, our study indicates that an AI tool enables fully automatic and effortless calcium scoring on non-contrast free-breathing, ungated CT scans from 18F-FDG-PET/CT examinations, thereby providing an acceptable to good estimation of the CAC burden. CAC load on ungated CT is, however, generally underestimated by AI-CACS, which should be taken into account when interpreting imaging findings. Nonetheless, our findings provide evidence that physicians can effortlessly achieve an acceptable to good estimation of the CAC burden from oncologic 18F-FDG PET/CT examinations, thus potentially enabling an opportunistic screening of CAD and allowing for the further expansion of the diagnostic spectrum and value of the imaging modality.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/diagnostics12081876/s1, Table S1: Manual CACS and AI-CACS on ungated CT from PET/CT and dedicated gated CAC CT from myocardial SPECT/CT.

Informed Consent Statement:
The need for informed consent was waived by the local Ethics Committee due to the retrospective nature of the study. Data Availability Statement: Data can be made available upon reasonable request to the corresponding author.

Conflicts of Interest:
The University Hospital of Zurich holds a research agreement with GE Healthcare (unrelated to current study). Other than that, the authors of the study declare no conflict of interest.