2. Material and Methods
2.1. Study Participants
The study was approved by the local institutional review board. Written consent was waived based on the retrospective study design.
Consecutive patients presenting to our institution between October 2020 and February 2021 who underwent multimodal brain CT for a suspected acute ischemic stroke and met the inclusion criteria were retrospectively identified using our Radiological Information System and Picture Archiving and Communication System. The inclusion criteria were: (1) patient ≥18 years old, and (2) multimodal stroke CT protocol performed within 24 hours of symptom onset or last seen well. Exclusion criteria were: (1) technically inadequate CTA (poor contrast bolus or substantial motion or metal artifact that precluded accurate assessment of the intracranial arteries to the level of the distal M2 segments of the middle cerebral arteries by an experienced neuroradiologist), and (2) localization of LVO in posterior circulation as the tested AI software does not provide an assessment of posterior circulation vessels.
2.2. CT Image Acquisition and Reconstruction Technique
All patients were scanned on a 128-slice multi-detector CT SOMATOM Definition AS+ (Siemens Healthcare GmbH, Erlangen, Germany). Our institution’s routine multimodal stroke CT protocol consisted of unenhanced CT followed by computed tomography perfusion (CTP) and CTA.
Unenhanced CT scans were acquired in the helical mode with the following parameters: 0.6 mm slice collimation, spiral pitch factor of 0.55, tube voltage of 120 kV, and image matrix 512 × 512. Images were reconstructed at 1 mm overlapping sections, iterative reconstruction factor of 5 and convolution kernel J30s. Axial, coronal, and sagittal multiplanar reconstructions were performed at 3 mm slice thickness.
For CTP, 50 mL of nonionic contrast agent was injected intravenously at a rate of 6 mL/s followed by 40 mL saline flush administered at the same rate. The scanning parameters were 80 kVp and 200 mA. Scans were performed every 3 s during the first 10 s, every 1.5 s during the following 25 s, and again every 3 s during the remaining 25 s. They were started with a delay of 2 s after contrast material injection, providing a total of 28 volume datasets. The total coverage in the z-axis was 96 mm, with a slice width of 10 mm obtained in 5 mm increments using a shuttle mode (adaptive 4-D spiral).
Perfusion parameters were calculated using the commercial perfusion software package syngo.via CT Neuro Perfusion (Siemens Healthcare GmbH, Erlangen, Germany) based on a deconvolution algorithm with the least mean square fitting.
For the subsequent CTA, 80 mL of the same nonionic contrast agent was injected intravenously at a rate of 5 mL/s, followed by a 40 mL saline flush administered at the same rate. Contrast bolus triggering was performed in the aortic arch. Parameters for the helical acquisition were as follows: craniocaudal coverage from the aortic arch to vertex, 100 kV tube voltage with dose modulation, slice collimation width 0.6 mm, image matrix 512 × 512, and spiral pitch factor 0.5. The following reconstruction parameters were used: iterative reconstruction factor of 5 and convolution kernel H10f. Axial images were reconstructed at 0.6 mm overlapping sections. Axial, coronal, and sagittal MIP images were reconstructed at 3 mm thickness.
2.3. LVO Definition
For this study, the term intracranial LVO was classified into two types: (1) proximal anterior circulation occlusions involving the supraclinoid internal carotid artery (ICA) segment and the middle cerebral artery (MCA-M1); the MCA-M1 segment was defined as from its origin to its genu along the inferior aspect of the Sylvian fissure. The MCA beyond the origin of the anterior temporal artery or beyond early bifurcations was considered as an M1 continuation, provided the postbifurcating course preceded the genu; and (2) distal anterior circulation occlusions involving the MCA-M2. The M2 segments were defined as those immediately distal to the MCA bifurcation/trifurcation that ascend vertically within the Sylvian fissure.
Proximal occlusions were further classified as limited to supraclinoid ICA, ICA extending to proximal part of MCA-M1, and limited to MCA-M1.
2.4. Image Analysis
The reference standard was set by a board-certified interventional neuroradiologist with 15 years of experience blinded to all clinical and imaging data, including data on interventional therapy and follow-up. CTAs were assessed in axial, coronal, and sagittal MIP images reconstructed at 3 mm thickness.
The CTAs were also assessed by three last year medical students from the Radiological Scientific Circle of our University after 2 h training in assessing CTA. They were blinded to any supporting information. Three students assessed CTAs together, and their findings were recorded in consensus. They analyzed images in the same manner used by an interventional neuroradiologist.
The technical adequacy of the CTA was first assessed.
The following features were then recorded: the presence, side, and site of an intracranial LVO.
The time gap from CTA to the initial digital subtraction angiography (DSA) series was recorded as well.
All CTP exams were assessed regularly by a radiologist on duty for clinical use, but their results were not recorded for the purpose of this study.
2.5. LVO Detection Using Automated Software
An automated tool—e-CTA (a part of e-STROKE, version 10.1p3, Brainomix Ltd., Oxford, UK)—was used to analyze each patient’s CTA raw data for the presence, side, and site of an intracranial LVO.
The e-Stroke Suite image processing algorithms follow an AI approach, with a combination of traditional 3D graphics and statistical methods and machine learning classification techniques. The input DICOM data is first resampled to correct any gantry tilt and standardize the input resolution. Then, a fast proprietary registration approach is applied to re-align the data, removing any tilt and rotation. This ensures that the image is presented in a standard reference frame.
The e-CTA uses a combination of machine learning and deep learning algorithms to identify LVOs. The e-CTA received CE mark certification in 2018.
In all cases, processing time, defined as the time from data transmission to the reception of results, was recorded.
2.6. Statistical Analysis
The analysis of sensitivity, specificity, positive predictive value (PPV), negative predictive value (NPV), and accuracy to detect LVOs on the correct side was performed for any occlusion (overall: proximal or distal). For proximal occlusions (excluding distal occlusions) and distal occlusions (excluding proximal occlusions), the analysis of sensitivity was performed, and 95% confidence intervals were calculated with the exact method. Interobserver agreement between the neuroradiologist, medical students and e-CTA was assessed with Cohen’s kappa coefficient. Statistical calculations were performed by a medical statistician (K.S.) using Statistica 13 data analysis software (TIBCO Software Inc., Palo Alto, CA, USA) with Plus Bundle 5.0.96.
3. Results
The inclusion criteria were met by 113 patients. Five patients with posterior circulation occlusion were excluded. None of the patients were excluded due to technically inadequate CTA. Finally, 108 participants were included in the study. Their mean age was 70 years (±12.6 years), and 55 (50.9%) were females.
The neuroradiologist found LVOs in 70 (64.8%) cases, of which 45 (41.7%) were proximal, and 25 (23.1%) were distal occlusions. In all 70 cases with LVOs detected by a neuroradiologist, MT was subsequently performed. Initial DSA series preceding MT confirmed the presence of LVOs detected by a neuroradiologist with CTA in all cases without false positive ones. This gave the specificity of 100% for interventional neuroradiologist.
The mean time gap from CTA to the initial DSA series was 24 min (±8 min). In patients treated with intravenous thrombolysis, the initial DSA series showed migration of thrombus from the proximal to the distal part of M1 in two cases and from the proximal to the distal part of M2 in one case.
Medical students detected 63 (58.3%) LVOs, while e-CTA revealed 49 (45.4%) LVOs. The findings are presented in detail in
Table 1.
Out of the 70 LVOs, 23 (32.9%) were missed by e-CTA, from which 7 out of 45 (15.6%) were proximal and 16 out of 25 (64.0%) were distal. Examples of proximal and distal LVOs detected by e-CTA are illustrated in
Figure 1.
This gave the overall sensitivity for e-CTA of 0.67 (95% CI 0.55–0.78), 0.84 (95% CI 0.71–0.94) for proximal LVOs, and 0.36 (95% CI 0.18–0.57) for distal LVOs. Overall specificity, PPV, NPV and accuracy for e-CTA were 0.95 (95%CI 0.82–0.99), 0.96 (95%CI 0.86–0.99), 0.61 (95%CI 0.47–0.73), and 0.77 (95%CI 0.68–0.84), respectively. The mean processing time was 96 seconds (±23 s).
In total, 17 out of 70 (24.3%) LVOs were missed by the medical students, from which 3 out of 45 (6.7%) were proximal, and 14 out of 25 (56.0%) were distal. The medical students achieved the overall sensitivity of 0.76 (95% CI 0.64–0.85), 0.93 (95% CI 0.82–0.99) for proximal LVOs, and 0.44 (95% CI 0.24–0.65) for distal LVOs. Students reached the overall specificity, PPV, NPV, and accuracy of 0.74 (95% CI 0.57-0.87), 0.84 (95% CI 0.73–0.92), 0.62 (95% CI 0.47–0.76), and 0.75 (95% CI 0.66–0.83), respectively.
Table 2 summarizes the performance of e-CTA and medical students.
Interobserver agreement between the neuroradiologist and medical students was 0.47 (Cohen’s kappa) overall, 0.68 for proximal LVOs and 0.18 for distal LVOs. Levels of interobserver agreement between neuroradiologist and e-CTA were 0.55, 0.78, and 0.34, respectively. Finally, respective levels of interobserver agreement between medical students and e-CTA were 0.45, 0.57, and 0.03.
4. Discussion
The introduction of MT revolutionized the treatment of ischemic stroke, significantly improving outcomes. This therapy became the standard of care in patients with occlusion of major arteries supplying the brain, termed large vessel occlusion (LVO). As the baseline inclusion criterion for MT is the presence of an LVO, its accurate and rapid detection is necessary.
Routinely, CTA is used for this purpose as a part of diagnostic stroke protocols. However, its quick and accurate assessment targeted at LVO’s detection needs expertise and relies on the learning curve. The most competent assessment is guaranteed by experienced neuroradiologists. However, as they are usually accessible in comprehensive stroke centers performing MT, the majority of stroke patients primarily are transported to the nearest stroke unit without a neuroradiologist present. Therefore, deciding whether a patient is eligible for MT is frequently challenging in primary stroke units.
In the last few years, several vendors developed AI-based software packages assisting radiologists in the selection of stroke patients for MT, particularly in LVO detection: e-STROKE (Brainomix Ltd., Oxford, UK), RAPID CTA (iSchemaView, Menlo Park, CA, USA), and Viz LVO (Viz.ai, San Francisco, CA, USA).
We tested the performance of e-CTA, a part of the e-STROKE package, for LVO detection using an experienced neuroradiologist’s reading as the reference. Our results show low overall sensitivity of 0.67 and accuracy of 0.77 for e-CTA in LVO detection, especially for distal occlusions—0.36 and 0.71, respectively. The sensitivity of 0.84 for proximal LVOs could be considered satisfactory. However, the sensitivity of 0.36 for distal LVOs should be regarded as insufficient. The e-CTA missed 32.9% LVOs. The majority of missed occlusions were localized distally. This means that ⅓ of stroke patients with LVO would be falsely disqualified from MT by e-CTA, which must be declared inadmissible.
Possibly, e-CTAs performance could be improved by a more comprehensive and more precise training on large datasets containing a wide range of real-world CT scans from stroke patients (focused on cases with distal LVOs) and negative controls, with ground-truth data from additional imaging such as MRI and CT, along with other modalities and clinical information. These datasets should contain examples of scans captured with scanners from all major manufacturers from a wide range of countries worldwide. Noteworthy, in our study, the last year medical students after 2 h training reached similar sensitivity and accuracy but much lower specificity compared to AI-based software. Such a constellation may be caused by e-CTA’s settings which prefer high specificity at the expense of sensitivity.
It is worth referring to the overdetection of MCA-M1 occlusions by medical students, with a total of 36, vs. 24 revealed by the neuroradiologist. However, students found significantly fewer ICA (8 vs. 14) and MCA-M2 (10 vs. 25) occlusions compared to neuroradiologist. This could be explained by students mistakenly classifying some ICA occlusions as proximal MCA-M1 while some proximal MCA-M2 occlusions as distal MCA-M1. These mistakes were presumably caused by their limited experience assessing complicated anatomy of the cerebral arteries in CTA images.
We found only a single study testing the same software published by Seker et al., who found a higher overall sensitivity of 0.84 in a group of 144 stroke patients [
7]. The performance of e-CTA in both studies is similar for proximal LVO. In this group, the sensitivity reported by Seker et al. was 0.93 vs. 0.84 in our study, and their accuracy was 0.97 vs. 0.89 in our study [
7]. This discrepancy could be explained by a different definition of proximal and distal LVOs used in both papers. Seker et al. limited proximal LVO to the initial part of MCA-M1 up to its anterior temporal branch, while the continuation of MCA-M1 beyond the anterior temporal branch was classified as distal LVO [
7]. We used a different approach based on anatomy, as described in the methods section.
The added value of our study is a more comprehensive analysis by including distal LVOs, which was omitted in the paper published by Seker et al. [
7]. It should be emphasized that according to guidelines, MT is recommended in distal LVOs as well
The only study showing similar results to ours was presented as the conference abstract by Dornbos et al., who reported the overall sensitivity of 0.66 and 0.39 for distal LVO using Viz LVO [
10].
However, other studies evaluating Viz LVO revealed much higher sensitivities of 0.81–0.90 overall, 0.92 for proximal, and 0.54 for distal occlusions [
8,
11,
12,
13].
Similarly, two papers evaluating RAPID CTA showed much higher sensitivity values compared to ours. Amukotuwa et al. report the sensitivity of 0.92 overall, 0.94 for proximal, and 0.86 for distal LVO detection [
4]. Dehkharghani et al. show an overall sensitivity of 0.96 [
5].
According to the only systematic review published in 2020 by Murray et al., LVO detection studies variably report AI software performance with broad sensitivities of 0.67–0.98 [
9]. However, this systematic review is based only on several conference abstracts, as all of the above-mentioned studies were published later.
Although all comprehensive platforms by iSchemaView, Viz.ai, and Brainomix are based on a convolutional neural network (CNN) algorithm to detect LVOs, each vendor uses different modifications of this method. These modifications include different software settings that prefer high sensitivity at the expense of specificity or vice versa. Additionally, in RAPID CTA, a CTA vessel density detection feature is used to identify relative distal MCA vessel asymmetries suggestive of an LVO. These differences may impact the results of the evaluation studies causing the above-presented discrepancies.
Relatively poor performance of the tested AI-based software in our study, especially for distal LVOs, could be partially explained by the inconsistent definition of proximal and distal LVOs in the literature. Some authors, like Seker et al. and Amukotuwa et al. [
4,
7], define the vessel segment beyond the anterior temporal branch as a so-called M2 trunk which is classified as distal, while others, like the authors in the present study, regard it as the distal portion of the M1 segment which is classified as proximal. This difference could reduce the sensitivity of e-CTA for distal LVO detection in our study.
Lower sensitivity for distal LVO detection compared to proximal could also be caused by the more complex anatomy of MCA-M2 branches, which makes automatic segmentation more challenging; the smaller size of vessels with poorer contrast filling and their greater individual anatomical variability compared to ICA and MCA-M1.
The other source of discrepant results could be that AI systems were evaluated against different reference standards of different qualities. In the study by Seker et al., it was a neuroradiologist with more than ten years of experience; in the study by Amukotuwa et al., two diagnostic neuroradiologists with eight and nine years of post-fellowship experience; in the study by Dehkharghani et al., three board-certified neuroradiologists with 11, 7, and 7 years of experience [
4,
5,
7]. In our study, the reference standard was set by a board-certified interventional neuroradiologist with 15 years of experience in assessing CTA. Therefore, a unified definition of “ground truth” against which algorithms will be evaluated is warranted.
Despite limited and inconclusive evidence, there is no doubt that AI has the potential to improve fast and accurate stroke diagnosis and LVO triage. In our study, the mean processing time of e-CTA was about 1.5 min, which is significantly shorter than human reading.
AI-based software packages are not limited to LVO detection but automatically calculate ASPECTS score and perfusion results required for decision-making in an extended therapeutic window of >6 h for MT. The application of AI software in the interpretation of large stroke imaging datasets may reduce false-negative human errors in image interpretation, increase the efficiency of stroke triage, and finally improve long-term outcomes.
Considering the ongoing improvement of AI-based algorithms, we share the opinion expressed by Murray et al. that the value of the AI software as a tool for clinicians in the management of stroke patients will presumably increase in the future [
9].
Study Limitations
The present study is limited to testing only one AI software on a relatively small population. Another limitation is the usage of a single reader assessment as the reference standard.
There remains a paucity of clinical trials evaluating AI software. Systematic and standardized methods for validating and comparing these tools with established “ground truth” are also warranted.