Prediction of Tissue Damage Using a User-Independent Machine Learning Algorithm vs. Tmax Threshold Maps

a User-Independent Machine Algorithm vs. Tmax Threshold Maps. Abstract: (1) Background: To test the accuracy of a fully automated stroke tissue estimation algorithm (FASTER) to predict ﬁnal lesion volumes in an independent dataset in patients with acute stroke; (2) Methods: Tissue-at-risk prediction was performed in 31 stroke patients presenting with a proximal middle cerebral artery occlusion. FDA-cleared perfusion software using the AHA recommendation for the Tmax threshold delay was tested against a prediction algorithm trained on an independent perfusion software using artiﬁcial intelligence (FASTER). Following our endovascular strategy to consequently achieve TICI 3 outcome, we compared patients with complete reperfusion (TICI 3) vs. no reperfusion (TICI 0) after mechanical thrombectomy. Final infarct volume was determined on a routine follow-up MRI or CT at 90 days after the stroke; (3) Results: Compared to the reference standard (infarct volume after 90 days), the decision forest algorithm overestimated the ﬁnal infarct volume in patients without reperfusion. Underestimation was observed if patients were completely reperfused. In cases where the FDA-cleared segmentation was not interpretable due to improper deﬁnitions of the arterial input function, the decision forest provided reliable results; (4) Conclusions: The prediction accuracy of automated tissue estimation depends on (i) success of reperfusion, (ii) infarct size, and (iii) software-related factors introduced by the training sample. A principal advantage of machine learning algorithms is their improved robustness to artifacts in comparison to solely threshold-based model-dependent software. Validation on independent datasets remains a crucial condition for clinical implementations of decision support systems in stroke imaging.


Introduction
Defining infarction core and penumbra is relevant to determine the management plan of patients with acute ischemic stroke. Multiple factors influence the evolution and final extent of ischemic lesions, such as the location of the thrombus [1,2], preexisting vascular pathology [3], and the state of collateral circulation [4,5], as well as other patientrelated factors [6,7], which necessitate moving from a universal time window towards an individualized approach in management. This is reflected in the current guidelines for stroke management by the American Stroke Association, as selected patients may be eligible for invasive stroke therapy in an extended time window (6-24 h from last seen normal). The guidelines also incorporate collateral flow status in clinical decision-making [8].
Perfusion imaging identifies hypoperfused brain tissue that is potentially salvageable if revascularization can be achieved within 24 h [9]. An ADC threshold of ≤620 × 10 −6 mm 2 /s is currently recommended as a marker for the infarct core [10][11][12], and a Tmax of >6 s is used as a predictor of severely hypoperfused tissues [13,14]. However, defining the infarct core based on DWI remains a matter of controversy [15], as multiple studies have shown the presence 2 of 13 of reversible diffusion restriction [12,[16][17][18][19][20]. The dependency of the deconvolution method on the arterial input function (AIF) renders Tmax susceptible to even minor changes in the shape of AIF, making it one of the limitations of this perfusion parameter [21].
Quantitative analysis using a fixed threshold of a single parameter is widely used clinically. However, due to the complexity of the stroke pathophysiology, with multiple factors affecting the outcome, and increasing availability of advanced stroke therapy, a multiparametric approach could be more useful. Currently, a reliable method able to precisely determine the core and penumbra is lacking. In the last years, many attempts were made to find a mathematical solution that could predict the fate of ischemic tissue and gain better results using the perfusion data in the acute stage of patients with vascular occlusion. Deep learning algorithms have been introduced to predict the salvageable tissue in patients who receive mechanical thrombectomy and i.v. thrombolysis [22]. Supervised decision forests (a subform of random forest classifiers) [23] have been suggested to predict tissue fate based on multiparametric MRI sequences including perfusion data [24]. Fully automated stroke tissue estimation using random forest classifiers (FASTER) [24,25] is a fully automated algorithm that predicts tissue fate, from images in the acute stage, based on supervised learning. When applied to 19 test cases with thrombolysis in cerebral infarction (TICI) grading of 1-2 a, the predicted tissue-at-risk volume was positively correlated with the final lesion volume. Prediction of tissue damage in cases with persistent occlusion or full recanalization has not yet been analyzed. From a clinical point of view, this dichotomization is of relevance, since growing evidence suggests that complete perfusion (TICI 3) rather than incomplete revascularization (TICI 0) may provide remarkable benefit in outcomes [26,27], while from the view of medical image analysis, major challenges arise for predicting the outcome from multiple factors that influence tissue survival and guide eligibility for reperfusion regimens. In this study, we report the precision of FASTER against a threshold-based perfusion analysis in patients without vs. immediate full recanalization.

Ethical Statement
This is a retrospective study of patients from the Bernese Stroke Registry, a prospectively collected database approved by the review board of the University Hospital of Bern and the local ethical committee (Kantonale Ethikkommision Bern, Switzerland).

Patients and Inclusion Criteria
Patients who underwent treatment for acute ischemic stroke were retrospectively identified from the stroke unit registry at the Inselspital, University of Bern. Inclusion criteria were (i) patients over 18 years with acute ischemic stroke who underwent a brain MRI exam in the acute phase, (ii) ischemic lesion (s) on DWI and PWI, (iii) proximal occlusion of the middle cerebral artery (M1 or M2 segments), (iv) endovascular therapy with a resulting TICI score of 0 or 3, and (v) follow-up imaging after 3 months. Patients were included if imaging data were complete, including the raw perfusion data saved in the Picture archiving and communication system (PACS), and were not used to train the FASTER algorithm.

Imaging Protocol
The institutional clinical acute MRI stroke protocol was performed either on a 1.5T system (Siemens Magentom Avanto and Siemens Magnetom Aera, Siemens medical solution) or a 3T system (Siemens Magnetom Verio, Siemens, healthcare, Erlangen, Germany). The protocol includes the following sequences: a whole-brain DWI (slice thickness 5 mm), an axial FLAIR sequence (slice thickness 5 mm), TOF-MRA (slice thickness 0.5 mm), and an SWI (slice thickness 1.6 mm). After the application of i.v. Gadobutrol (Gadovist; Bayer Healthcare, Berlin, Germany ) in an antecubital vein with 5 mL/s injection rate a standard dynamic susceptibility contrast (DSC), MRI perfusion (slice thickness 5 mm) was acquired as well as a contrast-enhanced T1-weighted sequence (slice thickness 5 mm). Finally, a contrast-enhanced MR angiography of the head and neck vessels was acquired after injection of a second bolus of Gadobutrol with a 3 mL/s injection rate.
A similar MRI protocol was performed after 3 months. In 6 patients, a CT exam was performed on follow-up instead of the MRI, which was performed on a 128 multidetector CT, slice thickness 3 mm (Siemens Definition Edge). For this work, the prediction was based on the diffusion weighted images, the apparent diffusion coefficient images, and the dynamic susceptibility perfusion imaging, whereas the lesion volume was determined in the follow-up examination of the FLAIR or non-enhanced CT.

Data Processing
Perfusion postprocessing was performed using Olea Sphere v2.3, which is an FDAcleared automated multivendor processing software for medical imaging, developed by Olea Medical. The following postprocessing steps were performed automatically: (1) motion correction, (2) segmentation of non-cerebral tissue, (3) automatic identification of arterial input function (AIF) and venous reference, and finally, the different perfusion parameters were generated. For each case, oscillation index Singular Value Decomposition (oSVD) was performed to generate TTP, MTT, rBV, rBF, Tmax, corrected rBV, and K2 maps.

Manual Segmentation of Hypoperfusion
After processing perfusion parameters, manual segmentation of the lesions with delayed perfusion was performed in the analysis module of Olea, which allows drawing constraints of the volume of interest. The segmentation was performed on a slice-by-slice basis in regions with a Tmax delay of >6 s to calculate the penumbra volume. These constraints were defined as ground truth for penumbra. This step was performed by a medical master's student under the supervision of a board-certified neuroradiologist with more than 10 years of experience.

Manual Segmentation of Final Infarct Volume
The final infarct volume was defined based on follow-up MRI (FLAIR), or follow-up CT. The 3D slicer ® v4.5.1-1, which is an open-source software for 3D visualization and imaging postprocessing, was used for this task. Final infarct volumes were delineated manually by a board-certified neuroradiologist on a slice-by-slice basis in the editor module using draw and paint tools, and the calculated volume was defined as the ground truth for the infarct core ( Figure 1). The obtained delineated volumes were reviewed by a second experienced observer in stroke imaging.

Postprocessing with FASTER
The perfusion maps, calculated using oSVD deconvolution and generated by Olea in the acute phase, were analyzed using FASTER, which provided an estimation of the final infarct volume in the case of either complete (TICI 3) or no reperfusion (TICI 0), as well as

Postprocessing with FASTER
The perfusion maps, calculated using oSVD deconvolution and generated by Olea in the acute phase, were analyzed using FASTER, which provided an estimation of the final infarct volume in the case of either complete (TICI 3) or no reperfusion (TICI 0), as well as estimation of the penumbra in the acute situation.

Data Comparison
The results generated by FASTER were compared to the manually segmented areas on the follow-up images: 1.
Infarct core volume: The predicted infarct volume at baseline using FASTER was compared to the final infarct volume on follow-up imaging (MRI or CT) calculated with the slicer.

2.
Penumbral volume: The estimated penumbral volume by FASTER was compared to the manually delineated volume by Olea using the linear threshold of Tmax (>6 s).
We have actively waived metrics that are relevant for image analysis, such as Dice score coefficient and Hausdorff distance, and focused on the clinically relevant metrics only, i.e., the volume of tissue damage and the mismatch between core and penumbra, since they constitute outcomes of clinical trials.

Statistics
For statistical analysis, R statistical software v3.32 was used to obtain the following results: (1) statistical distribution of volumes, split according to TICI scores; (2) statistical volume difference between manually delineated volumes and automatically estimated volumes by FASTER using a Bland Altman plot; and (3) linear correlation between the two datasets using the Pearson correlation coefficient.

Results
Thirty-one patients (13 female) with an acute ischemic stroke that underwent mechanical thrombectomy with available baseline multimodal MRI imaging and follow-up MRI or CT at 3 months were anonymized and included in the study. The age range was between 32-86 years (Median 70, IQR 60-75).

Infarct Core
The final infarct volume (ground truth) calculated on the follow-up imaging ranged from 0-270,659 mm 3 (Median 17,394 mm 3 , IQR 5950-59,600). The volume difference between FASTER and the manual segmentation varied between −30,749 mm 3 (larger volumes detected by FASTER) and +81,764 mm 3 (smaller volumes detected by FASTER), with a mean difference of +5398 mm 3 . Except for two outliers (patient 18 with a difference of +124,819 mm 3 and patient 2 with a difference of −227,682 mm 3 ) (Figure 2), the differences tended to increase with increasing the mean infarct volume (true volume); the Pearson correlation coefficient was R = 0.8045 with a p-value < 0.001. The linear correlation approves to be statistically significant (Figure 3).   Subgroup analysis shows that in cases with TICI 0, FASTER, as well as Tmax, estimates larger final infarction volumes compared to the ground truth; however, the mean overestimation of the final infarction volume by Tmax was greater than FASTER (average difference for Tmax = 39,227 mm 3 and for FASTER = 31,284 mm 3 ). The volume difference tends to be higher in cases with high NIHSS for both FASTER and Tmax (assuming that the penumbra equals the infarction core in cases of no perfusion). The case with the highest volume difference in both Tmax and FASTER was a patient with NIHSS of 19. However, in relation to the final infarction volume, the volume difference (between ground truth and estimation) appears to be higher in cases with lower NIHSS for both the FASTER and Olea estimation ( Figure 4). There was no relation found between volume difference and time to recanalization. In cases with TICI 3, FASTER provides smaller infarct volumes ( Figure 5). In cases with TICI 3, FASTER provides smaller infarct volumes ( Figure 5). In cases with TICI 3, FASTER provides smaller infarct volumes ( Figure 5).

Figure 5.
Boxplot diagram of the comparison between the predicted final infarction volumes by FASTER and the manually segmented final infarction volumes on follow-up imaging, divided according to TICI score: TICI 0 (green) and TICI 3 (blue).

Penumbra
The calculated tissue at risk volume based on Tmax > 6 s (critical hypoperfusion) ranged from 480-312,800 mm 3 . (Median 149,040 mm 3 , IQR 95,340-206,970). Two patients (P 6 and P 9) were excluded due to artifacts in the Tmax perfusion map, leading to the

Penumbra
The calculated tissue at risk volume based on Tmax > 6 s (critical hypoperfusion) ranged from 480-312,800 mm 3 . (Median 149,040 mm 3 , IQR 95,340-206,970). Two patients (P 6 and P 9) were excluded due to artifacts in the Tmax perfusion map, leading to the inability to delineate the lesion. In these two patients, FASTER calculated the volume of at-risk-tissue to be 48,327 mm 3 and 4695 mm 3 , respectively. The volume difference between the volume estimated by FASTER and the threshold method vary between −49,820 mm 3 (overestimation of FASTER) and 46,832 mm 3 (underestimation of FASTER). All results lie inside the limits of agreement ( Figure 6). There is no clear trend between the size of the mean volume and the volume differences. The volume differences are distributed uniformly between cases. The correlation between the two volumes (i.e., estimated by FASTER and calculated by threshold method) is statistically significant with an overall correlation value R = 0.9317 and p = < 0.001 (Figure 7). For both groups (TICI 0 and TICI 3), FASTER overestimated the true penumbra volume compared to the conventional threshold method (Figure 8). The volume difference tended to be higher in cases with high NIHSS. The NIHSS of the patients with the highest volume difference was 19 and 22 ( Figure 9). Notably, the differences were higher in the patients with an NIH below 11 and beyond 15.
No relation was found between volume difference and time to recanalization. correlation value R = 0.9317 and p = < 0.001 (Figure 7). For both groups (TICI 0 and TICI 3), FASTER overestimated the true penumbra volume compared to the conventional threshold method (Figure 8). The volume difference tended to be higher in cases with high NIHSS. The NIHSS of the patients with the highest volume difference was 19 and 22 (Figure 9). Notably, the differences were higher in the patients with an NIH below 11 and beyond 15. No relation was found between volume difference and time to recanalization.     s Figure 9. Graph showing the relation between NIHSS (black line) and the absolute volume difference of the penumbra (the difference between the estimated penumbra volume by FASTER and the volume of Tmax > 6 s).

Discussion
The prediction accuracy of a decision forest was previously reported, in which the final lesion volume predicted by FASTER in the case of a poor response to therapy was significantly correlated with the final lesion volume. Since testing was restricted to cases having TICI scores 1-2 a, we have now extended the analysis to cases with TICI scores of 0 and 3. The results of our study indicate a trend towards overestimation of the final infarct volume in TICI 0 versus a trend towards underestimation in cases with complete reperfusion after therapy using FASTER.

Discussion
The prediction accuracy of a decision forest was previously reported, in which the final lesion volume predicted by FASTER in the case of a poor response to therapy was significantly correlated with the final lesion volume. Since testing was restricted to cases having TICI scores 1-2 a, we have now extended the analysis to cases with TICI scores of 0 and 3. The results of our study indicate a trend towards overestimation of the final infarct volume in TICI 0 versus a trend towards underestimation in cases with complete reperfusion after therapy using FASTER.

Discussion
The prediction accuracy of a decision forest was previously reported, in which the final lesion volume predicted by FASTER in the case of a poor response to therapy was significantly correlated with the final lesion volume. Since testing was restricted to cases having TICI scores 1-2 a, we have now extended the analysis to cases with TICI scores of 0 and 3. The results of our study indicate a trend towards overestimation of the final infarct volume in TICI 0 versus a trend towards underestimation in cases with complete reperfusion after therapy using FASTER.
Both the final infarction and tissue-at-risk may differ according to response to revascularization therapy [24], and the state of collaterals [29,30], which influenced the prediction accuracy of FASTER. In cases with no reperfusion after mechanical thrombectomy (TICI 0), FASTER, as expected, tended to overestimate the final infarction volume. In this study, we followed a strategy that defined a "final" infarction as a time point where periprocedural effects, inflammation, and tissue repair can be neglected as confounders, rather than to compare the lesion size earlier during lesion evolution, since apparent lesion size in the early acute phase is known to overestimate final lesion volume even in cases with complete revascularization. In cases with complete reperfusion after mechanical thrombectomy (TICI 3), FASTER tended to underestimate the true infarction volume, which tended to be larger with higher lesion volumes. Since the perfusion maps were processed with oSVD-a delay-sensitive approach-any delay in contrast bolus arrival (e.g., due to carotid stenosis or impaired cardiac function) may lead to overestimation of the results. The mean penumbra volume difference was found to be 7.8 mL; this volume appeared to be not related to the size of the effective penumbra. In summary, the results achieved a good accuracy, whereas precision was low. Improving the precision of detecting the infarction core and penumbra is mandatory for improved decision support in the selection of patients for reperfusion therapy beyond the defined treatment window. A principal advantage of the automated method is its robustness to artifacts. Due to the susceptibility of the singular value decomposition to noise and artifacts, the results of the perfusion parameters may not be accurate or maps can be distorted so that no decision can be made even by the experienced readers. In our series, FASTER allowed for predicting the tissue at risk in two additional cases where the FDA-cleared software failed [31]. The superiority of automated analysis could be subsequently demonstrated in further studies with arterial input function detection failures [32].
Technical factors such as image resolution, supervised implementation of feature extraction, and the initial training dataset may to some extent account for the differences between the FASTER algorithm and manual segmentation. FASTER perfusion maps were resampled to 2 mm isotropic resolution with linear interpolation; however, for the newer version, the original scanner resolution was used without any modifications. FASTER was trained on maps processed with block circulant decomposition using the Perfusion Mismatch Analyzer (PMA) from the Acute Stroke Imaging Standardization Group (ASIST) v3.4.0.6, but the testing cases in this study were processed by Olea v2.3. An additional factor for the difference was the inclusion criteria of patients, which included only patients with TICI 0 and TICI 3 only, and not all different responses to reperfusion therapy (TICI 0 to TICI 3). Clinical severity of stroke appears to be one of the factors that may influence the results, as high core and penumbra volume difference was often found in high NIHSS, and also a high percentage of volume difference for both FASTER and OLEA was found in cases with lower NIHSS.
Methods based on thresholds of different perfusion parameters still form the basis for decision support in different commercially available software [33] and are already integrated into clinical practice. However, there are inaccuracies and limitations of such methods, and there are already known significant differences between the commercially available software and sequences, even if identical source data were used, with insufficient correlations of true values and a lack of standardization [34][35][36][37][38]. Therefore, different attempts were made to move beyond the classical threshold-based methods to use semiautomated or fully automated methods [39]; lately, different works have focused on using deep learning methods to better differentiate the tissue at risk from the infarcted tissue, and challenges were organized to motivate researchers to develop advanced algorithms for evaluation of perfusion images in stroke patients [40,41], of which FASTER was one of the top-performing algorithms [40]. Other various machine learning models have been used for the prediction of tissue fate of processed perfusion parameters, but lately, different works reported the value of using the raw perfusion data instead of the already post-processed parameters, with good results [31,32,[42][43][44][45][46]. Pinto et al. designed a fully automated deep learning method using both supervised and unsupervised learning to predict final stroke lesions after 90 days from multiparametric MRI imaging [47]. Combining clinical information with multimodal MRI images also reaches an improved performance [48]. Results from the ISLES 2018 challenge have recently demonstrated that machine learning methods may predict infarcted tissue from perfusion-CT with improved accuracy compared to threshold-based methods used in clinical routine [46]. More recently, quantitative evaluation of deep learning derived MR perfusion maps yielded a strong agreement across different threshold-based segmentation of Tmax perfusion maps for patient selections for thrombectomy, rendering automated image analysis methods suitable for point of care triage and decision support if a medical expert performs the second look [32].

Limitations
This is a retrospective study, with a limited number of patients from a single center and single perfusion processing software. Another limitation is that we did not calculate the interrater reliability, as the segmentation of the final infarction volume was performed primarily by a single rater and reviewed by a second rater to check for correctness, which showed a high interobserver agreement. The study focused on the evaluation of volumetric estimation of tissue at risk and final infarction, which does not take into account the topography of the lesion; therefore, the metrics as a dice similarity coefficient were not calculated. We decided to focus the analysis on the volumetric estimates since those estimates are most relevant for software solutions in decision support in strokes.

Conclusions
The volume estimation of FASTER for the infarction core and penumbra in predicting tissue fate varies according to the response to reperfusion therapy, generally with good accuracy, but with low precision. Both Tmax and FASTER overestimated the final infarction. FASTER performed better in the determination of tissue fate in cases with artifacts in which such determination is not possible using conventional methods. Using different processing techniques may impact the results of prediction.

Conflicts of Interest:
The authors declare no conflict of interest.