Texture-Based Analysis of 18 F-Labeled Amyloid PET Brain Images

: Amyloid positron emission tomography (PET) brain imaging with radiotracers like [ 18 F]ﬂorbetapir (FBP) or [ 18 F]ﬂutemetamol (FMM) is frequently used for the diagnosis of Alzheimer’s disease. Quantitative analysis is usually performed with standardized uptake value ratios (SUVR), which are calculated by normalizing to a reference region. However, the reference region could present high variability in longitudinal studies. Texture features based on the grey-level co-occurrence matrix, also called Haralick features (HF), are evaluated in this study to discriminate between amyloid-positive and negative cases. A retrospective study cohort of 66 patients with amyloid PET images (30 [ 18 F]FBP and 36 [ 18 F]FMM) was selected and SUVRs and 6 HFs were extracted from 13 cortical volumes of interest. Mann–Whitney U-tests were performed to analyze differences of the features between amyloid positive and negative cases. Receiver operating characteristic (ROC) curves were computed and their area under the curve (AUC) was calculated to study the discriminatory capability of the features. SUVR proved to be the most signiﬁcant feature among all tests with AUCs between 0.692 and 0.989. All HFs except correlation also showed good performance. AUCs of up to 0.949 were obtained with the HFs. These results suggest the potential use of texture features for the classiﬁcation of amyloid PET images.

The interpretation of amyloid PET images is performed visually in clinical practice and is defined in the Society of Nuclear Medicine and Molecular Imaging (SNMMI) procedure standard/European Association of Nuclear Medicine (EANM) practice guideline [8]. The main feature for the visual classification is the grey-to-white matter contrast in amyloid PET images. Cases of abnormally elevated cortical Aβ deposition are considered amyloidpositive (Aβ+). The difference between both regions is difficult to identify and the image intensities are more homogeneous. Amyloid-negative (Aβ−) cases are characterized by slight white matter uptake without grey matter uptake. White matter can be clearly differentiated and the image intensity distribution between both tissues is heterogeneous.
The visual analysis and classification of amyloid PET images are recommended by the manufacturers and are performed in clinical practice. It must be noted that amyloid PET imaging is currently only considered a diagnostic biomarker and the corresponding radiotracers have been approved for clinical use for visual interpretation and to classify images as Aβ+ or Aβ− [1,9,10]. The sensitivity of the visual analysis of amyloid PET images when differentiating healthy control subjects from those with mild cognitive impairment (MCI) or AD is over 90% [11]. However, borderline cases that are difficult to classify visually could require quantitative analysis. Thresholds for the classification between amyloidpositive and amyloid-negative based on the standardized uptake value (SUV) have been calculated for all Aβ-binding tracer, namely [ 11 C]PiB [12], [ 18 F]FBP [13], [ 18 F]FBB [5] and [ 18 F]FMM [14]. The progression of amyloid deposition as seen in PET images is also studied and its relationship to cognitive functions [15].
Even though the quantitative analysis of Aβ-binding tracer uptake in PET images is focused on the SUV, visual analysis is still the gold standard in clinical settings [8]. Moreover, it presents limitations and other approaches have previously been proposed. These are mainly motivated by finding an alternative to the reference region normalization, which has shown high variability in longitudinal studies [16,17] and its optimization has been studied for different radiotracers [18][19][20]. Whittington and Gunn [21] presented an imaging biomarker called amyloid load, which quantifies the Aβ burden from 0% to 100%. Another quantification approach evaluated previously is the textural analysis of amyloid PET images. Campbell et al. [22] studied group differences of textural features and SUV ratios (SUVR) of cortical brain regions in patients of different clinical factors. The authors demonstrated statistically significant differences in textural features between the study groups. Ben Bouallègue et al. [23] evaluated the diagnostic and prognostic value of textural and shape features in amyloid PET images. Textural features are shown to be able to be used for the prediction of the conversion from mild cognitive impairment to AD. However, differences between textural features of amyloid-positive and amyloid-negative PET images have not been studied.
The goal of this study is to evaluate the usefulness of textural image features for the classification of positive or negative PET amyloid images, especially compared to the more common SUVR. Textural image analysis is based on the Haralick features (HF) [24], which are obtained from the grey level co-occurrence matrix (GLCM) and quantify the spatial distribution of image intensities. The features are studied whether they can correctly describe the grey-to-white matter contrast and their discriminatory performance is evaluated and compared to the more conventional SUVR.

Subjects
The study cohort was retrospectively selected from patients with clinical diagnoses of neurodegenerative disease, and who were transferred to the department of nuclear medicine for PET/CT brain imaging. Patients who had undergone an amyloid PET scan were included. Amyloid PET images were acquired either with [ 18 F]FBP or [ 18 F]FMM. The subjects are divided into two groups according to the employed amyloid tracer. Amyloid positivity or negativity was evaluated visually by specialists in Nuclear Medicine. The demographic characteristics are shown in Table 1.  21 subsets, allpass filter) and the images were corrected for attenuation with a low-dose CT scan. Scatter and random correction were also performed. The reconstructed images had a matrix size of 168 × 168 × 33 and voxel size of 4.0728 × 4.0728 × 5 mm 3 .

Image Analysis
All images are preprocessed using Statistical Parametric Mapping 12 (SPM12) (Wellcome Centre for Human Neuroimaging, University College London, London, UK) [25]. Preprocessing includes a manual orientation of the PET/CT images, coregistration between PET and CT images, spatial normalization to the Montreal Neurological Institute (MNI) space based on the unified segmentation algorithm of SPM12 using the CT image as an anatomical reference [26], and segmentation of cortical brain regions. Spatially normalized PET images have a matrix size of 91 × 109 × 91 and voxel size of 2 × 2 × 2 mm 3 .
The segmentation of cortical brain regions for the textural analysis is based on the Automatic Anatomical Labelling (AAL2) atlas [27,28]. This way, 6 volumes of interest (VOI) are defined for each brain hemisphere due to potential asymmetry of the amyloid burden [29,30], corresponding to the frontal, occipital, parietal and temporal lobes, the combined posterior cingulate/precuneus (PCC) and the anterior cingulate. These 6 VOIs are selected based on the specifications of the visual interpretation [8][9][10]. Additionally, a global VOI is defined as the sum of the 12 VOIs. Spatially normalized images are segmented by these VOIs as the goal is to study the difference between grey matter and white matter image intensities, except in the case of SUV calculations where a previous grey matter segmentation is performed. The AAL2 segmentation provides regions with higher cortical thickness and is, therefore, best-suited for the proposed analysis. For each region, the 3-dimensional (3D) GLCM is constructed. The process is based on the Radiomics toolbox (https://github.com/mvallieres/radiomics, accessed on 23 February 2021) by M. Vallières for MATLAB (The MathWorks Inc., Natick, MA, USA) [31][32][33]. Before GLCM computation, image intensities are quantized into 128 bins using equal probability quantization. A total of 6 HFs are calculated, namely energy, contrast, entropy, homogeneity, correlation, and dissimilarity [24,33], which have previously been used while analyzing amyloid PET images and have shown good performance [22,23]. On the other hand, the SUVRs are obtained by normalizing the average SUV of each region to the average SUV of the grey matter part of the cerebellum [8,14,34].

Statistical Analysis
Quantitative variables are represented as mean ± standard deviation. Statistical testing is performed to evaluate differences between quantitative features of Aβ+ and Aβ− PET images. Differences between groups are evaluated using Mann-Whitney U-tests. Receiver characteristic curves (ROC) are computed and the area under the curve (AUC) is used to quantify the discriminatory performance of the image feature. AUC values > 0.7 or <0.3 (inverse relationship between feature value and the positive class) are considered to describe acceptable discrimination [35]. A total of 273 statistical tests are performed (3 subgroups, 13 VOIs, 7 image features). For a group of VOIs and image features, False Discovery Rate (FDR) is controlled at level 0.05 based on the Benjamini-Hochberg procedure to adjust for multiple comparisons. All data are statistically analyzed on SPSS software version 19.00 (IBM Corp., Armonk, NY, USA).

Results
Results from the Mann-Whitney U-test using global features are shown in Table 2. When comparing image features between Aβ+ and Aβ− patients independently of the used radiotracer and in the [ 18 F]FBP group, all features show statistically significant differences except one. Correlation does not show significant differences between Aβ+ and Aβ− in any of the study groups (p = 0.037, p = 0.022 and p = 0.295, respectively). Additionally, energy and contrast are not significant features when analyzing the images acquired with [ 18 F]FMM (p = 0.057 and p = 0.029, respectively). The results for the analyses of the individual VOIs for the three study groups are shown in Tables 3-5, respectively. Across all VOIs, the SUVR shows significant differences between Aβ+ and Aβ− for all three study groups, except in the occipital and left parietal VOIs in [ 18 F]FBP images. Regarding HFs, energy and homogeneity show significant differences in 9 VOIs (all except the parietal and left anterior cingulate VOIs) for all patients, more than any other HF in that study group. In the case of [ 18 F]FBP PET images, homogeneity is significantly different in 7 VOIs (left and right frontal and occipital VOIs, as well as the left PCC and right temporal and anterior cingulate VOIs). When analyzing only [ 18 F]FMM PET images energy also shows significant differences in 7 VOIs (occipital, temporal and PCC, as well as right anterior cingulate VOIs).  ROC curves of the image features using the global VOI for each study group are shown in Figure 1 and their corresponding AUC values in Table 6. The SUVR shows the best discriminatory capabilities with the highest AUC in the [ 18 F]FMM study groups (0.96) while homogeneity performs better in the other two groups (0.813 and 0.862, respectively). Except in the case of SUVR, AUC values are higher in the study group of patients whose amyloid PET images were acquired with [ 18 F]FBP. Lastly, only correlation results in an AUC value < 0.7. In the study group of all patients, the AUC is 0.654. When analyzing only the [ 18 F]FMM study group, homogeneity results in an AUC value of 0.611.

Discussion
In this study, image features from 18 F-labeled amyloid PET images are evaluated to differentiate between Aβ+ and Aβ− cases. The SUVR, commonly used in clinical practice, as well as 6 HFs, which do not require a reference region normalization, are extracted. The quantitative features are compared in three study groups. These correspond to all the patients of the retrospective cohort, those whose amyloid PET images were acquired with [ 18 F]FBP, and [ 18 F]FMM PET images. Each HF is evaluated individually whether it can describe quantitatively the grey-to-white matter contrast present in Aβ− images compared to its decrease in Aβ+ cases.
Statistical testing revealed that the SUVR is still the best feature when discriminating between Aβ+ and Aβ− PET images, showing significant differences in all regions and study groups except in 3 VOIs of the [ 18 F]FBP group. Cut-offs have been previously defined for different amyloid-binding radiotracers [5,[12][13][14] and the SUV is also used in clinical practice for quantifying PET images. However, all HFs showed significant differences in at least 3 VOIs in every study group. An exception is correlation, which is found to not be a useful image feature in our study. Even though grey and white matter present similar voxel intensities in positive amyloid PET images, neighboring voxels do not necessarily need to be linearly correlated in either case, especially given that Aβ plaques are present as diffuse deposits [2]. On the contrary, other features like homogeneity that show good performance describe the general similarity of neighboring voxel intensities and show greater differences between Aβ+ and Aβ− cases as demonstrated by ROC analyses.
Comparing the findings of the present study to those by Campbell et al. [22], the authors reported results that state that energy and entropy best discriminate between the studied groups. While these features generally show statistically significant differences in our study, energy in the [ 18 F]FMM group is also one of the three HFs that are not significant when extracted from the global VOI. However, the authors grouped the patients by clinicopathological features and not by Aβ positivity. Moreover, in their study, SUVR showed no statistically significant differences between the groups and HFs performed better, unlike the results presented in this study.
AUC values of ROC curves from features using the global VOI were generally larger than 0.7. Again, correlation is not sufficiently capable of discriminating between Aβ+ and Aβ−. All other features showed overall good performance in ROC analyses of regional VOIs indicating their discriminatory capability and describing correctly the visually identified grey-to-white matter contrast patterns. Overall, an association between the statistically significant group differences in Mann-Whitney U-tests and higher AUC values can be observed. Compared to Campbell et al. [22], the AUC values are generally higher in this study. The authors reported their highest AUC value as 0.695 (energy of the PCC), while in our study it is 0.989 (SUVR of the left temporal VOI). On the other hand, Ben Bouallègue et al. [23] obtained AUC values < 0.75 with HFs in a cross-sectional study group. However, the authors employed a different segmentation method and did not analyze cortical brain regions.
Moreover, ROC analyses show the relationship between the features and Aβ positivity. As mentioned previously, Aβ+ PET images are characterized by a more homogeneous uptake pattern between grey and white matter. Contrast, entropy and dissimilarity result in AUC values <0.3, indicating an inverse relationship to Aβ+. This indicates that features, whose higher values describe heterogeneous spatial intensity distributions, are correlated negatively to Aβ+. On the other hand, energy, homogeneity and correlation show AUC values >0.7. These are features are therefore positively correlated to Aβ+ and their values increase with higher homogeneity of the image intensity values.
Regional group differences have also been detected in statistical testing and ROC analyses. The manuals for visual interpretation of [ 18 F]FBP [9] and [ 18 F]FMM [10] PET images indicate that for a positive diagnosis, at least two or one region needs to present reduced grey-white matter contrast, respectively. Regions of interest defined in those guidelines are the frontal, temporal, and parietal lobes, as well as the posterior cingulate [8][9][10]. Among all analyses, especially the PCC showed the best results. Good results in the other regions were also obtained in different tests and study groups. For example, the frontal lobe ([ 18 F]FBP) and the temporal lobe ([ 18 F]FMM) can be seen as defining regions between Aβ+ and Aβ− PET images. However, the parietal VOI is the least significant region in our analysis. While it is defined in the guidelines of the visual interpretation, the parietal lobe and higher image slices are more frequently affected by atrophy and therefore might appear to present lower activity in the amyloid PET images. Experienced readers usually focus their analysis on more caudal slices, which often include the regions with good discriminatory performance in our study. Regarding the quantitative analysis, it has also been shown that the parietal region is particularly prone to artifacts, which could affect the results [36]. Interestingly, the occipital lobe is not included in the regions to review defined by the guidelines for visual interpretation of amyloid PET images but showed good results in our analyses to discriminate between Aβ+ and Aβ− images.
Limitations of this study include the lack of anatomical images for the normalization process and study groups of images with [ 11 C]PiB and [ 18 F]FBB to evaluate HFs in these radiotracers. Additionally, to evaluate the robustness over time of textural features of PET images, longitudinal studies should be performed. In this study, however, only a retrospective cross-sectional study cohort was available. Moreover, a large prospective study cohort would also allow for the calculation of radiotracer-specific cut-off values of the textural parameters for the classification into Aβ+ or Aβ−. Regarding the extracted image features, it should be noted that no standardized protocol for texture analysis in PET images is defined and the intensity quantization method or bin number may vary between studies. Other texture features like those based on image histogram, the grey-level run length matrix or the grey-level size zone matrix or regions like the striatum, which is defined as a region of interest in the visual interpretation of [ 18 F]FMM images [10], could also be evaluated. Lastly, while in this study the individual performance of HFs is analyzed, future studies should include the generation of multivariate classifiers to evaluate the combined performance of image features.

Conclusions
PET brain images of amyloid-binding radiotracers are visually interpreted by analyzing grey-to-white-matter contrast. Quantitative image features from PET brain images acquired with [ 18 F]FBP and [ 18 F]FMM are evaluated in this study. SUVRs and HFs from the GLCM are extracted and compared between Aβ+ and Aβ− images. While SUVR still performed better than HFs, these also showed significant differences between the groups. Therefore, it can be concluded that texture features could be an alternative to SUVR that does not need a reference region for its calculation.  Institutional Review Board Statement: Ethical review and approval were waived for this study, due to involving a retrospective image database.
Informed Consent Statement: Informed consent was obtained from all subjects involved in the study.

Data Availability Statement:
The data presented in this study are available on request from the corresponding authors. The data is not publicly available due to clinical patient information.