A Method for the Prediction of Clinical Outcome Using Diffusion Magnetic Resonance Imaging: Application on Parkinson’s Disease

Robust early prediction of clinical outcomes in Parkinson’s disease (PD) is paramount for implementing appropriate management interventions. We propose a method that uses the baseline MRI, measuring diffusion parameters from multiple parcellated brain regions, to predict the 2-year clinical outcome in Parkinson’s disease. Diffusion tensor imaging was obtained from 82 patients (males/females = 45/37, mean age: 60.9 ± 7.3 years, baseline and after 23.7 ± 0.7 months) using a 3T MR scanner, which was normalized and parcellated according to the Automated Anatomical Labelling template. All patients were diagnosed with probable Parkinson’s disease by the National Institute of Neurological Disorders and Stroke criteria. Clinical outcome was graded using disease severity (Unified Parkinson’s Disease Rating Scale and Modified Hoehn and Yahr staging), drug administration (levodopa equivalent daily dose), and quality of life (39-item PD Questionnaire). Selection and regularization of diffusion parameters, the mean diffusivity and fractional anisotropy, were performed using least absolute shrinkage and selection operator (LASSO) between baseline diffusion index and clinical outcome over 2 years. Identified features were entered into a stepwise multivariate regression model, followed by a leave-one-out/5-fold cross validation and additional blind validation using an independent dataset. The predicted Unified Parkinson’s Disease Rating Scale for each individual was consistent with the observed values at blind validation (adjusted R2 0.76) by using 13 features, such as mean diffusivity in lingual, nodule lobule of cerebellum vermis and fractional anisotropy in rolandic operculum, and quadrangular lobule of cerebellum. We conclude that baseline diffusion MRI is potentially capable of predicting 2-year clinical outcomes in patients with Parkinson’s disease on an individual basis.

Abstract: Robust early prediction of clinical outcomes in Parkinson's disease (PD) is paramount for implementing appropriate management interventions. We propose a method that uses the baseline MRI, measuring diffusion parameters from multiple parcellated brain regions, to predict the 2-year clinical outcome in Parkinson's disease. Diffusion tensor imaging was obtained from 82 patients (males/females = 45/37, mean age: 60.9 ± 7.3 years, baseline and after 23.7 ± 0.7 months) using a 3T MR scanner, which was normalized and parcellated according to the Automated Anatomical Labelling template. All patients were diagnosed with probable Parkinson's disease by the National Institute of Neurological Disorders and Stroke criteria. Clinical outcome was graded using disease severity (Unified Parkinson's Disease Rating Scale and Modified Hoehn and Yahr staging), drug administration (levodopa equivalent daily dose), and quality of life (39-item PD Questionnaire). Selection and regularization of diffusion parameters, the mean diffusivity and fractional anisotropy, were performed using least absolute shrinkage and selection operator (LASSO) between baseline diffusion index and clinical outcome over 2 years. Identified features were entered into a stepwise multivariate regression model, followed by a leave-one-out/5-fold cross validation and additional blind validation using an independent dataset. The predicted Unified Parkinson's Disease Rating Scale for each individual was consistent with the observed values at blind validation (adjusted R 2 0.76) by using 13 features, such as mean diffusivity in lingual, nodule lobule of cerebellum vermis and fractional anisotropy in rolandic operculum, and quadrangular lobule of cerebellum. We conclude that baseline diffusion

Imaging
MRI was performed with a 3T scanner using a 12-channel head matrix coil (Trio, A TIM system, Magnetom, Siemens, Erlangen, Germany). The patient's head was kept fixed with a pad to avoid bulk motion during scanning. The imaging parameters for T1-weighted images were repetition time/echo time/inversion time/flip angle = 2000 ms/2.63 ms/900 ms/9 • , slice thickness = 1 mm, number of slices = 160, matrix size = 224 × 256, field of view = 224 × 256 mm 2 , and acquisition time = 4 min 8 s.
Diffusion-weighted images were acquired using a spin-echo echo-planar-imaging sequence with repetition time/echo time/slice thickness = 5700 ms/108 ms/3 mm, matrix size = 96 × 96, and field of view = 192 × 192 mm 2 , 40 slices which covered the whole brain down to the cerebellum, and an acceleration factor of 2 using GRAPPA (generalized autocalibrating partially parallel acquisition) reconstruction. The b-value was 1000 s/mm 2 with the diffusion-weighting gradients applied along 30 non-collinear directions.

Image Post-Processing
Diffusion data were processed using Diffusion Kurtosis Estimator software [16]. The mean diffusivity (MD) and fractional anisotropy (FA) were extracted from the diffusion tensor in accordance with the methods described by Lo et al. [17]. Briefly, a transformed matrix was identified by normalizing the B 0 image from each individual to the standard Montreal Neurological Institute template using linear registration (12 parameter affine transformation) and was parcellated into 116 different regions according to the Automated Anatomical Labelling template [18]. A parenchyma mask was used to remove the presence of cerebrospinal fluid in all diffusion metrics. The 10th, 50th, and 90th percentiles for each parcellated brain region were recorded, which led to 696 features in each subject.

Statistical Analysis
Statistical analyses were performed in SAS version 9.4 (SAS Institute, Inc. Cary, NC, USA). Student's paired t-test was used to examine the differences in (a) clinical measures (p < 0.05), and (b) diffusion parameters between baseline and follow-up. The normal distribution of the diffusion index was tested using the Kolmogorov-Smirnov test. Statistical significance was reached at p < 0.05 with correction for multiple comparisons by using Bonferroni's method (0.05/696) in tests for diffusion parameters.
For blind validation, the original data were randomly split into two sub-datasets by using systematic random sampling. Following the 80/20 rule [19,20], the training dataset consisted of 66 patients (80%) and independent blind dataset of 16 (20%). All qualified patients were enrolled in a chronological order and selected at a fixed interval. Regression analysis used LASSO, in which L1-norm regularization was performed to minimize the sum of the absolute value of the regression coefficients [21] in order to reduce the number of features to a statistically reasonable level for further analyses [22]. Age, sex, disease duration and imaging protocols were examined by LASSO, together with the imaging features, in order to clarify the potential confounding effect on the predictability of the models. During the training process, approximately 43 to 55 features, less than the sample size, were selected by LASSO from the original 696 imaging parameters for each clinical change. To devise a final predictive model, a stepwise linear regression was performed from LASSO selected features using 1000 times bootstrap calculation. To avoid overfitting the model, the number of features was further reduced to 1/5 of the training samples [23,24]. As a result, 13 features were entered into the regression model.
Adjusted R 2 , F values, and regression coefficients were used to express goodness-of-fit. Predictive power was expressed as adjusted R 2 . To confirm our final model, we employed a leave-one-out and five-fold cross validation in the training dataset, followed by blind validation in the independent blind dataset, where the mean values of the adjusted R 2 and the absolute error (MAE, the absolute difference between the observed and the predicted scores) were calculated from each validation model. Figure 1 shows the experimental procedure from enrollment to statistical analysis.
J. Clin. Med. 2020, 9, 647 5 of 21 patients (80%) and independent blind dataset of 16 (20%). All qualified patients were enrolled in a chronological order and selected at a fixed interval. Regression analysis used LASSO, in which L1norm regularization was performed to minimize the sum of the absolute value of the regression coefficients [21] in order to reduce the number of features to a statistically reasonable level for further analyses [22]. Age, sex, disease duration and imaging protocols were examined by LASSO, together with the imaging features, in order to clarify the potential confounding effect on the predictability of the models. During the training process, approximately 43 to 55 features, less than the sample size, were selected by LASSO from the original 696 imaging parameters for each clinical change. To devise a final predictive model, a stepwise linear regression was performed from LASSO selected features using 1000 times bootstrap calculation. To avoid overfitting the model, the number of features was further reduced to 1/5 of the training samples [23,24]. As a result, 13 features were entered into the regression model. Adjusted R 2 , F values, and regression coefficients were used to express goodness-of-fit. Predictive power was expressed as adjusted R 2 . To confirm our final model, we employed a leaveone-out and five-fold cross validation in the training dataset, followed by blind validation in the independent blind dataset, where the mean values of the adjusted R 2 and the absolute error (MAE, the absolute difference between the observed and the predicted scores) were calculated from each validation model. Figure 1 shows the experimental procedure from enrollment to statistical analysis.

Changes over the Study Period
We first looked for any changes from baseline to the end of the study in clinical outcome, dose administration, and quality of life ( Table 1). None of the patients had an intracranial structural abnormality. Over the 2-year study period, clinical staging (MHY) increased by 0.3 (p < 0.001), as did the total and the motor subscale of the UPDRS by 6.1 and 2.5, respectively (p = 0.003). In the assessments of quality of life (including in ADL, UPDRS category II, PDQ39SI, and SF36SI), there were significant declinations (p < 0.005). Measures of disease severity were not correlated with age, systolic or diastolic blood pressure, education, and body mass index. A significant positive correlation was found between disease duration and clinical outcomes (p < 0.05).
The values of FA and MD at baseline and follow-up are summarized in Table 2. Figure 2 plots the changes in diffusion parameters over the follow-up period when significant. Extensive regions can be involved, which include superior and middle parts of the frontal lobe; superior, middle and inferior parts of the occipital lobe; biventral, tonsil, and flocculus of the cerebellum. Notably, in basal ganglia, the regions with a significant change in diffusion were mainly located in the caudate.

Summary of Regression Analysis
Imaging parameters that survived the LASSO regression with the change of each clinical outcome in the training dataset were entered into the linear regression analysis. All of the variance inflation factors were within the normal range and <2.8, which suggests that the assumption of multicollinearity was not violated. The values of statistical power and effect size (Cohen f 2 ) of the prediction model ranged from 0.94-1 and 0.48-7.59, respectively.
Regression analyses revealed strong-to-excellent (adjusted R 2 between 0.57 and 0.94) associations between the predicted clinical parameters by baseline diffusion parameters and the observed at follow-up. We summarized the results of the regression, cross-and blind validations, including the adjusted R 2 and the mean average error in Table 3. The predicted scores at follow-up are plotted against the observed values in Figure 3, including the disease severity (total score of UPDRS, MHY), LEDD, and PDQ39SI, using both the training (solid circle) and the blind (circle) datasets. The observed UPDRS at follow-up were significantly in line with those predicted by the regression model with the high adjusted R 2 (leave-one out/5-folds cross validation = 0.939 ± 0.002/0.944 ± 0.011, Figure 3A). Figure 4 shows the predicted and observed values for each category in UPDRS in the training dataset (solid circle) and the blind dataset (circle). Category II and III in UPDRS showed highly adjusted R 2 , which ranged between 0.923 and 0.933 in cross validations.
For the independent blind dataset, the adjusted R 2 was 0.76 in UPDRS with an inherent error of 4.38%. The linear relationships between the predicted and observed LEDD, PDQ39SI, and MHY were also good-to-excellent (adjusted R 2 between 0.86 and 0.89, Figure 3B-D). Noticeably, LEDD could be predicted at a mean adjusted R 2 of 0.891 and 0.898 at both cross and blind validations with a mean average error of approximately 135 and 138 mg, respectively ( Table 3). The adjusted R 2 was still high for Category II (0.82) and III (0.62) in UPDRS at the independent blind dataset. The error between the predicted and the observed scores were 5.88% and 6.90%, respectively.  Changes of diffusion parameters from the parcellated brain regions over the follow-up period. The figure shows the change of diffusion parameter from parcellated brain regions in patients with PD if with statistically significant differences between baseline and the end of the study. MD, mean diffusivity, in unit of 10 −3 mm 2 /sec; FA, fractional anisotropy, unitless. Table 2. Diffusion parameters in parcellated cortical regions over time. Table 2 shows average values of fractional anisotropy and mean diffusivity in the different parcellated cortical regions at baseline and at follow-up. Data are reported as means ± standard deviations.

Prediction of the Clinical Outcome
The clinical outcome can be calculated from the equations of linear regression, as listed in Table  4 (UPDRS and each sub-category) and Table 5 (MHY, LEDD, and PDQ39SI). Dependent variables and the corresponding unstandardized coefficients for each clinical outcome are also reported. The clinical changes in individual patients could be calculated by the combination of diffusion parameters calculated from multiple brain regions, which showed that all regression models involved the

Prediction of the Clinical Outcome
The clinical outcome can be calculated from the equations of linear regression, as listed in Table 4 (UPDRS and each sub-category) and Table 5 (MHY, LEDD, and PDQ39SI). Dependent variables and the corresponding unstandardized coefficients for each clinical outcome are also reported. The clinical changes in individual patients could be calculated by the combination of diffusion parameters calculated from multiple brain regions, which showed that all regression models involved the cerebellum. Notably, both the total score and the motor subscale of UPDRS could be predicted by diffusion parameters measured from several overlapping regions, including the caudate, rectus, and, most pronounced, the cerebellum. In Category II, the predictive regions were most notably the visual cortex, thalamus, insula, cingulum, hippocampus, and cerebellum.
In the assessment of life quality (PDQ39SI), the adjusted R 2 was 0.88 at both validations, mostly predicted by diffusion indices measured in the occipital lobe, cingulum gyrus, quadrangular and biventral lobules in the cerebellum, rolandic, insula, caudate, paracentral gyrus, temporal, frontal lobe, and lingula of cerebellum vermis. Regions that entered into the regression model of LEDD were located in the paracentral gyrus, frontal lobe, Heschl gyrus, postcentral, occipital lobe, cuneus, thalamus, temporal lobe, precuneus, and parahippocampal gyrus, alae, biventral, and tonsil lobules of cerebellum. Table 4. Equation with dependent variables and the corresponding unstandardized coefficients in the regression model of UPDRS and its categories. Table 4 shows the dependent variables and corresponding unstandardized coefficients for the total score and Category I-IV of UPDRS from which clinical changes in individual patients can be calculated by the combination of diffusion parameters. Table 5. Equation with dependent variables and the corresponding unstandardized coefficients in the regression model of LEDD, MHY, and PDQ39SI. Table 5 shows the dependent variables and corresponding unstandardized coefficients for LEDD, MHY, and PDQ39SI from which clinical changes in individual patients can be calculated by the combination of diffusion parameters.

Major Findings and Clinical Impacts
The study proposes a multivariate approach to examine the potential prognostic performance of diffusion MRI on later disease severity, LEDD, and PDQ-39 of patients with PD. The results of our study indicate that the clinical outcome over a 2-year follow-up period can be confidently predicted by baseline diffusion parameters measured in parcellated regions from the whole brain. Importantly, diffusion parameters were found to predict the clinical trajectory of PD over time on an individual patient basis. The results were blind-validated by using an independent sub-dataset, establishing the validity of prognostic prediction by features from diffusion MRI.
Diffusion imaging can be easily incorporated into routine MRI examination, which has often been prescribed in patients suspected of having PD to rule out concomitant brain disorders. Here, we show the prognostic value of our approach by using different diffusion parameters in patients with PD. The proposed machine learning algorithm might help facilitate the development of an artificial intelligence-based computer-aided prognosis in patients with PD using diffusion tensor imaging. Diffusion tensor imaging can be objective and does not require extensive cooperation from the patient [25]. Our study demonstrates the value of adding this step to routine imaging examinations. This might significantly contribute to the confidence of neurologists on the diagnosis and prognosis of PD.

Prediction of Motor Function
Owing to their paramount pathogenetic role in PD, the substantia nigra, in particular the basal ganglia, has been the subject of intense MRI investigation [26][27][28][29]. However, the detection of structural changes in the middle temporal gyrus more than in the basal ganglia may improve diagnostic accuracy [10,30]. Our study shows that during the follow-up period, changes in diffusivity can occur in extensive cortical regions, for example, middle frontal gyrus, superior and middle occipital gyri, as well as nodule and tonsil in the cerebellum. Because microenvironmental alterations generally precede structural brain changes, diffusion parameters are expected to have higher predictive power than conventional morphometric measures.
The cerebellum appeared in many of our regression models, notably the quadrangular lobule of the cerebellum and lingula of the vermis. The basal ganglia are characterized by a relevant disynaptic projection to the cerebellar cortex [31]. Patients of PD with either tremor [32] or pain [33] showed damage in these regions. Reciprocal connections between the basal ganglia and cerebellum have been identified and may account for some of the clinical symptoms observed in PD [34]. Our study might suggest that the cerebellum could play an essential role in the disease progression of the motor symptoms in PD.
Regions found to be predictive of prognosis could form the keynodes within an interconnected network, for example, frontal-hippocampus and cerebellum-basal ganglia-thalamus networks. Brain connectivity within a whole neural network is altered in PD patients [35]. Therefore, affected regions might extend beyond observed pathogenesis, leading to changes in related motor and higher cognitive functions [36]. Our approach, which comprehensively assessed the damage throughout the whole brain, might potentially provide a more accurate prediction of the functional decline.

Prediction of Non-Motor Functions
Interestingly, the functions of these brain regions reflect well-known alterations in memory, language, and executive functions that occur in PD [37]. Our findings also suggest that both the thalamus and frontal lobe may be involved in the prediction of LEDD and the hippocampus in the prediction of UPDRS I. The functions of these memory-and cognitive-related areas involve regulation of sensory-motor, emotional and memory functions [38,39], and socio-emotional processing [40].
Although PD is predominantly considered to be a motor disorder, the extensive brain regions identified in our regression analysis supports the hypothesis that the disease process is multifocal in nature [41].
Because non-dopaminergic and non-motor symptoms tend to emerge later during the course of PD, their potential role as targets of cognitive training is attracting increasing attention [42]. Therefore, in order to effectively predict clinical outcome, it would be reasonable to assess the damage from a comprehensively neural network, rather than a few selected regions of interest.
Because non-motor symptoms in PD often precede motor abnormalities such as cognitive impairment, and are associated with worse prognosis [43], patients should be carefully monitored for potential early interventions [44,45]. We show that the clinical outcome in motor and non-motor symptoms can be confidently predicted by the diffusion characteristics in a combination of multiple brain regions. Our study might provide novel evidence of diffusion MRI as an effective and non-invasive prognostic tool.

Study Limitation
The main limitation in our study is the nature of our data-driven approach that enters the diffusion parameters into regression models, which makes it difficult to establish a direct causal relationship between the underlying function of an affected brain region and longitudinal changes in clinical severity. Further research aimed at disentangling the complex neural networks associated with clinical deterioration over time is therefore required.
Because only 82 patients were included in the final analysis, the generalizability of the study might be restricted. Our study presents a method, which selects features from the whole brain rather than a few pathophysiology related areas, to effectively predict the clinical outcome of patients with PD. Our observations of the involvement from additional brain regions might prompt further research into the role of progressive diffuse microstructural brain damage in the natural history of PD and to investigate the potential effect from chronic pharmacotherapy with an increased number of participants in the future study.

Conclusions
In conclusion, baseline parameters from diffusion tensor measurements from the whole brain of PD patients can accurately predict the decline in clinical outcomes over a 2-year period in individual patients, notably in the total and Category II of UPDRS.