Factors Affecting the Baseline and Post-Treatment Scores on the Hopkins Verbal Learning Test-Revised Japanese Version before and after Whole-Brain Radiation Therapy

Our objectives were to (1) investigate the feasibility of the use of the Japanese version of the Hopkins Verbal Learning Test-Revised (HVLT-R); (2) identify the clinical factors influencing the HVLT-R scores of patients undergoing whole-brain radiation therapy (WBRT); and (3) compare the neurocognitive function (NCF) after WBRT in different dose fractionation schedules. We administered the HVLT-R (Japanese version) before (baseline) and at four and eight months after WBRT in 45 patients who received either therapeutic (35Gy-in-14, n = 16; 30Gy-in-10, n = 18) or prophylactic (25Gy-in-10, n = 11) WBRT. Sixteen patients dropped out before the eight-month examination, due mostly to death from cancer. The Karnofsky Performance Status (KPS) 80–100 group had significantly higher baseline total recall (TR) scores (p = 0.0053), delayed recall (DR) scores (p = 0.012), and delayed recognition (DRecog) scores (p = 0.0078). The patients aged ≤65 years also had significantly higher TR scores (p = 0.030) and DRecog scores (p = 0.031). The patients who underwent two examinations (worse-prognosis group) had significantly decreased DR scores four months after WBRT compared to the baseline (p = 0.0073), and they were significantly more likely to have declined individual TR scores (p = 0.0017) and DR scores (p = 0.035) at four months. The eight-month HVLT-R scores did not significantly decline regardless of the WBRT dose fractionation. The baseline NCF was determined by age and KPS, and the early decline in NCF is characteristic of the worse-prognosis group.


Introduction
In the last decade, the importance of both whole-brain radiation therapy (WBRT) and stereotactic radiosurgery (SRS) in the management of brain metastasis (BM) has been highlighted again with the latest improvement in prognostic stratification. Sperduto et al. published the results of a secondary analysis of the Radiation Therapy Oncology Group (RTOG) 9508 trial comparing WBRT + SRS with WBRT alone in the management of 1-3 BMs [1]. In that analysis, non-small cell lung cancer (NSCLC) was the dominant primary tumor. The patients with favorable prognoses showed improved survival when treated with WBRT + SRS compared to those treated with WBRT alone (p = 0.05). In their secondary analysis of the JROSG 99-1 trial, Aoyama et al. post-stratified the NSCLC patients. They reported that the patients with BM from NSCLC in the favorable-prognosis group had longer survival when treated with WBRT + SRS compared to those treated with SRS alone (p = 0.04) [2]. The difference in overall survival was not observed in the worse-prognosis group (p = 0.86). For the patients with a good systemic condition and a small number of systemic metastases, improved intracranial tumor control could lead to prolonged survival.
Although both WBRT and SRS are indispensable in the management of BM, two trials demonstrated a negative impact of WBRT on the patients' neurocognitive function (NCF) at 3-4 months after the treatment, and these results set the trend toward the SRS-alone strategy [3,4]. It is of note that the Hopkins Verbal Learning Test-Revised (HVLT-R) was used in both of those studies, and this test is considered to have adequate properties to assess the NCF after brain irradiation; the English version of the HVLT-R has been increasingly used in trials dealing with brain metastasis and other brain tumors [5].
The objectives of the present study were to (1) investigate the feasibility of the use of the Japanese version of the HVLT-R; (2) identify the clinical factors influencing the HVLT-R (Japanese version) scores of patients who have undergone WBRT; and (3) compare the patients' NCF after WBRT in different dose fractionation schedules.

Patient Characteristics
The characteristics of the 45 patients are listed in Table 1. The 25-Gy group was comprised of only the LD-SCLC patients who were assigned to prophylactic WBRT (n = 11), and they had no intracranial metastases at the beginning of the WBRT. Sixteen patients (36%) underwent HVLT-R at four months after WBRT but did not undergo the eight-month examination. The remaining 29 patients (64%) underwent the eight-month examination. The most common reason for the discontinuation of HVLT-R was death from cancer (n = 8). The proportion of patients who did not undergo the eight-month examination was significantly higher in the Karnofsky Performance Status (KPS) ≤ 70 group compared to the KPS ≥ 80 group (Table 2, p = 0.014 by Fisher's exact test).   Table 3 shows the association between the baseline scores of the subdomains TR, DR, DRecog of the HVLT-R, and the clinical factors. The patients with KPS ≥ 80 had significantly higher TR scores (p = 0.0053 by Mann-Whitney U-test), higher DR scores (p = 0.012), and higher DRecog scores (p = 0.0078) compared to the patients with KPS ≤ 70. The patients aged ≤65 years also had significantly higher TR scores (p = 0.030) and higher DRecog scores (p = 0.031) compared to the patients aged ≥66 years, and they tended to have higher DR scores (p = 0.080). There were no significant differences in any HVLT-R subdomain scores among the WBRT dose groups, or among the different types of the cranial lesion.  The p-value was Bonferroni-adjusted in the comparison among different WBRT dose groups or cranial lesions.

HVLT-R Raw Scores of the Patients Who Underwent Two Examinations (Baseline and Four Months)
(Worse-Prognosis Group, n = 16) Figure 1 summarizes the HVLT-R subdomain scores in the patients who did not undergo the eight-month examination. The DR scores were significantly decreased four months after WBRT compared to the baseline (p = 0.0073 by Wilcoxon signed rank test). The TR scores tended to decline four months after WBRT (p = 0.057), but this change was not observed in the DRecog scores (p = 0.15).   Figure 2 summarizes the HVLT-R subdomain scores in the patients who also underwent th eight-month examination. There were no significant changes over time in the TR scores (p = 0.24 b Friedman test), DR scores (p = 0.14), or DRecog scores (p = 0.13).

HVLT-R Raw Scores of the Patients Who Underwent Three Examinations (Baseline, Four, and Eight
Months) (Better-Prognosis Group, n = 29) Figure 2 summarizes the HVLT-R subdomain scores in the patients who also underwent the eight-month examination. There were no significant changes over time in the TR scores (p = 0.24 by Friedman test), DR scores (p = 0.14), or DRecog scores (p = 0.13).  Table 4 shows the HVLT-R scores of the patients whose total number of examinations was three (baseline, four, and eight months) by WBRT dose fractionation. In the 25-Gy group, the four-month TR scores were significantly higher than the baseline scores (Bonferroni-adjusted p = 0.045). To test the hypothesis that the patients who underwent 25-Gy WBRT and three examinations had worse  Table 4 shows the HVLT-R scores of the patients whose total number of examinations was three (baseline, four, and eight months) by WBRT dose fractionation. In the 25-Gy group, the four-month TR scores were significantly higher than the baseline scores (Bonferroni-adjusted p = 0.045). To test the hypothesis that the patients who underwent 25-Gy WBRT and three examinations had worse baseline TR scores compared to the other WBRT dose groups, we compared the baseline TR scores of each WBRT dose group by Kruskal-Wallis test. There were no significant differences in the baseline TR scores among the WBRT dose groups (p = 0.40). The eight-month HVLT-R scores were not significantly declined in any subdomain regardless of the WBRT dose fractionation.  Table 5 shows the proportion of significant decline in four-month individual HVLT-R scores and clinical factors. The patients with KPS ≤ 70 were significantly more likely to have declined TR scores at four months compared to the patients with KPS ≥ 80 (Fisher's exact test, p = 0.013). Similarly, the patients who underwent two examinations (worse-prognosis group; baseline and four months) were significantly more likely to have declined TR scores (p = 0.0017) and DR scores (p = 0.035) compared to the patients who underwent three examinations (better-prognosis group; baseline, four, and eight months).   Table 6 shows the proportion of significant decline in eight-month individual HVLT-R scores and clinical factors. The patients aged ≥66 years tended to have declined DR scores at eight months compared to the patients aged ≤65 years (p = 0.060).

Discussion
This is the first clinical study in which the Japanese version of the HVLT-R was used to investigate the serial changes in NCF after brain irradiation. By applying this test, we were able to assess subtle differences or changes before and after brain irradiation, which might not be detected by the Mini-Mental State Examination (MMSE).
At the baseline, the patients with either older age or lower KPS exhibited lower HVLT-R scores. A similar association was observed with the MMSE. Aoyama et al. reported that BM patients aged ≥66 years or those with KPS ≤ 80 had lower baseline MMSE scores than their counterparts [6]. Kurita et al. used the MMSE to analyze the NCF of 1915 cancer patients who underwent treatment with opioids [7], and they reported that older age and lower KPS were independently associated with MMSE scores <27.
Although it has been used widely, a weak point of the MMSE is that it is designed to evaluate global cognitive function rather than learning and memory. It also has a ceiling effect [8]. Sun et al. published the results of their secondary analysis of the RTOG 0214 trial evaluating the effect of prophylactic WBRT for advanced NSCLC [9]: the rate of decline in individual TR scores of the HVLT was significantly higher in the prophylactic WBRT arm at 3, 6, and 12 months, whereas the rate of decline in individual MMSE scores was higher in the intervention arm only at three months. The ability of the HVLT-R to detect subtle changes in learning and memory supports its routine use in trials evaluating cranial radiation therapy for BM.
One of the interesting findings of the present study is that a significant decline in the score at four months was observed only in the patients who underwent the examination twice (worse-prognosis group; baseline and four months) but not in the patients who underwent the examination three times (better-prognosis group; baseline, four, and eight months). Sixteen (36%) of our patients underwent the four-month HVLT-R but dropped out before the eight-month examination. Their DR scores declined significantly compared to the baseline, and they were also significantly more likely to have declined individual TR and DR scores at four months compared to the patients who completed the four-month examination. The reason for dropping out by eight months was mostly death from cancer or worsened general condition. The KPS ≤ 70 at the baseline was a risk factor for discontinuation of the HVLT-R and declined individual TR scores at four months. The KPS is regarded as a prognostic factor of several primary cancer sites, and it is reasonable that the patients with a low KPS have worse survival and cannot undergo repeated HVLT-R examinations.
Cognitive deterioration determined by the HVLT-R at three or four months was employed as a primary endpoint in two randomized clinical trials (RCTs) comparing SRS alone and SRS + WBRT for patients with one to three brain metastases [3,4]. In a recent publication from North America, Brown et al. reported that there was less cognitive deterioration at three months after SRS alone (40/63, 63.5%) than when SRS was combined with WBRT (44/48, 91.7%). In a smaller RCT from the MD Anderson Cancer Center, Chang et al. reported that patients who were randomly assigned to receive SRS + WBRT were significantly more likely to show a decline in learning and memory function (mean posterior probability of decline 52%) at four months than patients assigned to receive SRS alone (mean posterior probability of decline 24%). It should be noted that the former (Brown et al.) study recruited 213 patients (SRS alone, n = 111, SRS + WBRT, n = 102); however, only 57% (63/111) of the patients in the SRS-alone group and 47% (48/102) of the patients in the SRS + WBRT group underwent HVLT-R at three months. Similarly, in the Chang et al. study, only 67% (20/30) of the patients in the SRS-alone group and 39% (11/28) of the patients in the SRS + WBRT group underwent the HVLT-R at four months. The results of the present study suggest that the four-month decline in NCF is characteristic of the lower-KPS group, and this early decline must be distinguished from the true long-term toxicities that affect the favorable-prognosis group. This finding is supported by the RTOG 0933 trial evaluating hippocampus-avoiding WBRT; Gondi et al. reported that the HVLT-R DR scores of the patients who had died by six months declined significantly over time [10]. This decline was not observed in the patients who were alive at six months.
Regarding the association between the schedule of WBRT and the HVLT-R score among the 29 patients who underwent both the four-and eight-month tests in the present study, neither the total dose nor the fractionation of WBRT affected the four-and eight-month HVLT-R scores. Wolfson et al. conducted the RTOG 0212 trial comparing three different prophylactic WBRT dose fractionations in LD-SCLC [11]. They used the HVLT, the Controlled Oral Word Association Test (COWAT), and the Trail Making Test (TMT) parts A and B. Chronic neurotoxicity was defined as deterioration in at least one test without the development of brain metastases at 12 months. In a logistic regression analysis, chronic neurotoxicity was associated with the WBRT dose fractionation of 36 Gy in 18 fractions and older age. It seems difficult to assess the relationship between the WBRT dose fractionation and the incidence of 12-month decline in the NCF of BM patients, because the median survival time for all BM patients is approximately eight months [12]. The late adverse effects of therapeutic WBRT for BM will be of clinical importance for long-term survivors.
Our study has several limitations. First, the study population was heterogeneous because we compared the NCF of patients who had undergone different WBRT fractionation schedules; Second, the small sample size made it difficult to elucidate possible confounders with a multivariate analysis; Finally, the degree of radiation-induced NCF deterioration might be overestimated, because it was difficult to eliminate the effect of the regrowth of intracranial tumors.

Patient Population
We enrolled the patients who were assigned to therapeutic WBRT for the treatment of BM, skull metastasis, dural metastasis, leptomeningeal dissemination of cancer, or central nervous system (CNS) involvement of non-Hodgkin lymphoma (NHL) from April 2012 to May 2014 at Niigata University Hospital and Niigata Cancer Center Hospital. We also included patients with limited-disease small cell lung cancer (LD-SCLC) who were assigned to prophylactic WBRT. Patients with a neurological deficit, such as a consciousness disorder, hemiparesis, visual defect, or aphasia, that could disturb the neuropsychological examinations were excluded. The dose fractionation of therapeutic WBRT was 35 Gy in 14 fractions over 3 weeks at Niigata University Hospital, and 30 Gy in 10 fractions over 2 weeks at Niigata Cancer Center Hospital. The fractionation of prophylactic WBRT was 25 Gy in 10 fractions over 2 weeks at both institutions. After WBRT, a follow-up neuroradiological examination was performed with contrast-enhanced X-ray computed tomography (CT) or magnetic resonance imaging (MRI) every 2-6 months. Written consent was obtained from all patients or their relative before WBRT. This study was approved by the Institutional Review Board of Niigata University Hospital (Study #1449).

Neurocognitive Function Assessment
Each patient's NCF was assessed with the HVLT-R (Japanese version), consisting of the total recall (TR), delayed recall (DR), and delayed recognition (DRecog) subdomains. The HVLT-R was performed before (baseline) and at 4 months and 8 months after the completion of WBRT. The HVLT-R test battery contains six different question forms so that the participant cannot memorize the answers. The interval between the day exactly 4 or 8 months after WBRT and the actual date of the follow-up HVLT-R was <1 month. The HVLT-R scores were analyzed in two different ways. One way was an inter-group or intra-group (the baseline, 4, and 8 months) comparison of raw scores with nonparametric tests, and the other way was to judge whether the 4-or 8-month score of a participant was significantly declined compared to the baseline beyond a specific cutoff value. Significant deterioration in an individual's NCF was defined as a drop from the baseline score by ≥5 points for TR, ≥3 points for DR, and ≥2 points for DRecog [10]. For example, if the baseline TR score was 15 and the 8-month TR score was 9, the 8-month TR score was judged as declined compared to the baseline.

Statistical Analyses
The association between the patient characteristics and the baseline HVLT-R scores in each subdomain was evaluated with the Mann-Whitney U-test and the Kruskal-Wallis test. Among the patients who underwent the 4-month examination and did not undergo the 8-month examination, the HVLT-R scores at the baseline and 4 months after WBRT were compared using the Wilcoxon signed rank test. Among the patients who completed the 8-month examination, we used the Friedman test to compare the baseline, 4-month, and 8-month scores. A univariate Fisher's exact test was performed to determine the factors associated with a significant decline from the baseline score in each HVLT-R subdomain. A p-value <0.05 was considered significant.
All statistical analyses were performed with EZR (Saitama Medical Center, Jichi Medical University, Saitama, Japan), which is a graphical user interface for R (The R Foundation for Statistical Computing, Vienna, Austria) [13].

Conclusions
The results of this study demonstrated the feasibility of the HVLT-R Japanese version for evaluating the patients' NCF after brain irradiation. We found that the patients with either older age or lower KPS had impaired NCF as assessed with the HVLT-R at the baseline. The four-month NCF deterioration was associated with both the KPS and poor outcome. The WBRT dose fractionation did not have an impact on the incidence of the decline in the NCF at four or eight months. The early decline in the NCF is characteristic of the worse-prognosis group, and thus the NCF of favorable-prognosis patients might better be evaluated later (i.e., at six months or later) after irradiation.