Evaluation of Intensity- and Contour-Based Deformable Image Registration Accuracy in Pancreatic Cancer Patients

We aimed to clarify the accuracy of rigid image registration and deformable image registration (DIR) in carbon-ion radiotherapy (CIRT) for pancreatic cancer. Six patients with pancreatic cancer who were treated with passive irradiation CIRT were enrolled. Three registration patterns were evaluated: treatment planning computed tomography images (TPCT) to CT images acquired in the treatment room (IRCT) in the supine position, TPCT to IRCT in the prone position, and TPCT in the supine position to the prone position. After warping the contours of the original CT images to the destination CT images using deformation matrices from the registration, the warped delineated contours on the destination CT images were compared with the original ones using mean displacement to agreement (MDA). Four contours (clinical target volume (CTV), gross tumor volume (GTV), stomach, duodenum) and four registration algorithms (rigid image registration [RIR], intensity-based DIR [iDIR], contour-based DIR [cDIR], and a hybrid iDIR-cDIR ([hDIR]) were evaluated. The means ± standard deviation of the MDAs of all contours for RIR, iDIR, cDIR, and hDIR were 3.40 ± 3.30, 2.2 1± 2.48, 1.46 ± 1.49, and 1.46 ± 1.37 mm, respectively. There were significant differences between RIR and iDIR, and between RIR/iDIR and cDIR/hDIR. For the pancreatic cancer patient images, cDIR and hDIR had better accuracy than RIR and iDIR.


Introduction
As carbon-ion beams have the characteristics of a Bragg peak and sharper penumbra [1], they can generate a more conformal dose distribution than X-ray beams [2]. However, there is the risk that the dose distribution may be substantially changed if the target position moves or the internal organ contours around the target change. In particular, it was reported that changes were observed for mobile organs that moved with respiratory movement [3][4][5][6][7][8]. To safely treat such organs, it is necessary to confirm the reproducibility of the dose distributions. Additionally, confirming the dose distributions during the whole treatment days can obtain a more accurate estimation for the irradiated dose, and it is effective for predicting treatment outcomes and toxicities [9]. For dose distribution confirmation, it is necessary to acquire computed tomography (CT) images on whole treatment days, to calculate the dose distributions on the CT images, and to accumulate the overall dose distributions [7,9,10].

Discussion
The accuracy of DIR was better than that of RIR in cases with the same patient position, while the accuracies of cDIR and hDIR for the CTV, stomach, and duodenum were better than those of iDIR in all patient positions, as shown in Figure 3. Meanwhile, for the GTV, the accuracy of iDIR was better than the other methods. We assume that iDIR is effective when deformations are small and the boundary of the contour is clear. However, we assume that iDIR is not effective when the boundary of the contour is unclear (such as with the CTV), because obtaining pixel-by-pixel correspondence is difficult, which is also the case when CT values show large differences due to changes in gas content (such as in stomach and duodenum). Because cDIR and hDIR are less affected by the above effects, they were better than iDIR for the CTV, stomach, and duodenum.
In previous studies, Motegi

Discussion
The accuracy of DIR was better than that of RIR in cases with the same patient position, while the accuracies of cDIR and hDIR for the CTV, stomach, and duodenum were better than those of iDIR in all patient positions, as shown in Figure 3. Meanwhile, for the GTV, the accuracy of iDIR was better than the other methods. We assume that iDIR is effective when deformations are small and the boundary of the contour is clear. However, we assume that iDIR is not effective when the boundary of the contour is unclear (such as with the CTV), because obtaining pixel-by-pixel correspondence is difficult, which is also the case when CT values show large differences due to changes in gas content (such as in stomach and duodenum). Because cDIR and hDIR are less affected by the above effects, they were better than iDIR for the CTV, stomach, and duodenum.
In previous studies, Motegi et al. reported a DSC of 0.96 ± 0.01 for the prostate with hDIR [15], and Sarudis et al. reported a DSC of more than 0.83 for the GTV with iDIR when the GTV was shifted more than 1.6 cm [16]. As the DSCs of GTV with hDIR and iDIR were 0.88 ± 0.07 and 0.89 ± 0.08, respectively, in this study, our accuracies were worse than those given in the above references. It is assumed that registration in pancreatic cancer patients, which have different patient positions, is difficult. Task Group (TG)-132 in the American Association of Physicists in Medicine (AAPM) guidelines shows that the tolerance of the MDA should be from 2 mm to 3 mm [17]. The MDA obtained with cDIR was the best in this study, being less than 2 mm in 100% of SP-SP cases and 95.8% of PR-PR cases. However, the value in SP-PR cases was much lower (54.2%). The average MDA of the target was approximately 1 mm, in contrast with the errors for the OARs (stomach and duodenum), which were approximately 2 mm. Large morphological changes occur in the bowels due to changes in their content and gases (stomach volume changed by a maximum of 158.97 ml and duodenal volume by 46.24 mL), and CT values can show large changes due to changes in gas content. This seems to be one cause of DIR failure. In particular, cDIR had MDA errors of 1.89 ± 2.27 mm in the stomach, and hDIR had errors of 1.51 ± 0.65 mm in the duodenum. In this study, we applied cDIR and hDIR to several contours (CTV, GTV, stomach, and duodenum) simultaneously; however, it is possible that accuracy would increase if each contour were used individually. Manual user configurations, such as Reg refine [18,19], or other new methods, are necessary to increase the accuracy of DIR, because the accuracy appears to be limited, even when cDIR and hDIR methods are used.
RIR and cDIR had approximately 10% maximum differences for CTV V95, and that all registrations had approximately 20% maximum differences for stomach V50, as shown in Table 1. The deformation of dose distribution in this study used only a simple deformation according to the deformation matrix, and it did not take into account preservation of the total dose with respect to the contour volumes and CT values. Thus, it is unknown whether the differences are reasonable or not. However, it should be noted that such a difference was observed when dose distributions transferred to other CT images were evaluated. Additionally, it should also be noted that a difference was observed when accumulating dose distributions across several CT image sets.
The correlation coefficients R between MDA and DSC were high (0.84-0.93), as shown in Figure 4. Although DSC accuracy generally increases as the volumes increases, DSC can be used for evaluating DIR accuracy in pancreatic cancer patient images because it has a high correlation with MDA. In contrast, the correlation coefficients between MDA and V95 for the target were only mid-range (0.40 to 0.65), because differences in the target V95 might decrease when registration accuracy increases. The correlation coefficients between MDA and V50 for the OARs were low (0.01, 0.29, and 0.23 with RIR, iDIR, and cDIR, respectively). It is not necessarily the case that the differences in the OAR V50 decrease when registration accuracy increases.
In the evaluation results using contours, cDIR and hDIR were better than RIR and iDIR. However, they can only be used by taking the correspondence pixel-by-pixel and accumulating the dose distribution, they cannot be used when transferring contours from one CT image set to another one; in such a case, iDIR might be more useful than RIR.
Our study has some limitations. We used only 18 (three sets for each of six patients) CT image sets, and further analysis is necessary to accurately evaluate DIR accuracy. Additionally, the registration evaluation used a relative evaluation based on contours. Even if the MDA was 0 (or DSC becomes 1), it is not necessarily the case that a point in the original CT image set matched the corresponding point in another CT image set. Therefore, it is possible that the evaluation underestimates the DIR error. Moreover, delineation errors are also included in this evaluation, even though the same oncologist delineated all contours.

Patients
This prospective study included eight consecutive patients with pancreatic cancer who were each treated with 12 fractions of passive-irradiation CIRT at our facility between March 2018 and February 2019. This prospective study was performed to evaluate inter-fractional anatomical changes and to calculate the accumulated inter-fraction dose using daily CT images acquired in the treatment room, including determination and evaluation of the most appropriate method. Data from six patients were used in this study, with the data from two patients being excluded because they did not meet the following inclusion criteria: (i) CT images sets were acquired at treatment planning and other timepoints; (ii) CT image sets were acquired in both supine and prone positions; (iii) the patients had a stomach and duodenum. The patients' characteristics are shown in Table 2. This study was conducted in accordance with the Declaration of Helsinki and was approved by our facility's institutional review board (1564). The study was registered at the University Hospital Medical Information Network Clinical Trials Registry (UMIN-CTR trial number: 000029495). All patients gave written informed consent for inclusion before they participated in the study, and their data were anonymized.

CT Image Acquisition
Twelve CT data sets were acquired for each patient, one on each day of treatment, to investigate the effects of tumor movement and inter-fractional changes on the planned dose. The CT images for treatment planning (PlanCT) were acquired on a scanner in the simulation room (Aquilion LB ® , Self-Propelled, Canon Medical Systems, Japan).
On each of the twelve separate radiotherapy days, the patient was irradiated after patient positioning was performed using orthogonal X-ray images [20], with supine positioning being used on days 1 to 9, and prone positioning for days 10 to 12. After the irradiation, a CT data set was acquired in the treatment room with the patient in the same position as that used for the irradiation, and with the same tube voltage, tube current, field of view, and slice thickness settings used for the PlanCT.
Four CT image sets were used in this study: PlanCT images with the patient in the supine position (SP-PlanCT), CT images acquired in the supine position in the treatment room on the first irradiation day (1st-IRCT), PlanCT images in the prone position (PR-PlanCT), and CT images acquired in the prone position in the treatment room on the 10th irradiation day (10th-IRCT). The median period (range) from the SP-PlanCT to the 1st-IRCT was 12 days (8-14 days), that to the PR-PlanCT was 15.5 days (9-20 days), and that to the 10th-IRCT was 27 days (22-29 days).

Treatment Planning and Dose Calculation
The Gunma University Heavy Ion Medical Center (GHMC) provides carbon-ion therapy [21] using a heavy ion irradiation device (Mitsubishi Electric, Japan) with a passive irradiation method [22]. The passive irradiation field was generated using a scatterer and wobbling, and the field was collimated to the outside of the planning target volume (PTV) using a multi-leaf collimator (MLC). A treatment planning system with a pencil-beam algorithm (XiO-N, Elekta Sweden, Mitsubishi Electric, Japan) was used. The relative biological effectiveness (RBE) was included in the absorbed dose using a spread-out Bragg peak concept [23], and the clinical dose, including this, was defined as Gy (RBE).
A radiation oncologist delineated the stomach, duodenum, and intestine (bowels) on each PlanCT while referring to contrast-enhanced CT images. The gross tumor volume (GTV); clinical target volume-1 (CTV1; by adding 5-mm margins to the GTV, including the prophylactic lymph node, and excluding the bowels + 2 mm [excluding the GTV]), and CTV2 (by adding 5-mm margins to the GTV, excluding the bowels + 2 mm) were also delineated.
The dose distribution of the anterior-posterior (AP) beam field was calculated on the SP-PlanCT, and the dose distribution was calculated on the 1st-IRCT using the same parameters as used on the SP-PlanCT. The dose distribution of the posterior-anterior (PA) beam field was calculated on the PR-PlanCT, and also on the 10th-IRCT using the same parameters. The prescribed dose was set to 4.6 Gy (RBE). The priority for the AP beam field was to ensure target coverage, while the priority for the PA beam field was to spare organs at risk (OARs). Dose distributions on the 1st-IRCT and 10th-IRCT were analyzed.

Deformable Image Registration Algorithm
Four types of registration algorithm were used: rigid image registration (RIR), intensity-based deformable image registration (iDIR), contour-based deformable image registration (cDIR), and hybrid intensity-and contour-based deformable image registration (hDIR). The DIR algorithms were implemented by the VoxAlign Deformation Engine in MIM maestro (MIM Software Inc., USA). For the cDIR and hDIR, all the four contours delineated on each CT image were used: the CTV1 (or 2), GTV, stomach, and duodenum.

Data Analysis
Registration from the 1st-IRCT to the SP-PlanCT was defined as SP-SP, registration from the 10th-IRCT to the PR-PlanCT as PR-PR, and registration from the PR-PlanCT to the SP-PlanCT as SP-PR. In each case, the contours on one CT set were warped and transferred to the other CT set using the deformation matrix from each registration. The mean distance to agreement (MDA) and dice similarity coefficient (DSC) were calculated between the delineated contours on the CT images and transferred contours. The CTV1 (or 2), GTV, stomach, and duodenum, were used for the evaluations.
Furthermore, in each case, the dose distribution on one CT set was warped and transferred to the other CT set using the deformation matrix from each registration. The dose-volume parameters were compared between the dose distribution on original CT set and the transferred dose distribution on the other CT set using contours delineated on each CT set, and the difference between each parameter was calculated. The CTV1 (or 2) and GTV receiving greater than 95% of the prescription dose (V95) and the stomach and duodenum V50 and V10 were used for the calculations. Additionally, for each case, the correlations between MDA and DSC, and between MDA and dose-volume parameter indices, were calculated.
The Bonferroni method was used to correct for multiple comparisons in MDA or DSC measurements between RIR, iDIR, cDIR, and hDIR. A level of p < 0.05 was considered statistically significant. Statistical analyses were performed using SPSS software (IBM SPSS Statistics for Windows, version 25.0, IBM, Inc., Armonk, NY, USA).

Conclusions
In this study, we evaluated RIR and intensity-and contour-based DIR accuracy in CIRT for pancreatic cancer patients. We found that DIR accuracy was significantly better than RIR accuracy for inter-fractional CT image sets with the same patient position and that contour-based DIR and hybrid DIR were significantly better than RIR and intensity-based DIR for inter-fractional CT image sets with different patient positions.