The Effect of Contrast Agents on Dose Calculations of Volumetric Modulated Arc Radiotherapy Plans for Critical Structures

Featured Application: This article investigates contrast-enhanced computed tomography imaging in radiotherapy, highlighting the impact of contrast materials on CT numbers and dose calculations in comparison to non-contrast CT imaging, justifying the use of contrast-enhanced CT imaging to improve utilization and efﬁciency in radiotherapy simulation. Abstract: Radiotherapy dose calculation requires accurate Computed Tomography (CT) imaging while tissue delineation may necessitate the use of contrast agents (CA). Acquiring these two sets is a common practice in radiotherapy. This study aims to evaluate the effect of CA on the dose calculations. Two hundred and twenty-six volumetric modulated arc therapy (VMAT) patients that had planning CT with contrast (CCT) and non-contrast CT (NCCT) of different cancer sites (e.g., brain, head, and neck (H&N), chest, abdomen, and pelvis) were evaluated. Treatment plans were recalculated using CCT, then compared to NCCT. The variation in Hounsﬁeld units (HU) and dose distributions for critical structures and target volumes were analyzed using mean HU, mean and maximum relative dose values, D 2% , D 98% , and 3D gamma analysis. HU variations were statistically signiﬁcant for most structures. However, this was not clinically signiﬁcant as the difference in mean HU values was within 30 HU for soft tissue and 50 HU for lungs. Variation in target volumes’ D 2% and D 98% were insigniﬁcant for all sites except brain and nasopharynx. Dose maximum differences were within 2% for the majority of critical structures and target volumes. 3D gamma analysis results revealed that majority of plans satisﬁed the 2% and 2 mm criteria. CCT may be acquired for VMAT radiotherapy planning purposes instead of NCCT, since there is no clinically signiﬁcant difference in dose calculations based on either image set.


Introduction
In clinical radiation therapy, the accuracy of radiotherapy treatment planning and dose calculations requires high-quality medical images to delineate and define planning target volumes (PTV) and critical structures of interests, or organs at risk (OAR). CT is the most suitable modality for radiotherapy treatment planning and dose calculation algorithms due to its geometrical accuracy and tissue density information. In addition, CT data is essential for the treatment planning dose calculation algorithms, as these are dependent on the electron density information embedded in CT images. CT, however, may suffer imaging artifacts due to the presence of high-density materials such as dental fillings, virtual CCT study, Li et al. [24] replaced blood vessels and heart HU values in NCCT images of 22 esophageal cancer patients with higher values up to 445 HU. IMRT and 3DCRT plans were recalculated for comparison. Results suggest a linear increase in calculated dose with increase in HU values. Although the authors recommended replacing contrast material in the heart with a constant value of 45 HU, no significant dose differences were reported. This was also the finding in a more recent artificial intelligence-based study [25] where 40 patients' CCT images were used to generate virtual NCCT images. VMAT plans based on three image sets, NCCT, virtual NCCT, and CCT, were used to evaluate the effect on dose calculations. Results showed no significant dose difference between the image sets for targets and critical structures. Similarly, Shi et al. [26] used CCT images of nine non-small-cell lung cancer patients to calculate 3DCRT and IMRT treatment plans. Virtual non-contrast images were generated by altering HU values in blood vessels. Dose distributions based on the virtual NCCT were compared to those based on CCT images in target volumes, and results suggested no significant differences. Weber et al. [27] also reported no significant change in dose calculations for the PTV and organs at risk. The 2001 results indicated that dose differences for targets and critical structures were within 2.5%. A similar study by Aras et al. [28] for five stomach cancer patients in which the visible contrast material had a HU value of 500 was assigned 0 HU value. To mimic non-contrast images, the stomach density was altered virtually to water equivalent density. Treatment plans were calculated based on both CCT and virtual NCCT, and average dose differences for both were within 2%. A similar approach was used also in a prostate cancer study [29] which included CCT images for 10 patients. Densities in bladder and rectum were altered to water density. Box treatment technique plans were created using CCT and virtual NCCT image sets. No significant differences were reported for both target volumes and critical structures. However, statistically significant dose differences were reported in some virtual studies. Ercan et al. [30] evaluated the dosimetric differences between two treatment plans based on CCT and virtual NCCT images for ten patients and reported significant statistical dose differences. An artefact masking study evaluated the effect of a contrast agent on dose calculation for 17 thoracic cancer patients and reported dose differences within 3% except in two patients [31]. The authors recommended masking CCT artefacts prior to dose calculation.
A DECT study [32] utilized reconstructed NCCT and CCT images at three monochromatic energy levels, 40, 60, and 77 kv, for 15 head and neck cancer patients. Results showed that there were no significant changes in the dose distributions for target volumes and critical structures.
A Monte Carlo simulating study [33] examined the magnitude of tumor dose enhancement achieved by contrast media with various beam qualities and indicated that for flattened photon beams, the dose enhancement was less than 5%. However, it was significantly higher for flatting filter free beam, up to 23.1%.
A small field stereotactic radiosurgery study (SRS) [15] reported maximum dose difference up to 20% between NCCT-and CCT-based plans and recommended that care should be taken when considering CCT images for dose calculations. Another study [34] suggested a correction strategy with CCT images to overcome up to 5% dose difference.
A comparative study [35] for nine lung cancer patients' original treatment plans were recalculated using CCT images. Results showed no significant effect of the contrast agent on dose calculations.
The reviewed literature, summarized in Table 1, generally reported limited effect of contrast agents on radiotherapy dose calculations except in a few publications [13,[16][17][18]20,30]. Nevertheless, there was no consensus on the magnitude of these effects or a clear recommendation to utilize contrast-enhanced CT simulation images for radiotherapy dose calculations. Table 1. Literature search summary for the effect of contrast agents on radiotherapy dose calculations. Upper subscripts denote correction method used to overcome the effect of contrast material. a : resetting structure density to water equivalent density; b : changing structure density using structure density from diagnostic images for same structure; c : overriding the HU by other values; d : using virtual monochromatic images in DECT; e : using artificial intelligence to correct HU values; f : altering densities based on simulated CT images. The longitudinal study presented in this work aims to answer the question: Is CCT imaging an acceptable alternative to NCCT for TPS dose calculations? This will be carried out by investigating and then evaluating the effect of the CCT on the treatment planning dose calculations for all concerned critical structures and target volumes using VMAT technique for different anatomical sites: brain, H&N, chest, abdomen, and pelvis.

Materials and Methods
All patients that were treated using VMAT and who underwent both NCCT and CCT sequentially during the same radiotherapy CT simulation session as part of their normal clinical pathway were selected. This included all radiotherapy patients treated from January of 2011 to December of 2018 at the Comprehensive Cancer Center, King Fahad Medical City (KFMC). Two hundred and twenty-six patients of different cancer sites met the selection criteria and hence were included in this study; they are summarized in Table 2. The selected patients were grouped according to the treatment site, 70 brain, 106 H&N (90 nasopharynx, 9 thyroid, and 7 tongue), 19 chest (7 lung, 7 mediastinum, and 5 esophagus), 19 abdomen (11 non-Hodgkin's lymphoma (NHL), 6 pancreas, and 2 liver), and 12 pelvis (4 cervix, 3 uterus, 3 rectum, and 2 prostate). CT images with motion artifacts or registration accuracy exceeding 1 mm were excluded. This study was approved by the KFMC institutional review board with number 18-350.

CT Image Acquisition
CT images were acquired using a large-bore radiotherapy CT simulator (Somatom, Siemens, Germany) according to the departmental protocol for each treatment site. NCCT and CCT images were acquired sequentially for each patient within 20 min in the same session. NCCT scan was performed first, then the CCT second, then with the same setup and position, the patient was injected intravenously with contrast agent using an automatic injection system MEDRAD ® Stellant CT Injection System (Bayer AG, Berlin, Germany), and the scan was repeated with the same scan conditions. The contrast material was XENTIX 300. This material contains an iodinated contrast agent (300 mg I/mL). The enhanced scans were commenced about 100 s after a contrast material injection. The time interval between both scans was <4 min. The two NCCT and CCT image sets were rigidly registered, and structure sets from the original NCCT were copied to CCT image sets for each patient.

Treatment Planning and Evaluation
Original patients' treatments were planned and calculated in the Eclipse TPS version 13.6 (Varian Medical Systems, Palo Alto, CA, USA) based on NCCT in accordance with standard departmental policies. VMAT treatment planning was carried out for all patients using 6 MV X-ray beams with the anisotropic analytical algorithm (AAA). The same plan for each patient was copied and recalculated using the corresponding CCT im-age set for each patient instead of the NCCT image set. These treatment plans were neither re-optimized nor modified in any way. The dose distribution and monitor units (MU) settings were then calculated. The resulting CCT-based treatment plans were evaluated and compared to the corresponding original NCCT-based treatment plans in terms of dose distribution and HU, with emphasis on OARs and PTVs.
In addition to the investigation of differences in absolute mean HU values for critical structures and PTV between CCT and NCCT images sets, statistical evaluation included the mean dose, maximum dose, median dose, PTV volume covered by 98% of the prescribed dose (D 98% ), and PTV volume covered by 2% of the prescribed dose (D 2% ). For critical structures, the percentage difference (D % ) between relative doses in CCT (D CCT )-and NCCT (D NCCT )-based plans was calculated using Equation (1). A difference of 2% or less is considered acceptable, and relative doses less than 10% of the prescribed dose were omitted from evaluation to elude error overestimation.
Furthermore, the 3D gamma evaluation method was used to evaluate the volumetric dose difference between CCT-and NCCT-based plans. The criteria were 2.0 mm distance to agreement (DTA) and 2.0% dose difference (DD) with suppression of dose values less than 10% of the prescribed dose. The 3D gamma evaluation was carried out using Verisoft package version 5.1 (PTW, Freiburg, Germany).

Statistical Analysis
Mean HU values, dose values, and volumes were measured using built in Eclipse tools and dose volume histograms (DVH); all 226 patients were included in the analysis. Data analysis was performed using the Statistical Package for Social Sciences (SPSS) version 25 (SPSS Inc., Chicago, IL, USA). Alpha level p-value of less than 0.05 was considered statistically significant. The statistical analysis workflow is shown in Figure 1, and is divided into three steps:

1.
Normality test (Shapiro-Wilk test) that shows whether a dataset is distributed normally or exhibits a non-normal distribution [40].

2.
Mann-Whitney-Wilcoxon test to evaluate the differences between two independent non-normal distributions [41].

3.
Student's t-test to evaluate the difference between two paired independent normal distributions [42].
In the case of a group of less than three patients where normality test is not applicable, nonparametric related-samples Wilcoxon signed rank test was used. Appl. Sci. 2021, 11, x FOR PEER REVIEW 7 of 22

Results
Statistical analysis outcome for all 226 patients in terms of the difference in HU, maximum dose, and average dose for all critical strictures are summarized in Tables 3-5. Table 3. Summary of statistical analysis results for critical structures in brain and head and neck. "YES" indicates statistically significant difference in CT number (HU), maximum dose (Dmax), and mean dose (Davg) values between NCCT and CCT images, and "NO" is vice versa.

Results
Statistical analysis outcome for all 226 patients in terms of the difference in HU, maximum dose, and average dose for all critical strictures are summarized in Tables 3-5. Table 3. Summary of statistical analysis results for critical structures in brain and head and neck. "YES" indicates statistically significant difference in CT number (HU), maximum dose (D max ), and mean dose (D avg ) values between NCCT and CCT images, and "NO" is vice versa.   Table 5. Summary of statistical analysis results for critical structures in brain and abdomen and pelvis. "YES" indicates statistically significant difference in CT number (HU), maximum dose (D max ), and mean dose (D avg ) values between NCCT and CCT images, and "NO" is vice versa. The study included 70 brain cancer patients; the effect of CA was investigated for 11 critical structures. In terms of HU, the difference in mean HU values between NCCT and CCT for most critical structures did not exhibit a normal distribution. Critical structures that presented normal HU distribution were brain stem (p = 0.25) and left cochlea (p = 0.31) in both CCT and NCCT images, and right cochlea (p = 0.61) in NCCT images. Wilcoxon test indicated that the HU value differences between CCT and NCCT images were statistically significant for all structures (p < 0.05) except left cochlea, right cochlea, left lens, and right lens (p >0.05).

Site
In terms of dose distribution, normality test results for maximum and mean dose values relative to the prescribed dose for the 11 critical structures were found not to follow a normal distribution, except for the mean relative dose values in brain stem (both NCCT and CCT sets with p-values of 0.11 and 0.10, respectively), and the maximum relative dose values in brain (both NCCT and CCT sets with p-values of 0.66 and 0.31, respectively). Paired samples t-test was used to evaluate structures with normal distribution, while nonparametric test (related-samples Wilcoxon signed rank test) was used for the other structures. The contrast material effect was significant for all critical structures except for the left and right lenses. The relative maximum dose percentage difference between critical structures in NCCT plans and the corresponding ones in CCT plans are shown in Figure 2. This figure also shows number of cases where relative percentage dose difference exceeds 2%. test indicated that the HU value differences between CCT and NCCT images were statistically significant for all structures (p ˂ 0.05) except left cochlea, right cochlea, left lens, and right lens (p ˃ 0.05).
In terms of dose distribution, normality test results for maximum and mean dose values relative to the prescribed dose for the 11 critical structures were found not to follow a normal distribution, except for the mean relative dose values in brain stem (both NCCT and CCT sets with p-values of 0.11 and 0.10, respectively), and the maximum relative dose values in brain (both NCCT and CCT sets with p-values of 0.66 and 0.31, respectively). Paired samples t-test was used to evaluate structures with normal distribution, while nonparametric test (related-samples Wilcoxon signed rank test) was used for the other structures. The contrast material effect was significant for all critical structures except for the left and right lenses. The relative maximum dose percentage difference between critical structures in NCCT plans and the corresponding ones in CCT plans are shown in Figure  2. This figure also shows number of cases where relative percentage dose difference exceeds 2%. Figure 2. Maximum relative dose difference and number of patients with relative dose difference greater than 2% between NCCT and CCT for brain critical structures. The largest differences in small structures are attributed to registration accuracy.

PTV Evaluation
Normality test for target volumes revealed that the PTVs and mean HU values do not exhibit normal distribution. Mann-Whitney-Wilcoxon test indicated that there was a statistically significant difference in HU between NCCT and CCT for all patients in this site. Similarly, both D2% and D98% values were normally distributed (see Table 6) and t-test indicated a significant difference between these two parameters for both NCCT and CCT images.  . Maximum relative dose difference and number of patients with relative dose difference greater than 2% between NCCT and CCT for brain critical structures. The largest differences in small structures are attributed to registration accuracy.

PTV Evaluation
Normality test for target volumes revealed that the PTVs and mean HU values do not exhibit normal distribution. Mann-Whitney-Wilcoxon test indicated that there was a statistically significant difference in HU between NCCT and CCT for all patients in this site. Similarly, both D 2% and D 98% values were normally distributed (see Table 6) and t-test indicated a significant difference between these two parameters for both NCCT and CCT images. A total of 90 nasopharynx cancer patients were included in the study; contrast agent's effect was evaluated for 20 critical structures. In terms of mean HU, normality test indicated that for most critical structures, HU was not normally distributed. The structures that demonstrated normal distributions were parotid glands in both NCCT and CCT images, brain stem, mandible, and right brachial plexus in NCCT images, and larynx in CCT images. The differences in mean HU between CCT and NCCT were statistically significant for 16 critical structures. The other four that showed insignificant mean HU differences were left cochlea, right cochlea, left lens, and esophagus.
In terms of relative dose distribution differences, normality test revealed that most of the 20 critical structures did not exhibit a normal distribution, except brain stem, brain, oral cavity, esophagus, larynx, and spinal cord. Wilcoxon signed rank test and t-test indicated that dose differences were found to be significant only for brain stem, brain, optic nerves, esophagus, optic chiasm, right brachial plexus, and spinal cord. The number of patients with a maximum relative dose difference of more than 2% for each critical structure is summarized in Figure 3. A total of 90 nasopharynx cancer patients were included in the study; contrast agent's effect was evaluated for 20 critical structures. In terms of mean HU, normality test indicated that for most critical structures, HU was not normally distributed. The structures that demonstrated normal distributions were parotid glands in both NCCT and CCT images, brain stem, mandible, and right brachial plexus in NCCT images, and larynx in CCT images. The differences in mean HU between CCT and NCCT were statistically significant for 16 critical structures. The other four that showed insignificant mean HU differences were left cochlea, right cochlea, left lens, and esophagus.
In terms of relative dose distribution differences, normality test revealed that most of the 20 critical structures did not exhibit a normal distribution, except brain stem, brain, oral cavity, esophagus, larynx, and spinal cord. Wilcoxon signed rank test and t-test indicated that dose differences were found to be significant only for brain stem, brain, optic nerves, esophagus, optic chiasm, right brachial plexus, and spinal cord. The number of patients with a maximum relative dose difference of more than 2% for each critical structure is summarized in Figure 3.

PTV Evaluation
Mean HU values in PTV showed normal distribution for both NCCT and CCT image sets based on normality test, and there was a statistically significant HU difference between NCCT and CCT based on t-test.
Mean relative dose difference values for D2% and D98% were within 1% of each other for both NCCT-and CCT-based plans (see Table 7), even though there were statistically significant differences in D2% and D98% values between NCCT-and CCT-based plans.

PTV Evaluation
Mean HU values in PTV showed normal distribution for both NCCT and CCT image sets based on normality test, and there was a statistically significant HU difference between NCCT and CCT based on t-test.
Mean relative dose difference values for D 2% and D 98% were within 1% of each other for both NCCT-and CCT-based plans (see Table 7), even though there were statistically significant differences in D 2% and D 98% values between NCCT-and CCT-based plans. Table 7. Statistical analysis results for all nasopharyngeal cancer patients' plans; statistically significant differences between NCCT-and CCT-based plans were found even though these differences are within ±1%.

Tongue Critical Structures Evaluation
Mean HU statistical analysis for CCT and NCCT image sets included 10 critical structures in 7 tongue cancer patients. The majority of critical structures presented HU normal distribution based on the p-value for Shapiro-Wilk test except the brain and mandible. Based on normality test results, Wilcoxon and Student's tests revealed statistically significant differences in mean HU values between NCCT and CCT for all the critical structures except the mandible (p-value = 0.1763).
In terms of relative dose percentage differences, Shapiro-Wilk test revealed that the differences in dose values between NCCT and CCT images for all critical structures were not significant except the mandible (p-value = 0.043). In addition, differences between CCT and NCCT dose distributions for all critical structure were within 2%, except for larynx and mandible, where it was 3.7% and −2.3%, respectively.

PTV Evaluation
PTV mean HU variations in both NCCT and CCT image sets demonstrated a normal distribution, and the differences between the two image sets were statistically significant.
In terms of dose distribution, the normality test results for both D 2% and D 98% presented a normal distribution. For t-test, p-values >0.05 revealed an insignificant effect of the contrast agent's material on PTV dose distributions.

Thyroid Critical Structures Evaluation
The study included nine critical structures in nine thyroid cancer patients. Mean HU value statistical analysis revealed a normal distribution according to Shapiro-Wilk normality test for all critical structures. HU evaluation using Student's test revealed statistically significant differences between NCCT and CCT image sets for all critical structures except larynx and mandible, with p-values of 0.2989 and 0.0597, respectively.
In terms of relative dose distribution, the percentage differences in dose distributions between NCCT and CCT image sets-based plans presented a normal distribution in all critical structures except the brain, brain stem, and left brachial plexus. Wilcoxon signed rank test and Student's t-test indicated that there were no statistically significant differences between dose distributions calculated based on NCCT and CCT image sets, except the difference of the mean relative dose value for the mandible (p-value = 0.043). The percentage difference between NCCT and CCT scans for all critical structure was less than 2% except for larynx and mandible, which were 3.7% and −2.3% respectively. PTV Differences in mean HU values between NCCT and CCT image sets in PTV presented a normal distribution, and consequent t-test indicated a significant statistical between NCCT and CCT image sets (p-value = 0.0068).
In terms of dose distribution, normality test indicated that the differences in both D 2% and D 98% between NCCT and CCT image sets-based plans followed a normal distribution. However, t-test indicated that these differences were statistically insignificant.

Chest
The study included 5 critical structures in 19 cancer patients with tumors in the chest area: these included 7 mediastinum, 7 lung, and 5 esophagus cancer patients.

Mediastinum Critical Structures Evaluation
Shapiro-Wilk test revealed that the mean HU values differences distributed normally for the right lung and esophagus, and otherwise for the heart, spinal cord, and left lung. Based on normality test, there were insignificant differences in HU values between NCCT and CCT images for all structures except the spinal cord and esophagus, with p-values of 0.028 and 0.0002, respectively.
In terms of dose distribution, the maximum difference between NCCT and CCT sets-based plans was −2.9% (see Figure 4). In terms of dose distribution, normality test indicated that the differences in both D2% and D98% between NCCT and CCT image sets-based plans followed a normal distribution. However, t-test indicated that these differences were statistically insignificant.

Chest
The study included 5 critical structures in 19 cancer patients with tumors in the chest area: these included 7 mediastinum, 7 lung, and 5 esophagus cancer patients.

Critical Structures Evaluation
Shapiro-Wilk test revealed that the mean HU values differences distributed normally for the right lung and esophagus, and otherwise for the heart, spinal cord, and left lung. Based on normality test, there were insignificant differences in HU values between NCCT and CCT images for all structures except the spinal cord and esophagus, with p-values of 0.028 and 0.0002, respectively.
In terms of dose distribution, the maximum difference between NCCT and CCT setsbased plans was −2.9% (see Figure 4).

PTV Evaluation
The mean HU values in NCCT and CCT image sets presented normal distributions, and the differences between the two image sets were statistically significant according to t-test.
D2% and D98% values also presented normal distributions, t-test indicated insignificant D2% differences between NCCT-and CCT-based plans and the contrary for D98%.

Critical Structures Evaluation
Shapiro-Wilk test indicated that mean HU values for critical structure in both NCCT and CCT image sets were following a normal distribution except for the heart in the CCT image set. Based on normality test, there were significant differences in HU values between NCCT and CCT image sets in all critical structures except the spinal cord and esophagus, with p-values of 0.99 and 0.88, respectively.

PTV Evaluation
The mean HU values in NCCT and CCT image sets presented normal distributions, and the differences between the two image sets were statistically significant according to t-test. D 2% and D 98% values also presented normal distributions, t-test indicated insignificant D 2% differences between NCCT-and CCT-based plans and the contrary for D 98% .

Lung Critical Structures Evaluation
Shapiro-Wilk test indicated that mean HU values for critical structure in both NCCT and CCT image sets were following a normal distribution except for the heart in the CCT image set. Based on normality test, there were significant differences in HU values between NCCT and CCT image sets in all critical structures except the spinal cord and esophagus, with p-values of 0.99 and 0.88, respectively.
In terms of dose distribution, normality test revealed that all critical structures followed a normal distribution in both NCCT and CCT image sets, except the lungs maximum relative dose values. Based on the normality test, the dose distribution differences between the two image sets were statistically significant in the heart and spinal cord only. The maximum difference between NCCT and CCT image sets-based plans was within ±2%, except for one patient where it was −3.3%.

PTV Evaluation
Mean HU values presented normal distributions for both NCCT and CCT image sets based on normality test. The differences in HU values between the two image sets were statistically significant.
Normal distribution was indicated for D 2% values but not for D 98% in the PTV. T-test suggests that these differences in D 2% were statistically insignificant while D 98% were statistically significant.

Esophagus Critical Structures
Shapiro-Wilk test indicates that the mean HU values presented a normal distribution in the right lung, esophagus, and heart in both NCCT and CCT image sets in addition to left lung CCT image sets. Mean HU values in spinal cord for both images sets and left lung for NCCA image sets did not follow a normal distribution. Wilcoxon test and t-test indicated that the UH differences between NCCT and CCT image sets were statistically significant for all critical structures.
In terms of dose distribution, eight plans exceeded the ±2% dose difference between NCCT and CCT image sets-based plans with the maximum difference of −5.05% for esophagus (see Figure 5). Normality test indicated that the dose distribution presents a normal distribution in all critical organs for both NCCT and CCT image sets except the maximum relative dose values for left lung and heart. The percentage differences in dose distribution between these two were statistically insignificant for all critical structures except the heart mean dose value, according to Wilcoxon test and t-test. In terms of dose distribution, normality test revealed that all critical structures followed a normal distribution in both NCCT and CCT image sets, except the lungs maximum relative dose values. Based on the normality test, the dose distribution differences between the two image sets were statistically significant in the heart and spinal cord only. The maximum difference between NCCT and CCT image sets-based plans was within ±2%, except for one patient where it was −3.3%.

PTV Evaluation
Mean HU values presented normal distributions for both NCCT and CCT image sets based on normality test. The differences in HU values between the two image sets were statistically significant.
Normal distribution was indicated for D2% values but not for D98% in the PTV. T-test suggests that these differences in D2% were statistically insignificant while D98% were statistically significant.

Esophagus Critical Structures
Shapiro-Wilk test indicates that the mean HU values presented a normal distribution in the right lung, esophagus, and heart in both NCCT and CCT image sets in addition to left lung CCT image sets. Mean HU values in spinal cord for both images sets and left lung for NCCA image sets did not follow a normal distribution. Wilcoxon test and t-test indicated that the UH differences between NCCT and CCT image sets were statistically significant for all critical structures.
In terms of dose distribution, eight plans exceeded the ±2% dose difference between NCCT and CCT image sets-based plans with the maximum difference of −5.05% for esophagus (see Figure 5). Normality test indicated that the dose distribution presents a normal distribution in all critical organs for both NCCT and CCT image sets except the maximum relative dose values for left lung and heart. The percentage differences in dose distribution between these two were statistically insignificant for all critical structures except the heart mean dose value, according to Wilcoxon test and t-test.

PTV Evaluation
Normality test indicated that the mean HU values in PTV present a normal distribution in both NCCT and CCT image sets. Consequent t-test indicated statistically significant differences in mean HU values between NCCT and CCT image sets.
In terms of dose distribution, normality test revealed normal distributions for both D 2% and D 98% . T-test indicated a statistically significant difference in D 2% values between NCCT and CCT and a statistically insignificant difference in D 98% values.

Non-Hodgkin's Lymphoma (NHL) Critical Structures Evaluation
Based on normality, mean HU values in both NCCT and CCT image sets presented normal distributions in all critical structures except in the spinal cord. Evaluating the differences in HU values between NCCT and CCT image sets indicated statistically significant differences in all critical structures.
In terms of relative dose distribution, Shapiro-Wilk test for the dose values indicated that these values were normally distributed in all critical structures except the liver maximum dose values. Wilcoxon and Student's test revealed statistically significant differences between dose values in all critical structures except the small intestine maximum dose values. The maximum dose percentage difference between NCCT and CCT sets was −2.55%. This appeared in one patient while the maximum dose differences in the other patients' plans were within 2%.

PTV Evaluation
In PTV, normality test indicated that mean HU values present normal distributions in both NCCT and CCT image sets. Wilcoxon test for the differences in HU values between the two image sets proved these were statistically significant (p-value < 0.0001).
In terms of dose distribution, normality test for dose values in PTV volumes indicated normal distributions for both D 2% , and D 98% dose values. T-test, however, determined that the differences in D 2% were statistically insignificant while statistically significant for D 98% .

Pancreas Critical Structures
Shapiro-Wilk test indicated that mean HU values present normal distribution in most critical structures in both NCCT and CCT image sets; the critical structures that did not exhibit normal distributions were left kidney and liver in NCCT images, and the spinal cord in both image sets. Consequent Wilcoxon test and Student's t-test revealed that differences in HU values between NCCT and CCT image sets were statistically significant in all critical organs except the spinal cord.
In terms of dose distribution, the maximum difference between NCCT and CCT image sets-based plans was −2.29% in one patient while all the others were within 2%. Dose values in all critical structures presented normal distributions according to Shapiro-Wilk test, and the differences in dose values were statistically significant for all critical structures except the left kidney and small intestines.

PTV Evaluation
PTV mean HU values in both NCCT and CCT image sets evaluated with Shapiro-Wilk test present normal distributions, and parametric t-test revealed significant statistical differences between these two image sets (p-value = 0.0007).
In terms of dose distribution, Shapiro-Wilk test shows a normal distribution of dose values in PTV for D 2% and otherwise for D 98% with both NCCT and CCT image sets.
Consequently, t-test indicated statistically insignificant differences D 2% and D 98% in PTV for both image sets with p-values of 0.09 and 0.11, respectively.

Liver Critical Structures
Normality was not applicable in this case because the sample included two patients only. Nonparametric related-samples Wilcoxon signed rank test was used to evaluate the difference in mean HU values between NCCT and CCT image sets. Results indicated that the differences were statistically insignificant for all four critical structures.
In terms of dose distribution, the max dose difference between NCCT and CCT sets was within ±2% for all critical structures. Percentage dose differences between NCCT-and CCT-based plans were insignificant for all critical structures according to nonparametric related-samples Wilcoxon signed rank test.

PTV Evaluation
Nonparametric related-samples Wilcoxon signed rank test was used to evaluate mean HU values variation in PTV. Results indicated statistically insignificant HU value differences in PTV volumes for both NCCT and CCT image sets (p-value = 0.18).
In terms of dose distribution, the differences in D 2% and D 98% were statistically insignificant between NCCT and CCT image sets (p-value = 0.66 for both).

Pelvis
The study included 10 critical structures in 12 patients with pelvic malignancies, 4 cervix, 3 uterus, 3 rectum, and 2 prostate cancer patients.

Cervix Critical Structures Evaluation
Shapiro-Wilk test indicated that mean HU values for critical structures in both NCCT and CCT image sets were following a normal distribution except the bowels in the CCT image set and the bladder in the NCCT image set. The differences in HU values were statistically insignificant between NCCT and CCT image sets in all critical structures except the rectum and kidneys (p-values < 0.04).
In terms of dose distribution, normality test revealed that all critical structures followed a normal distribution in both NCCT and CCT image sets, except rectum (CCT), left femur (NCCT), and the vagina (NCCT). Based on the normality test, the dose distribution differences between NCCT and CCT image sets were statistically insignificant in all the critical structures except the bowels (p-value = 0.37). The maximum difference between NCCT and CCT image sets-based plans was within ±2%.

PTV Evaluation
PTV mean HU values presented normal distribution in both NCCT and CCT image sets according to normality test. T-test indicated a statistically significant difference in HU values between NCCT and CCT image sets (p-value = 0.0125).
In terms of dose distribution, normality test indicated normal distributions for both D 2% and D 98% values in the PTV. T-test showed insignificant differences between D 2% and D 98% values in both image sets, with p-values of 0.37 and 0.19, respectively.

Uterus Critical Structures Evaluation
Mean HU values presented normal distribution in all critical organs in both NCCT and CCT image sets except in the rectum, according to Shapiro-Wilk test. The differences in HU values between the two image sets were statistically insignificant except in the bowels and kidneys, with p-value ≤ 0.03.
Statistical evaluation indicated that dose values were distributed normally except left femur (NCCA), right kidney, and left kidney (CCA), and the differences in dose values were statistically in significant in all critical structures for both image sets. In terms of relative dose distribution in the critical structures, the percentage maximum difference between NCCT and CCT image sets was less than 2%.

PTV Evaluation
Mean HU values in the PTV presented normal distribution in NCCT and NCCT image sets, as indicated by Shapiro-Wilk test. Differences in these HU values between the two image sets were statistically significant (p-value = 0.03).
In terms of dose distribution, D 2% and D 98% had normal distributions in PTV volumes for both image sets. The differences, however, were statistically insignificant for both dose indicators, with p-values of 0.56 and 0.78, respectively.

Rectum Critical Structures Evaluation
Mean HU values in all nine critical structures demonstrated a normal distribution except the spinal cord in CCT image set, according to Shapiro-Wilk test. The differences in HU values between NCCT and CCT image sets were statistically insignificant for all critical structures the except prostate and penile bulb, with p-values of 0.01 and 0.04, respectively.
The dose values in all critical structures presented normal distributions except spinal cord (mean, CCT), and right femur (max, NCCT), and the differences in dose values between the two image sets-based plans were statistically insignificant. Percentage dose differences between NCCT and CCT image sets-based plans in all critical organs were within ±2%.

PTV Evaluation
Mean HU values in both CCT and NCCT image sets presented normal distribution, and the differences in HU values were statistically significant (p-value = 0.02).
In terms of dose distribution, Shapiro-Wilk test indicated that dose values in the PTV followed normal distribution for both D 2% and D 98% . The differences in dose distribution were however statistically insignificant, with p-values of 0.86 and 0.39, respectively.

Prostate Critical Structures
Since there were only two prostate patients, Shapiro-Wilk test was not applicable. Nonparametric related-samples test was applied directly to weigh the difference in mean HU values between NCCT and CCT image sets; this revealed that there were no statistically significant differences in HU between all the critical structures.
Similar to mean HU, the differences in dose values were statistically insignificant for all critical structures. In terms of dose distribution, the max dose difference between NCCT and CCT sets was within ±2% for all critical structures.

PTV Evaluation
Differences in HU values in PTV volumes between the two image sets were statistically insignificant (p-value = 0.18). Similarly, there were no statistically significant differences in D 2% and D 98% between both image sets: both dose levels had a p-value of 0.65.

Gamma Evaluation Analysis
The gamma evaluation results satisfied the 2% and 2 mm criteria for 212 patients; the other 14 patients failed these criteria with a chest plan recording the lowest pass rate of 88.6% (see Figure 6). All 226 patients, however, satisfied the 3% and 3 mm criteria with an average pass rate of 98.2 ± 1.2.

Gamma Evaluation Analysis
The gamma evaluation results satisfied the 2% and 2 mm criteria for 212 patients; the other 14 patients failed these criteria with a chest plan recording the lowest pass rate of 88.6% (see Figure 6). All 226 patients, however, satisfied the 3% and 3 mm criteria with an average pass rate of 98.2 ± 1.2.

Discussion
The presence of contrast agents in the body and the resulting distortions are clearly visible in planning CT images. It is a valid inquisitive point when radiotherapy treatment planning accuracy is in question. A direct result for this concern is reflected in common radiation therapy practice of twin CT image acquisition, NCCT and CCT, which is associated with inconvenience to the patient and added cost. In addition, published work presents inconsistent findings, making it even more challenging to quantify the practical significance of the resultant dosimetric effect in a clinical sense. In this study, we have demonstrated the extent of the effect of contrast materials on CT images in terms of HU and the magnitude of the consequential dosimetric effect in clinically relevant setups. This study also involved a larger number of patients compared to previous studies and included all major cancer treatment sites in contrast to site-specific studies. In addition, other factors affecting the quality of the statistical analysis were addressed in this study, such as registration effects, which have been rarely mentioned in previously published studies.
Published studies that evaluated the effect of contrast agents on radiotherapy dose calculations have discussed conventional radiotherapy, 3DCRT, IMRT, and, more recently, VMAT, see Table 1. These studies can be grouped into two main categories: the first one is based on phantoms and mathematical models, the second is based on clinical CT images. Studies in the first category lean toward simulating clinical observations based on the amount of contrast material and beam energy; results generally tend to emphasize

Discussion
The presence of contrast agents in the body and the resulting distortions are clearly visible in planning CT images. It is a valid inquisitive point when radiotherapy treatment planning accuracy is in question. A direct result for this concern is reflected in common radiation therapy practice of twin CT image acquisition, NCCT and CCT, which is associated with inconvenience to the patient and added cost. In addition, published work presents inconsistent findings, making it even more challenging to quantify the practical significance of the resultant dosimetric effect in a clinical sense. In this study, we have demonstrated the extent of the effect of contrast materials on CT images in terms of HU and the magnitude of the consequential dosimetric effect in clinically relevant setups. This study also involved a larger number of patients compared to previous studies and included all major cancer treatment sites in contrast to site-specific studies. In addition, other factors affecting the quality of the statistical analysis were addressed in this study, such as registration effects, which have been rarely mentioned in previously published studies.
Published studies that evaluated the effect of contrast agents on radiotherapy dose calculations have discussed conventional radiotherapy, 3DCRT, IMRT, and, more recently, VMAT, see Table 1. These studies can be grouped into two main categories: the first one is based on phantoms and mathematical models, the second is based on clinical CT images. Studies in the first category lean toward simulating clinical observations based on the amount of contrast material and beam energy; results generally tend to emphasize the apparent significant effect of contrast material rather than the clinical effect [13,16,17]. While studies in the second category generally place emphasis on clinical outcome and reported dose, differences were mostly within clinically acceptable limits.
The results of reviewed literature, summarized in Table 1, indicate that the majority of studies suggest clinically insignificant effect of contrast agents on radiotherapy dose calculations for both critical structures and target volumes. However, conclusions are usually conservative and recommend the use of NCCT images or applying density corrections. Many other authors, however, reported considerable discrepancies between dose calculations based on NCCT and CCT image sets. Shibamoto et al. [16] reported a significant effect on the planning of upper-abdominal irradiation as MU was increased over 2%. Nasrollah et al. [17] reported a statistically significant dose difference in the lower esophageal region target volume. Ercan et al. [30] also reported statistically significant dose differences between NCCT-and CCT-based plans for both target volumes and critical structures. Jing et al. [20] concluded that oral contrast agents caused clinically significant changes in the dose calculations for the targets and critical structures. Rankine et al. [13] reported significant dose discrepancy for critical structures, and Burridge et al. [34] found a considerable increase in MU settings due to contrast agents presence in CCT images. The results in this study show that HU variations due to contrast materials were statistically significant (p-value < 0.05) for the majority of critical structures and target volumes. However, this did not imply clinical significance, since the variation in HU values were within 30 HU for the vast majority of soft tissue and less than 50 HU for lungs and bony structures. Variation in PTV dose metrics D 2% and D 98% values were clinically insignificant for all sites except for brain and nasopharynx. Relative dose maximum differences were within 2% for the majority of critical structures. Similarly, in target volumes, minimum, maximum, mean, and median values for D 2% and D 98% were within 1.5%.
The results in this study, generally consistent with published results in literature, show that HU variations due to contrast materials were statistically significant (p-value < 0.05) for the majority of critical structures and target volumes. However, this did not imply clinical significance since the variation in HU values were within 30 HU for the vast majority of soft tissue and less than 50 HU for lungs and bony structures. Variation in PTV ICRU 83 recommended dose metrics D 2% and D 98% values were statistically insignificant for all sites except for brain and nasopharynx, even though, these statistical differences did not result in clinically significant dose metric differences between NCCT-and CCT-based plans, because difference in D 2% and D 98% values were less than 2% (see Tables 6 and 7).
In this study, 3D gamma analysis using local dose comparison method was applied to compare all dose distributions point by point. Unlike global difference point, local comparison is more sensitive to differences in dose calculations [43]. The 3D gamma analysis results revealed that 94% of plans satisfied the recommended 2% and 2 mm criteria, and only 6% of all plans failed the criteria. All 226 patients' plans, however, satisfied the 3% and 3 mm criteria. Gamma evaluation failures in the 14 patients is attributed to uncontrollable changes in patients' geometry due to involuntary movements in the lag time between NCCT and CCT scans. Respiratory-induced differences were observed in eight nasopharyngeal patients, one thyroid, and one esophagus cancer patient. Bowel movement-induced differences were observed in two rectum patients, one uterus, and one cervix cancer patient. Two examples for geometry changes are shown in Figure 7; gamma passing rates were reduced because of small movement in body parts and penile displacement in subset A and respiratory movement in subset B. It should be noted that in all treatment plans that did not satisfy the 2% and 2 mm criteria, the dose differences were in low dose regions only which are not clinically significant, and therefore increasing the suppression threshold to 20% was enough to score a satisfactory gamma pass rate with the 2% and 2 mm criteria.
We have evaluated the effect of contrast materials in clinical radiation oncology relevant setups to provide a credible clear answer on the utilization of contrast enhanced CT-simulation images for dose calculation. Due to their chemical composition, contrast material densities are often overestimated in CT images, which might lead to inaccurate dose calculations. Despite this, the results demonstrated a limited effect of distorted CCT images on dose calculations mainly because radiation doses in modern radiotherapy are usually delivered through several fields and arcs, and the fact that changes on CT number of an order of several HU has limited effect on dose calculation. In addition, contrast materials are confined to blood vessels in case of intravenous administration, or to the digestive track if administered orally. Apparently, it is the artefacts induced by contrast agents rather than the minute concentrations that interfere with CT numbers. Figure 7. Reduced gamma pass rates in some patients' plans are the result of patients geometrical differences between NCCT and CCT scans. Subset A shows the difference in geometry due to penile movement, (A1) NCCT, and (A2) CCT. Subset B shows respiratory induced oral cavity movement between NCCT (B1) and CCT (B2).
We have evaluated the effect of contrast materials in clinical radiation oncology relevant setups to provide a credible clear answer on the utilization of contrast enhanced CTsimulation images for dose calculation. Due to their chemical composition, contrast material densities are often overestimated in CT images, which might lead to inaccurate dose calculations. Despite this, the results demonstrated a limited effect of distorted CCT images on dose calculations mainly because radiation doses in modern radiotherapy are usually delivered through several fields and arcs, and the fact that changes on CT number of an order of several HU has limited effect on dose calculation. In addition, contrast materials are confined to blood vessels in case of intravenous administration, or to the digestive track if administered orally. Apparently, it is the artefacts induced by contrast agents rather than the minute concentrations that interfere with CT numbers.

Conclusions
The use of contrast agents aims to produce higher quality CT images and improve the delineation of target volumes and OARs. We have evaluated the effect of contrast agents on the CT images used for radiotherapy treatment planning. Although this effect is noticeable and the associated changes in HU values may be statistically significant in many cases, these changes are clinically insignificant in terms of dose calculation. About 6% of all plans failed 3D gamma-evaluation using a tight 2% and 2 mm criteria mainly due to registration inaccuracies. Thus, we conclude that CCT could be acquired for VMAT radiotherapy planning purposes instead of NCCT to improve targets and OAR's delineation without compromising dose calculation accuracy.   . Reduced gamma pass rates in some patients' plans are the result of patients geometrical differences between NCCT and CCT scans. Subset A shows the difference in geometry due to penile movement, (A1) NCCT, and (A2) CCT. Subset B shows respiratory induced oral cavity movement between NCCT (B1) and CCT (B2).

Conclusions
The use of contrast agents aims to produce higher quality CT images and improve the delineation of target volumes and OARs. We have evaluated the effect of contrast agents on the CT images used for radiotherapy treatment planning. Although this effect is noticeable and the associated changes in HU values may be statistically significant in many cases, these changes are clinically insignificant in terms of dose calculation. About 6% of all plans failed 3D gamma-evaluation using a tight 2% and 2 mm criteria mainly due to registration inaccuracies. Thus, we conclude that CCT could be acquired for VMAT radiotherapy planning purposes instead of NCCT to improve targets and OAR's delineation without compromising dose calculation accuracy.