Predicting Three-Dimensional Dose Distribution of Prostate Volumetric Modulated Arc Therapy Using Deep Learning

Background: Volumetric modulated arc therapy (VMAT) planning is a time-consuming process of radiation therapy. With a deep learning approach, 3D dose distribution can be predicted without the need for an actual dose calculation. This approach can accelerate the process by guiding and confirming the achievable dose distribution in order to reduce the replanning iterations while maintaining the plan quality. Methods: In this study, three dose distribution predictive models of VMAT for prostate cancer were developed, evaluated, and compared. Each model was designed with a different input data structure to train and test the model: (1) patient CT alone (PCT alone), (2) patient CT and generalized organ structure (PCTGOS), and (3) patient CT and specific organ structure (PCTSOS). The generative adversarial network (GAN) model was used as a core learning algorithm. The models were trained slice-by-slice using 46 VMAT plans for prostate cancer, and then used to predict and evaluate the dose distribution from 8 independent plans. Results: VMAT dose distribution was generated with a mean prediction time of approximately 3.5 s per patient, whereas the PCTSOS model was excluded due to a mean prediction time of approximately 17.5 s per patient. The highest average 3D gamma passing rate was 80.51 ± 5.94, while the lowest overall percentage difference of dose-volume histogram (DVH) parameters was 6.01 ± 5.44% for the prescription dose from the PCTGOS model. However, the PCTSOS model was the most reliable for the evaluation of multiple parameters. Conclusions: This dose prediction model could accelerate the iterative optimization process for the planning of VMAT treatment by guiding the planner with the desired dose distribution.


Introduction
Prostate cancer was the second most diagnosed cancer and the fifth leading cause of death among the male population globally in 2020 [1]. External-beam radiation therapy (EBRT) is one of the most widely used treatment modalities for prostate cancer. Volumetric modulated arc therapy (VMAT) and intensity-modulated radiation therapy (IMRT) are also widely used, and have become the standard for prostate cancer treatments in many institutes [2,3]. In VMAT and IMRT, the complex dose distribution can increase the dose conformity to the target and significantly decrease the amount of dose given to the organs at risk (OARs), which can reduce the risk of complications to normal tissues after treatment [4]; however, in order to achieve a higher complex dose distribution, the complexity of the treatment planning process is also increased [4,5]. VMAT treatment planning is considered a time-consuming process due to the manual nature of the process (trial-and-error inverse planning), the long dose calculation time, and the replanning iterations. In order to achieve the desired dose distribution, the dosimetrist or the planner has to manually input the optimized parameters to the treatment planning system (TPS). After that, the system Life 2021, 11, 1305 2 of 15 will calculate and evaluate the dose distribution outcome through dose prescription and dose criteria via planning target volume (PTV) and OARs, respectively. If the evaluation outcomes do not satisfy the criteria, the process of adjusting these planning parameters is repeated, and the process of repeating repetition may take time for each cycle. The process may be iterated from an unapproved plan, and more requirements may be needed from the radiation oncologist's perspective.
In recent years, researchers have started to use deep learning and neural networks (NNs) in various medical and biomedical applications [6]. In biomedical tasks, deep learning has been used as an identification and detection algorithm. For instance, deep learning has been applied in the detection of protein S-sulfenylation sites from protein sequences [7]. Another pertinent point is that protein function identification provides a better understanding of cancer and helps to create more effective drugs for cancer treatment [8]. Additionally, deep learning can be used for protein-protein interaction, which uses text mining methods from biomedical literature [9]. Deep learning is also employed to accelerate and assist the treatment planning process in various applications, one of which is predicting dose distribution. Deep learning can predict dose distributions by inputting anatomical information (such as CT images or contours of organ structures) as either 2D or 3D data. The predicted dose distribution can be used as an objective to automatically generate a treatment plan later in the process [10]. Nguyen et al. studied the use of the U-net-a deep learning model-in dose distribution prediction of IMRT prostate cancer plans; they reported that when comparing the model and ground truth, the results showed an average absolute difference within 5% of the prescription dose, and had Dice similarity of 0.91 [5]. Murakami et al. developed a fully automated dose distribution prediction of IMRT for prostate cancer using a GAN, which requires only CT images to generate the prediction dose. The results show that the dose differences evaluated by the DVH parameters are approximately 2% of the prescription dose for PTV and 3% for OARs, except for the parameters D 98% and D 95% [4]. Fan et al. developed an automated treatment planning strategy using a residual neural network deep learning model (ResNet). In particular, ResNet can predict the 3D dose distributions and use MATLAB to generate a treatment plan based on the predicted dose distributions. The authors reported that the model could predict the acceptable dose distributions with no statistical differences between the actual and predicted plans [11]. Mahmood et al. proposed the GAN model to predict the 3D dose distribution of oropharyngeal cancer, and compared the predicted results with several baseline approaches. They reported that the plan generated by the GAN-predicted dose distribution outperformed the actual plans by satisfying the additional OAR criteria, achieving better results than other baseline methods [12].
However, the dose distribution prediction system for VMAT plans based on deep learning has not been widely investigated. In order to solve the time-consuming problem of the VMAT treatment planning process, this study aims to develop a method of predicting 3D dose distribution for prostate VMAT based on deep learning. This would accelerate the treatment planning process by guiding the planner with the desired dose distribution without a dose calculation from TPS. This study used a deep learning algorithm to investigate the accuracy of learning techniques with customized input sources: (1) patient CT alone (PCT alone), (2) patient CT and generalized organ structure (PCTGOS), and (3) patient CT and specific organ structure (PCTSOS). The core learning algorithm uses the generative adversarial network model. The dose distribution was predicted using 2D images, which were compiled by stacking each 2D slice to create the 3D dose distribution. The baseline of this study was the dose distribution that was calculated from the actual treatment plan. The prediction accuracy of the customized input sources model was evaluated with the ground truth dose distribution using 3D gamma analysis and the difference in DVH parameters. Gamma analysis is a technique of evaluating the accuracy of the dose distributions by simultaneously considering the dose difference (DD) and the distance to agreement (DTA) [13]. The distribution comparison was evaluated using the acceptance criteria. The dose and distance of the two compared distributions were normalized by dose difference Life 2021, 11, 1305 3 of 15 criteria (%) and distance criteria (mm), based on the clinical standard. If the evaluated point on the gamma index is higher than 1, the point does not satisfy the criteria. The DVH parameters of PTV and OARs are the clinical plan evaluation parameters; in this case, the physician and medical physicist used these to evaluate the plan before the treatment process. The parameters of PTV are the clinical criteria of the prescription dose for each specific target organ. The parameters of OARs are the dose constraint recommendations from Quantitative Analysis of Normal Tissue Effects in the Clinic (QUANTEC) and Radiation Therapy Oncology Group (RTOG) clinical trials, consisting of the bladder, the rectum, and both the left and right sides of the femoral head [14,15]. The dose constraints recommended by QUANTEC for the bladder and rectum are shown in Table 1 [16,17]. For the left and right femoral head, dose constraints are recommended by RTOG, as shown in Table 1 [15]. Table 1. Dose constraints for organs at risk.

Materials and Methods
The overall framework of this study is shown in Figure 1. The framework is separated into three main processes: a model construction, a dose distribution prediction, and a model evaluation and comparison. The model construction process is the process of constructing three prostate VMAT dose distribution prediction models involving different input structures using the GAN deep learning model, wherein, each model is trained with a specific input data structure from a training dataset. The following process constitutes the dose distribution prediction: Dose distributions were predicted from each trained model using a specific testing data structure. The model evaluation and comparison are the processes of evaluating each trained model's prediction accuracy with the ground truth dose distribution using 3D gamma analysis. The epoch that gives the maximum gamma passing rate of each model was chosen, so as to be the representative model to be evaluated using various DVH parameters. Then, accuracy was compared between the three models.

Data Acquisition
Data from 54 prostate cancer patients treated at Chulabhorn Oncology Medical Center, Chulabhorn Hospital, Thailand, between 2015 and 2020, were used in this study. The data consisted of patient CT images, the contours of structures, and dose distribution. All patients were treated using the VMAT technique with a prescription dose of 78 Gy/39 fractions to a PTV. A total of 48 patients' data (85.71%) were used as the training dataset, with a total of 5980 image slices, and the remaining 8 patients (14.29%) were used as the test dataset, with a total of 1098 image slices.
The PTV is separated into three volumes that receive a different prescription dose. The first volume receives 46 Gy (PTV46) or 23 fractions, which irradiate to the prostate, the seminal vesicle, and the lymph nodes around the pelvic area. The next volume is 14 Gy or 7 fractions, which irradiate the whole prostate and seminal vesicle. In particular, this means that the whole prostate and seminal vesicle would receive a total of 60 Gy (PTV60). The Gy or 7 fractions, which irradiate the whole prostate and seminal vesicle. In particular, this means that the whole prostate and seminal vesicle would receive a total of 60 Gy (PTV60). The following 18 Gy or 9 fractions would irradiate only the prostate area. Thus, the prostate would receive a total 78 Gy (PTV78) from all treatment processes. Figure 1. The overall framework of this study. The first process is model construction, the second process is model testing, and the last process is model comparison.

Data Pre-Processing
The acquired dose distributions (dose distribution of PTV46, -60, and -78) were combined to receive one input dose distribution pairing with one corresponding CT image and contour of organ structure. For the organ structure of the PTV, the largest PTV (PTV46) that covers all treatment targets was chosen. Each slice of CT image and organ structure was resized from 512 × 512 pixels to 256 × 256 pixels in order to reduce memory usage. Dose distribution with the patient's specific size was cropped and resized to be the same as CT images (256 × 256 pixels). All CT images, organ structures, and dose distribution slices were normalized from -1 to 1.

Dataset Generation
Each model was designed with a different purpose, and required different input data structures to train and test the model. The PCT alone model was designed to generate the prediction model from only CT images, without using any contour structure data. The resized CT images and resized dose distribution were paired with each corresponding CT and its dose distribution to create the dataset, as shown in Figure 2a. PCTGOS requires the contours of organ structure and CT images as the input data, along with dose distribution as the target images. The organ structures consist of 5 volumes, including PTV, body, and 3 organs at risk (bladder, femur, and rectum), as shown in Figure 2b. In addition, PCTSOS consists of 5 models with the same architecture, but different input data. The dataset includes 5 input sources consisting of the organ structure and CT images as the input data, along with dose distribution as the target images, similar to the previous model. Nevertheless, the organ structure and dose distribution were split into 5 volumes, including the PTV, body, and 3 OARs (bladder, femur, and rectum), as shown in Figure  2c. After that, all CT images, organ structures, and dose distribution were normalized Figure 1. The overall framework of this study. The first process is model construction, the second process is model testing, and the last process is model comparison.

Data Pre-Processing
The acquired dose distributions (dose distribution of PTV46, -60, and -78) were combined to receive one input dose distribution pairing with one corresponding CT image and contour of organ structure. For the organ structure of the PTV, the largest PTV (PTV46) that covers all treatment targets was chosen. Each slice of CT image and organ structure was resized from 512 × 512 pixels to 256 × 256 pixels in order to reduce memory usage. Dose distribution with the patient's specific size was cropped and resized to be the same as CT images (256 × 256 pixels). All CT images, organ structures, and dose distribution slices were normalized from -1 to 1.

Dataset Generation
Each model was designed with a different purpose, and required different input data structures to train and test the model. The PCT alone model was designed to generate the prediction model from only CT images, without using any contour structure data. The resized CT images and resized dose distribution were paired with each corresponding CT and its dose distribution to create the dataset, as shown in Figure 2a. PCTGOS requires the contours of organ structure and CT images as the input data, along with dose distribution as the target images. The organ structures consist of 5 volumes, including PTV, body, and 3 organs at risk (bladder, femur, and rectum), as shown in Figure 2b. In addition, PCTSOS consists of 5 models with the same architecture, but different input data. The dataset includes 5 input sources consisting of the organ structure and CT images as the input data, along with dose distribution as the target images, similar to the previous model. Nevertheless, the organ structure and dose distribution were split into 5 volumes, including the PTV, body, and 3 OARs (bladder, femur, and rectum), as shown in Figure 2c. After that, all CT images, organ structures, and dose distribution were normalized from −1 to 1, and each corresponding CT, organ structure, and its dose distributions were paired to create organ-specific datasets. from -1 to 1, and each corresponding CT, organ structure, and its dose distributions were paired to create organ-specific datasets.

Generative Adversarial Model and Dose Distribution Prediction
In this study, the modified GAN model called pix2pix was used to construct the predictive model. The pix2pix model is a specific GAN model for image-to-image translation proposed by Isola et al. [18]. The pix2pix model can generate images by learning the dataset of the pairs of the two images. The CT images and contour structures are considered the input or source images. The corresponding dose distributions are the labels or target images. Goodfellow et al. proposed a GAN model that was constructed using two neural network models: the generator, and the discriminator [19]. In this case, the generator was trained to generate the dose distributions that could not be discriminated against by the discriminator; the discriminator was trained to maximize the probability of discrimination between the generated dose distribution from the generator and the ground truth. A Unet-based architecture model was used as the generator, consisting of 8 downsampling convolutions and 8 upsampling convolutions with an input size of 256 × 256 pixels and the same output size. The PatchGAN classifier was used as the discriminator to determine whether each image patch was real or not. The model consisted of 5 downsampling convolutions with an output size of 30 × 30 patches.

Generative Adversarial Model and Dose Distribution Prediction
In this study, the modified GAN model called pix2pix was used to construct the predictive model. The pix2pix model is a specific GAN model for image-to-image translation proposed by Isola et al. [18]. The pix2pix model can generate images by learning the dataset of the pairs of the two images. The CT images and contour structures are considered the input or source images. The corresponding dose distributions are the labels or target images. Goodfellow et al. proposed a GAN model that was constructed using two neural network models: the generator, and the discriminator [19]. In this case, the generator was trained to generate the dose distributions that could not be discriminated against by the discriminator; the discriminator was trained to maximize the probability of discrimination between the generated dose distribution from the generator and the ground truth. A U-net-based architecture model was used as the generator, consisting of 8 downsampling convolutions and 8 upsampling convolutions with an input size of 256 × 256 pixels and the same output size. The PatchGAN classifier was used as the discriminator to determine whether each image patch was real or not. The model consisted of 5 downsampling convolutions with an output size of 30 × 30 patches.
The prediction models were trained using an HP Z8 G4 workstation with an Intel Xeon 4112 and a NVIDIA Quadro P4000 GPU. Both the generator and the discriminator used the binary cross-entropy as the cost function and the rectified linear unit (ReLU) as the activation function. We used an Adam optimizer as the optimizer [20], with a learning rate of 0.0001, and the momentum parameters were β1 = 0.5 and β2 = 0.999. We used the default Adam momentum parameters described by Isola et al. [18], as they were proven to be great for image-to-image translation problems. The training batch size was set to 4, based on the RAM of the GPU. Based on the preliminary experiments, the models were trained with 100, 200, 300, 400, and 500 epochs.
The dose distribution results were predicted by inputting a testing dataset created specifically for each model. In order to generate the dose distribution from the PCTSOS model, each organ-specific testing dataset was inputted to each model. After generating all organ-specific dose distributions, the full-body dose distribution was derived by summing all predicted dose distributions.

Model Evaluation
The accuracy of the predicted dose distribution was assessed using 3D gamma analysis for each epoch of each model with the 3%/3 mm criteria. The epoch that gave the maximum gamma passing rate of each model was selected to obtain the representative results to compare the prediction performance between the three models.
The dose differences were calculated for the evaluation parameters for PTV (Table 2) and OARs (Table 3), which are the dose constraint recommendations from the QUANTEC and RTOG clinical trials.  The differences in the DVH evaluation parameters were calculated using the following equation, based on the previous studies of Murakami et al. and Nguyen et al. [4,5]: where D prediction represents the DVH parameters from the predicted dose distribution, D ground truth represents the DVH parameters from the actual dose distribution, and D prescription represents the prescription dose to the PTV. In this case, the difference in the DVH parameters, which are measured in terms of volume, was calculated using the following equation: where V prediction is the percentage of the organ's volume from the predicted dose distribution, and V ground truth is the percentage of the organ's volume from the ground truth dose distribution.
The uncertainty of this study was evaluated by the standard deviation (±SD) of the results for each model. For instance, the prediction time of the PCT alone model was 24.87 s/8 data, with the uncertainty reported as (±SD) with 3.61 ± 0.19 s/data.

Training and Prediction Time
The training and prediction times are shown in Table 4. For the PCT alone model and the PCTGOS model, the mean prediction time was approximately 3.5 s per patient. In contrast, the PCTSOS model spent approximately 17.5 s per patient.  Table 5 shows the average 3D gamma passing rate with 3%/3mm criteria. The maximum average gamma passing rate was 80.51 ± 5.94% from patient CT, including a generalized organ structure model with a training stage of 400 epochs, whereas the maximum gamma passing rate of the patient CT alone model was 77.21 ± 9.02% from 200 epochs, and that of the patient CT and specific organ structure model was 76.90 ± 3.91% with 300 epochs.

3D Gamma Analysis
The result after the gamma analysis part represents each model via the epoch with maximum gamma passing rate. Models include the PCT alone model at 200 epochs, the PCTGOS model at 400 epochs, and the PCTSOS model at 300 epochs. The example results of three dose distribution prediction models of patient number five are shown in Figure 3. Dose profile comparisons between ground truth dose distribution and predicted dose distribution for all three models are shown in Figure 4. The example results were from slice number 60 (middle slice). Dose profiles in all three models were highly consistent with the ground truth. However, in the PCTSOS model, the dose profile lacked smoothness due to an inconsistency of the prediction model in specific organs. The example results of three dose distribution prediction models of patient number five are shown in Figure 3. Dose profile comparisons between ground truth dose distribution and predicted dose distribution for all three models are shown in Figure 4. The example results were from slice number 60 (middle slice). Dose profiles in all three models were highly consistent with the ground truth. However, in the PCTSOS model, the dose profile lacked smoothness due to an inconsistency of the prediction model in specific organs.

DVH Parameters Evaluation
Example results of comparisons of DVH parameters between ground truth dose distribution and predicted dose distribution are shown in Figure 5. The example results were from patient number 5, slice number 60 (middle slice).

DVH Parameters Evaluation
Example results of comparisons of DVH parameters between ground truth dose distribution and predicted dose distribution are shown in Figure 5. The example results were from patient number 5, slice number 60 (middle slice).

DVH Parameters Evaluation
Example results of comparisons of DVH parameters between ground truth dose distribution and predicted dose distribution are shown in Figure 5. The example results were from patient number 5, slice number 60 (middle slice).  The summary of percentage differences between the average DVH parameters of ground truth dose distribution and predicted dose distribution for all three models is shown in Table 6. In PTV78 and PTV60, the PCTSOS model showed the best results in terms of the average percentage dose difference, when compared with the other two models (except for the parameter D2% of PTV78 from the CT alone model). However, the model PCTGOS in PTV46 appeared to have the lowest percentage difference, when all three models were compared. In this case, in PTV46, the patient CT alone model had the highest percentage difference, with approximately 20%. The PCTGOS model and the PCTSOS model were lower in terms of percentage difference in ground truth, at approximately 3.5% and 10%, respectively. The PCTSOS model showed the highest performance for the bladder, with the lowest percentage difference (approximately 4.5%) between the ground truth and predicted dose distributions. On the other hand, the PCTGOS model had the best agreement in the rectum, with approximately 10% dose difference. In addition, when comparing all three models, the PCTGOS model showed the most promising percentage of dose difference in both the left and right femoral heads. The percentage volume differences of both the left and right femoral heads contained similar values.

Model Comparison
The summary of the model comparison is shown in Table 7. The prediction time of each model was approximately 3.5 s. However, for the PCTSOS model, which comprises five organ models, the cumulative prediction time was approximately 17 s. The patient CT alone model had a 3D gamma passing rate of 77.21 ± 9.02 with the 3/3mm criteria-only slightly lower than the best result from the PCTGOS model (80.51 ± 5.94). The average overall percentage difference in the parameters of the PCTGOS model was 6.01 ± 5.44%, which is slightly lower those of than the other two models. The acceptance of the model comparison criteria counted the number of lowest percentage dose differences between the ground truth and predicted dose distribution. The PCTSOS model has the best prediction model accuracy, as it contains the best 11 parameters, including 6 from PTV and 5 from OARs. The PCT alone model showed the lowest prediction model accuracy, including 1 from PTV and 4 from OARs. In addition, the PCTGOS model contains 10 parameters, including 2 from PTV and 8 from OARs.

Discussion
In this study, several dose distribution predictive models of VMAT for prostate cancer were developed, evaluated, and compared with the core concept of backward planning. The traditional treatment process starts from patient simulation, organ structure contouring, and treatment planning stages to obtain the dose distribution and evaluate the planning. In contrast, our method can skip the planning and evaluation stages to achieve the desired dose distribution without dose calculation via the system. This can reduce the planning iterations and planning time by guiding the planner with the acceptable dose distribution, and can be the objective of the automated treatment planning based on the predicted dose distribution. In addition, the treatment planning time is dramatically reduced for the PCT alone model, which can skip the organ-structure-contouring process.
In this study, the planning process included two full arcs of VMAT and three PTVs (PTV78, PTV60, PTV46) that were used in all patients to deliver radiation to the prostate. This gave uniformity and a great pattern to the dataset, making it suitable for training the model. However, applying the model to predict the dose distribution of patients who are treated with other techniques may not give the best results-even in prostate cancer patients.
The data used in this study were obtained from a single hospital center, which may lead to a bias in this hospital data. Applying this model to different hospital centers may not give the best performance based on the dependence variation of the model. However, the dataset that was used in this model was analyzed by several experienced physicians and medical physicists who used the same criteria for the treatment planning process. This should prevent bias of the data from the treatment planner, and may also increase the generalizability of the model. Note that the number of training datasets could influence the prediction accuracy in deep learning algorithms. In the future, our study could be improved by increasing the number of training datasets.
Although the prediction results of the PCTSOS model had the lowest percentage differences in the DVH parameters (11 parameters), the 3D gamma analysis and average percentage difference over all parameters still showed worse results than those of the PCTGOS model. Additionally, the PCTSOS model required more prediction time than the PCTGOS model for the prediction of five organ-specific models. Compared to the PCT alone model, the prediction time of the PCTGOS model is comparable, but the PCT alone model does not require the contours of organ structure, meaning that the PCT alone models required less time than the PCTGOS model. The PCTGOS model gave the best predictive results of the three models, while the PCT alone model gave the worst. However, the PCT alone model seems to be interesting for further investigation in order to develop fully automated dose prediction without using the patient organ structure information.
In the PCTSOS model, dose distribution lacked smoothness around the edge of the organs. This may be because the dose distribution of each organ comes from a different predictive model that trains from the specific organ data. The dose distribution of each organ was predicted separately and then combined with other organs into a full-body dose distribution, which caused a discontinuous dose between the organs' boundaries. To solve this problem, we suggest applying filters to a full-body dose distribution, such as a Gaussian blur, in order to reduce the discontinuity between each organ dose.
The patient data used in this study consisted of three PTVs: one larger, and two smaller inside. We only chose the largest PTV to be the representative volume of the high-dose volume. The results show that, in PTV46, the patient CT alone model shows the highest percentage differences (approximately 20%), compared to the model where organ structures were included. The patient CT and specific organ structure model were lower, at approximately 3.5% and 10% differences, respectively. This means that the area of steep dose gradient can strongly influence the organ structures. Thus, in future work, all PTVs in the sequential plan should be implemented in order to improve the accuracy of the predictive model.
The discriminator used in this study was a 30 × 30 PatchGAN. In the previous study of Isola et al. [18], the authors compared four sizes of PatchGAN, consisting of 1 × 1, 16 × 16, 70 × 70, and 286 × 286 patches, and reported that the 70 × 70 patches gave the most accurate results. The various sizes of PatchGAN should be included in future studies. The 3D dose distribution was processed from a 2D slice-by-slice dose prediction method in which the information from the vertical axis of the patient's body was not implemented. This predicted dose distribution lacks smoothness in the vertical axis, and reduces the accuracy at the upper and lower borders of the PTV and OARs. In contrast, in the study of Dan Nguyen et al. [21], which studied the 3D dose distribution prediction from a hierarchically densely connected U-net, the authors reported that the predicted dose distribution from the 3D HD U-net gave better performance compared to a standard 2D U-net in terms of D max , homogeneity, dose conformity, and dose coverage.
Willems et al. studied the use of a 3D U-Net-based deep learning model to predict the dose distribution for VMAT for prostate cancer from CT images alone. Accordingly, they compared their results to CT images that were combined with additional data (plan isocenter and contours of organ structures). They reported on the CT only model that the mean percentage error in D max and D 98% was 8.6% and 16.8%, respectively, whereas the CT combined with the contour structure model resulted in a decrease in mean percentage error in D max and D 98% to 1.3% and 1.0%, respectively [22]. Comparing these previous studies to our results, as shown in Table 8, the absolute difference in D max from our PCT alone model was 1.7%, constituting a better result than the CT only model that Willems et al. reported. Nevertheless, after comparing the average absolute differences in D max and D 98% , the results obtained from the PCTGOS model were higher than the CT combined with the contour structure model that Willems et al. reported. Under the circumstances, the 3D U-Net-based deep learning model and 3D input dataset used in their study might perform better when predicting the 3D dose distribution-as opposed to our model, which used 2D images and converted them to 3D images later in the process. Lempart et al. studied the use of the densely connected U-Net deep learning model to predict the dose distribution for VMAT for prostate cancer by using CT images combined with the separated organs' contours (1PTV + 4OARs + 1Body). After obtaining the predicted dose distribution, they converted it to deliverable treatment plans. They modified the U-Net model to train on triplets data (a combination of three consecutive image slices and corresponding segmentations), resulting in a total of 160 patients whose data were. In this case, they reported that the mean of the absolute differences in D 98% of PTV was 1.90%, while for the D mean of the bladder it was 2.1% [23]. Compared to our results, the absolute differences in D 98% and D mean from our three models were higher in both parameters. This could be because the input data of our model were less than those used in the model that Lempart et al. reported; the latter model inputted the set of data with a combination of three consecutive slices, which could lead the model to learn the relationships between slices for a higher accuracy in prediction. Nguyen et al. studied the use of a U-Net-based deep learning model to predict the dose distribution for the IMRT plan for prostate cancer from the contours of organ structure. They reported that the means of the absolute differences in D max and D mean were still less than 5% of the prescription dose in the PTV and OARs [5]. Murakami et al. developed a fully automated dose distribution prediction for the IMRT plan for prostate cancer using the GAN from CT images alone, and reported that the means of the absolute differences in D max and D mean were both less than 2% in PTV [4]. Compared to our results, the differences in D max and D mean from our model were slightly higher than the results of Nguyen et al. and Murakami et al. in both PTV and OARs. This could be because our model predicted the dose distribution from the VMAT plan, not the IMRT plan. Unlike the IMRT, VMAT dose distribution does not have a clear radiation beam path, which could make it harder for the deep learning model to catch the pattern in VMAT dose distribution, leading to worse prediction results and higher uncertainty.
Compared to the traditional process of radiation therapy, our method gives a faster usage time in the treatment planning process, as shown in Figure 6. The traditional treatment planning process may take several days to obtain the desired dose distribution. In contrast, our deep-learning-based dose prediction model could shorten the time from days to 3.5 s for the PCT alone PCTGOS models, and 18 s for the PCTSOS model. In particular, as the PCT alone model does not require physicians to provide the information on contouring structure, this model could shorten the time by even more than other models.
Future work will increase the number of patients' data used, so as to achieve comparability with previous studies. All three contours of PTV volumes will be implemented in future studies. Additionally, the 3D convolutional neural network model will also be implemented in order to prevent errors from lack of dose continuity between slices. In future investigation, a fully automated treatment planning system will be developed from the accuracy dose prediction based on deep learning for clinical use in medical oncology and radiation treatment. [22] (normalized) CT + contour 1.

Conclusions
The three dose distribution predictive models of the prostate VMAT plan were developed using a generative adversarial network with different input data. Additionally, using our trained models, the accurate and rapid VMAT dose distributions were generated directly from either CT alone or CT and patient organ structure. The mean prediction time was approximately 3.5 s per patient, except with the PCTSOS model, which required approximately 17.5 s per patient. The highest 3D gamma passing rate was 80.51 ± 5.94, and the lowest overall percentage difference in DVH parameters is 6.01 ± 5.44% from the PCTGOS model. From 26 evaluated DVH parameters, the PCTSOS model received the most parameters with the lowest percentage differences (11 parameters; 6 PTV and 5 OAR). This dose prediction model could accelerate the time used for the structural contouring and the iterative optimization process for VMAT treatment planning, by guiding the planner with the desired dose distribution.

Conclusions
The three dose distribution predictive models of the prostate VMAT plan were developed using a generative adversarial network with different input data. Additionally, using our trained models, the accurate and rapid VMAT dose distributions were generated directly from either CT alone or CT and patient organ structure. The mean prediction time was approximately 3.5 s per patient, except with the PCTSOS model, which required approximately 17.5 s per patient. The highest 3D gamma passing rate was 80.51 ± 5.94, and the lowest overall percentage difference in DVH parameters is 6.01 ± 5.44% from the PCTGOS model. From 26 evaluated DVH parameters, the PCTSOS model received the most parameters with the lowest percentage differences (11 parameters; 6 PTV and 5 OAR). This dose prediction model could accelerate the time used for the structural contouring and the iterative optimization process for VMAT treatment planning, by guiding the planner with the desired dose distribution.
Author Contributions: Conceptualization, P.K., W.C. and T.F.; methodology, P.K. and T.F.; data acquisition, P.K.; writing-original draft preparation, P.K.; writing-review and editing, P.K., W.C., K.T. and T.F. All authors have read and agreed to the published version of the manuscript.
Funding: This research received no external funding.

Institutional Review Board Statement:
The study was conducted in accordance with the guidelines of the Ethics Committee of the Chulabhorn Royal Academy, Thailand (012/2564).

Informed Consent Statement:
Informed consent was obtained from all subjects involved in the study.
Data Availability Statement: Not applicable.