A Deep Learning Approach for the Fast Generation of Synthetic Computed Tomography from Low-Dose Cone Beam Computed Tomography Images on a Linear Accelerator Equipped with Artificial Intelligence

.


Introduction
The recent technological advancements in radiotherapy (RT) and the introduction of Artificial Intelligence (AI) in this context have made it possible to offer even more personalised treatments to patients, effectively compensating for the anatomical variations observed during each RT treatment fraction [1][2][3].
In recent years, different studies have proved that quickly adapting the RT treatment plan to the patient's daily anatomy can lead to remarkable clinical benefits in terms of quality of life and disease control [4,5].
AI plays a key role in the online adaptive procedure by supporting clinicians and medical physicists in automating different manual procedures, such as contouring and planning, making the whole pipeline faster and less user-dependent.In particular, the AI implementation can significantly reduce treatment slot times, moving from the current 30 to 50 min to about 15 min, making possible the diffusion of this procedure to a larger community of patients [6].
Another important step forward obtainable thanks to AI is represented by the removal of the Computed Tomography (CT) simulation procedure.The core idea is to generate a synthetic Computed Tomography (sCT) image starting from the daily imaging on-board acquired for patient positioning and using this image as an electron density map for dose calculation [7,8].
The adoption of synthetic CT in clinical practice would reduce uncertainties due to the co-registration between simulation CT and daily positioning images, avoid simulation dose exposure to the patient, and speed up RT treatment workflow, opening up single-step procedures that can have a huge impact in the case of palliative treatments [9].
Various non-AI-based strategies have been proposed so far to remove CT simulation; these strategies include the use of pre-generated atlases of known cases, the identification of pre-defined tissue density levels, and the consequent assignment of bulk electron density (ED) values [10][11][12].Another approach was the use of deformable image registration, which today represents the most diffused strategy in clinical practice.However, several comparison studies have shown that all these approaches yield suboptimal results when compared to AI-based strategies in terms of dose accuracy and generation time [7,13].
Today, the most diffused AI approaches are based on Deep Learning (DL) architectures, such as Generative Adversarial Neural (GAN) networks and U-shaped convolutional neural networks (U-Net), which have proved the feasibility of generating high-quality sCT images from different positioning images, such as Cone Beam Computed Tomography (CBCT) and Magnetic Resonance Imaging (MRI) [14][15][16][17].
Providing ED information from CBCT images was a topic of research for several years, but only with the use of DL approaches was it possible to obtain levels of dose accuracy reliable for clinical use [18,19].
Recently, the first RT system able to integrate AI technology to perform online adaptive RT treatments has been released for clinical use (Varian Ethos, Varian Medical System, US), equipped with a low-dose CBCT on-board system, able to ensure high image quality using low-dose protocols through a dedicated iterative reconstruction algorithm [4].
Thanks to the AI integration, this system is able to speed up the online adaptive workflow, offering clinicians auto-contouring and auto-planning solutions that allow for treatment standardisation and reduce workload during on-table adaptive procedures.
The aim of this study is to propose a DL algorithm able to generate high-quality sCT from CBCT images in a reliable and fast way, making it possible to remove the CT simulation step from the RT workflow and collapsing the whole RT back-off workflow online, offering reliable solutions to patients requiring fast treatments, such as those needing palliative care.

Patients
This retrospective study was focused on 53 patients treated at Mater Olbia Hospital (Olbia, Sassari, Italy) from August 2021 to June 2023.All patients were treated using an AI-based linear accelerator system (Ethos, Varian Medical System, Mountain View, CA, USA) for lesions located in the pelvis.
The whole research was conducted in compliance with the ethical standards set in the Declaration of Helsinki, and it was approved by the Institutional Review Board.
A CT image was acquired for each patient during the simulation procedure, using a helical CT scanner (RT Discovery, General Electric, Chicago, IL, USA), following an imaging protocol consisting of a slice thickness of 1.25 mm and a pixel dimension of 1.17 × 1.17 mm.The voltage of the tube was kept at 120 kV, while the tube current was automatically selected by the CT Automatic Tube Current Modulation (ACTM) system in order to adapt it to the patient's body mass index.Patients with artificial implants, prostheses, or those under the age of 18 were excluded from the study.
All patients were simulated and treated in supine positions.During RT treatment, each patient was subjected to a daily CBCT acquisition obtained using an X-ray beam of 125 kV, adopting a low-dose protocol minimising the mA based on body mass index and an iterative reconstruction algorithm to increase the image quality.The CBCT images were acquired before the delivery of each single treatment fraction and were characterised by a slice thickness of 2 mm and a pixel dimension of 0.96 × 0.96 mm.

Synthetic CT Generation
The synthetic CT generation process was performed using 30 patients as training, 9 as validation, and 14 as a test set.Patients belonging to the training and validation groups were treated from August 2021 to March 2023, while test cases were collected consecutively, considering the patients treated from April 2023 to June 2023.The network training was performed in a 2D modality, pairing CT axial images with the corresponding CBCT images acquired on the first day of therapy.
The first day of therapy was chosen because it was the one where the patient's anatomy was closest to the one acquired during treatment simulation.
Image pairing was obtained by applying a rigid roto-translation focused on the spatial correspondence of the bony anatomy using external software (Varian Velocity, version 3, Varian Medical System, Palo Alto, CA, USA).The fused images were spatially resampled to make the CBCT spatial resolution equal to the CT one.
For each training case, the quality of the alignment between CBCT and CT images was assessed by a radiation oncologist with more than five years of experience in the field.Registered images were then exported in Digital Imaging and Communications in Medicine (DICOM) format, normalised in terms of pixel intensities, and converted to Portable Network Graphics (PNG) format to make them compliant for network architecture.
An image selection process was manually carried out on the 2D axial CBCT-CT paired images, with the aim of including in the training set only images meeting the following criteria: The trained neural network was a conditional Generative Adversarial Network (cGAN) based on pix2pix architecture, as similarly carried out in similar experiences dealing with this topic [16,17].
The GAN architecture consists of two neural networks, the discriminator and the generator, which are trained in mutual competition.The aim of the discriminator is to differentiate between the fake images created by the generator and the real ones provided by the user and belonging to the validation set.The generator continuously improves the image quality with the goal of making the discriminator in trouble, reducing the difference between the created images and the real ones.The training can be considered completed when the discriminator is not able to differentiate between real and fake samples.
The generator architecture was a 2D U-Net with the introduction of skip-connections among different network layers, while the discriminator was a 30 × 30 matrix organised following a PatchGAN architecture, aiming to divide the images into sub-regions of 70 × 70 pixels and calculating the probability of each patch being a real image [20].Each element of the discriminator matrix contains the probability of each single sub-region being fake (probability equal to zero) or real (probability equal to one).
The cost function used during the GAN optimisation was labelled as L TOT and equal to the following: where L GAN is the adversarial loss function; L 1 is the norm loss; and λ is the lambda weighting factor of the L 1 loss.A hyperparameter tuning phase was performed to identify the optimal parameters for network training: three learning rate values (0.02, 0.002, and 0.0002) were combined with two lambda weighting factors (100 and 2000) in a grid search strategy.All the training iterations were run for 200 epochs, starting from a random noise vector with a Gaussian distribution with a mean of 0 and a standard deviation of 0.02.
Two hundred was considered a sufficient number of epochs in consideration of similar studies recently reported in the literature on the same topic [16,17].The early-stop technique was used to prevent the risk of overfitting.Data augmentation was not implemented.Error backpropagation was adopted for network weight adjustment and the drop-out technique to avoid overfitting.Training optimisation was carried out implementing the gradient descendent technique, using an Adam optimiser with a = 0.999 and b = 0.999 as parameters and keeping a batch size of 1.As for the inference phase, time spent generating sCT images from a complete CBCT study was measured for each test case.
Image selection and pre-processing were performed using RStudio and dedicated packages, while neural network training and testing were run using Python (version 3.9) [21].The whole process was carried out using a workstation Dell Precision 5820 Tower X series (Dell Inc., Round Rock, TX, USA) equipped with a CPU Intel Core i9-10900X with a frequency speed of 3.7 Giga Hertz, a RAM memory of 64 Gigabytes, and a 24 Gigabyte Graphical Power Unit (GPU) Invidia RTX A5000.

Synthetic CT Evaluation
The evaluation of the performance of the best neural network obtained was investigated on the test set, composed of patients clinically treated using Volumetric Arc therapy (VMAT) and Intensity Modulated (IMRT) planning.The clinical and dosimetric characteristics of the patients included in the test set are reported in the Section 3.
For each of the test cases, a sCT was generated from the CBCT image of the first treatment fraction, and the clinical treatment plan originally calculated on the simulation CT was recalculated on the sCT.Dose calculation was performed using the Monte Carlo algorithm (Acuros XB version 16.1.0),adopting a grid resolution of 1.25 mm 3 .
For each test case, the sCT was evaluated in terms of image and dose accuracy.
To avoid any possible bias in algorithm performance evaluation, a preliminary visual consistency check was carried out by a radiation oncologist in the test set, with the aim of removing patients showing large differences in positioning and anatomical conditions between the CBCT acquired at first fraction and CT simulation.
Image accuracy was quantified by calculating the mean absolute error (MAE) and mean error (ME) in terms of Hounsfield Units (HU) between the synthetic and the original CT within a field of view of 28 cm.Dose accuracy was estimated by calculating the difference in the estimation of various Dose Volume Histogram (DVH) indicators and performing gamma analysis between the dose distribution calculated on the sCT and those calculated on the original CT [22].
As As for the organs at risk (OARs), the last three parameters (D2%[Gy], D98%[Gy] and D50%[Gy]) were considered DVH indicators for the rectum and bladder.The difference among the DVH indicators was evaluated using a box plot analysis [16,23].
As for gamma analysis, the mean 3D gamma passing rate, indicating the percentage of points reporting a gamma value inferior to one, was reported as the global gamma index.A low dose threshold level equal to 10% of the maximum dose and three different tolerance criteria (1%/1 mm, 2%/2 mm, and 3%/3 mm) were used, as also reported in similar studies in this field [16,17].The Verisoft software (version 3.5, PTW, Freiburg, Germany) was used to calculate the gamma analysis.

Results
The clinical and dosimetric characteristics of the patients included in the training and in the test set are summarised in Table 1.Each network training process took a mean of 24 h, while the mean generation time for a sCT study was 74 ± 7 s.The image selection process initially performed resulted in a discard of 3128 and a selection of 1379 paired images that met the image quality criteria.A learning rate of 2•10 −4 and a lambda value of 2000 were identified as optimal hyperparameters set during network optimisation.Figure 1 reports a comparison between the sCT obtained using the optimal hyperparameter set and the corresponding CBCT and CT for a test patient chosen as an example.Palliative treatment plans were planned for patients 1-7, while curative plans were planned for patients 8-14.
The mean differences of the dose DVH parameters calculated using the sCT approach with respect to those calculated on the original CT were reported as a boxplot analysis in Figure 3.   Palliative treatment plans were planned for patients 1-7, while curative plans were planned for patients 8-14.
The mean differences of the dose DVH parameters calculated using the sCT approach with respect to those calculated on the original CT were reported as a boxplot analysis in Figure 3. Palliative treatment plans were planned for patients 1-7, while curative plans were planned for patients 8-14.
The mean differences of the dose DVH parameters calculated using the sCT approach with respect to those calculated on the original CT were reported as a boxplot analysis in Figure 3.
As for PTV percentage metrics, a mean percentage difference of 0.4 ± 0.3% was observed in the estimation of V95% [%] and 0.2 ± 0.1% in the estimation of V105% [%].All the DVH indicators related to dose values were in agreement within 1 Gy between the calculations on CT and sCT. Figure 4 reports the axial dose distribution obtained on a palliative case as calculated on the real CT (left) and on the synthetic CT (right).As for PTV percentage metrics, a mean percentage difference of 0.4 ± 0.3% was o served in the estimation of V95% [%] and 0.2 ± 0.1% in the estimation of V105% [%].A the DVH indicators related to dose values were in agreement within 1 Gy between t calculations on CT and sCT. Figure 4 reports the axial dose distribution obtained on palliative case as calculated on the real CT (left) and on the synthetic CT (right).Dose accuracy was also evaluated in terms of gamma passing rates obtained by com paring the dose distribution calculated on the sCT with respect to the one calculated the real CT.The results for the different gamma criteria are reported in Table 2, separate for palliative and curative test cases.As for PTV percentage metrics, a mean percentage difference of 0.4 ± 0.3% was observed in the estimation of V95% [%] and 0.2 ± 0.1% in the estimation of V105% [%].All the DVH indicators related to dose values were in agreement within 1 Gy between the calculations on CT and sCT. Figure 4 reports the axial dose distribution obtained on a palliative case as calculated on the real CT (left) and on the synthetic CT (right).Dose accuracy was also evaluated in terms of gamma passing rates obtained by comparing the dose distribution calculated on the sCT with respect to the one calculated on the real CT.The results for the different gamma criteria are reported in Table 2, separately for palliative and curative test cases.Dose accuracy was also evaluated in terms of gamma passing rates obtained by comparing the dose distribution calculated on the sCT respect to the one calculated on the real CT.The results for the different gamma criteria are reported in Table 2, separately for palliative and curative test cases.No significant difference was observed between curative and palliative planning, as indicated by the p-values listed in the last column calculated using the t-test for unpaired samples.Figure 5 reports the results of the gamma passing rates obtained by merging the whole test set.
Interestingly, a median gamma passing rate higher than 90% is visible even considering the tighter gamma criteria (1%/1 mm).
No significant difference was observed between curative and palliative planning, as indicated by the p-values listed in the last column calculated using the t-test for unpaired samples.Figure 5 reports the results of the gamma passing rates obtained by merging the whole test set.Interestingly, a median gamma passing rate higher than 90% is visible even considering the tighter gamma criteria (1%/1 mm).

Discussion
The advent of AI is revolutionising the RT clinical workflow, opening new opportunities in terms of research and clinical innovation to enhance the clinical impact of this cutting-edge discipline.The development of single-session workflow is a new frontier in radiotherapy, not imaginable a few years ago and now made possible by the advent of AI, that will reduce treatment waiting times, offering an important perspective on speeding up the treatment time of patients subject to palliative care.The application of Deep Learning techniques to generate synthetic CT from Cone Beam Computed Tomography images, bypassing the traditional CT simulation, has been explored by various research groups across different treatment sites and DL architectures.
Chen et al. were one of the first groups to propose a DL solution in this framework, implementing in 2020 a U-Net architecture able to guarantee a MAE of 18.98 HU in the case of patients affected by head and neck tumours [24].In the last few years, some studies have also investigated the differences among different DL architectures.Wang et al. compared U-Net, CycleGAN, and cGAN (in the paper stated as pix2pix) for sCT generation from breast CBCT images, obtaining better results using the U-Net architecture in terms of image similarity and dose accuracy [25].
While various experiences related to the pelvic site have been reported in the literature, a limited number have meticulously investigated the aspects related to dose accuracy.CycleGAN architecture was one of the most diffused solutions; this was adopted by different research groups on paired [26] and unpaired [27] data, yielding significant results in terms of image quality (the best reported values were 16.1 for MAE and 14.6 HU for ME), but without presenting results in terms of dose accuracy.The superiority of GAN over U-Net and CycleGAN in the pelvis was recently demonstrated by Zhang et al., who limited the dose evaluation to a single example case.This paper focused on a comprehensive dose accuracy analysis to assess the possibility of integrating this approach into

Discussion
The advent of AI is revolutionising the RT clinical workflow, opening new opportunities in terms of research and clinical innovation to enhance the clinical impact of this cutting-edge discipline.The development of single-session workflow is a new frontier in radiotherapy, not imaginable a few years ago and now made possible by the advent of AI, that will reduce treatment waiting times, offering an important perspective on speeding up the treatment time of patients subject to palliative care.The application of Deep Learning techniques to generate synthetic CT from Cone Beam Computed Tomography images, bypassing the traditional CT simulation, has been explored by various research groups across different treatment sites and DL architectures.
Chen et al. were one of the first groups to propose a DL solution in this framework, implementing in 2020 a U-Net architecture able to guarantee a MAE of 18.98 HU in the case of patients affected by head and neck tumours [24].In the last few years, some studies have also investigated the differences among different DL architectures.Wang et al. compared U-Net, CycleGAN, and cGAN (in the paper stated as pix2pix) for sCT generation from breast CBCT images, obtaining better results using the U-Net architecture in terms of image similarity and dose accuracy [25].
While various experiences related to the pelvic site have been reported in the literature, a limited number have meticulously investigated the aspects related to dose accuracy.CycleGAN architecture was one of the most diffused solutions; this was adopted by different research groups on paired [26] and unpaired [27] data, yielding significant results in terms of image quality (the best reported values were 16.1 for MAE and 14.6 HU for ME), but without presenting results in terms of dose accuracy.The superiority of GAN over U-Net and CycleGAN in the pelvis was recently demonstrated by Zhang et al., who limited the dose evaluation to a single example case.This paper focused on a comprehensive dose accuracy analysis to assess the possibility of integrating this approach into clinical radiotherapy practice.The results of this study indicated that GAN-based solutions can rapidly and reliably generate synthetic CT images, ensuring high dose accuracy in both palliative and curative cases.This opens the door to the effective implementation of clinical workflows, removing CT simulation from clinical practice.Notably, our proposed approach maintains high dose accuracy even in the challenging scenario of treatment plans for bone metastasis, where Hounsfield Unit attribution in bone regions is inherently uncertain.Unlike the experiences of sCT generation on MR-Linac, where MR and CT images are acquired on the same day, in CBCT-focused studies, it is common for CBCT and Appl.Sci.2024, 14, 4844 9 of CT images to be acquired on different days.To reduce potential bias due to errors in HU estimation not attributable to the neural network's performance but simply to differences in patient positioning or anatomical conditions, a preliminary visual consistency analysis on the test set was conducted to select only cases where a reliable matching was presented between first-day CBCT and simulation CT.The present study suffers from different limitations that should be addressed before moving towards full implementation of the algorithm in clinical practice.Firstly, the network's performance should be evaluated on a larger cohort of patients, including patients acquired using other Varian Ethos systems, to assess the generalisability of the network to similar systems.Additional validation should be conducted using CBCT images acquired with other RT machines to evaluate the applicability of this network to other conventional linear accelerators.A larger, multicentric validation is one of the future directions of this study.In the next few years, it is reasonable to expect that the quality of these sCT images could be further improved.Augmenting the number of training cases or exploring alternative AI architectures, such as diffusion models, are some strategies currently under investigation [28,29].High-quality sCT images generated from CBCT are not only useful for improving the accuracy of dose calculation but also for allowing more precise delineations for clinicians, as reported in a recent publication [30].
Furthermore, implementing a robust quality assurance protocol is crucial for promptly identifying local inaccuracies in synthetic CT images, thereby issuing warnings in the event of errors in HU attribution.Some experiences have been recently reported in the literature, demonstrating the need for new QA solutions [31,32].Once these points are properly addressed, the implementation of a clinical workflow without the simulation of CT acquisition could become a reality, significantly speeding up the time needed to offer a radiation treatment to patients who are subject to palliative care, maintaining high standards in terms of clinical quality, and offering the possibility of fully online adaptive treatments.

Conclusions
In conclusion, the integration of AI into the radiotherapy clinical workflow, particularly in the generation of synthetic CT images from CBCT data, represents a groundbreaking advancement with relevant advantages for patients and professionals as well.Using GAN architectures, it is possible to create sCT images that effectively substitute the CT acquisition, allowing high-quality treatments in the pelvis, not only for palliative purposes but also for curative cases.With the integration of additional external test set evaluation and the adoption of designed ad hoc quality assurance protocols, it is reasonable to assume that this approach could become a state of the art in the next few years.
Informed Consent Statement: Informed consent was not obtained from the patients because the retrospective study exclusively used anonymous and aggregated data, ensuring complete confidentiality and protection of personal information, and fell within the exceptions provided by ethical guidelines for secondary data research.

Data Availability Statement:
The raw data supporting the conclusions of this article will be made available by the authors on request.

-
High anatomical correspondence in terms of bony anatomy.-Shape correspondence related to the patient's body is displayed in the two images.-High agreement in terms of location and volume of air pockets.-No presence of image artifacts in CBCT due to the presence of large air bubbles.-No presence of cut images due to a reduced field of view (FOV), a condition typical of apical CBCT slices.

Figure 1 .
Figure 1.Example of CT (left), CBCT (centre), and synthetic CT (right) generated in the axial, sagittal, and coronal planes in a pelvic case.As for the image analysis, Figure2reports the values obtained in terms of MAE and ME for each single patient in the test set: average values of −7 ± 6 HU and 36 ± 6 HU were obtained for ME and MAE, respectively.

Figure 2 .
Figure 2. MAE and ME values for test cases.Patients 1-7 are palliative cases; patients 8-14 are curative cases.

Figure 1 .
Figure 1.Example of CT (left), CBCT (centre), and synthetic CT (right) generated in the axial, sagittal, and coronal planes in a pelvic case.As for the image analysis, Figure2reports the values obtained in terms of MAE and ME for each single patient in the test set: average values of −7 ± 6 HU and 36 ± 6 HU were obtained for ME and MAE, respectively.

Figure 1 .
Figure 1.Example of CT (left), CBCT (centre), and synthetic CT (right) generated in the axial, sagittal, and coronal planes in a pelvic case.As for the image analysis, Figure2reports the values obtained in terms of MAE and ME for each single patient in the test set: average values of −7 ± 6 HU and 36 ± 6 HU were obtained for ME and MAE, respectively.

Figure 2 .
Figure 2. MAE and ME values for test cases.Patients 1-7 are palliative cases; patients 8-14 are curative cases.

Figure 2 .
Figure 2. MAE and ME values for test cases.Patients 1-7 are palliative cases; patients 8-14 are curative cases.

Figure 4 .
Figure 4. Visual representation of dose axial dose distribution obtained on a palliative case as c culated on the real CT (left) and on the synthetic CT (right).

Figure 3 .
Figure 3. Boxplot analysis of DVH difference observed in the test set between synthetic and real CT in terms of estimation of D98%, D50%, and D2% for target (PTV), rectum (OAR 1 ), and bladder (OAR2).

Figure 3 .
Figure 3. Boxplot analysis of DVH difference observed in the test set between synthetic and real CT in terms of estimation of D98%, D50%, and D2% for target (PTV), rectum (OAR1), and bladder (OAR2).

Figure 4 .
Figure 4. Visual representation of dose axial dose distribution obtained on a palliative case as calculated on the real CT (left) and on the synthetic CT (right).

Figure 4 .
Figure 4. Visual representation of dose axial dose distribution obtained on a palliative case as calculated on the real CT (left) and on the synthetic CT (right).
regards the DVH comparison, the following five parameters were considered for Planning Target Volume (PTV):

Table 1 .
Clinical and dosimetric characteristics of the patients included in the study.

Table 2 .
Mean values of gamma passing rates obtained on the test set comparing the dose distrib tion calculated on real and synthetic CT and considering 1%/1 mm, 2%/2 mm, and 3%/3 mm gamm as tolerance criteria.

Table 2 .
Mean values of gamma passing rates obtained on the test set comparing the dose distribution calculated on real and synthetic CT and considering 1%/1 mm, 2%/2 mm, and 3%/3 mm gamma as tolerance criteria.

Table 2 .
Mean values of gamma passing rates obtained on the test set comparing the dose distribution calculated on real and synthetic CT and considering 1%/1 mm, 2%/2 mm, and 3%/3 mm gamma as tolerance criteria.