Deep-Learning Based Positron Range Correction of PET Images

Featured Application: This work demonstrates that a trained deep neural network is able to accu-rately correct the blurring present in PET images when using radionuclides with large positron ranges, such as 68 Ga. This approach can be used in any preclinical and clinical PET studies. Abstract: Positron emission tomography (PET) is a molecular imaging technique that provides a 3D image of functional processes in the body in vivo. Some of the radionuclides proposed for PET imaging emit high-energy positrons, which travel some distance before they annihilate (positron range), creating signiﬁcant blurring in the reconstructed images. Their large positron range compromises the achievable spatial resolution of the system, which is more signiﬁcant when using high-resolution scanners designed for the imaging of small animals. In this work, we trained a deep neural network named Deep-PRC to correct PET images for positron range effects. Deep-PRC was trained with modeled cases using a realistic Monte Carlo simulation tool that considers the positron energy distribution and the materials and tissues it propagates into. Quantiﬁcation of the reconstructed PET images corrected with Deep-PRC showed that it was able to restore the images by up to 95% without any signiﬁcant noise increase. The proposed method, which is accessible via Github, can provide an accurate positron range correction in a few seconds for a typical PET acquisition.


Introduction
Positron emission tomography (PET) is a molecular imaging technique that provides information about biochemical or physiological processes in the body in vivo [1]. PET images are obtained by detecting the radiation emitted by radionuclides bound to molecules (radiotracers) of interest administered to patients or animals under study [2]. PET currently provides very valuable information in clinical fields such as oncology, cardiology, and neurology, as well as in many preclinical studies [3].
PET is based on the detection of the two collinear 511 keV γ rays resulting from the annihilation of the positrons emitted by the unstable radionuclides. After being emitted, a positron will lose its initial kinetic energy through multiple interactions (mostly inelastic collisions [4]) with the electrons of the surrounding tissues. Positron-electron annihilation is most likely to occur when the positron has lost all its kinetic energy [5]. The distance between the emission and annihilation points is known as positron range (PR), and it is one of the main limiting factors of the spatial resolution of PET scanners [6][7][8][9][10][11]. PR makes the spatial distribution of the annihilation points a somewhat blurred version of that of the emission points (see Figure 1b). PR modeling is not straightforward, as it depends on both the kinetic energy of the emitted positrons ( Figure 1a) and the electron density (i.e., the density and composition) of the surrounding tissues. 18 F is by far the most widely used PET radionuclide, accounting for about 90% of clinical studies [12]. Nevertheless, many other radionuclides have been proposed for use in PET imaging, and hundreds of PET radiotracers based on them have been developed. A challenge associated with the use of some of these radionuclides, such as such as 68 Ga, 82 Rb, 124 I, 76 Br, or 86 Y, is that they emit positrons with a large initial kinetic energy (see Figure 1), which results in large PR in tissues, yielding images with reduced spatial resolution and lower contrast in small volumes due to partial volume effects [7,11,13]. This effect is even more important when using high-resolution scanners designed to image small animals, which are able to achieve millimeter resolution [14].
The quantitative accuracy of PET depends on an accurate positron range correction (PRC). As 18 F has a relatively small PR in soft tissue, accurate PRC for different radionuclides has been largely neglected, and to the best of our knowledge is not yet performed in standard PET image reconstruction. Nevertheless, the improved resolution of current scanners and the use of radionuclides with large PR such as 68 Ga requires this correction to be improved.

Positron Range Models
PR can be modeled in many ways, which can be grouped into the following categories in order of increasing complexity: Isotropic and tissue-independent model: A uniform model is used for all voxels, assuming water (or soft tissue) as the medium through which positrons travel and annihilate. This is a fast and easy procedure to implement, and only requires the activity distribution as input. PR distributions for most common radionuclides used in PET in water-or plastic-equivalent materials can be measured [15] or computed using Monte Carlo (MC) simulations based on well-established libraries such as GEANT4, PENELOPE, and FLUKA [10,16,17]. The simulated spatial distributions can be fitted to an equation [9,10]. These models can consider differences in the energy distribution of the positrons emitted by each radionuclide (see Figure 1a), but in general they are not quite realistic and can yield artefacts in heterogeneous media. A challenge associated with the use of some of these radionuclides, such as such as 68 Ga, 82 Rb, 124 I, 76 Br, or 86 Y, is that they emit positrons with a large initial kinetic energy (see Figure 1), which results in large PR in tissues, yielding images with reduced spatial resolution and lower contrast in small volumes due to partial volume effects [7,11,13]. This effect is even more important when using high-resolution scanners designed to image small animals, which are able to achieve millimeter resolution [14]. The quantitative accuracy of PET depends on an accurate positron range correction (PRC). As 18 F has a relatively small PR in soft tissue, accurate PRC for different radionuclides has been largely neglected, and to the best of our knowledge is not yet performed in standard PET image reconstruction. Nevertheless, the improved resolution of current scanners and the use of radionuclides with large PR such as 68 Ga requires this correction to be improved.

Positron Range Models
PR can be modeled in many ways, which can be grouped into the following categories in order of increasing complexity: Isotropic and tissue-independent model: A uniform model is used for all voxels, assuming water (or soft tissue) as the medium through which positrons travel and annihilate. This is a fast and easy procedure to implement, and only requires the activity distribution as input. PR distributions for most common radionuclides used in PET in wateror plastic-equivalent materials can be measured [15] or computed using Monte Carlo (MC) simulations based on well-established libraries such as GEANT4, PENELOPE, and FLUKA [10,16,17]. The simulated spatial distributions can be fitted to an equation [9,10]. These models can consider differences in the energy distribution of the positrons emitted Isotropic but tissue-dependent model: The positron range effect is modeled as a blurring kernel that depends on the material of the voxel from which the positron is emitted, irrespective of the surrounding media. In this case, each blurring kernel is homogeneous and isotropic. This correction requires the coregistration of the CT image with the activity distribution and it is expected to work well everywhere except near tissue boundaries. However, a non-negligible number of positrons close to the skin-air boundary can escape and using space-invariant filters can result in severe artefacts [19]. For instance, these artefacts may be observed in clinical 124 I PET imaging of thyroid glands close to the trachea [20].
Anisotropic and tissue-dependent (approximated) model: To address the presence of inhomogeneous media, space-variant models have been proposed [21,22]. The anisotropic PR has been modeled by anisotropically truncating an isotropic point probability density function based on tissue type, performing successive convolution operations of tissuedependent PR kernels, or using the average of the fitting parameters of annihilation densities for the originating voxel and the target voxel [23]. These filters provide a fast and robust method for implementing a PR model, but due to the complexity of positron migration at irregular interfaces, it may not be always accurate.
Anisotropic and tissue-dependent (full) model: In this final case, the blurring kernel takes into account the material from which the positron is emitted and all the different materials that the positron travels through until it annihilates. These models are accurate even when the activity is distributed at extreme tissue boundaries. Monte Carlo simulations, despite their large computational cost, can be used to generate these models [24], including when the PR is in the presence of magnetic fields [17,25,26].

Positron Range Correction Methods
Many correction approaches have been proposed to remove the blurring caused by PR on reconstructed PET images. These methods differ not only for the PR model they are based on, but also for how they are applied. Two main approaches for PRC in PET can be identified.
PRC applied as a postprocessing step to the reconstructed images: In this approach, the reconstructed images before PRC are considered to represent the distribution of the positron annihilations, instead of the distribution of the positron emissions. Therefore, the goal of the postprocessing PRC procedure is to convert the annihilation distribution into the positron emission distribution (the one expected to be seen with PET). Postprocessing PRC has the advantage of being fast, simple, and to some extent independent of the procedures, algorithms, and codes used in the image reconstruction. On the other hand, it has the risk of increasing the noise in the final corrected images [25]. This approach has been applied using Fourier deconvolution techniques [27] with isotropic and tissue-independent kernels, as well as iterative deconvolution methods such as Richardson-Lucy [28], which enables the use of more realistic PR models.
PRC applied within the tomographic iterative image reconstruction process: PR models can be included in a point-spread-function (PSF) or resolution kernel [13,14,25], or within the system response matrix (SRM) used in the iterative reconstruction. This can be argued to be the most common approach used for PRC with 18 F as, in fact, many image-reconstruction methods use SRM from point source measurements [29] obtained from 18 F in water, 22 Na in plastic, or realistic simulations [30] that incorporate PR effects. This approach has some important limitations: On one hand, adapting an existing SRM to other radionuclides could be difficult, unless a SRM is used wherein PR is factored out [31]. Furthermore, fully realistic PR models would require the SRM to be evaluated in each acquisition.
In some works, a mismatched projector/backprojector pair has been proposed, with PR blurring only applied right before the forward projection operation. It has been shown that using a PR blurring kernel in the backprojector has only the effect of reducing the convergence speed of the iterative algorithm [8]. In this case, the SRM used should not incorporate any PR effects. In many cases, it is difficult to separate the PR from other blurring effects considered in the SRM, and including an isotope-dependent PRC in a reconstruction procedure introduces the risk of overcorrecting the PRC, yielding overshooting and Gibbs artefacts [32].

Neural Networks in Medical Imaging
Machine learning (ML), and more specifically the area of ML known as deep learning, are having a huge impact on many areas, including medical imaging and PET [33]. Deep Appl. Sci. 2021, 11, 266 4 of 13 learning methods are able to create accurate mappings between inputs and outputs by means of artificial neural networks (NNs) with a large (deep) number of layers. The fact that the same framework can connect many different inputs, such as measurements, raw images, outputs such as labels, and reference images, allows its application in a large variety of problems and disciplines. A recent overview on ML and deep learning for PET imaging can be found in Reference [34].
The components of the NN are learned from example training datasets. In the case of supervised learning, the example inputs are paired with their corresponding desired outputs. After the training, the NN can then be used on new input data to predict their outputs.
Various studies based on convolutional neural networks (CNNs) have been proposed for medical image generation, especially for segmentation, many of them using the U-Net structure [35]. However, to the best of our knowledge, such networks have not yet been applied for PRC.

Proposed Method
In this work, we propose a deep-learning based PRC method (Deep-PRC) applied as a postprocessing step to the reconstructed PET images. Our goal was to develop a fast PRC method for 3D PET imaging that is able to provide PET images for medium-and large-range radionuclides, rivaling in spatial resolution the ones reconstructed with the standard short-range 18 F radionuclide. The NN was trained with realistic simulated cases of preclinical studies of reconstructed images of 68 Ga and 18 F corresponding to the same activity distribution, and it was able to produce accurate and precise 68 Ga PR-corrected images similar to the 18 F ones.
We assumed that the image reconstruction method used to obtain the images had already incorporated the PRC for 18 F, and, therefore, the purpose of this work was to obtain a PRC for 68 Ga that made it look similar to the corresponding 18 F counterpart, without the risk of double correcting this effect. The source code is available in Github [36].

Materials and Methods
Supervised ML requires a set of reference cases to train the NN. In this work, we generated simulated cases with realistic activity, material, and density distribution. The complete workflow is depicted in Figure 2, and each component is described in detail in the sections below.

Neural Networks in Medical Imaging
Machine learning (ML), and more specifically the area of ML known as deep lear ing, are having a huge impact on many areas, including medical imaging and PET [33 Deep learning methods are able to create accurate mappings between inputs and outpu by means of artificial neural networks (NNs) with a large (deep) number of layers. Th fact that the same framework can connect many different inputs, such as measurement raw images, outputs such as labels, and reference images, allows its application in a larg variety of problems and disciplines. A recent overview on ML and deep learning for PE imaging can be found in Reference [34].
The components of the NN are learned from example training datasets. In the case supervised learning, the example inputs are paired with their corresponding desired ou puts. After the training, the NN can then be used on new input data to predict their ou puts.
Various studies based on convolutional neural networks (CNNs) have been pr posed for medical image generation, especially for segmentation, many of them using th U-Net structure [35]. However, to the best of our knowledge, such networks have not y been applied for PRC.

Proposed Method
In this work, we propose a deep-learning based PRC method (Deep-PRC) applied a a postprocessing step to the reconstructed PET images. Our goal was to develop a fa PRC method for 3D PET imaging that is able to provide PET images for medium-an large-range radionuclides, rivaling in spatial resolution the ones reconstructed with th standard short-range 18 F radionuclide. The NN was trained with realistic simulated cas of preclinical studies of reconstructed images of 68 Ga and 18 F corresponding to the sam activity distribution, and it was able to produce accurate and precise 68 Ga PR-correcte images similar to the 18 F ones.
We assumed that the image reconstruction method used to obtain the images ha already incorporated the PRC for 18 F, and, therefore, the purpose of this work was to o tain a PRC for 68 Ga that made it look similar to the corresponding 18 F counterpart, witho the risk of double correcting this effect. The source code is available in Github [36].

Materials and Methods
Supervised ML requires a set of reference cases to train the NN. In this work, w generated simulated cases with realistic activity, material, and density distribution. Th complete workflow is depicted in Figure 2, and each component is described in detail the sections below.

Positron Range Simulator
The MC tool PenEasy (v2020) [37] based on PENELOPE [37] was used to simulate the PR for different radionuclides in several heterogeneous biological tissues. PENELOPE is a code for the MC simulation of coupled transport of electrons, positrons, and photons. It is suitable for the range of energies between 100 eV and 1 GeV and allows for definition of complex materials and geometries. In this work, PenEasy was adapted to generate 3D images of the spatial distribution of the positron annihilation points, using as an input the 3D images of the positron emissions and CT (see Figure 2). PenEasy considers the path traveled by each positron until its annihilation, taking into account its energy distribution and all the materials in the field of view.
The energy distributions of the positrons emitted by 68 Ga and 18 F were obtained with the code PenNuc [18], which considers all possible decay branches and nuclear properties for a large set of tabulated radionuclides (see Figure 1a).

Simulated Cases
We used numerical models of mice from a repository [38] to simulate the different cases needed for training, testing, and validation of the NN. The material composition and density of each segmented tissue in the models were directly obtained from the repository, while different activities were assigned to each tissue type, such as heart, liver, kidneys, and tumors, using a range of typical values found in 18 F-Fluorodeoxyglucose ( 18 F-FDG) acquisitions. The numerical models consisted of 154 × 154 × 242 cubic voxels of 0.28 × 0.28 × 0.28 mm, which covered a major part of the bodies of the mice (except for the brain, which was not included in the segmentation of the repository).
A total of eight different whole-body mouse models were used with PenEasy to generate the positron annihilation distributions from the initial positron emission, material, and density distributions (see Figure 3). Each model was simulated twice, once for 18 F and once for 68 Ga. Each simulation consisted of around 3 × 10 8 positron emissions, simulated at a rate of 3.4 × 10 4 histories per second for the 18 F simulations and 2.2 × 10 4 histories per second for the 68 Ga simulations in an Intel(R) Xeon(R) CPU @ 2.30GHz computer.

Positron Range Simulator
The MC tool PenEasy (v2020) [37] based on PENELOPE [37] was used to simulate the PR for different radionuclides in several heterogeneous biological tissues. PENELOPE is a code for the MC simulation of coupled transport of electrons, positrons, and photons. It is suitable for the range of energies between 100 eV and 1 GeV and allows for definition of complex materials and geometries. In this work, PenEasy was adapted to generate 3D images of the spatial distribution of the positron annihilation points, using as an input the 3D images of the positron emissions and CT (see Figure 2). PenEasy considers the path traveled by each positron until its annihilation, taking into account its energy distribution and all the materials in the field of view.
The energy distributions of the positrons emitted by 68 Ga and 18 F were obtained with the code PenNuc [18], which considers all possible decay branches and nuclear properties for a large set of tabulated radionuclides (see Figure 1a).

Simulated Cases
We used numerical models of mice from a repository [38] to simulate the different cases needed for training, testing, and validation of the NN. The material composition and density of each segmented tissue in the models were directly obtained from the repository, while different activities were assigned to each tissue type, such as heart, liver, kidneys, and tumors, using a range of typical values found in 18 F-Fluorodeoxyglucose ( 18 F-FDG) acquisitions. The numerical models consisted of 154 × 154 × 242 cubic voxels of 0.28 × 0.28 × 0.28 mm, which covered a major part of the bodies of the mice (except for the brain, which was not included in the segmentation of the repository).
A total of eight different whole-body mouse models were used with PenEasy to generate the positron annihilation distributions from the initial positron emission, material, and density distributions (see Figure 3). Each model was simulated twice, once for 18 F and once for 68 Ga. Each simulation consisted of around 3 × 10 8 positron emissions, simulated at a rate of 3.4 × 10 4 histories per second for the 18 F simulations and 2.2 × 10 4 histories per second for the 68 Ga simulations in an Intel(R) Xeon(R) CPU @ 2.30GHz computer.

PET Acquisition Simulation and Reconstruction
In order to generate images more similar to actual PET reconstructed images, the positron annihilation distributions obtained from PenEasy were used to simulate a realistic PET acquisition in a generic preclinical scanner similar to an Inveon PET/CT scanner [39] using the MC simulator MCGPU-PET [40]. MCGPU-PET is a version of the MC-GPU software adapted for PET, which was developed for X-ray imaging [41]. MCGPU-PET allows very fast and realistic simulation of PET acquisitions from voxelized activity, material, and density distribution. MCGPU-PET can generate data which can be histogrammed into 3D sinograms, in our case with 147 × 168 × 1293 bins considering a maximum ring difference of 79, an axial compression factor of 11 and a radial bin size of 0.795 mm. MCGPU-PET simulations in a computer with a GeForce GTX 1080 8 Gb GPU contain around 1.2 × 10 9 coincidences in a minute (2 × 10 7 coincidences/second), including scatter and non-scattered true coincidences.

PET Acquisition Simulation and Reconstruction
In order to generate images more similar to actual PET reconstructed images, the positron annihilation distributions obtained from PenEasy were used to simulate a realistic PET acquisition in a generic preclinical scanner similar to an Inveon PET/CT scanner [39] using the MC simulator MCGPU-PET [40]. MCGPU-PET is a version of the MC-GPU software adapted for PET, which was developed for X-ray imaging [41]. MCGPU-PET allows very fast and realistic simulation of PET acquisitions from voxelized activity, material, and density distribution. MCGPU-PET can generate data which can be histogrammed into 3D sinograms, in our case with 147 × 168 × 1293 bins considering a maximum ring difference of 79, an axial compression factor of 11 and a radial bin size of 0.795 mm. MCGPU-PET simulations in a computer with a GeForce GTX 1080 8 Gb GPU contain around 1.2 × 10 9 coincidences in a minute (2 × 10 7 coincidences/second), including scatter and non-scattered true coincidences.
For reconstruction of the sinograms, we used GFIRST [42], which is the GPU-accelerated version of FIRST [30], a 3D-OSEM algorithm which allows a physical model to be incorporated into the SRM. In this case, the SRM used was the standard one created based on 18 F in water. We used one subset and 40 iterations. The final images consisted of 154 ×154 × 80 voxels with a size of 0.28 × 0.28 × 0.795 mm, which is the typical size of the images reconstructed in the Inveon scanner [39]. The total reconstruction time was 50 s in a GTX 1080 8 Gb GPU. The values of the reconstructed images were converted into standardized uptake value units (SUV) to make it easier to evaluate the performance of the method. Using SUV units, a radiotracer with uniform distribution in the body would have a SUV of 1.
At the end of the whole simulation workflow (see Figure 2), eight mice were simulated with 18 F and 68 Ga, for a total of 640 slices. In order to make the size of the slices more tractable to the neural network, each image of 154 × 154 pixels was padded with zeros to reach a size of 160 × 160 pixels.

Neural Network
The CNN was implemented in Python within the Tensorflow framework [43] (v 2.3.0) with Keras [44]. It was based on the U-Net network [35] which has been demonstrated to be useful in many medical imaging applications [45,46]. We directly used the U-Net model available in Keras with four levels, 64 filters, and a dropout factor of 0.2 (see Figure 4). The source code can be found in Reference [36].

Model Training
The simulated cases were separated as follows: One volume was set aside and n used in the training/validation process, while the other seven volumes were process and data augmentation was used with flip and shifts in horizontal and vertical directio and in-plane rotations. This enabled the network to be trained with a much larger vari of cases than the initially simulated ones. Note that zoom could not be used in this ca for data augmentation, as it would have distorted the PR effect.
Model training was performed with the recently proposed Rectified Adam optimi [48] (learning rate of 1 × 10 −3 ) and with the Lookahead technique [49]. The combination these techniques provided a much faster convergence of the training process compared the more commonly used Adam optimizer. The loss function used was the L1-norm tween each slice of the ground truth of the 18 F image, and the output slice of the Dee The inputs to the model were slices of the 68 Ga (PET) and µ-maps (CT) volumes. In this work, we wanted to evaluate the amount of input information needed to perform an accurate PRC with a neural network. Therefore, we considered six different cases (see Table 1). In three of them, only the 68 Ga PET images were considered for the input, and in three of them both the 68 Ga and µ-maps (PET/CT) were considered. In each of them, we evaluated the performance of the method when using one, three, and five input slices. The different slices were used as channels in the input layer (see Figure 4). In all cases, the output was the corresponding central slice from the 18 F-PET image with 160 × 160 × (1 channel). The sizes of the different input layers are shown in Table 1. In all cases, the total number of parameters was 31.7 million and the size of the model (hdf5 file) was 485 Mb. In this work, the Swish activation function [47] was used instead of ReLU (except for the final output layer). Swish is a new, self-gated activation function which performs better than ReLU with a similar level of computational efficiency.

Model Training
The simulated cases were separated as follows: One volume was set aside and not used in the training/validation process, while the other seven volumes were processed and data augmentation was used with flip and shifts in horizontal and vertical directions, and in-plane rotations. This enabled the network to be trained with a much larger variety of cases than the initially simulated ones. Note that zoom could not be used in this case for data augmentation, as it would have distorted the PR effect.
Model training was performed with the recently proposed Rectified Adam optimizer [48] (learning rate of 1 × 10 −3 ) and with the Lookahead technique [49]. The combination of these techniques provided a much faster convergence of the training process compared to the more commonly used Adam optimizer. The loss function used was the L1-norm between each slice of the ground truth of the 18 F image, and the output slice of the Deep-PRC network.
For training, we used an NVIDIA T4 (NVIDIA Corporation, Santa Clara, CA, USA) graphics processing unit (GPU) with 16 GB of memory from Google Cloud (AI notebook running Jupyter Lab, with CUDA 10.1). The models were trained for 50 epochs with 100 iterations each. It took around 1 h to train each considered case.

Application and Quantitative Analysis
The trained models were saved as Keras models in hdf5 format. The models were then loaded and applied to a simulated study not included in either the training or the validation datasets.
The input was adapted to the specific characteristics of each trained network: • Selecting the corresponding input slices-in the case of the models with three and five input slices, the slices closer to the edges of the axial FOV were extended to avoid truncation. • Zero-padding the images to obtain 160 × 160 pixels in each slice.

•
Normalizing the values to be between −1 and 1. This normalization was restored in the output, so that the PRC preserved the appropriate units.
A quantitative analysis of the resulting image was performed to obtain the mean (µ) and standard deviation (σ) in different organs. The noise was defined as the σ:µ ratio in uniform regions away from any boundaries and edges. The recovery coefficients were obtained by defining regions over the whole organs, and their values were then normalized respective to the reference reconstruction with 18 F. The differences between the 68 Ga images before and after the proposed PRC could be easily evaluated from the obtained coefficients. Figure 5 shows the evolution of loss function during the training step in the different cases (depending on the amount and type of input information provided). It can be seen that the L1 was significantly minimized in all cases, although the convergence was more monotonic and reached a lower L1 loss function in the case of using five PET slices (Figure 5c). Some significant spikes were noticed in the validation cases. This reflected some possible directions in the training which may have yielded neural networks that produced images with artefacts. Fortunately, these unwanted solutions did not last long, and the convergence process continued without problems. monotonic and reached a lower L1 loss function in the case of using five PET slices (Figure 5c). Some significant spikes were noticed in the validation cases. This reflected some possible directions in the training which may have yielded neural networks that produced images with artefacts. Fortunately, these unwanted solutions did not last long, and the convergence process continued without problems. Figure 6 shows a coronal view of a mouse with the μ-map obtained from the CT, and the reconstructed images of 18 F, 68 Ga, and 68 Ga after the PRC (using the model with five slices and PET-only input). It can be easily seen that the proposed method was able to recover the resolution loss in 68 Ga images with respect to 18 F, and that this increased the values in some areas with higher uptake.   Figure 6 shows a coronal view of a mouse with the µ-map obtained from the CT, and the reconstructed images of 18 F, 68 Ga, and 68 Ga after the PRC (using the model with five slices and PET-only input). It can be easily seen that the proposed method was able to recover the resolution loss in 68 Ga images with respect to 18 F, and that this increased the values in some areas with higher uptake.

Model Deployment and Quantitative Analysis
In order to test the model, the trained Deep-PRC network (PET, five slices) was applied to a simulated case considered in neither the training nor the validation process. The time required to obtain the PRC on the 80 slices of the whole volume was 5.14 s in a T4 GPU and 2.85 s in a V100 GPU. Although these results could be sped up by using multiple GPUs, they are already fast enough to be used in preclinical and clinical applications.
Profiles along some organs of interest, such as the bladder, heart and tumor in the 18 F, 68 Ga, and 68 Ga with Deep-PRC images are shown in Figure 7. The significant impact of the PR in these cases is quite clear, as well as the capacity of the proposed method to correct for this effect. The differences in SUV units between the estimated images ( 68 Ga with Deep-PRC) and the reference one ( 18 F) are shown in Figure 8. Areas with high uptake,

Model Deployment and Quantitative Analysis
In order to test the model, the trained Deep-PRC network (PET, five slices) was applied to a simulated case considered in neither the training nor the validation process. The time required to obtain the PRC on the 80 slices of the whole volume was 5.14 s in a T4 GPU and 2.85 s in a V100 GPU. Although these results could be sped up by using multiple GPUs, they are already fast enough to be used in preclinical and clinical applications.
Profiles along some organs of interest, such as the bladder, heart and tumor in the 18 F, 68 Ga, and 68 Ga with Deep-PRC images are shown in Figure 7. The significant impact of the PR in these cases is quite clear, as well as the capacity of the proposed method to correct for this effect. The differences in SUV units between the estimated images ( 68 Ga with Deep-PRC) and the reference one ( 18 F) are shown in Figure 8. Areas with high uptake, such as the bladder, still had some residual error (as it can be also seen in Figure 7a), but it was a small deviation in relative terms.

Model Deployment and Quantitative Analysis
In order to test the model, the trained Deep-PRC network (PET, five slices) was applied to a simulated case considered in neither the training nor the validation process. The time required to obtain the PRC on the 80 slices of the whole volume was 5.14 s in a T4 GPU and 2.85 s in a V100 GPU. Although these results could be sped up by using multiple GPUs, they are already fast enough to be used in preclinical and clinical applications.
Profiles along some organs of interest, such as the bladder, heart and tumor in the 18 F, 68 Ga, and 68 Ga with Deep-PRC images are shown in Figure 7. The significant impact of the PR in these cases is quite clear, as well as the capacity of the proposed method to correct for this effect. The differences in SUV units between the estimated images ( 68 Ga with Deep-PRC) and the reference one ( 18 F) are shown in Figure 8. Areas with high uptake, such as the bladder, still had some residual error (as it can be also seen in Figure 7a), but it was a small deviation in relative terms. The quantitative analysis of the results is shown in Table 2. From the table, it is clear that the 68 Ga images corrected by PR images were very similar to the 18 F images (with recoveries greater than 95% of the reference values). Additionally, the noise level of the estimated images was comparable to that of the reference ones, which indicates that the proposed method did not trade noise for resolution, as is the case in many deconvolutionbased approaches for PRC.   The quantitative analysis of the results is shown in Table 2. From the table, it is clear that the 68 Ga images corrected by PR images were very similar to the 18 F images (with recoveries greater than 95% of the reference values). Additionally, the noise level of the estimated images was comparable to that of the reference ones, which indicates that the proposed method did not trade noise for resolution, as is the case in many deconvolutionbased approaches for PRC.

Discussion
This paper presents the use of a deep convolutional neural network to provide an accurate PRC in PET. The method was evaluated in simulations in preclinical studies and its performance characterized. To the best of our knowledge, this is the first work to successfully combine deep learning and PRC in a coherent framework.
Our results indicate that overall, the image quality produced by the learned model is comparable to that of the reference images, with recoveries going from around 60% to more than 95% while maintaining low noise levels.
One interesting question that we wanted to address in this work is the amount of input data required for these types of algorithms. With PR being a three-dimensional effect (i.e., it affects not only a particular single slice) and being dependent on both the PET activity and the material distribution (CT), it may be reasonable to expect that many PET and CT input slices will be needed to generate an accurate output slice. On the other hand, many works have shown that the information in PET and CT images is not independent, and that a deep neural network may be able to estimate to some extent images of one modality from the other. This fact may indicate that using only the PET image as input may be enough.
The results obtained with all methods considered were good, as shown by the L1 loss function in the validation cases. Nevertheless, it seems that the training with just the PET images as input, and with large enough axial slices was the best option. It is important to note that in our case, each slice was two times larger than the pixel size in the transverse direction. Therefore, considering five slices corresponded to using 4 mm in the axial direction, which seemed enough to consider the effects of surrounding slices for a specific slice. In this work the number of slices were limited by the amount of memory available in the GPU. Nevertheless, this is something that will be easily solved with new GPU models.
The training was based on minimization of the L1 norm between the reference images and the estimated ones. The L1 norm is known to be more stable and robust to the presence of noise than the L2 norm [50]. In any case, other loss functions could be explored in this context, including a loss term from an adversarial network (GAN).
A detailed comparison of the performance of the proposed method with previously proposed ones (described in Section 1.2) is outside the scope of this work, but we plan to perform this detailed comparison in a future work. In any case, the fact that the proposed Deep-PRC had no significant impact on the noise level of the images is a clear advantage compared to previous approaches [25].
It is important to note that although we have proposed the method as a postprocessing step, the same neural network architecture could be used to generate a PR model that could be applied in the forward projection within an image reconstruction (simply by inverting the input and outputs of the NN). This line of research will be explored in future works.
In this work, the axial FOV of the scanner was large enough for mice, so no significant axial truncation of the PET activity was present. If there is significant truncation (as it usually happens in whole-body PET acquisitions in which only a section of the body is examined in each bed axial position), it may be advisable to perform this PRC on the final 3D whole-body volume to avoid possible truncation artefacts (from activity out of the axial FOV in each particular bed position). This can be argued to be an advantage of the postprocessing PRC approach, as it can be easily applied to multibed studies in which the activity from other bed positions might have a non-negligible effect.
In this work, the effect of positronium formation in the positron range was not considered [9,51]. This can occur when, after losing its kinetic energy, the positron reaches thermal velocities (a few eV) and instead of annihilating directly with an electron into two gammas, forms an intermediate state called positronium, which may extend its lifetime. Its effect on PR is not well established, but it may have a non-negligible effect in porous materials or low-density ones such as the lungs.
The cases developed in this work are not exclusive of any scanner or radionuclide in particular. In this work, we used the preclinical scanner Inveon and 68 Ga as a reference, but the proposed approach is flexible and suitable for any preclinical and clinical PET systems and with any radionuclide.

Conclusions
We developed and evaluated a deep convolutional neural network (Deep-PRC) that provides a fast and accurate PRC method to recover the resolution loss present in PET studies using radionuclides that emit positrons with large PR. We demonstrated its quantitative accuracy in realistic simulations of preclinical PET/CT studies with 68 Ga.
Our results suggest that it is sufficient to use PET images as input for the neural network (i.e., without the corresponding CT or the µ-map extracted from it), but it is important to include not only the reference slice (i.e., 2D case), but also some additional neighbor slices.
The correction of PR effects in PET image reconstruction is becoming mandatory in light of the increasing use of high-energy positron emitters in preclinical and clinical PET imaging and their improved spatial resolution. Convolutional neural networks seem to be very well suited for this type of correction. Funding: We acknowledge support from the Spanish Government (RTI2018-095800-A-I00), from Comunidad de Madrid (B2017/BMD-3888 PRONTO-CM). and the NIH R01 CA215700-2 grant. JLH also acknowledges support from a Google Cloud Academic Grant.

Institutional Review Board Statement: Not applicable.
Informed Consent Statement: Not applicable.

Data Availability Statement:
The source code and some training data are available in Github ( https://github.com/jlherraiz/deepPRC).