Bayesian Maximum-A-Posteriori Approach with Global and Local Regularization to Image Reconstruction Problem in Medical Emission Tomography

The Bayesian approach Maximum a Posteriori (MAP) provides a common basis for developing statistical methods for solving ill-posed image reconstruction problems. MAP solutions are dependent on a priori model. Approaches developed in literature are based on prior models that describe the properties of the expected image rather than the properties of the studied object. In this paper, such models have been analyzed and it is shown that they lead to global regularization of the solution. Prior models that are based on the properties of the object under study are developed and conditions for local and global regularization are obtained. A new reconstruction algorithm has been developed based on the method of local statistical regularization. Algorithms with global and local regularization were compared in numerical simulations. The simulations were performed close to the real oncologic single photon emission computer tomography (SPECT) study. It is shown that the approach with local regularization produces more accurate images of ‘hot spots’, which is especially important to tumor diagnostics in nuclear oncology.


Introduction
Image reconstruction belongs to the class of ill-posed inverse problems of mathematical physics [1]. The method of solution for this class of problems was developed in 1963 by A.Tikhonov and called regularization [2]. Regularization introduces a priori information regarding the problem to obtain well-behaved inverse. The introduction of a priori information entails the necessity of choosing an unknown parameter, which is called the regularization parameter. In Tikhonov's approach, this parameter plays a role of a weight factor, which controls the competition between a priori information and the measured data. Initially, the method was developed in the form of 'global regularization', in which a single parameter controls the solution. However, it was found that global regularization often provides too smoothed solutions, even in the early practical applications. The idea of 'local regularization' was suggested to locally regulate 'a level of smoothness' to improve the method [3]. The reconstruction algorithm based on local regularization was developed and applied in the x-ray tomographic systems 'CPT-1000 and 'CPT-5000 [4]. In both, numerical simulations and clinical practice, it was proven that local regularization produces images that are deemed of higher quality in comparison to global regularization. In [2,3], a priori information was introduced in deterministic form.
In 1967, the physicist V. Turchin suggested using the Bayesian method of Maximum a Posteriori (MAP) for solving inverse ill-posed problems with stochastic data, naming this approach 'statistical regularization' [5,6]. The statistical regularization method introduces a priori information in the f j ∝ n j (1) The photons that are emitted by the radiopharmaceutical are collected in a system of detectors located around the patient. Measurements are denoted by the random vector g = g i , i = 1 . . . I . The data acquisition process is modeled by the system of linear equations: j a ij f j = g i (2) g i are the measured data in the i-th detector cell, a ij is the random operator that describes which fraction of photons emitted from the voxel j is detected in the detector pixel i. In Figure 1, an example of measured clinical data is demonstrated. A left anterior oblique projection that was obtained in myocardial perfusion SPECT imaging is shown. One can see that these data have stochastic properties. In PET and SPECT, the Poisson distribution gives the likelihood function: with means:

Maximum-Likelihood-Based Image Reconstruction Method (ML)
When a researcher does not have any prior information, the simple way is to assume that all possibilities are equally probable and Bayesian estimation (6) reduces to the Maximum Likelihood (ML) solution: The log-likelihood function is written: In (8), and further in the formulas, by ij a is meant ij a . The ML method gives the following The matrix element a ij stands for the probability that a photon emitted from the voxel j will be detected in the detector pixel i. Having stochastic data g, the reconstruction problem is formulated as a classical problem of mathematical statistics: what is the probability density P(f|g) for the solution f given g? The probability P(f|g) is determined according to the Bayesian method Maximum a Posteriori (MAP): P(f) is the a priori probability density function and P(g f) is the likelihood distribution of the observed data. MAP estimation (in logarithmic form) provides the solution of the reconstruction problem:f = arg f >0 max ln P(f) + ln P(g|f) (6) The main difficulty of the Bayesian approach is the determination of the a priori probability P(f). There are two different classes of imaging problems: restoration of distorted images and tomographic reconstruction of objects. When solving the image restoration problems, it is usually impossible to include a priori information about real objects. A priori information refers to the expected image to be restored. In contrast, when reconstructing tomographic images, one can often propose the physical model of the object to be reconstructed and determine a priori information by using this model. Currently, in tomographic problems, one uses a priori information, which was initially developed to solve the image restoration problems. In this paper, we discuss these approaches to specifying a priori probability in both the image restoration problem and the image reconstruction problem. We consider the three most widely used forms of a priori information: no prior, Gibbs prior, and entropy-based prior.

Maximum-Likelihood-Based Image Reconstruction Method (ML)
When a researcher does not have any prior information, the simple way is to assume that all possibilities are equally probable and Bayesian estimation (6) reduces to the Maximum Likelihood (ML) solution:f = arg f max ln P(g|f) The log-likelihood function is written: In (8), and further in the formulas, by a ij is meant a ij . The ML method gives the following resulting solution (ML algorithm) [22]: f n j is an estimation of f j on the n-th iteration step. As stated in the Introduction, a fast version of the ML algorithm (9), known as the OSEM, is applied in most existing PET and SPECT clinical systems.

Gibbs Prior Based on Image Properties
Initially, the prior probability in the form of Gibbs distribution was developing for restoration of degraded images [18]. The image was considered as pixel/voxel gray levels in a lattice-like physical system. It was assumed that the Markov Random Field (MRF) model reflects the expectations of smoothness and captures the distributions of 'units of grey'. According to the Hammersly-Clifford theorem, the MRF has the distribution, which is described by the Gibbs probability [23]: U(f) is called the energy function, Z is a normalizing constant, and β is a positive constant. The energy function U(f) is designed, so that the expected image configurations are those for which neighboring pixels have similar intensities. Still, sharp gradients can occur. This local constraint is written as: where c j denotes the set of neighbors of pixel j, V is a potential function defined on pairwise cliques of neighboring pixels, w jk is a weight factor specified the closeness of pixels j and k. The prior probability (10) in logarithmic form is written as: ln In [8], this form of prior probability was adopted to solve the problem of image reconstruction in SPECT with the following argumentation: (1) isotope concentration being fairly constant within small regions of common tissue type; neighboring pixels are more likely than not to have similar isotope concentrations; (2) there are may be a sharp gradient in the intensity values of neighboring pixels: sharp boundaries will occur between two tissue types, or between two organs. Taking into account the log-likelihood function (8) and prior probability (12), the Bayesian estimation (6) is written as: The MAP method with Gibbs a priori distribution (MAP-GIBBS) gives the resulting solution [24]: This algorithm is widely studied in the literature for solving PET and SPECT reconstruction problems. A variety of energy functions U(f) have been proposed in the literature. They differ by the choice of potential function V that assigns cost to differences between neighboring pixels/voxels. As was noted above, on one side, piecewise smoothness is preferred, but, on the other side, a priori information should be tolerant for large differences between neighboring voxels. For example, edge-preserving functions V were developed in [24,25]: The parameter δ in (15) and (16) controls the difference between f j and f k in neighboring voxels. The following comments can be added analyzing the MRF-based Gibbs a priori probability (12). The MRF-based prior model was developed for image restoration, segmentation, and texture modeling, and, formally, it reflects image properties rather than the properties of a real physical object. Expressions (15) and (16) are looking as mathematical constructs, but it is not very clear a physical nature of V. The parameter β is a constant and it leads to global regularization in the MAP-GIBBS algorithm (14).

Gibbs a Priori Probability Based on a Closed System Model
In Section 2.3.1, we discussed Gibbs prior probability that reflects the properties of an expected image. In the present section, the Gibbs prior probability is related to the object to be reconstructed. In nuclear medicine, the real object to be reconstructed is a steady-state (during data acquisition time) spatial distribution of radiopharmaceutical particles in a patient body. The particles do not interact with each other due to low radionuclide concentration. We consider this distribution as a closed physical system, placed in thermostat (patient body) with a certain temperature T. It is assumed that some force field causes non-uniform spatial particles distribution. We do not consider the specific fields and forces that hold the particles. Under the influence of the field, the different parts of the system are in different conditions, so the system might be spatially non-uniform and, at the same time, equilibrium [26]. The canonical Gibbs distribution describes such a system: where H(n) is the Hamiltonian of the system of N particles. For simplicity, we assume that the volume of each voxel is unit and the concentration n j determines the number of particles in the voxel j.
The Hamiltonian is written for the non-interacting particles: where q and p are the coordinates and momenta of particles. According to the Gibbs theorem, an average energy is a constant value. For neighboring voxels k and j this can be written as: where µ is a chemical potential per particle (without field), U is a potential energy of particles in the field. Taking into account that in thermostat T = const one obtains from (19): On the other side, in thermodynamically equilibrium system with T = const where dp is the pressure difference between neighboring voxels and v = 1/n j is the specific volume of one particle [26]. The potential U is defined from (20) and (21) as a function of pressure gradient.
In the case of non-interacting particles, U can be represented as the particle concentration gradient in a discrete form, as follows: w jk is a weight factor specified the closeness of pixels j and k, c j denotes the set of neighbors of pixel j. Substituting (18) into (17) and integrating the Gibbs distribution (17) on momenta one obtains: K is a positive constant that is a result of multiplying 1 Z and the constant of integration on momenta. The probability density (23) can be written in logarithmic form as: While taking into account (1), the prior probability (24) can be rewritten (up to a constant): ln When comparing expressions (12) and (25), one can see that they are the same. This approach, which is based on the statistical physics model, allow for us to answer the comments at the end of the previous Section 2.3.1. Different forms of the potential function (15), (16) correspond to different state equations of the system. The parameter β must be constant throughout the solution area. Therefore, Gibbs a priori probability necessarily leads to global regularization in the MAP-GIBBS algorithm.

Bayesian Image Reconstruction Method MAP-ENT
Another a priori model, which is based on the entropy principle, was successfully applied in the fields of X-ray-, radio-and gamma-astronomy, plasma tomography [11][12][13][14][15], but less in medicine tomography. Applied to solve the problem of astronomical image restoring, the Maximum Entropy approach has demonstrated that resolved and unresolved sources can be restored with reliability [27,28]. These results are of interest for nuclear medicine in the context of the similarity between astronomical images and 'hot spots' tumor images. In the absence of correlations, the entropy-based method is superior to the Gibbs' approach in 'hot spots' identifying. Therefore, in the present paper, this method is considered from the point of view of its potential in nuclear medicine.

Entropy a Priori Probability Based on Image Properties
The ME approach that is based on image properties can be formulated, as follows. An image is considered as a set of boxes in which a large number of 'particles' are distributed. Different authors make different sense for the term 'particles', as 'radiance units', luminance quanta, units of gray, silver particles. According to the combinatorial theorem, the number of different ways of filling the boxes is given by.
j is a box index, N j is the number of 'radiance units' in the j-th box, and N 0 is the total number of 'radiance units'. While using the Stirling approximation, one can write W in the logarithmic form, as follows: Boltzmann has found the connection between entropy S and multiplicity W: k is a coefficient depending on the taken dimension. The distribution with higher entropy might be realized in the greatest number of ways and, hence, it yields the most probable solution consistent with given data. Therefore, entropy can be used as prior probability density in the Bayesian approach Maximum a Posteriori: ln P(N j N 0 ) = ln W (29) One usually uses the following form for the entropy-based prior: where f j is an expected value of unknown source function, β is a constant associated with the relation between N j /N 0 and f j , which controls the competition between entropy-based prior information and measured data. In [12], the entropy-based prior was developed in the form: m j is the default model. In our next work, we are going to study and compare the forms (30) and (31) in application to nuclear medicine. In the present work, entropy form (30) was used. In nuclear medicine, the Bayesian approach with entropy prior was developed for images reconstruction in SPECT [16]. Substitution of the entropy prior (30) and condition probability (8) into (6) gives the Maximum a Posteriori estimation with entropy-based prior (MAP-ENT): The iteration relation for determining f was obtained in [16], as: where n is an iteration number. The sum i is taken over the set of observations. The iteration solution (33) depends on the regularization parameter γ = 1/β. In our calculations, the multiplicative form of the MAP-ENT algorithm with f 0 j = 1/e was used: Analyzing entropy prior (30), we can add the following comments: (a) derivation of Formulas (26)-(30) was based on the properties of an image; (b) the role of the parameter γ = 1/β in (33) is not very clear, it is also not clear whether it can be locally changed.

Entropy Prior Based on the Boltzmann Isolated System Model
The spatially non-uniform distribution of non-interacting radiopharmaceutical particles in the patient's body can be described while using the model of the non-equilibrium Boltzmann ideal gas. In contrast to the prior model (30), here we consider the distribution of particles in phase space. Up to a constant factor, the Boltzmann entropy is written as: S = − n(p, q) ln n(p, q)dpdq (35) where n(p, q) is the particle density in the phase space. Replace the variables dpdq = dq2π(2m) 3/2 ε 1/2 dε (36) where ε is particle energy. Entropy functional (35) can be rewritten as: In (37), integration over the spatial coordinate dq is replaced by summation over discrete voxels q j . We assume that the voxel size is so small, so that an equilibrium energy distribution is established in each voxel q j and the particle energy distribution function ρ(ε) has a very sharp maximum at ε(q j ) = ε(q j ), where ε is a mean energy. According to the Boltzmann theory for isolated systems, the mean energy value should be the same in the entire system ε(q j ) = ε 0 . The entropy functional can be rewritten as: The 'width' ∆ε is defined from normalization ρ(ε)dε = ρ(ε 0 )∆ε = 1. As a result, the entropy functional (38) is presented in its standard form: where n(q j ) is the radionuclide particles concentration in the voxel with co-ordinate q j . Taking (1) into account, (39) is written as: When comparing expressions (31) and (40), one can see that they are the same. The regularization parameter β depends on the particles mean energy. β is a constant throughout the solution area.

Entropy Prior Based on Open System Model
Open systems are the systems that are exchanged with environment by energy, matter, and information. Therefore, emitting objects are open systems. In SPECT and PET diagnostic procedures, a radiopharmaceutical is injected into a patient body. Different organs have different metabolism rate for the injected radiopharmaceutical. In the same organ, healthy and ill tissues can also have different metabolism rate. The accumulative model is considered, which describes a final steady-state non-equilibrium spatial distribution of radiopharmaceutical in a patient body. Due to radioactive decay, radiopharmaceutical emits gamma photons, so a patient body can be considered as an emitting open system. It is assumed that the processes of nuclear excitation and de-excitation occur at the same rate during all the time of the patient examination procedure. In an open system, there can be regions in which the average energy values will be different. The total entropy is equal to the sum of the entropies of the subsystems since these regions are independent. For the case of two selected regions C1 and C2 with mean energies ε 0 and ε 1 , the expression (38) is changed, as follows: In a general case, entropy -based prior can be written as: ln where β k is a local regularization parameter. Substitution of the entropy prior (42) and condition probability (8) into (6) gives the Maximum a Posteriori estimation with entropy-based prior and local regularization (MAP-ENT-LOC):

Numerical Simulations
Numerical experiments were performed to study the local regularization method and to compare it with global regularization method. Figure 2a shows the two-dimensional (2D) clinical image in a torso axial cross section. One can see liver and stomach in the image. A liver lesion is shown as the bright area (hot spot) in the upper left part of the image. The image was obtained in SPECT with 111In-octreotide in Blokhin Russian Cancer Research Center. The appropriate 2D mathematical model was developed, as shown in Figure 2b. The tumor size in the model is smaller than the true tumor. Tumor activity (concentration of the radiopharmaceutical) was modeled 1.6 times higher than in the surrounding healthy liver tissues. In numerical simulations, the 2D model was sampled on the grid 128 × 128. Projection data were generated for 64 angular views over. The data acquisition model included the effects of non-uniform attenuation, collimator-detector response, and Poisson statistics [21]. Reconstruction of images from simulated data was performed by using the following four algorithms: • OSEM-the standard non-regularized algorithm (9) applied on the most SPECT and PET systems;

Results and Discussions
The OSEM algorithm with eight subsets was applied in the first simulation study. Progress in images reconstructed after 1-st, 2-nd, and 5-th iterations is shown in Figure 3a-c. The profiles of activity along the line shown by arrow in Figure 2b were calculated and they are demonstrated in Figure 3d. OSEM is not a regularized algorithm; therefore, its behavior in iteration process is unstable and noisy artifacts dominate the images. Noise progresses more rapidly in the zones with lower statistics. In Figure 3a, one can see that after the first iteration, the image shows the presence of In numerical experiments, the behavior of these algorithms was studied. It should be noted that local regularization is a time consuming problem. In our simulation, the general reconstruction was performed by using the MAP-ENT with global regularization, and only for the selected organ (the liver) the MAP-ENT with local regularization was applied. The prior model was used in the form (42) with two regularization parameters: one for the healthy liver tissues and the second for the tumor. As it is assumed that we 'do not know the location of the tumor in advance', the second parameter was included in reconstruction algorithm in some iteration, provided that the activity in the voxel has exceeded the value of normal activity in healthy tissues. At that, we assume that the algorithm 'knows or can estimate' the level of activity in the healthy tissues.

Results and Discussions
The OSEM algorithm with eight subsets was applied in the first simulation study. Progress in images reconstructed after 1-st, 2-nd, and 5-th iterations is shown in Figure 3a-c. The profiles of activity along the line shown by arrow in Figure 2b were calculated and they are demonstrated in Figure 3d. OSEM is not a regularized algorithm; therefore, its behavior in iteration process is unstable and noisy artifacts dominate the images. Noise progresses more rapidly in the zones with lower statistics. In Figure 3a, one can see that after the first iteration, the image shows the presence of a tumor, but its activity, as is seen in Figure 3d, is underestimated and the borders are blurred. Noise peaks already have appeared on image at the second iteration. Therefore, the iterative process should have been stopped after the first iteration; however, image deterioration more rapidly progresses in the zones with lower count statistics. At a stopping point, a part of the image with low count statistics could already become noisy while another part of the image with high statistics have not reached its optimal resolution yet. This problem is especially important in oncology, because the noise in the low count zones can imitate false 'hot spots'.  The MAP-GIBBS algorithm was used in the second simulation study. The appropriate value of the global regularization parameter β was empirically determined. Figure 4a-c show images that were reconstructed after 10-th, 15-th, and 20-th iterations. Figure 4d shows corresponding profiles of activity. Simulations were performed with different potential functions. In the presented pictures, the potential (15) was used. Rather minor differences were obtained between the images reconstructed by the MAP-GIBBS with potential functions (15) and (16). Analyzing the results presented in Figure 4, one can see that MAP-GIBBS algorithm blurs borders of the tumor. The underlying reason is that MAP-Gibbs algorithm produces the images that support local spatial correlations in the source function to be reconstructed.
The MAP-ENT algorithm was applied in the third simulation study. Figure 5a-c show images reconstructed after 10-th, 15-th, and 20-th iterations. Figure 5d shows the corresponding profiles of activity. As it was expected, in the absence of correlations in the MAP-ENT algorithm is superior to the MAP-GIBBS in identifying 'hot spots'.    MAP-ENT-LOC algorithm was used in the fourth simulation study. Figure 6a-c show images that were reconstructed after 10-th, 15-th, and 20-th iterations. Figure 6d shows corresponding profiles of activity. It is seen that local regularization leads to improved sensitivity in tumor image in both the activity value and border identification.

Conclusions
In the Introduction, we suggested the hypothesis that disadvantages in images reconstructed by MAP algorithms are associated with global regularization. The new algorithm MAP-ENT-LOC with local regularization is developed. The suggested hypothesis was checked in computer simulations close to real clinical case. Four algorithms were compared: OSEM, MAP-GIBBS, MAP-ENT, and MAP-ENT-LOC. The comparative analysis has shown that local regularization leads to improved sensitivity in tumor image in comparison to global regularization in both activity value and border identification. Joint clinical and simulation studies are necessary for further development of this new advanced approach. In these simulations, the hot spot is a region of interest. Image quality is evaluated by using the profiles. The reconstructed tumor boundary and tumor activity are visually evaluated in comparison to the exact profile. The behavior of the MAP algorithms in an iterative process depends on the value of the regularization parameter. The parameter was empirically chosen. Further studies will be performed to derive a statistical criterion for choosing the optimal regularization parameter and stopping the iterative process.

Conclusions
In the Introduction, we suggested the hypothesis that disadvantages in images reconstructed by MAP algorithms are associated with global regularization. The new algorithm MAP-ENT-LOC with local regularization is developed. The suggested hypothesis was checked in computer simulations close to real clinical case. Four algorithms were compared: OSEM, MAP-GIBBS, MAP-ENT, and MAP-ENT-LOC. The comparative analysis has shown that local regularization leads to improved sensitivity in tumor image in comparison to global regularization in both activity value and border identification. Joint clinical and simulation studies are necessary for further development of this new advanced approach.