# Algorithms for Non-Negatively Constrained Maximum Penalized Likelihood Reconstruction in Tomographic Imaging

## Abstract

**:**

## 1. Introduction

**x**and

**y**, can be different for different imaging modalities. Vectors

**y**and

**x**are related through a system matrix A; see Equation (4) below for some examples. For tomographic reconstruction problems, matrix A is usually assumed known so its estimation is not covered by this paper. Rather, we focus on how to estimate

**x**from the observed

**y**and the known system matrix A. We denote the estimate of

**x**by $\widehat{\mathit{x}}$.

**x**is a vector formed by the ${\mathit{x}}_{m}$’s and ${b}_{im}$ is the blank scan count from energy spectrum m.

**x**), it may be no longer concave for transmission but still concave for emission scans. Concavity is an important property exploited by the optimization transfer algorithms.

**y**and $\mathbf{\mu}$. Different probability distributions have been used to model ${y}_{i}$ even under the same imaging modality. For example, for emission tomography, if assuming the Poisson model for ${y}_{i}$ (i.e., ${y}_{i}\sim \text{Poisson}\left({\mu}_{i}\right)$) then ${l}_{i}$ is given by Equation (6), or if considering the weighted least squares then

**x**obtained at iteration k of an algorithm. The notation $\nabla b(\xb7)$ indicates the derivative of function b with respect to the variable in the brackets. For example, $\nabla b\left({A}_{i}\mathit{x}\right)$ represents the derivative of b with respect to ${A}_{i}\mathit{x}$ and $\nabla b(\mathit{x};{\mathit{x}}^{\left(k\right)})$ the derivative of b with respect to

**x**. We use ${\nabla}_{j}b\left(\mathit{x}\right)$ to denote the derivative of b with respect to ${\mathit{x}}_{j}$, the j-th element of vector

**x**. We also let $\nabla b\left({\mathit{x}}^{\left(k\right)}\right)$ and ${\nabla}_{j}b\left({\mathit{x}}^{\left(k\right)}\right)$ represent, respectively, $\nabla b\left(\mathit{x}\right)$ and ${\nabla}_{j}b\left(\mathit{x}\right)$ evaluated at $\mathit{x}={\mathit{x}}^{\left(k\right)}$.

**y**are used to update

**x**in each iteration, and for block-iterative algorithms, distinct portions of

**y**are used in turn to update

**x**. We discuss in this paper some simultaneous algorithms for non-negatively constrained MPL reconstructions, and the block-iterative algorithms are not included in our discussions. The rest of this paper is arranged as follows. The expectation-maximization algorithm for emission tomography is discussed in Section 2. Section 3 explains the alternating minimization algorithm designed specifically for transmission tomography. Section 4 contains explanations on the optimization transfer algorithms and their applications to tomographic reconstructions. The multiplicative iterative (MI) algorithms for tomographic imaging are provided in Section 5 and the Fisher scoring based Jacobi or Gauss–Seidel over-relaxation algorithms are presented in Section 6. Section 7 explains another Gauss–Seidel method named the iterative coordinate ascent algorithm. Finally, Section 8 includes discussions and remarks about this paper.

## 2. EM Algorithm for Maximum Likelihood Reconstruction in Emission Tomography

**E-Step**: Compute the conditional expectation of the complete data log-likelihood given the incomplete data and ${\mathit{x}}^{\left(k\right)}$, and denote this function by$$Q(\mathit{x};{\mathit{x}}^{\left(k\right)})=E({l}_{\mathcal{C}}\left(\mathit{x}\right)\mid \mathcal{Y},{\mathit{x}}^{\left(k\right)})$$**M-Step**: Update the**x**estimate by maximizing the Q function, namely$${\mathit{x}}^{(k+1)}=arg\phantom{\rule{-0.166667em}{0ex}}\underset{\mathit{x}}{max}Q(\mathit{x};{\mathit{x}}^{\left(k\right)})$$

**x**:

- If the initial ${\mathit{x}}^{\left(0\right)}\ge 0$ then ${\mathit{x}}^{\left(k\right)}\ge 0$ for all $k\ge 1$; i.e., it automatically satisfies the non-negativity constraint on
**x**. - The algorithm is easy to implement as it only involves forward- and back-projections.
- The updating formula in Equation (15) increases the incomplete data log-likelihood: $l\left({\mathit{x}}^{(k+1)}\right)\ge l\left({\mathit{x}}^{\left(k\right)}\right)$, where equality holds only when the iteration has converged.
- ${\mathit{x}}^{\left(k\right)}$ satisfies ${\sum}_{i}{\mu}_{i}^{\left(k\right)}={\sum}_{i}{y}_{i}$, where ${\mu}_{i}^{\left(k\right)}$ is ${\mu}_{i}$ with $\mathit{x}={\mathit{x}}^{\left(k\right)}$. Thus the
**x**estimate at any iteration satisfies that the total expected and the total observed counts are equal.

**x**in the derivative of the penalty function by its current estimate ${\mathit{x}}^{\left(k\right)}$, and therefore an “exact” solution can still be accomplished. But this method suffers from the deficiencies that (i) the algorithm may be non-convergent; and (ii) some estimates may be negative.

## 3. Alternating Minimization Algorithms for Transmission Tomography

**z**, which will be defined below) is given by Equation (5). Moreover, elements of the attenuation map associated with spectrum m, namely elements of ${\mathit{x}}_{m}$ in Equation (5), are further modeled by

**z**is a vector of size $pa\times 1$ formed by column-wise stacking the vectors ${\mathit{z}}_{j}={({z}_{1j},\dots ,{z}_{aj})}^{T}$.

**p**and

**q**be the vectors created from ${p}_{im}$ and ${q}_{im}$ respectively. It can be shown that the problem of maximizing the log-likelihood Equation (16) can be re-written as

- (i)
- compute ${\mathit{p}}^{(k+1)}$ by minimizing $I(\mathit{p}\phantom{\rule{0.166667em}{0ex}}\parallel \phantom{\rule{0.166667em}{0ex}}{\mathit{q}}^{\left(k\right)})$ subject to $\mathit{p}\in \mathcal{L}$;
- (ii)
- compute ${\mathit{q}}^{(k+1)}$ by minimizing $I({\mathit{p}}^{(k+1)}\phantom{\rule{0.166667em}{0ex}}\parallel \phantom{\rule{0.166667em}{0ex}}\mathit{q})$ subject to $\mathit{q}\in \mathcal{E}$.

**Remarks**

- (1)
- This algorithm is designed for maximum likelihood estimation. However, it can be easily extended to MPL where the penalty function must be convex and therefore can also be decoupled.
- (2)
- This algorithm is developed for the likelihood function derived from the simple Poisson measurement noise. Note that the alternating minimization algorithm was also developed for a compound Poisson noise model in [36] and its comparison with the simple Poisson alternating minimization was provided in [37]. For other measurement distributions, however, the corresponding algorithms have to be completely re-developed.
- (3)
- The convergence properties of the alternating maximization algorithm have been studied in [34]. Particularly, it is monotonically convergent under certain conditions.
- (4)
- It will become clear in Section 5 (Example 5.3) that the multiplicative-iterative algorithm can be derived more easily for this transmission reconstruction problem.
- (5)
- The trick of decoupling the objective function using its convex (or concave) property is also the key technique of the optimization transfer algorithms discussed in Section 4.

## 4. Optimization Transfer Algorithms

- (i)
- $\Psi \left({\mathit{x}}^{\left(k\right)}\right)=\Phi ({\mathit{x}}^{\left(k\right)};{\mathit{x}}^{\left(k\right)})$, and
- (ii)
- $\Psi \left(\mathit{x}\right)\ge \Phi (\mathit{x};{\mathit{x}}^{\left(k\right)})$ for all x.

**x**is estimated by maximizing $\Phi (\mathit{x};{\mathit{x}}^{\left(k\right)})$, i.e.,

**x**is a p-vector. For matrix A, we assume its elements ${a}_{ij}$ are non-negative and ${\sum}_{j}{a}_{ij}\ne 0$. We also assume that all ${g}_{i}(\xb7)$ are concave functions. Let ${\pi}_{ij}\ge 0$ be weights satisfying ${\sum}_{j=1}^{p}{\pi}_{ij}=1$. Then according to the concave inequality we have

**x**, the surrogate function corresponding to Equation (28) is

**x**of $\Phi (\mathit{x};{\mathit{x}}^{\left(k\right)})$ can be achieved by a sequence of 1-D optimizations. Another trick, due to De Pierro [32], uses the following concave inequality:

**x**, and therefore its maximization with respect to

**x**cannot be reduced to a series of 1-D problems. To overcome this problem we can find another function surrogating the above parabola surrogate but with separable

**x**. Towards this, we denote the right hand side quadratic function of Equation (32) by ${q}_{i}^{\left(k\right)}\left({A}_{i}\mathit{x}\right)$. Since ${q}_{i}^{\left(k\right)}$ is concave in ${A}_{i}\mathit{x}$, we can use either Equations (29) or (31) to find a surrogate to ${q}_{i}^{\left(k\right)}$ and the resulting algorithm is called the separable paraboloidal surrogate (SPS) algorithm [39]. For example, corresponding to Equation (31), a separable parabola surrogate of ${q}^{\left(k\right)}$ is

**Example 4.1 (OT for emission scans with Poisson noise).**

**Example 4.2 (OT for transmission scans with Poisson noise).**

## 5. Multiplicative Iterative Algorithms

**x**from

**x**:

**Example 5.1 (MI for emission scans with Poisson noise).**

**x**at iteration $k+1$ is given by Equation (57). In this algorithm, there is only one back-projection (for the numerator of Equation (62)) and one forward-projection in each iteration; its computational burden is the same as EM.

**Example 5.2 (MI for randoms-precorrected PET emission scans).**

**x**first according to

**Example 5.3 (MI for polyenergetic transmission scans with Poisson noise).**

## 6. Modified Fisher’s Method of Scoring Using Jacobi or Gauss–Seidel Over-Relaxations

**x**estimate is updated by constrained maximization of ${\Psi}^{\left(k\right)}\left(\mathit{x}\right)$, namely

**x**elements, except ${\mathit{x}}_{j}$, at their estimates from the last iteration (i.e., iteration k), but SOR solves it by fixing all the

**x**elements, except ${\mathit{x}}_{j}$, at their most current estimates.

**Example 6.1 (Emission scans with Poisson noise).**

**Example 6.2 (Transmission scans with Poisson noise).**

## 7. Iterative Coordinate Ascent Algorithms

**Example 7.1 (Emission scans with Poisson noise).**

**Example 7.2 (Transmission scans with Poisson noise).**

## 8. Conclusions

## Acknowledgements

## References

- Phelps, M.E.; Hoffman, E.J.; Mullani, N.A.; Ter-Pogossian, M.M. Application of annihilation coincidence detection to transaxial reconstruction tomography. J. Nucl. Med.
**1975**, 16, 210–224. [Google Scholar] [PubMed] - Bailey, D.L.; Townsend, D.W.; Valk, P.E.; Maisey, M.N. Positron Emission Tomography: Basic Sciences; Springer-Verlag: Secaucus, NJ, USA, 2005. [Google Scholar]
- Parra, L.; Barrett, H.H. List mode likelihood: EM algorithm and image quality estimation demonstrated on 2-D PET. IEEE Trans. Med. Imaging
**1998**, 17, 228–235. [Google Scholar] [CrossRef] [PubMed] - Barrett, J.F.; Keat, N. Artifacts in CT: Recognition and avoidance. Radio Graph.
**2004**, 24, 1679–1691. [Google Scholar] [CrossRef] [PubMed] - De Man, B.; Nuyts, J.; Dupont, P.; Marchal, G. Reduction of metal streak artifacts in X-ray computed tomography using a transmission maximum a posteriori algorithm. IEEE Trans. Nucl. Sci.
**2000**, 47, 977–981. [Google Scholar] [CrossRef] - Fessler, J.A. Penalized weighted least squares image reconstruction for PET. IEEE Trans. Med. Imaging
**1994**, 13, 290–300. [Google Scholar] [CrossRef] [PubMed] - Titterington, D.M. On the iterative image space reconstruction algorithm for ECT. IEEE Trans. Med. Imaging
**1987**, 6, 52–56. [Google Scholar] [CrossRef] [PubMed] - Shepp, L.A.; Vardi, Y. Maximum likelihood estimation for emission tomography. IEEE Trans. Med. Imaging
**1982**, MI-1, 113–121. [Google Scholar] [CrossRef] [PubMed] - Yavuz, M.; Fessler, J.A. Statistical image reconstruction methods for randoms-precorrected PET scans. Med. Image Anal.
**1998**, 2, 369–378. [Google Scholar] [CrossRef] - Whiting, B.R. Signal statistics in X-ray computed tomography. Proc. SPIE 4682, Med. Imaging 2002: Phys. of Medical Imaging
**2002**, 53–60. [Google Scholar] - Anderson, J.M.M.; Mair, B.A.; Rao, M.; Wu, C.H. Weighted least-squares reconstruction methods for positron emission tomography. IEEE Trans. Med. Imaging
**1997**, 16, 159–165. [Google Scholar] [CrossRef] [PubMed] - Veklerov, E.; Llacer, J. Stopping rule for the MLE algorithm based on statistical hypothesis testing. IEEE Trans. Med. Imaging
**1987**, 6, 313–319. [Google Scholar] [CrossRef] [PubMed] - Lange, K. Convergence of EM image reconstruction algorithms with Gibbs smoothing. IEEE Trans. Med. Imaging
**1990**, MI-9, 439–446. [Google Scholar] [CrossRef] [PubMed] - Lewitt, R.M. Multidimensional digital image representations using generalized Kaiser-bessel window functions. J. Opt. Soc. Am.
**1990**, 7, 1834–1846. [Google Scholar] [CrossRef] - Silverman, B.W.; Jones, M.C.; Wilson, J.D.; Nychka, D.W. A smoothed EM approach to indirect estimation problems, with particular reference to stereology and emission tomography (with discussion). J. R. Stat. Soc. B
**1990**, 52, 271–324. [Google Scholar] - Snyder, D.L.; Miller, M.I.; Thomas, L.J.; Politte, D.G. Noise and edge artifacts in maximum-likelihood reconstructions for emission tomography. IEEE Trans. Med. Imaging
**1987**, 6, 228–238. [Google Scholar] [CrossRef] [PubMed] - Fessler, J.A. Tomographic Reconstruction Using Information Weighted Smoothing Splines. In Information Processing in Medical Im.; Barrett, H.H., Gmitro, A.F., Eds.; Springer-Verlag: Berlin, Germany, 1993; pp. 372–386. [Google Scholar]
- La Rivière, P.J.; Pan, X. Nonparametric regression sinogram smoothing using a roughness-penalized Poisson likelihood objective function. IEEE Trans. Med. Imaging
**2000**, 19, 773–786. [Google Scholar] [CrossRef] [PubMed] - Rudin, L.; Osher, S.; Fatemi, E. Nonlinear total variation based noise removal algorithms. Physica D
**1992**, 60, 259–268. [Google Scholar] [CrossRef] - Huber, P.J. Robust regression: Asymptotics, conjectures, and Monte Carlo. Ann. Stat.
**1973**, 1, 799–821. [Google Scholar] [CrossRef] - Yu, D.F.; Fessler, J.A. Edge-preserving tomographic reconstruction with nonlocal regularization. IEEE Trans. Med. Imaging
**2002**, 21, 159–173. [Google Scholar] [CrossRef] [PubMed] - Evans, J.D.; Politte, D.A.; Whiting, B.R.; O’Sullivan, J.A.; Williamson, J.F. Noise-resolution tradeoffs in X-ray CT imaging: A comparison of penalized alternating minimization and filtered backprojection algorithms. Med. Phys.
**2011**, 38, 1444–1458. [Google Scholar] [CrossRef] [PubMed] - Ma, J. Total Variation Smoothed Maximum Penalized Likelihood Tomographic Reconstruction with Positivity Constraints. In Proceedings of the 8th IEEE International Symposium on Biomedical Imaging, Chicago, USA, April 2011; pp. 1774–1777.
- Sidky, E.Y.; Duchin, Y.; Pan, X.; Ullberg, C. A constrained, total-variation minimization algorithm for low-intensity X-ray CT. Med. Phys.
**2011**, 38, S117–S125. [Google Scholar] [CrossRef] [PubMed] - Lauzier, P.T.; Tang, J.; Chen, G.H. Quantitative evaluation method of noise texture for iteratively reconstructed X-ray CT images. Proc. Med. Imaging 2011: Phys. Med. Imaging, Proc. SIPE
**2011**, 7961, Artical 796135. [Google Scholar] - Ma, J. Positively constrained multiplicative iterative algorithm for maximum penalized likelihood tomographic reconstruction. IEEE Trans. Nucl. Sci.
**2010**, 57, 181–192. [Google Scholar] [CrossRef] - Dempster, A.; Laird, N.; Rubin, D. Maximum likelihood from incomplete data via the EM algorithm (with discussion). J. R. Stat. Soc. B
**1977**, 39, 1–38. [Google Scholar] - Wei, G.; Tanner, M. A Monte Carlo implementation of the EM algorithm and the Poor Man’s data augmentation algorithm. J. Am. Stat. Assoc.
**1990**, 85, 699–704. [Google Scholar] [CrossRef] - Lange, K.; Carson, R. EM reconstruction algorithms for emission and transmission tomography. J. Comput. Assis. Tomogr.
**1984**, 8, 306–316. [Google Scholar] - Ma, J. On iterative Bayes algorithms for emission tomography. IEEE Trans. Nucl. Sci.
**2008**, 55, 953–966. [Google Scholar] [CrossRef] - Green, P. Bayesian reconstruction from emission tomography data using a modified EM algorithm. IEEE Trans. Med. Imaging
**1990**, 9, 84–93. [Google Scholar] [CrossRef] [PubMed] - De Pierro, A.R. A modified expectation maximization algorithm for penalized likelihood estimation in emission tomography. IEEE Trans. Med. Imaging
**1995**, 14, 132–137. [Google Scholar] [CrossRef] [PubMed] - Csiszár, I.; Tusnády, G. Information geometry and alternating minimization procedures. Stat. Decis.
**1984**, Supplement Issue, No. 1, 205–237. [Google Scholar] - O’Sullivan, J.; Benac, J. Alternating minimization algorithms for transmission tomography. IEEE Trans. Med. Imaging
**2007**, 26, 283–297. [Google Scholar] [CrossRef] [PubMed] - Csiszár, I. Why least squares and maximum entropy? An axiomatic approach to inference for linear inverse problems. Ann. Stat.
**1991**, 19, 2032–2066. [Google Scholar] [CrossRef] - O’Sullivan, J.A.; Whiting, B.R.; Snyder, D.L. Alternating Minimization Algorithms for Transmission Tomography Using Energy Detectors. In Proceedings of the 36th Asilomar Conference Signals, Systems and Computers, St. Louis, USA, 2002; Volume 1, pp. 144–147.
- Lasio, G.M.; Whiting, B.R.; Williamson, J.F. Statistical reconstruction for X-ray computed tomography using energy-integrating detectors. Phys. Med. Biol.
**2007**, 52, 2247–2266. [Google Scholar] [CrossRef] [PubMed] - Lange, K.; Hunter, D.R.; Yang, I. Optimization transfer using surrogate objective functions. J. Comput. Graph. Stat.
**2000**, 9, 1–20. [Google Scholar] - Erdoğan, H.; Fessler, J.A. Monotonic algorithms for transmission tomography. IEEE Trans. Med. Imaging
**1999**, 18, 801–814. [Google Scholar] [CrossRef] [PubMed] - Böhning, D.; Lindsay, B.G. Monotonicity of quadratic approximation algorithms. Ann. Inst. Stat. Math.
**1988**, 40, 641–663. [Google Scholar] [CrossRef] - Chan, R.H.; Ma, J. A multiplicative iterative algorithm for box-constrained penalized likelihood image restoration. IEEE Trans. Image Process.
**2012**, 21, 3168–3181. [Google Scholar] [CrossRef] - Gasso, G.; Rakotomamonjy, A.; Canu, S. Recovering sparse signals with a certain family of non-convex penalties and DC programming. IEEE Trans. Signal Proc.
**2009**, 57, 4686–4698. [Google Scholar] [CrossRef] - Luenberger, D. Linear and Nonlinear Programming, 2nd ed.; J. Wiley: New York, NY, USA, 1984. [Google Scholar]
- Ahn, S.; Fessler, J.A. Emission image reconstruction for randoms-precorrected PET allowing negative sinogram values. IEEE Trans. Med. Imaging
**2004**, 23, 591–601. [Google Scholar] [CrossRef] [PubMed] - Lange, K.; Bahn, M.; Little, R. A theoretical study of some maximum likelihood algorithms for emission and transmission tomography. IEEE Trans. Med. Imaging
**1987**, 6, 106–114. [Google Scholar] [CrossRef] [PubMed] - Ober, R.J.; Zou, Q.; Zhiping, L. Calculation of the Fisher information matrix for multidimensional data sets. IEEE Trans. Signal Proc.
**2003**, 51, 2679–2691. [Google Scholar] [CrossRef] - Ma, J.; Hudson, H.M. Modified Fisher scoring algorithms using Jacobi or Gauss-Seidel subiterations. Comput. Stat.
**1997**, 12, 467–479. [Google Scholar] - Hudson, H.; Ma, J.; Green, P. Fisher’s method of scoring in statistical image reconstruction: Comparison of Jacobi and Gauss-Seidel iterative schemes. Stat. Method Med. Res.
**1994**, 3, 41–61. [Google Scholar] [CrossRef] - Ortega, J.M.; Rheinboldt, W.C. Iterative Solutions of Nonlinear Equations in Several Variables; Academic Press: New York, NY, USA, 1970. [Google Scholar]
- Osborne, M.R. Fisher’s method of scoring. Int. Stat. Rev.
**1992**, 60, 99–117. [Google Scholar] [CrossRef] - Sauer, K.; Bouman, C. A local update strategy for iterative reconstruction from projections. IEEE. Trans. Signal Proc.
**1993**, 41, 533–548. [Google Scholar] [CrossRef] - Bouman, C.A.; Sauer, K. A unified approach to statistical tomography using coordinate descent optimization. IEEE Trans. Image Process.
**1996**, 5, 480–492. [Google Scholar] [CrossRef] [PubMed]

© 2013 by the author; licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Ma, J. Algorithms for Non-Negatively Constrained Maximum Penalized Likelihood Reconstruction in Tomographic Imaging. *Algorithms* **2013**, *6*, 136-160.
https://doi.org/10.3390/a6010136

**AMA Style**

Ma J. Algorithms for Non-Negatively Constrained Maximum Penalized Likelihood Reconstruction in Tomographic Imaging. *Algorithms*. 2013; 6(1):136-160.
https://doi.org/10.3390/a6010136

**Chicago/Turabian Style**

Ma, Jun. 2013. "Algorithms for Non-Negatively Constrained Maximum Penalized Likelihood Reconstruction in Tomographic Imaging" *Algorithms* 6, no. 1: 136-160.
https://doi.org/10.3390/a6010136