# Bayesian Maximum Entropy Based Algorithm for Digital X-ray Mammogram Processing

## Abstract

**:**

## 1. Introduction

## 2. Methods

#### 2.1. Inverse problems

**f**.

**R**is likely to be either singular or very ill-conditioned. In order to turn the inverse problem to well-posed, the goal is relaxed by asking for some reliable estimate only, say $\tilde{\mathit{f}}$, which, in a practical sense, may be considered ”very close” or ”best approximation” to the exact image

**f**, given the measured sample data ${g}_{m},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$, the known response function ${r}_{m}\left(t\right),\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$, of the measuring instrument, and some information about the noise components ${e}_{m},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$, such as their covariance matrix $\mathbf{C}={\left\{{C}_{ij}\right\}}_{i,j=1,2,...,M}$. In this respect, Gull and Daniell [18] suggested the ME principle to assign a probability distribution to any image along with a ${\chi}^{2}$-constraint for handling the errors ${e}_{m},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$. The ${\chi}^{2}$-distribution measures how well a model $\tilde{\mathit{f}}$ fits the measured data

**g**:

**σ**, has the components:

#### 2.2. Bayesian image modeling

**Figure 2.**N mutually exclusive and exhaustive hypotheses ${\left\{{H}_{i}\right\}}_{i}$ (left); an event D in the hypothesis space (right).

**e**is data independent, additive, and Gaussian with zero mean and known standard deviation ${\sigma}_{m}$ in each measured pixel $m,\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$.

**f**.

**f**given

**w**, $P\left(\mathit{w}\right|H)$ is the prior distribution of

**w**, and $P\left(\mathit{f}\right|H)$ is the evidence for the image

**f**. A prior $P\left(\mathit{w}\right|H)$ has to be assigned quite subjectively based on our beliefs about images.

**f**) to be estimated is comparable or larger than the number of measured data (components of

**g**). A reasonable choice for prior knowledge is that the image

**f**is a positive-valued vector. The ME methods have historically been designed to make use of this prior knowledge in order to solve the inverse problem of image restoration.

**f**after observing the data

**g**, that is, $P\left(\mathit{w}\right|\mathit{f},H)$ is normalized to unity, then the denominator in eq. (12) ought to ensure $P\left(\mathit{f}\right|H)={\int}_{\mathit{w}}P\left(\mathit{f}\right|\mathit{w},H)\phantom{\rule{0.166667em}{0ex}}P\left(\mathit{w}\right|H)\phantom{\rule{0.166667em}{0ex}}\mathrm{d}\mathit{w}$. The integral is often dominated by the likelihood in ${\mathit{w}}_{MP}$, so that the evidence is approximated by [20]:

#### 2.3. Image entropy

**f**. Suppose that the luminance in each pixel is quantified (in some units) to an integer value. Denote by U the total number of quanta in the whole image. The conservation of the number of quanta equates to the following equality:

#### 2.4. Bayesian image restoration

**f**, assuming that both the image

**f**and the data

**g**are outcomes of random vectors given in the form of probability distribution. The incomplete knowledge about the errors in measurements and modeling are included in a noise random vector

**e**, also expressed as probability distribution. Likewise, a prior probability has to be defined that reflects the uncertainty or partial knowledge on the image

**f**.

**f**consists of the pixel values of a latent image, the M-dimensional data vector

**g**consists of the pixel values of an observed image supposed to be a degraded version of

**f**, the matrix $\mathbf{R}$ is the PSF of the imaging system, and the errors

**e**are assumed additive and zero-mean Gaussian. As such, $\mathit{e}=k\mathit{\sigma}$, which entails:

**σ**is the noise standard deviation and k is drawn from the standard normal distribution:

#### Likelihood

**g**results:

**e**. If the pixels are uncorrelated and each pixel has the standard deviation ${\sigma}_{m},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$ , then the symmetric full rank covariance matrix $\mathbf{C}$ becomes diagonal with elements ${C}_{mm}={\sigma}_{m}^{2},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$, that is, $\mathbf{C}={\sigma}_{m}^{2}\mathbf{I}$. Hence the probability of the data

**g**given the image

**f**can be written as:

#### Prior probability

**f**is affecting the amount by which the image restoration is offset from reality.

**f**with entropy $S\left(\mathit{f}\right)$ is postulated [27] as given by a general potential function $\Phi \left(\mathit{f}\right)$, such as:

**f**, is assumed to belong. Apart from the entropic one [28], there are more different choices to be considered and evaluated for prior probability distribution of an image.

#### Posterior probability

**f**and the unknown PSF parameters denoted generically by θ need to be evaluated. Then the required inference about the posterior probability $P\left(\mathit{f}|\mathit{g},H\right)$ is obtained as a marginal integral of this joint posterior over the PSF parameters:

**f**drawn from some measured data

**g**is given by Bayes’ theorem:

**f**. Its numerical minimization falls in the field of constrained nonlinear optimization problems.

#### 2.5. Regularization of the inverse problem

**f**.

**f**, and the data,

**g**. By minimizing ${\chi}^{2}\left(\mathit{f}\right)$ alone, the agreement becomes unrealistically good, while the solution becomes unstable and oscillating. Thus, ${\chi}^{2}\left(\mathit{f}\right)$, by itself, defines a highly degenerate minimization problem. Reversely, $S\left(\mathit{f}\right)$ measures the smoothness of the solution with respect to the variations in data. By maximizing $S\left(\mathit{f}\right)$ alone results in a very smooth or stable solution, which is virtually not related with the measured data. Thus, $S\left(\mathit{f}\right)$, by itself, is termed the stabilizing functional or regularizing operator. While ${\chi}^{2}\left(\mathit{f}\right)$ is related with the a posteriori knowledge of a solution, $S\left(\mathit{f}\right)$ is related with a priori expectation. The adjustable parameter λ is setting the trade-off between the constraints imposed on the two functionals (Figure 5). In addition, λ is required by the necessity of invariance for the minimization principle to the units in which

**f**is quantified (e.g., the change from 16-bit to 32-bit sampling) [26].

**f**. That is, when a quadratic minimization principle is combined with a quadratic constraint, and both are positive, only one of the two need to be nondegenerate for the overall problem to be well-posed [26].

#### 2.6. Derivation of the potential function

- Conservation of the total number of photons in the measured image,
**g**, and the model image**f**:$$\sum _{n=1}^{N}{p}_{n}=1$$ - Linear transform between the model space and the data space:$${g}_{m}=\sum _{n=1}^{N}U\xb7{R}_{mn}\xb7{p}_{n}+{e}_{m},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$$
- The errors ${e}_{m},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$, are normally distributed with zero mean, $\overline{e}=0$, and variances ${\sigma}_{m}^{2},\phantom{\rule{0.166667em}{0ex}}m=1,2,...,M$:$$\sum _{m=1}^{M}\frac{{e}_{m}^{2}}{{\sigma}_{m}^{2}}=\Omega $$where Ω denotes the expected value of the statistical goodness-of-fit ${\chi}^{2}$.

**f**, among the admissible set of images. The least committal Lagrangian associated with the objective function, $S\left(\mathit{f}\right)$, including the specific constraints, has the following explicit form:

#### 2.7. Multidimensional optimization

## 3. Results and Discussion

#### 3.1. Physics of X-ray imaging

**Table 1.**Average recorded quanta around the incident point of the X-ray beam as resulted from simulations using GEANT for the most common X-ray energies.

X-ray beam energy (keV) | Central recorded photons | Total recorded photons |

18 | 15,660 ± 125 | 18,390 ± 135 |

20 | 59,130 ± 240 | 70,900 ± 270 |

22 | 184,070 ± 430 | 226,290 ± 475 |

24 | 337,400 ± 580 | 424,340 ± 650 |

26 | 531,130 ± 730 | 683,330 ± 825 |

28 | 738,230 ± 860 | 976,540 ± 990 |

30 | 890,710 ± 940 | 1,204,780 ± 1100 |

Columns | (0) | (1) | ..... | (xsize-2) | (xsize-1) |

Rows | |||||

(0) | 0 | 1 | ..... | xsize-2 | xsize-1 |

(1) | xsize | xsize+1 | ..... | 2*xsize-2 | 2*xsize-1 |

(2) | 2*xsize | 2*xsize+1 | ..... | 3*xsize-2 | 3*xsize-1 |

: | ..... | ..... | ..... | ..... | ...... |

(ysize-2) | (ysize-2)*xsize | (ysize-2)*xsize+1 | ..... | (ysize-1)*xsize-2 | (ysize-1)*xsize-1 |

(ysize-1) | (ysize-1)*xsize | (ysize-1)*xsize+1 | ..... | ysize*xsize-2 | ysize*xsize-1 |

**Table 3.**PSF of the imaging system defined as a symmetrically space-invariant $5\times 5$ matrix around the k-th pixel in the blurred image resulting from simulations of average soft breast-like tissue and typical hard calcification opacities for X-ray beam of 24 keV.

0.0030 | 0.0050 | 0.0070 | 0.0050 | 0.0030 |

(k-2*xsize-2) | (k-2*xsize-1) | (k-2*xsize) | (k-2*xsize+1) | (k-2*xsize+2) |

0.0050 | 0.0120 | 0.0160 | 0.0120 | 0.0050 |

(k-xsize-2) | (k-xsize-1) | (k-xsize) | (k-xsize+1) | (k-xsize+2) |

0.0070 | 0.0160 | 0.7950 | 0.0160 | 0.0070 |

(k-2) | (k-1) | (k) | (k+1) | (k+2) |

0.0050 | 0.0120 | 0.0160 | 0.0120 | 0.0050 |

(k+xsize-2) | (k+xsize-1) | (k+xsize) | (k+xsize+1) | (k+xsize+2) |

0.0030 | 0.0050 | 0.0070 | 0.0050 | 0.0030 |

(k+2*xsize-2) | (k+2*xsize-1) | (k+2*xsize) | (k+2*xsize+1) | (k+2*xsize+2) |

**Figure 6.**Artificial-generated opacity of $2\times 2$ pixel size (left) embedded in a simulated background breast-like tissue of $31\times 31$ pixel area (middle) and subsequently restored by the entropic restoration algorithm.

**Figure 7.**Arrangement of marble grains (left), raw radiography (middle), and processed radiography by the entropic algorithm (right). Average dimension of individual and cluster grains are specified in millimeters.

#### 3.2. X-ray image quality assessment

**f**, can be used as a factor of merit of the reconstruction algorithm. Yet too high a value for D may put the restored image too far away from the original one and raise questions about introducing spurious features for which there is no clear evidence in the acquired data.

**g**, and the noise (error),

**e**.

**f**, of the measured data

**g**is available, the quality of image restoration algorithms may be assessed by the improvement in signal-to-noise metric:

#### 3.3. Improvement of digital and digitized mammograms

**Figure 8.**Full breast mammograms: Fundeni Hospital - normal breast (left), case mdb032, MIAS - benign ill-defined masses, fatty-glandular tissue (middle), case mdb005, MIAS - benign circumscribed masses - fatty tissue (right). Digitized X-ray images by optical scanning (top) and digitally restored images (bottom).

**Figure 9.**Close-up of malignant abnormalities in mammograms: Case mdb245, MIAS - microcalcification cluster, fatty tissue (left), case #9, UNC Radiology - breast carcinoma (middle), Salus Hospital - stellate patterns (right). Digitized X-ray images by optical scanning (top) and digitally restored images (bottom).

## 4. Conclusion

## Acknowledgements

## References and Notes

- Claridge, E.; Richter, J.H. Characterisation of mammographic lesions. In Digital Mammography; Gale, A.G., Astley, S.M., Dance, D.R., Alistair, A.Y., Eds.; Elsevier: Amsterdam, Netherlands, 1994. [Google Scholar]
- Meyer, Y. An introduction to wavelets and ten lectures on wavelets. Bull. Amer. Math. Soc.
**1993**, 28, 350–359. [Google Scholar] [CrossRef] - Mallat, S. Une Exploration des Signaux en Ondelettes; Les Editions de l’Ecole Polytechnique, 2000. [Google Scholar]
- Donoho, D.L. Unconditional bases and bit-level compression. Applied and Computational Harmonic Analysis
**1996**, 1, 100–105. [Google Scholar] [CrossRef] - Candes, E.J. Ridgelets: Theory and applications. PhD Thesis, Department of Statistics, Standford University, CA, USA, 1998. [Google Scholar]
- Mutihac, R.; Cicuttin; Jansen, K.; Mutihac, R.C. An essay on Bayesian inference and maximum entropy. Roumanian Biotechnological Letters
**2000**, 5, 83–114. [Google Scholar] - Shore, J.E.; Johnson, R.W. Axiomatic derivation of the principle of maximum entropy and the principle of minimum cross-entropy. IEEE Trans. Inform. Theory
**1980**, IT-26, 26-39 and IT-29. 942–943. [Google Scholar] [CrossRef] - Skilling, J. The axioms of maximum entropy. In Maximum Entropy and Bayesian Methods in Science and Engineering; Erickson, G.J., Smith, C.R., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1988; Vol. I, pp. 173–187. [Google Scholar]
- Bayes, T. An essay towards solving a problem in the doctrine of chances. Phil. Trans. R. Soc. London
**1763**, 53, 330–418. [Google Scholar] [CrossRef] - Jaynes, E.T. Papers on Probability, Statistics and Statistical Physics; Rosenkrantz, R.D., Ed.; Kluwer Academic Press: Dordrecht, Netherlands, 1983. [Google Scholar]
- Skilling, J. Fundamentals of MaxEnt in data analysis. In Maximum Entropy in Action; Buck, B., Macaulay, V.A., Eds.; Clarendon Press: Oxford, UK, 1994; pp. 19–39. [Google Scholar]
- MacKay, D.J.K. Information Theory, Inference, and Learning Algorithms; Cambridge University Press: Cambridge, UK, 2003. [Google Scholar]
- Berger, J. Statistical Decision Theory and Bayesian Analysis; Springer-Verlag: New York, NY, USA, 1985. [Google Scholar]
- Hadamard, J. Lectures on the Cauchy Problem in Linear Partial Differential Equations; Yale University Press: Yale, CT, USA, 1923. [Google Scholar]
- Engle, H.; Hanke, M.; Neubauer, A. Regularization of inverse problems. In Series - Mathematics And Its Applications; 375; Springer-Verlag: New York, NY, USA, 1996. [Google Scholar]
- Cox, R. Probability, frequency, and reasonable expectation. Am. J. Phys.
**1946**, 14, 1–13. [Google Scholar] [CrossRef] - Djafari, A.M. Maximum entropy and linear inverse problems. In Maximum entropy and Bayesian methods; Djafari, A.M., Demoments, G., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1993; pp. 253–264. [Google Scholar]
- Gull, S.F.; Daniell, G.J. Image reconstruction from incomplete and noisy data. Nature
**1978**, 272, 686–690. [Google Scholar] [CrossRef] - Weiss, N.A. Introductory Statistics; Addison-Wesley: Boston, MA, USA, 2002. [Google Scholar]
- MacKay, D.J.K. A practical Bayesian framework for backpropagation networks. Neural Comput.
**1992**, 4, 448–472. [Google Scholar] [CrossRef] - Balasubramanian, V. Occam’s razor for parametric families and priors on the space of distributions. In Maximum Entropy and Bayesian Methods, Proceedings of the 15th International Workshop, Santa Fe, 1995; Hanson, K.M., Silver, R.N., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1996; pp. 277–284. [Google Scholar]
- Balasubramanian, V. Statistical inference, Occam’s razor, and statistical mechanics on the space of probability distributions. Neural Comput.
**1997**, 9, 2. 349–368. [Google Scholar] [CrossRef] - Shannon, C.E. A mathematical theory of communication. Bell. Syst. Tech. J.
**1948**, 27, 379–423. [Google Scholar] [CrossRef] - Pal, N.R.; Pal, S.K. Entropy: A new definition and its applications. IEEE Trans. Syst., Man, Cybern.
**1991**, 21, 1260–1270. [Google Scholar] [CrossRef] - Skilling, J. Classic maximum entropy. In Maximum Entropy and Bayesian Methods; Skilling, J., Ed.; Kluwer Academic: Norwell, MA, USA, 1989; pp. 45–52. [Google Scholar]
- Press, W.H.; Flannery, B.P.; Teukolsky, S.A.; Vetterling, W.T. Numerical Recipes in C: The Art of Scientific Computing, 3rd ed.; Cambridge University Press: Cambridge, UK, 2007. [Google Scholar]
- Djafari, A.M. A full Bayesian approach for inverse problems. In Maximum Entropy and Bayesian Methods; Hanson, K.M., Silver, R.N., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1995; pp. 135–144. [Google Scholar]
- Hanson, M.K. Making binary decisions based on the posterior probability distribution associated with tomographic reconstructions. In Maximum Entropy and Bayesian Methods; Smith, C.R., Erickson, G.J., Neudorfer, P.O., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1992; pp. 313–326. [Google Scholar]
- MacKay, D.J.C. (1992), Bayesian interpolation. In Maximum Entropy and Bayesian Methods; Smith, C.R., Erickson, G.J., Neudorfer, P.O., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1992. [Google Scholar]
- Myrheim, J.; Rue, H. New algorithms for maximum entropy image restoration. Graphical Models and Image Processing Archive
**1992**, 54, 223–238. [Google Scholar] [CrossRef] - Agmon, N.; Alhassid, Y.; Levine, R.D. An algorithm for finding the distribution of maximal entropy. J. Comput. Phys.
**1979**, 30, 250–258. [Google Scholar] [CrossRef] - Wilczek, R.; Drapatz, S. A high accuracy algorithm for maximum entropy image restoration in the case of small data sets. Astron. Astrophys.
**1985**, 142, 9–12. [Google Scholar] - Ortega, J.M.; Rheinboldt, W.B. Iterative Solution of Nonlinear Equations in Several Variables; Academic Press: New York, NY, USA, 1970. [Google Scholar]
- Brodlie, K.W. Unconstrained minimization. In The State of the Art in Numerical Analysis; Jacobs, D.A.H., Ed.; Academic Press: London, UK, 1977; pp. 229–268. [Google Scholar]
- Cornwell, T.J.; Evans, K.J. A simple maximum entropy deconvolution algorithm. Astron. Astrophys.
**1985**, 143, 77–83. [Google Scholar] - Evans, A.L. The evaluation of medical images. In Medical Physics Handbooks; Adam Hilger: Bristol, UK, 1981; Volume 10, pp. 45–46. [Google Scholar]
- Benini, L.; et al. Synchrotron radiation application to digital mammography. A proposal for the Trieste Project ”Elettra”. Phys. Med.
**1990**, VI, 293. [Google Scholar] - Mutihac, R.; Colavita, A.A.; Cicuttin, A.; Cerdeira, A.E. Maximum entropy improvement of X-ray digital mammograms. In Digital Mammography; Karssemeijer, N., Thijssen, M., Hendriks, J., van Erning, L., Eds.; Kluwer Academic Publishers: Dordrecht, Netherlands, 1998; pp. 329–337. [Google Scholar]
- Allison, J.; Amako, K.; Apostolakis, J.; Araujo, H. Geant4 developments and applications. IEEE T. Nucl. Sci.
**2006**, 53, 270–278. [Google Scholar] [CrossRef] - Mutihac, R.; Colavita, A.A.; Cicuttin, A.; Cerdeira, A.E. X-Ray image improvement by maximum entropy. In Proceedings of the 13th IEEE & EURASIP International Conference on Digital Signal Processing, Santorini, Greece, 1997; Vol. II, pp. 1149–1152.
- Jannetta, A.; Jackson, J.C.; Kotre, C.J.; Birch, I.P.; Robson, K.J.; Padgett, R. Mammographic image restoration using maximum entropy deconvolution. Phys. Med. Biol.
**2004**, 49, 4997–5010. [Google Scholar] [CrossRef] [PubMed] - Arfelli, F. Silicon detectors for synchrotron radiation digital mammography. Nucl. Instrum. Meth.
**1995**, A 360, 283–286. [Google Scholar] [CrossRef] - Di Michiel, M. Un rivelatore di silicio a pixel per immagini in radiologia diagnostica. PhD Thesis (unpublished), Universita di Trieste, June 1994. [Google Scholar]
- Thijssen, M.A.O.; Bijkerk, K.R.; van der Burght, R.J.M. Manual CDRAD-phantom type 2.0; Department of Radiology, University Hospital Nijmegen: The Netherlands, 1988-1992. [Google Scholar]
- Banham, M.R.; Katsaggelos, A.K. Digital image restoration. IEEE Signal Proc. Mag.
**1997**, 3, 24–41. [Google Scholar] [CrossRef] - Sprawls, P., Jr. Physical Principles of Medical ImagingMedical Physics Publishing: Madison, Wisconsin, USA, 1995, 2nd ed.; Ch. 12; pp. 171–172. [Google Scholar]
- Zadeh, H.S-.; Windham, J.P.; Yagle, A.E. A multidimensional nonlinear edge-preserving for magnetic resonance image restoration. IEEE T. Image Process.
**1995**, 4, 141–161. [Google Scholar] - Goyette, J.A.; Lapin, G.D.; Kang, M.G.; Katsaggelos, A.K. Improving autoradiograph resolution using image restoration techniques. IEEE Eng. Med. Biol.
**1994**, 8-9, 571–574. [Google Scholar]

© 2009 by the authors; licensee Molecular Diversity Preservation International, Basel, Switzerland. This article is an open-access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

## Share and Cite

**MDPI and ACS Style**

Mutihac, R.
Bayesian Maximum Entropy Based Algorithm for Digital X-ray Mammogram Processing. *Algorithms* **2009**, *2*, 850-878.
https://doi.org/10.3390/a2020850

**AMA Style**

Mutihac R.
Bayesian Maximum Entropy Based Algorithm for Digital X-ray Mammogram Processing. *Algorithms*. 2009; 2(2):850-878.
https://doi.org/10.3390/a2020850

**Chicago/Turabian Style**

Mutihac, Radu.
2009. "Bayesian Maximum Entropy Based Algorithm for Digital X-ray Mammogram Processing" *Algorithms* 2, no. 2: 850-878.
https://doi.org/10.3390/a2020850