A Soft-Threshold Filtering Approach for Tomography Reconstruction from a Limited Number of Projections with Bilateral Edge Preservation

In X-ray tomography image reconstruction, one of the most successful approaches involves a statistical approach with l2 norm for fidelity function and some regularization function with lp norm, 1<p<2. Among them stands out, both for its results and the computational performance, a technique that involves the alternating minimization of an objective function with l2 norm for fidelity and a regularization term that uses discrete gradient transform (DGT) sparse transformation minimized by total variation (TV). This work proposes an improvement to the reconstruction process by adding a bilateral edge-preserving (BEP) regularization term to the objective function. BEP is a noise reduction method and has the purpose of adaptively eliminating noise in the initial phase of reconstruction. The addition of BEP improves optimization of the fidelity term and, as a consequence, improves the result of DGT minimization by total variation. For reconstructions with a limited number of projections (low-dose reconstruction), the proposed method can achieve higher peak signal-to-noise ratio (PSNR) and structural similarity index measurement (SSIM) results because it can better control the noise in the initial processing phase.


Introduction
X-ray computer tomography (CT) measures the attenuation of X-ray beams passing through an object, generating projections. Such projections are processed, resulting in an image (slice) of the examined object. This is known as a CT image reconstruction. The CT scan, formed by concatenating a large number of adjacent reconstructed images, has been proven to have great value in delivering rapid and accurate diagnoses for many cases in modern medicine. Although CT scanning has evolved considerably since its creation in 1972 by Godfrey Hounsfield, only recently has the concern with radiation levels in radiological examinations become important, leading to the "as-low-as-reasonably-achievable principle", known as the ALARA principle. The ALARA principle states that only the minimum amount of radiation must be applied to the patient. For this reason, ALARA is widely accepted in the medical CT community [1]. To reduce the X-ray dose of the patient during the CT scan, there are two possibilities: (1) reduce the amount of projection (the quantity of X-rays emitted) during the CT scan or (2) reduce the power of the X-ray source during image acquisition. Both cases generally lead to low-quality reconstructed CT images. Then, a state-of-the-art problem is to propose a method that allows good-quality CT image reconstruction with a low-dose X-ray source. The central theme of this work is the reconstruction of images from the signal of the CT process, where The Bayesian statistical approach is widely applied to the reconstruction of X-ray tomographies, with some variations, [14][15][16]18,27,28], and makes it possible to insert prior knowledge into the CT system model. This approach promises two advantages. First, it provides the search with more satisfactory solutions (noiseless ones) through the limitation of the searchable set of solutions using an a priori function (known as restriction). This restriction may mean that, for example, very high variations (high frequencies) between neighboring pixels will be considered as noise (and therefore must be discarded), and moderate frequencies will be considered as edges (and therefore must be preserved). In the context of computed tomography, this means that large differences in intensity between neighboring pixels tend to be interpreted as noise, and therefore, such a solution should be disregarded [5]. Moreover, solving the CT reconstruction system, y = Ax + e, is an inverse and ill-posed problem, and prior knowledge often ensures the stability of the solution. Second, we can adopt a simplified mathematical model for tomographic image reconstruction and compensate its inefficiency (instability, noisy reconstruction, etc.) by adding the statistical component (prior knowledge) to the objective function. However, the model simplification has its limitations and should be used restrictively [5]. As a consequence of prior knowledge introduction, more satisfactory solutions (low noise level) can be found. Maximum a posteriori (MAP) is a useful statistical framework for CT reconstruction [12,15,16,27,28] and favors the incorporation of the regularization term with prior knowledge into the model. The MAP strategy provides an objective function composed by the sum of the probability (also known as fidelity) and the regularization function that establishes the optimization restriction criteria, also known as a prior.

Signal Modeling and Error Considerations in CT Image Reconstruction
As previously discussed, the statistical approaches can reduce deficiencies caused by classical mathematical modeling without having to literally incorporate the complexity of a real-world model. However, before proposing a statistical (non-deterministic) model that results in a lower noise reconstruction, it is necessary to establish the process as a whole. As shown in Figure 1, the process begins with a synthetic image, µ. In this work, we use different synthetic images, namely Shepp-Logan head phantom and FORBILD head and abdomen phantom definitions [2,29]. The synthetic image is submitted to the Radon transform, R (.), generating the ideal (free of noise) signal of the CT scan. The acquisition of data in the CT equipment depends on the amount of photons that reaches each detector. Once this problem is a particle countable process, well described by Poisson statistics [8], it is used in the model in Figure 1, represented by the N p random variable. However, due to the high number of photons, the acquisition process can be modeled as Gaussian due to the central limit theorem. In addition, the Gaussian model leads to additive algorithms, whereas the Poisson model leads to multiplicative, and therefore less efficient, algorithms [24]. In this work, we assume the signal arriving at the CT equipment detectors is influenced by Gaussian additive noise and, from among the dosage reduction methods presented in Section 1, we chose to emulate the low radiation dosage by reducing the number of projection angles processed. This means we assume that each detector absorbs an amount of photons that allows modeling the noise as a Gaussian additive, and the low dosage occurs by reducing the number of projections captured by the detectors (by reducing scanning angles). Accordingly, even with the process having a Poissonian nature, Gaussian additive noise can be added to the process, as where y N is the resulting signal that approximates a tomography signal, R (µ) is the result of applying the Radon transform on the synthetic image, µ, and N g is the Gaussian additive noise. The remainder of this work is dedicated to the reconstruction of the CT image, µ N , from the signal, y N , and the reduction of the global error, i.e., the reduction of the difference between synthetic and reconstructed images. As a criterion for measuring the quality of image reconstruction, we use peak signal-to-noise ratio (PSNR) and structural similarity, known as SSIM [30].

The Contribution of This Work
This work proposes a MAP solution with adaptive regularization term modeled by a bilateral edge-preserving (BEP) function [31,32] to regularize the fidelity term (l 2 norm). The result of BEP regularization is then subject to regulation by the total variation (TV) of the discrete gradient transform (DGT) function. The results of reconstruction with an adaptive norm, l a , using BEP are compared via structural similarity (SSIM) and peak signal-to-noise ratio (PSNR) with a simultaneous algebraic reconstruction technique (SART) reconstruction regularized via the TV minimization of the discrete gradient transform (DGT) function. The image reconstruction is an iterative process, and SSIM is calculated for each step, making it possible to objectively compare the methods step by step.
The assumption is that the better the reconstruction method, the higher the SSIM value associated with the reconstructed image. We also use the well-known PSNR metric to compare the process of image reconstruction step by step. The proposal is to determine if both SSIM and PSNR present consistent results in comparison to each other. Both approaches use synthetic images (the same used to generate the input signal, y N ) as a reference (for comparison). The rest of this work is organized as follows. In Section 2, the MAP model is developed, resulting in the objective function. In Section 3, the optimization technique of the objective function is developed. In Section 4, experiments are performed, and the results are presented and analyzed. Finally, in Section 5, we present the conclusions and final considerations.

Modeling the Objective Function
Most modern CT scanners use energy integration detectors whose photon counts are proportional to the total energy incident on them. Energy, in turn, is proportional to the number of X-ray photons that affect the detectors (sensors) of the tomograph. The denser the region traversed by X-ray photons, the lower the count I i of detected photons over integral line L i , i = 1, ..., N I , where N I is the maximum number of projections acquired by the CT scanner. This is known as the Beer-Lambert law, defined as where I 0 is the number of detected photons when the beam finds no obstacle, and the exponential term is the integral of all linear attenuation coefficients µ (x, y) on the line L i (with (x, y) being 2-D coordinates following L i ), which is the path of the beam. Equation (2) assumes that every X-ray emission has the same energy level, meaning that the process is monoenergetic. This approach is adopted in many works such as [9][10][11]15,16,27,28,[33][34][35], with the advantage of avoiding the beam hardening problem. Moreover, the monoenergetic approach leads to a more tractable mathematical model. However, the emission of X-rays is, by nature, polyenergetic. As a consequence, the same object reacts differently when subjected to X-rays of different energy levels, generating unwanted artifacts in the reconstructed image. These defects can be avoided, but with the adoption of complex models as in [12,13,36] and at a high computational cost. This topic is complex and still subject to change because CT scanners using monoenergetic X-ray sources are beginning to emerge [37]. In favor of a better understanding of the purpose of this work, we first present a base solution that uses SART reconstruction regularized via TV minimization of the DGT function (SART+DGT), highlighting the relevant parts, and then we present our approach. This strategy is trustworthy because makes it clear the value of the contribution in this work. The proposed method is abbreviated as SART+BEP+DGT.

Objective Function Modeling Using Soft-threshold Filtering for CT Image Reconstruction
This method consists of modeling an objective function with a l 2 norm fidelity function added and a DGT prior function regularized by a l 1 norm with TV minimization. Optimization of the objective function (Section 3) is performed using alternating minimization. The fidelity term is minimized by SART. The regularization (prior minimization) is performed by constructing a pseudo-inverse of the DGT and adapting a soft-threshold filtering algorithm whose convergence and efficiency have been theoretically proven by [38].
The key aspect of the modeling process is that reconstruction estimates the discrete attenuation, µ (x, y) for each j pixel of the image, with j = 1, ..., N J . Thus, the integral over the line, can be discretized as where A = a ij N I ×N J is the matrix representing the system geometry, µ = µ 1 , ..., µ N J T is the linear attenuation coefficient vector with µ j representing the j-th pixel, and the symbol T is the transpose of the matrix. In this model, every a ij is defined as the normalized length of the intersection between the i-th projection beam and j-th rectangular pixel centered in (x, y). The emission of X-ray photons is a rare event, so a Poisson distribution is usually adopted to describe the probabilistic model, expressed as where y i is the projection (measurement) along the i-th X-ray beam, andȳ i is the expected value.
Because the X-ray beams are independent from each other, taking into account Equation (4), the joint probability of y = y 1 , y 2 , ..., y N I given µ, P (y|µ) and observing y i countable events may be expressed as Using the MAP approach, as in [9,15,16,27], we have the objective function as follows: where 2 is the fidelity term, withp i being an estimate of p i , and R DGT (µ) (multiplied by a factor β = 1 ignored in Equation (6)) is the regularization term with l 1 norm based on DGT, defined as where j = (m − 1) × W + n, m = 1, 2, ..., H, n = 1, 2, ..., W, with W and H being, respectively, the width and height of the matrix representing the image with N J = W × H pixels. By definition, TV is the sum of DGT for all pixels of the image: with Dµ = D 1 µ, ..., D N J µ T . Thus, introducing the auxiliary variable ν = Dµ and applying the transformation with Λ = diag( y i /2) ∈ R N I × R N I being a diagonal matrix, the objective function in Equation (6) can be rewritten as where β is a positive adjustment parameter to balance the terms of fidelity and TV and is usually set to 1 [12,16]. The ultimate goal is to minimize the objective function Φ (µ), obtainingμ, as shown below: where the fidelity term, F (µ), represented both in the expanded version, as in Equation (6), and in the compact version, as in Equation (10), is shown below as and R (µ) is the restriction that drives the solution according to certain criteria (l 1 norm in this case). The optimization of F (µ), although simple, is an important concept and can be defined as follows: Inspired by the model in Equation (10), we propose in what follows a method for CT image reconstruction using adaptive soft-threshold filtering, which means, in brief, that the proposed method is intended to balance edge preservation and noise mitigation.

Objective Function Modeling by Using Bilateral Edge Preservation for CT Image Reconstruction
Regularization using the DGT (l 1 norm) works well for the CT reconstruction problem because it searches among the solutions of fidelity (l 2 norm) optimization looking for the one with a lower TV. However, it is common sense that the regularization based on the l 1 norm often introduces artificial edges in smooth transition areas. Moreover, a good regularization strategy must simultaneously perform noise suppression and edge preservation. With this motivation, Charbonnier et al. [24] proposed the bilateral edge preserving (BEP) regularization function, inspired by the bilateral total variation (BTV) regularization technique [39]. BTV regularization is defined by where q is a positive number, S l x and S m y are displacements by l and m pixels in the horizontal and vertical directions, respectively, X is the image in reconstruction/regularization, and α, 0 < α < 1, is applied to create a spatial decay effect for the sum of terms in BTV regularization. The BEP regulation uses the same principle as BTV but with an adaptive norm (instead of the l 1 norm) defined by where a is a positive value and s is the difference that one wants to minimize. This function was initially proposed by [24] to preserve edges in the image regularization process. The parameter a is used to specify the error value for which the regularization becomes linear (growing with the error) to constant (saturated, regardless of the error). The same adaptive norm definition is also used in super-resolution problems [32]. The ρ (s, a) function is an M-estimator since it corresponds to the maximum likelihood (LM) type estimation [40], and has its influence function given by The influence function indicates how much a particular measure contributes to the solution [32]. We illustrate in the graphs of Figure 2a the behavior of ρ (s, a) (the error norm function), and in Figure 2b, its influence function. It can be observed that as parameter a evolves from 0 to 1, the function changes its behavior from l 1 to l 2 norm. Thus, as mentioned by [31] and [32], Equation (15) behaves adaptively with respect to the norm that it implements. Therefore, combining Equations (14) and (15) for the particular case of tomography reconstruction, we propose an adaptive operator defined as where q, α, S l x , and S m y are the same as in Equation (14);μ, defined in Equation (13), is the estimated image obtained in the i-th iteration by l 2 minimization of the objective function in Equation (12); andμ j is the j-th pixel of imageμ, with j = 1, ..., N j . It is noteworthy that the term R BEP (μ) imposes an l a regularization norm, 1 < a < 2, on the imageμ. Thus, we can rewrite the objective function of Equation (10), Φ (µ), so that a new regularization term, R BEP (μ), is introduced between the l 2 norm minimization and TV minimization. As a consequence, the objective function proposed in this paper incorporates adaptive regularization to the objective function, and defining an auxiliary variable where γ is a positive adjustment parameter to balance the terms of fidelity and adaptive regularization. The other parameters are the same as in Equation (10), and η, 1 ≤ η ≤ 2, is the norm BEP method imposed on the regularization process.

Objective Function Optimization
The alternating minimization technique makes the simultaneous optimization of two or more terms of an objective function possible. Thus, for the proposed method, three steps are necessary: (1) minimizing F (µ) with SART, (2) applying the gradient descent (GD) method to the result of the first stage, using R BEP (µ) = γ σ η as a regularization term, and (3) applying DGT regularization to the previous result, minimizing β ν 1 with soft-threshold filtration. The three stages are repeated iteratively until a satisfactory result is obtained or a certain number of steps is reached. For the purpose of better understanding, each of the three stages is presented in sequence.

First Stage: Minimization of the Fidelity Term with SART
The first step is to solve the optimization problem described by Equation (13). A popular solution was proposed by Ge and Ming [33], and it can be computationally expressed by the iterative equatioñ where k is the iteration index, and 0 < λ k < 2 is an arbitrary relaxation parameter [15,41]. To simplify the notation, one can establish . Then, Equation (19) can be rewritten as where the term λ k is usually constant and equal to 1. The method described in Equation (19) is commonly known as SART. This method produces a relatively noisy reconstruction, as can be observed in Section 4. We reinforce here thatμ is the input of the second stage in the reconstruction process.

Second Stage: Bilateral Edge-Preserving with a Gradient Descent Method
In the second stage, the goal is to solve the optimization problem defined bŷ where γ is a parameter that weights the contribution of the constraint R BEP (Equation (17)). The gradient descent method can be applied to solve this problem as resulting in an optimization problem written as follows (see Appendix A for details): whereμ, as defined in Equation (13), is the result of first-stage minimization; ρ a = ρ (s, a) is as in Equation (15) but with s =μ; and ρ c = ρ (s, c) is the same as in Equation (15) but with a constant c instead of a constant a and s =μ − S l x S m yμ , where S l x and S m y are the same as in Equation (14). A computable matrix form was derived from Equation (23), and the result is shown below aŝ where γ k is an adjustment parameter to balance the k-th value ofμ k with the gradient descent contribution, R BEP μ k−1 ; ϕ is also an adjustment parameter, but it balances terms inside gradient descent; is the element-by-element product of two matrices of compatible dimensions; and I is the identity matrix. The matrix M =μ k − S l x S m yμ k is the difference betweenμ k and its version shifted by S l x S m y , and the operators H a (.) and H c (.) are defined, respectively, as H a μ k μ k and H c (M) M are influence functions (as defined in Equation (16)) resulting from the application of the gradient descent method.
It is important to clarify that in Equation (22), µ appears with the upper index k − 1 instead of k because the previous result of the gradient descent, µ k−1 , feeds the calculation of the current value, µ k , and this is the manner in which gradient descent works. In contrast, Equation (24) showsμ with upper index k (as inμ) rather than k − 1 becauseμ is obtained in the same interaction step, k, asμ, but in a previous stage denoted by the upper mark "tilde" (.), while the current stage is denoted by the upper mark "hat" (.).

Third Stage: TV Minimization by Soft-threshold Filtering
The third stage (TV optimization, R DGT ) is to solve the problem ν = Dµ, where D is not invertible, as proposed by Yu and Wang [15] and shown below: with where ω is a pre-established threshold;μ k = μ k mn , with m = 1, 2, ..., H and n = 1, 2, ..., W, with W and H being the width and height of the reconstructed image, respectively. D m,nμ k is the DGT matrix as defined in Equation (7). As explained in detail by [15] and observing Equation (27), when D m,nμ k < ω, the values of µ k m,n ,μ k m+1,n , andμ k m,n+1 must be adjusted so that D m,nμ k = 0. This means that if neighboring pixels in the reconstructed image are very close in value, it is likely that they have equal (or very close) values in the real image. Then, the method smooths the region around the pixel so that they look alike.
Alternately, when D m,nμ k ≥ ω, the goal is to reduce μ k m,n −μ k m+1,n 2 and μ k m,n −μ k m,n+1 2 but not cancel them. In this case, the method recognizes the differences between values of neighboring pixels as too significant to be totally eliminated. Instead, the differences are just softened.

Convergence and Convexity Considerations
The model proposed in this work, represented by Equation 18, gives an important initial gain in terms of PSNR to the reconstruction of CT images, as reported in Section 4. However, it is important to know how this model behaves in long-term processing. It is therefore necessary to investigate its convergence. We do this empirically by comparing the PSNR values of the cost function between the iterations k and k − 1, with 1 < k ≤ 5000. We present in Figure 3 a chart for each image used in the simulations of Section 4, and more specifically for the graphs and images of Figures 5 and 6 (Shepp-Logan head phantom), 7 and 8 (FORBILD head phantom), 9 and 10 (FORBILD abdomen phantom). It is worth noting that in all cases shown in Figure 3, the SART+BEP+DGT method is consistent with respect to convergence and presents better error reduction in terms of the PSNR metric. This simulation uses the same initial values defined in Section 4. The same situation applies when we graphically analyze the convergence of the proposed method using the SSIM metric, as can be seen in Figure 4. It is noteworthy that, according to Charbonnier et al. [24], the function described in Equation (15) is convex, and therefore, it would be possible to apply an iTV-style minimization procedure [42] to assure data consistency and regularization term improvement in each iteration step.

Experiments and Results
In the experiments, we used the synthetic images presented in Section 1.2, that is, Shepp-Logan head phantom, FORBILD head phantom, and FORBILD abdomen phantom. The signal from the CT equipment is simulated according to the model in Equation (1) addressing the low dosage scenario, considering a limited number of projections (meaning a limited number of scanning angles). On the image reconstruction side, we use the model y = Ax + e, which, as discussed in Section 1.1, denotes an inverse and ill-posed problem, where A (N I × N J ) is the matrix that describes the capture system, x (N J × 1) is the phantom represented lexicographically, and e (N J × 1) is the error, whose features were presented in Section 1.2. It is worth remembering that x is the image we intend to reconstruct from the noise signal y and, in the modeling process presented in Sections 2 and 3, we use the variable µ to represent it. By improving the system description, N I = n l n θ is the number of projections, where n l is the number of projection lines (i.e., the number of detectors) for each scan angle, and n θ is the total number of scan angles. n θ is the parameter whose value should be changed when the intention is to set a new dosage value, i.e., when we want to define a different (lower) number of projections, N I . The image has dimensions d × d, where d = N J = 2 R , R ∈ N + (positive natural). In this work, we use R = 9 (d = 512), and therefore, N J = d 2 = 262, 144, and N I (= n l n θ ) = 300n θ , with n l = 300 detectors. Thus, A has dimensions 300n θ × 262, 144, which are compatible with the dimensions of y and µ, respectively, i.e., 300n θ × 1 and 262, 144 × 1. It is important to note that the N I dimension of A (and y) is related to the number of scan angles, n θ , and this number of angles is what determines if the signal is of low (or regular) dosage, as discussed in Section 1, according to the ALARA principle. For a low dosage, we consider subsets of Θ, i.e., equally spaced sets of integer values between 0 and 179 degrees named Θ g . For example, Θ g=5 = {0, 44, 88, 132, 176} would be a possible subset, in which the g = 5 angles are equally spaced at 44 degrees. Using this notation, Θ is equivalent to Θ 180 , meaning that there are g = 180 scan angles equally spaced by 1 degree (which represents a regular dosage).
In the experiments with low dosage, the sets Θ g , with g in {15, 30}, will be used. A was obtained for a parallel architecture scanner.
By observing the objective function optimization process detailed in Section 3, and in alignment with the proposal in Section 1.3, we design a test framework that involves (1) the execution of the first stage (Section 3.1) alternating with the third stage (Section 3.3), which we will call here SART+DGT, and (2) execution of the first, second (Section 3.2), and third stages alternately and in this sequence, named SART+BEP+DGT. Simulations are shown in Table 1.  For both arrangements (SART+BEP and SART+BEP+DGT), the first stage is mandatory because it is the core of the reconstruction process. That is, it represents the optimization of the fidelity term in Equation (13). Because of its omnipresence, we also show simulations with SART only, which serve as a basis for comparing how much constraint terms actually contribute to the reconstruction process. The subsequent stages represent the application of constraint terms as described in Sections 3.2 and 3.3, respectively, Equations (24) and (26). For each of these test arrangements, it was arbitrarily established that the iterator, k, ranges from 1 to L, with 350 ≤ L < 1500, approximately. Because Gaussian noise and the Poisson process are random, each experiment is performed a considerable number of times, defined arbitrarily as 101 executions by experiment, and each result in Table 1 is the mean of the 101 SSIM and PSNR values. It is important to note that the result presented for each experiment (with a particular additive Gaussian noise or a certain number of projections) is the mean of 101 executions performed. Each execution produces a particular SSIM and PSNR result. We do not average pixels in any reconstructed image, but rather the SSIM and PSNR of the 101 executions performed for each testing case. The idea of using the average of a considerable number of iterations is based on the central limit theorem, which states that the arithmetic mean of a sufficiently large number of iterates of independent random variables will be approximately normally distributed, regardless of the underlying distribution, provided that each iteration has a finite expected value.

Low Dosage Tests and Results
As recommended by the ALARA principle, an alternative to reduce the total amount of radiation applied to a patient is decreasing the number of projections in the acquisition of the CT signal. According to the signal model proposed in Equation (1), we will consider the projections as individually influenced by Gaussian additive noise, and the low dosage signal is provided by reducing the number of scanning angles. In the batch of tests with low dosage projections, we consider using the sets of angles Θ g , with g in {15, 30}, where g is the amount of angles in Θ g (remember that in a regular dosage, we have 180 angles, from 0 o , ..., 179 o ). In our model, these values represent a reduced amount of photon emission, which can be understood as a low radiation dosage. All low dosage presented in this section is performed with signal-to-noise ratio (SNR) = 32, 46, and 60 dB. The SART stage used λ = 1, according [15] and [41]. The DGT stage maintained β = 1, as discussed in Section 2.1, and used as a threshold, ω, the average of the DGT for each k iteration. The BEP stage used γ = 0.001, ϕ = 0.150 (Section 3.2), a = 0.5, q = 3, α = 0.6, and c = 0.1 (Section 2.2). All parameters were empirically set.
A batch of experiments using the set Θ g , with g in {15, 30}, of projections for SART+BEP+DGT (named here as method A), SART+DGT (named here as B), and pure SART (named here as method C) methods are shown in Table 1 for PSNR and SSIM metrics, using the FORBILD abdomen phantom (FA), FORBILD head phantom (FH), and Shepp-Logan head phantom (SL) synthetic images. Analyzing the results for the PSNR metric with 15 and 30 projections, it is observed that for k = 350 steps, the results of the proposed method present a higher PSNR value in general. The exceptions are the FA and FH images, with an SNR of 32 dB. However, for k = 700 and 1000 steps and 30 projections, the results favor the SART+DGT method according to the PSNR metric. For 15 projections, results for k = 700 steps favor the proposed method in most tests performed with PSNR metrics. For the SSIM metric, the proposed method presents interesting results when compared to the SART+DGT method.
For the reconstruction of Shepp-Logan head phantom with 15 angles of projection, Figure 5 shows the evolution of the SSIM and PSNR values for SART+BEP+DGT (proposed), SART+DGT, and pure SART methods for SNR = 60 dB. In this particular experiment, marker 1 in Figure 5a indicates the highest SSIM value, 0.9240, reached by the proposed method and corresponding to the highest PSNR value, 72.7103, indicated by marker 1 in Figure 5b. Marker 2 shows in Figure 5a,b, respectively, the SSIM (0.8819 ) and PSNR ( 71.7521 ) values obtained in step k = 551. Marker 3 in Figure 5b highlights the point at which the SART+DGT method reaches the same PSNR value as the proposed method, in step k = 1015, approximately, and the graphs in Figure 5 agree with Table 1. Looking closely at Figure 6b,c, it is possible to note the presence of random noise (indicated by the white arrows) that manifests as small white dots in Figure 6c, while in Figure 6b this phenomenon is not easily perceived. This is because the BEP regularization used in the proposed method (Figure 6b) tends to eliminate noise faster. However, the reconstruction performed by the SART+DGT method produces a more homogeneous image, as shown in Figure 6c. This is also related to the elimination of random noise by the introduction of BEP regularization in the reconstruction process. Figure 6d presents the result of the image reconstruction using the pure SART method.    Figure 9 shows the evolution of the SSIM and PSNR values for the reconstruction with 15 angles of projection with SNR = 60 dB for the FORBILD abdomen phantom using the SART+BEP+DGT, SART+DGT, and pure SART methods. In this particular experiment, marker 1 in Figure 9a indicates the highest SSIM value, 0.9419, reached by the proposed method and corresponding to the highest PSNR value, 77.7180, indicated by marker 1 in Figure 9b, both obtained in step k = 685. Marker 2 shows, in Figure 9a,b, respectively, the SSIM (0.9327) and PSNR (77.2279) values obtained in step k = 685. Marker 3 in Figure 9b highlights the point at which the SART+DGT method reaches the same PSNR value of the proposed method, in step k = 920, approximately. The image of Figure 10b shows a tendency to eliminate the characteristic lines and bands of the reconstruction process with few scanning angles.
Each of the examples of reconstructed images shown in Figures 6, 8 and 10, and their SSIM and PSNR graphs in Figures 5, 7, and 9, respectively, result from a single execution test pinched from a set of 101 executions. Figure 11 shows box plot graphs for various situations, combining image to be reconstructed, SNR dB value, number of projections, method used, and step k of the execution. As an example, for the reconstruction of Shepp-Logan with 15 projections, method A and SNR = 32 dB using the column (step) k = 600 of the processing matrix 101 × 1000 is highlighted in the graph of Figure 11a. Table 2 shows the details of each element of the box plot graphs in Figure 11, and it is straightforward to note that the standard deviation (Table 2) increases with noise ( Figure 11).

Conclusions and Final Comments
The proposed method, composed by the steps (1) SART reconstruction, (2) BEP adaptive minimization, and (3) TV minimization via DGT, synthesized in Equation (18), presents, in the first steps of the processing, a more pronounced reduction in the noise level of the reconstructed image both for SSIM and PSNR metrics, as can be seen in Section 4. It is important to emphasize that the tests were done with 15 and 30 projections, as shown in Table 1. At some point, the proposed method reaches its maximum PSNR value. It can be observed that at this point (maximum PSNR), the images are reasonably intelligible. From this point forward, the SART+DGT method gives higher values of PSNR and, consequently, a less noisy reconstruction. Even after the apex of the proposed method with regard to the value of PSNR, the value of SSIM remains above in many of the cases studied, when compared to the result of the SART+DGT method. The best values for SSIM generally result in images with better contrast, and this is very important for artifact viewing and contour distinction in the reconstructed image. Structural similarity works considering morphological features in the evaluation of reconstruction results and for this reason presents results more suitable to human standards, when compared with the PSNR metric. However, the main disadvantage of this method is that in a practical application, we cannot know the maximum PSNR since we do not have an original image for comparison. On the other hand, the advantage of the proposed method is that it delivers results earlier in the reconstruction process.
The use of BEP minimization soon after the SART reconstruction, as explained in Section 3, is intended to promote image noise reduction in the reconstruction process, delivering a less noisy image to the later stage (of minimization by TV using DGT). In fact, the noise reduction occurs up to a certain point, and although it is not possible in a practical application to define the ideal stopping point (maximum PSNR), it may be possible to estimate this point based on the type of image and the number of projections used. Acknowledgments: Thanks to PPGEE-Programa de Pós-graduação em Engenharia Elétrica-UFES for the support provided during the preparation and submission of this work.

Conflicts of Interest:
The authors declare no conflict of interest.

Abbreviations
The following abbreviations are used in this manuscript: ALARA As-low-as-reasonably- and Equation (23) can be written in compact form aŝ and in its expanded form aŝ