Bayesian Activity Estimation and Uncertainty Quantification of Spent Nuclear Fuel Using Passive Gamma Emission Tomography

In this paper, we address the problem of activity estimation in passive gamma emission tomography (PGET) of spent nuclear fuel. Two different noise models are considered and compared, namely, the isotropic Gaussian and the Poisson noise models. The problem is formulated within a Bayesian framework as a linear inverse problem and prior distributions are assigned to the unknown model parameters. In particular, a Bernoulli-truncated Gaussian prior model is considered to promote sparse pin configurations. A Markov chain Monte Carlo (MCMC) method, based on a split and augmented Gibbs sampler, is then used to sample the posterior distribution of the unknown parameters. The proposed algorithm is first validated by simulations conducted using synthetic data, generated using the nominal models. We then consider more realistic data simulated using a bespoke simulator, whose forward model is non-linear and not available analytically. In that case, the linear models used are mis-specified and we analyse their robustness for activity estimation. The results demonstrate superior performance of the proposed approach in estimating the pin activities in different assembly patterns, in addition to being able to quantify their uncertainty measures, in comparison with existing methods.


Introduction
In order to deter the proliferation of nuclear weapons, safeguards provide various technical measures that are used for the verification and the declarations made by the signatories to the Treaty on the Non-Proliferation of Nuclear Weapons, regarding their nuclear material and activities [1]. An important task within these safeguards is monitoring of spent fuel assemblies (SFAs) from nuclear power plants (NPPs), for detecting any eventual diversion of spent nuclear fuel for non-declared purposes. The detection of a single fuel pin missing from SFA should be reported. For any safeguards investigation of SFAs, it is important to use a minimum amount of a priori information on the SFA under study, in order to avoid biasing and potentially misleading the investigation. Eventually, IAEA approved the use of the passive gamma emission tomography (PGET) instrument [2][3][4][5][6][7] in inspections.
Passive gamma emission tomography is an imaging modality that can be used for the verification of spent nuclear fuel stored in water pools [4]. Spent nuclear fuel is highly radioactive; hence, the tomography can be conducted in passive mode, without the need of an external X-ray source, as in traditional tomography. The detection unit considered in this work consists of two collimated CdTe detector arrays, on opposite sides of the fuel assembly, each encompassing 91 detectors. The detector arrays rotate and scan the fuel bundle in steps of one degree, which generates a sinogram by detecting gamma rays emitted by the fuel pins. A cross-sectional image of the spent fuel bundle, using image reconstruction algorithms, can then be reconstructed. Based on these images, the fuel pins can be identified and their activity estimated as the sum of pixel values inside each pin region. Image reconstruction techniques with the capability of estimating the fuel pins' activity accurately are crucial in nuclear safeguards, as they allow inspectors to monitor nuclear material and promptly identify its diversion.
Due to the nature of acquisition process (e.g., the discrete nature of the measured photon counts), the observation noise of PGET sinograms is expected to be well-approximated by Poisson noise. However, if the photon counts in a sinogram pixel is high, the Poisson distribution becomes more symmetric, and it can be well-approximated using a Gaussian distribution, whose mean and variance change across pixels. While the sinogram formation process can be approximated by a linear forward model (which simplifies the inversion), this first-order approximation does not take into account the attenuation of the measured radiation of a pin, due to the presence of other pins (or shielding materials) within the assembly. Consequently, the activity levels, estimated by inversion of a linear problem, can present biases and artefacts, depending on the noise model considered. Hence, in this work, we provide a comparison of both the Poisson and isotropic Gaussian noise models and investigate their robustness. Moreover, we also investigate different linear operators corresponding to different levels of prior knowledge about the assembly configuration.
Most of state-of-the-art algorithms considered to solve linear inverse image restoration problems (irrespective of the noise model) are either optimization or simulation-based methods. Optimization-based approaches primarily rely on log-concave Bayesian models, such as [8][9][10][11][12][13][14][15], and have been proposed to perform maximum-a-posteriori (MAP) estimation. For example, PIDAL, which stands for Poisson image deblurring using augmented Lagrangian [8], and SALSA, which stands for split augmented Lagrangian shrinkage algorithm [10], are Poisson and Gaussian image restoration algorithms based on a totalvariation loss ot sparsity-promoting prior, which solves the restoration problem using an alternating direction method of multipliers (ADMM). Moreover SARA, which stands for sparsity averaging reweighted anlysis [11,12], assumes that the ill-posed problem is regularized by the assumption of average signal sparsity over representations in multiple wavelet bases. Although such algorithms are efficient in computing maximum a posteriori (MAP) estimates relatively quickly, they cannot provide uncertainty maps for the estimates, which can be very valuable in applications involving subsequent decision making, such as pin identification and activity estimation in the PGET context. Alternatively, many studies have considered hierarchical Bayesian models to solve the deconvolution and restoration problems akin to those in [16][17][18]. This class of methods assumes that the unknown model parameters are unknown stochastic quantities by assigning them suitable prior distributions, based on prior beliefs. The joint posterior distribution can then be computed using Bayes theorem and exploited using Markov chain Monte Carlo methods. These models offer a flexible and consistent methodology to deal with uncertainty in ill-posed inverse problems. Moreover, additional unknown parameters can be jointly estimated within the algorithm, such as regularization parameters. As such, they represent an attractive way to tackle ill-posed problems, such as the one considered in this work, where an automated hyperparameter setting is challenging. As an example, the authors in [17] proposed a Bayesian approach that samples from a posterior distribution built on a Poissonian likelihood and standard convex and possibly non-smooth regularizers. That study proposed an approach that relied on the split-and-augmented Gibbs sampler (SPA) [18] to tackle a Poisson/Gaussian reconstruction problem by sampling from an approximate joint posterior distribution. In particular, the log-concave property of the prior distributions allows the derivation of log-concave conditional distributions that are decoupled from the likelihood and can be sampled by proximal MCMC methods [19,20]. In this context, the SPA method provides both image estimates, such as MAP and MMSE, and their corresponding posterior uncertainty measures.
The PGET instrument is essentially a single photon emission computed tomography (SPECT) system that allows the reconstruction of axial cross-sections of the emission map of the SFA. Industrial or medical tomographic imaging systems work in a similar manner as PGET systems, in terms of data acquisition. Furthermore, the PGET concept can be applied to perform fast neutron imaging by replacing gamma ray detectors with neutron detectors [21]. However, the PGET of spent fuel poses some unique challenges, due to the high activity of the sources (a single-pin activity is of the order of 10 13 Bq) and the high selfattenuation of the fuel pins. Hence, robust image reconstruction algorithms are crucially required. For instance, in [7], the authors proposed a method to simultaneously reconstruct the activity and attenuation maps by formulating the reconstruction as a constrained minimization problem with a least squares data fidelity term and quadratic regularization terms with different smoothness matrices. However, the forward model proposed in [7] considered only monoenergetic gamma rays and did not account for the scattering effects of gamma rays in the PGET system that make non-neglible contributions to the total counts, which added additional uncertainty to the activity estimation results. Similarly, in [22], the authors formulated the problem of pin activity estimation as an optimization problem where a sparse regularization function is used. Although the advantages of these methods in providing fast estimates, they can provide only point estimates; hence, they can neither quantify the uncertainty of the estimates nor provide probability of presence of the pins. Moreover, associated regularisation parameters need to be manually tuned by the user, which can affect the estimated activity levels. In this work, to improve the quality of the estimated activity profiles and reduce the computational cost of the inversion procedure, we propose to only estimate the activity in a reduced number of pixels, leveraging the prior knowledge that the activity profile is expected to be null outside pin locations. For instance, for a cross-sectional image of the field of view discretized into 182 × 182 pixels, the pins, whose maximum number is known, form groups of 10 to 16 connected pixels, which form a small subset (≈15%) of the 182 2 pixels. The method proposed in this paper assumes that a mask, identifying where pins are expected to be present, is available. This mask is used (i) to account for pin self-attenuation when building the linear approximate forward operator and (ii) as an upper bound for the support of the pins present in the assembly. While the geometry of the assembly is generally known, the exact presence map is often unknown (this is one feature we want to estimate), but several initial guess can be considered, as will be discussed in Section 6. The main contributions of this work can be summarized as follows.

•
We formulate the pin activity estimation problem within a Bayesian framework and assign a Bernoulli truncated-Gaussian (BtG) prior model to the intensity field to be estimated. To the best of our knowledge, no work attempted to sample from such a highly-multimodel joint posterior distribution using an SPA sampler. This allows for estimating the activity of spent-fuel, including the assessment of fuel rod presence/absence. • We compare the performance of two different noise models for pin activity estimation from PGET sinograms simulated while accounting for pin self-attenuation (i.e., not simulated using a linear model).

•
In addition to estimating the activity profile, the proposed algorithms allow the automated estimation of the crucial model hyperparameters, like regularization parameters, which might affect the resulting estimated activity.
The remaining sections of the paper are organized as follows. Section 2 formulates the problem of PGET image restoration, followed by Section 3, which summarizes the likelihoods and the prior distributions assigned to the unknown parameters of the models. The resulting joint posterior distribution and the partially collapsed Gibbs sampler, used to sample that distribution, are discussed in Section 4. Simulations conducted using synthetic data (following a linear models used for inversion) are presented in Section 5 and the analysis of more realistic data is discussed in Section 6. Conclusions and future work are, finally, reported in Section 7. Figure 1 shows examples of different pin configurations (i.e., presence maps) akin to those expected to be studied using a PGET system. We can observe that each crosssectional image consists of pins formed by groups of connected pixels (white), which will emit radiations. Black pixels do not emit measurable radiations. In this work, we assume that we know, a priori, a mask such as those in Figure 1 such that the activity of the black pixels is assumed to be null and thus not estimated. However, the mask can be designed conservatively and select pixels where no pins are present (e.g., the mask in Figure 1a can be selected although the actual assembly configuration is that depicted in Figure 1b). Using this used-defined mask, we defined as x = (x 1 , · · · , x N ) T ∈ R N , the concatenation of the pixel values to be estimated. Figure 1. Examples of binary masks of the simulated fuel assemblies. (a) Case "1", (b) Case "2", and (c) Case "3". In Case "1", the mock-up fuel assembly hosted 331 60 Co pins of 8.879 g/cm 3 density, which emitted 1.17 and 1.33 MeV gamma rays. In Cases "2" and "3", 10% 60 Co pins pins were removed at the center or uniformly.

Problem Formulation
The pin activity estimation problem is formulated as follows. Given a set of measurements (sinogram) y = (y 1 , · · · , y M ) T ∈ R M , we aim at recovering the underlying pin intensities x = (x 1 , · · · , x N ) T ∈ R N , which are related to the observation y by y = G(x) + w, where G is a function representing the system's response, and w represents random observation noise. Due to photon scattering and attenuation effects between pins during the data formation process, the function G is non-linear and difficult to model analytically. However, it can be approximated by a linear model based on the user-defined mask discussed above. Indeed, the presence map can be used to build a linear mapping from x and y whereby attenuation involving neighbouring pins is accounted for (the attenuation is mostly due to the presence of pins, not their activity). Hence, in this work, the function G is approximated by a linear operator A ∈ R M×N , which is built using the simulator recently considered in [22]. Consequently, the forward model considered in this work can be expressed as: where F (·) denotes a probability distribution and ∼ reads "is distributed according to". While the observation noise corrupting PGET sinograms is expected to be Poisson distributed, we will consider two different noise models as the linear approximation can introduce mis-modeling errors. The problem investigated in this paper is the estimation the pin intensities x from the observation vector y. This inverse problem is severely illconditioned because of A and prior regularization is necessary to promote the solution to be in a set of feasible intensities x. To solve this problem, we propose a hierarchical Bayesian model and a sampling method to estimate the unknown model parameter.

Hierarchical Bayesian Model
This section introduces the hierarchical Bayesian model proposed to estimate the unknown parameter x. This model is based on the likelihood function of the observations and on prior distributions assigned to the unknown parameters, i.e., x and potential hyperparameters.

Likelihood
Under the Poisson noise model assumption, each individual observation y m , m ∈ [1, M] corresponds to an independent realization of a Poisson random variable, that is, for where P (·) denotes the Poisson distribution and {Ax} m denotes the mth element of Ax.
The full likelihood reduces to f (y|x) = ∏ M m=1 f (y m |x). The second model considered assumes independent and identically distributed (idd) additive Gaussian noise. In that case, the full likelihood reduces to: where σ 2 is the (unknown) noise variance.

Prior Distributions
As mentioned earlier, x ∈ R N is a concatenation of pixel intensities. If the user-defined mask is perfect, then the entries of x are expected to be positive, as long as a pin present has a measurable activity level. However, if the mask is designed conservatively, a pin which is assumed to be present can have a null activity level in practice, i.e., a fraction of the elements of x are equal to 0. In order to model such belief while ensuring the positivity of the activity levels, a classical choice is the following exponential prior distribution: where β is a regularization parameter controlling the mean value of x and 1 R + (x) is the indicator function defined on the positive orthant of R N . The resulting joint posterior distribution f (x|y, β) (for a fixed β) is relatively easy to sample from using an SPA sampling strategy, as in [17,18], resulting in an algorithm that is referred to as SPA-P-1 (Split and Augmented Gibbs sampler for Poisson noise) and SPA-G-1 (Split and Augmented Gibbs sampler for Gaussian noise), assuming σ 2 is known. Moreover, one can maximize the logarithm of the resulting joint posterior distribution and compute the maximum a posteriori (MAP) estimates as in [8], resulting in the PIDAL-1 algorithm using a Poisson image model and FISTA-1 for Gaussian image model. However, in this work, to model the intensity field vector sparsity, we introduce: where z g ∈ {0, 1} is a binary label for pin g and t g = [t K g ] ∈ R K g is the corresponding amplitude vector for the K g pixels forming pin g, x g ∈ R K g and x = [x 1 , . . . , x G ], where G is the maximum number of pins in the assembly, as defined by the user-defined mask. If z g = 0, the activity of any pixel of the pin g is 0. If z g = 1, then x g = t g . Note that K g and G are known a priori, however, the number of active pins forming a specific assembly (i.e., the value of {z g } g ∈ {0, 1} G ) is not known and must be estimated. The decomposition described above allows one to decouple the location of the sparse components from their values. In a similar fashion to [23], to model the sparsity of x, we utilize a structured spike-and-slab prior, which stems from classical spike-and-slab priors that have been used in Bayesian regression and factor models [24][25][26][27]. Following this, we assume that the intensities in {t g } g are drawn independently from the same truncated zero-mean Gaussian distribution, i.e., where s 2 controls the prior variance of the intensity vector. The pin presence labels are assumed a priori mutually independent and assigned a shared Bernoulli prior distribution, leading to: and The prior models of t and z depend on the hyperparameters s 2 and ω which can be difficult to fix automatically. Thus, we include them within the inference process using hyperpriors. To reflect the lack of prior knowledge about the variance s 2 in Equation (6), a weakly informative inverse-gamma prior is assigned to s 2 : where (η, ν) are fixed to (η, ν) = (10 −3 , 10 −3 ). Similarly, we assign ω a conjugate beta prior distribution: where (α, β) is fixed to (α, β) = (1, 1). Note that we did not observe significant changes in the results when changing the hyperparameters (η, ν) to (10 −3 , 10 −3 ), (10 −2 , 10 −2 ) and (1, 1). Similarly for (α, β) to values (1, 1), (1/9, 1) and (0.1, 0.9), for means to the Be distribution that are equal to 0.5, 0.1 and 0.9 respectively.

Joint Posterior Distribution
Assuming the parameters z and t are a priori mutually independent, the joint posterior of the parameter vector ∆ = {z, t} and hyperparameters Φ = {s 2 , ω} can be expressed as: where The joint posterior distribution reduces to: where Z = diag([z 1 1 K 1 , z 2 1 K 2 , · · · , z g 1 K g ]). The next paragraph describes the sampling strategy adopted to estimate the unknown parameter vector ∆ and the hyperparameters Φ.

Bayesian Inference
To overcome the challenging derivation of Bayesian estimators associated with f (∆, Φ|y), we use an efficient Markov chain Monte Carlo (MCMC) method to generate samples asymptotically distributed according to Equation (12). More precisely, we consider a variablesplitting inspired MCMC algorithm, namely the split-and-augmented Gibbs sampler (SPA, presented in the next paragraph) that was recently proposed in [18]. For SPA, we assume that Υ = {b, c}, where b ≈ Ac and c ≈ Zt. This splitting decouples locally each vector of variables from the rest of the variables, which will in turn make the inference easier. The joint posterior distribution in Equation (11) can be extended as: where 0, s 2 and the new terms are divergence functions such that the joint extended posterior distribution defines a proper probability distribution. The auxiliary variables Λ = {o 1 , o 2 } associated with the splitting variables Υ = {b, c}, which allow to decrease the correlation between the variables by giving an additional degree of freedom to each of the former variables (see [18] for more details). Moreover, ρ, α > 0 are user defined parameters. In Equation (13), it is possible to use a block Gibbs sampler( with proximal MCMC step in the Poisson noise case) to sample according to the conditional distributions of each of the unknown model parameters. In practice, strong correlations appear between z and t. Moreover, as z can be sparse, sampling f (t, s 2 |y, ∆ \t , Φ \(s 2 ) ), where H \u denotes the parameter vector H whose parameter u is omitted, via Gibbs sampling results in very slow convergence. Hence, we use a partially collapsed Gibbs sampler (PCGS) which provides better mixing and convergence properties [28][29][30][31][32][33]. The PCGS used here samples groups of variables (e.g., (z, t)) from their joint posterior distribution rather than from their conditional distributions. Sampling from the joint distribution is achieved by first marginalising some variables which are then sampled from their full conditional distribution [34,35]. Precisely, we sample sequentially the elements of ∆, Υ, Λ and Φ using moves that are summarised in Algorithm 1.
Algorithm 1 Split and augmented-partially collapsed Gibbs sampling algorithm for activity estimation in PGET-version I. 1: Fixed input parameters: Number of burn-in iterations N bi , total number of iterations In Algorithm 1, t z g =0 denotes the elements of t whose corresponding labels in z are null. Similarly, t z g =1 denotes the elements of t whose labels are equal to 1. We introduce the details of sampling each step of the algorithm above in the Appendix A.
The algorithm is stopped after N MC iterations, including the N bi burn-in iterations, which correspond to the transient period of the sampler (determined visually from preliminary runs). The first N bi samples are discarded and the remaining samples are used to approximate the following coupled Bayesian estimators that are particularly suitable for pin activity estimation problems. For the support of the activity vector or "presence maps", which provides the probability of presence of pins, we use the marginal maximum a posteriori (MMAP) estimator [36,37] As we are interested in estimating the pin intensities x g = z g t g = z g [t 1,g · · · , t i,g , · · · , t K g ,g ], the empirical averages of the generated samples (conditional minimum mean squared error "CMMSE" estimates) of each pin pixel x i,g = z g t i,g , conditionally on the estimated supports, is given by: where if z MMAP is given by: The variance of active pin pixels can be computed by: Moreover, the average of pin pixel intensities can be computed as: which can then assigned for the pixels forming each pin. The variance can be computed in same manner as in Equation (17).

Simulations Using Synthetic Datasets
In this section, we assess the performance of the proposed algorithm for cases where the data, given x, are generated according to the model used to perform Bayesian inference. More precisely, we define the linear operator A, compute Ax, and generate sinograms corrupted by additive idd Gaussian noise and by Poisson noise. We then use the corresponding algorithm to infer x. Cases where y is generated using the more realistic simulator will be discussed later in Section 6.

Data Creation
To assess the performance of the proposed approaches, we created a simulated activity profile akin to those expected in actual assemblies. This image is of size (182 × 182) and contains 331 pins spanning a total number of 5329 pixels. The image, depicted in Figure 2c presents three different pin activity levels, and 10% of pins are missing at the center of the assembly. The response matrix of the system A ∈ R 65,520×5329 was constructed by simulating the detector array response to a source of unit activity inside each pixel, using the ground truth presence maps depicted in Figure 2b. Here, the admissible support in Figure 2a contains all the possible pin locations in the assembly. This choice was made to confirm whether the algorithms are prone to underestimating of overestimating the number of pins actually present. The forward model in Equation (1), with the different noise models in Equations (2) and (3), is then used to generate the sinograms with the system response matrix A. Different activity peak values in {220, 160, 100, 40, 10} (accounting for different signal to noise ratios), are tested for the Poisson noise case. For the Gaussian noise case, the activity peak value was set to 220, and the noise variance in the Gaussian case is assumed independent and identically distributed idd different noise variances, belonging to {1, 2, 4, 6, 8} × 10 −3 , were considered. Note that the linear operator in the Poisson noise case is scaled to A = A × 10 3 to keep the same activity profile as of the Gaussian noise case.

Quantitative Analysis
The proposed approach is run on the simulated datasets described above. In all cases, a Markov chain of length N bi = 1.2 × 10 4 and a burn-in period of length N MC = 4 × 10 3 are used. The hyperparameters (ρ, α) of the split-and-augmented Gibbs sampler are set to (ρ, α) = (10, 1) and those of the P-MYULA are set to (λ, γ) = (ρ 2 , ρ 2 /4), as in [17]. The quantitative measure used to assess the quality of estimated activity profiles is the normalized mean square error (NMSE), defined as wherex is an estimate of x. The proposed approach is compared against four of the existing approaches described earlier, namely, SPA-P-1 and PIDAL-1 , for the Poisson noise model, and SPA-G-1 and FISTA-1 for the Gaussian noise model. For both SPA-P-1 and SPA-G-1 , the Markov chain length N MC , burn-in period N bi , the hyperparameters (ρ, α) of the split-and-augmented Gibbs sampler, and those of the P-MYULA (λ, γ) are set as in the proposed approach. For the four methods, several values for the 1 regularization parameter β are tested, from which we pick the one providing the lowest NMSE. Figure 3 shows plots of NMSE versus different activity profile peak values for the Poisson noise case, and different noise variances for a fixed activity profile peak value for the Gaussian noise case. We can observe that SPA-1 provides the highest NMSE, whereas SPA-BtG and FISTA/PIDAL-1 provide close results. This figure shows that, when comparing MMSE estimates, the proposed Bernoulli-based prior enforcing group sparsity is more efficient than the Exponential-based prior, for both noise models. Although MAP estimation based on the Exponential-based prior is appealing, the corresponding algorithms do not allow directly uncertainty quantification.

The Proposed Method
In this section, we present visual results for activity estimation using the proposed approaches. For each case, we plot both the individual pin pixel activities and the average activity of pixels spanning each pin. Figures 4 and 5 show examples of results for pin activity estimation and uncertainty quantification for the Gaussian and Poisson noise models, respectively. The peak value was set to 220 in both, and the noise variance of the Gaussian noise case was set to σ 2 = 2 × 10 −3 . We can observe that the algorithm could successfully reconstruct the activity profile with both noise models. Moreover, it can be seen that the differences in pin activity levels are clearer when considering the average of pin pixel intensities as an activity estimate, and the uncertainty is much lower, in comparison with considering all pixel intensities. This is because averaging naturally contributes to reducing the noise in the activity estimates, equivalently enhancing the local signal to noise ratio. Consequently, the ratio of marginal standard deviation divided by the posterior mean shows clearer activity level differences. Hence, for the rest of the experiments in this paper, we will consider pin pixel averages as activity estimates. Figure 4. Results of the proposed approach using the Gaussian noise model, when the peak value is set to 220 and noise variance is set to σ 2 = 2 × 10 −3 . Column 1: posterior mean, Column 2: marginal standard deviation and Column 3: ratio of marginal standard deviation divided by posterior mean. Row 1: All pixels, and Row 2: pin pixels averaging.  Figure 6 depicts examples of estimated probabilities of pin presence in both the Poisson noise case, using different activity peak values, and the Gaussian noise case for two levels of signal quality. We can observe that for both the Gaussian and Poisson noise cases, when the peak value is reasonably high, the algorithm could identify with high confidence levels the 10% missing pins near the center of the assembly. However, when the peak value in the Poisson noise case decreases, and the noise variance in the Gaussian noise case increases, it becomes more difficult to identify the assembly configuration. In Figure 6a for instance, the algorithm tends to overestimate the number of pins present while pins in the outer ring of the assembly seem more difficult to identify with high confidence. While the comparison between Poisson and iid Gaussian noise is difficult as in the Poisson case the signal-to-noise ratio is pixel-dependent, we observed that for similar average SNR, the activity estimation is more challenging in the Poisson case, probably because of the non-stationary SNR. Figure 6. Probability of pin presence for (a) the Poisson noise case when peak value is set to 10, (b) the Gaussian noise case for peak value of 220 and noise variance σ 2 = 1 × 10 −2 , and (c) the Poisson noise case when peak value is set to 220, and (d) the Gaussian noise case when the peak value is set to 220 and the noise variance is set to σ 2 = 2 × 10 −3 .

Comparison with Existing Methods
For completeness, Figure 7 shows examples of results with the SPA-P-1 approach for pin activity estimation using the Poisson noise model. The results for the Gaussian noise model (using SPA-G-1 ) present similar trends and hence are not included here. For all pin pixels and pin pixels averaging results, we can observe that the posterior means look similar to those of the proposed approach. However, from the ratio of marginal standard deviation divided by posterior mean, the SPA-P-1 approach is not confident about the 10% missing pins in the center. On the other hand, Figure 8 shows the results of the PIDAL-1 approach for pin activity estimation using Poisson noise model. Those of the Gaussian noise model using FISTA-1 are similar and hence are not presented here. We can observe that the maximum a posteriori estimate is similar to the MMSE using both the proposed approach and SPA-P-1 . However, these MAP-based methods cannot provide uncertainty measures about the estimates. Figure 7. Pin activity estimation using SPA-P-1 , using the Poisson noise model when peak value is set to 220. The results using the Gaussian noise model (SPA-G-1 ) using peak value = 220 and noise variance σ 2 = 2 × 10 −3 present similar trends and hence are not presented here.

Simulations Using Realistic Datasets
In this section, we simulated the measurement of five fuel pin configurations in a mockup water-water energetic reactor (WWER) fuel assembly using the software MCNP 6.2 [38], as shown in Figure 9. In this figure, we show the ground truth assembly configuration of each case. As in Case "1", the mock-up fuel assembly hosted 331 60 Co pins, which emitted 1.17 and 1.33 MeV gamma rays. We simulated 60 Co pins because they are used for the testing of the PGET system. In Cases "2" and "3", we mimicked the scenario where fuel pins were missing by removing 10% fuel pins in the assembly. Each pin in the three cases is formed by a number of pixels belonging to {10-16}. In Cases "4" and "5", 10% fuel pins were replaced by depleted uranium pins of the same size as the 60 Co ones. These pins are not emitting measurable gamma rays. In the five cases, the activity in each white pixel is constant. In contrast to Section 5 where we used a linear operator A constructed using physical model of collimator-detector system response and simulated data using the linear model in Equation (1), here the data are generated using the MCNP 6.2 software that simulates all photon interactions in the PGET system [38]. More details on the simulation set up can be found in [22]. The simulated detection unit is a high-fidelity model of the PGET and consists of two collimated CdTe detector arrays on the top and bottom sides of the fuel assembly. The detector arrays rotate and scan the fuel bundle generating a 182 × 360 sinogram, as shown in Figure 10. Due to the mismatch between the observations and Ax, the robustness of the linear model is assessed using the two noise models mentioned earlier; the idd Gaussian and the Poisson noise models. Figure 9. Actual binary masks of the simulated fuel assemblies. (a) Case "1", (b) Case "2", (c) Case "3", (d) Case "4", (e) Case "5". In Case "1", the mock-up fuel assembly hosted 331 60 Co pins of 8.879 g/cm 3 density, which emitted 1.17 and 1.33 MeV gamma rays. In Cases "2" and "3", 10% 60 Co pins pins were removed at the center or uniformly. In Cases "4" and "5", 10% 60 Co pins pins were replaced by depleted uranium pins of the same size but higher density (10.4 g/cm 3 ) and no activity. Different response matrices, namely the IDEAL, the FULL and the EMPIRICAL, were tested to estimate the pin activity. To calculate a response matrix A, pin locations are needed as input parameters to identify the pixels inside the pins and account for scattering and attenuation effects. To construct the IDEAL response matrices, the exact pin positions in each case (ground truth) are used. With such response matrices, we run the proposed algorithms to estimate the pin activity levels and to confirm whether the methods tend to underestimate the number of pins. For Case "1", where no pins are missing, A ∈ R 65,520×5329 . However, for the rest of the cases (Cases "2" to "5") where 10% of the pins are missing, the response matrix of the system is of size A ∈ R 65,520×4732 . In practice, these IDEAL response matrices are not available as we aim precisely at identify which pins are present. A more realistic scenario consists of assuming that all the pins are a priori present (although they may not). This assumption is accurate in Case 1, but it expected to introduce errors for the other cases, in particular if a large number of pins is missing. The resulting matrix is of size 65, 520 × 5329 and it coincides with the IDEAL matrix of Case 1. The last strategy considered here to build A is referred to as EMPIRICAL and consists of identifying, as a pre-processing step, the likely pin support using a fast inversion algorithm (filtered backprojection (FBP), as in [22]). FBP provides a coarse activity map which is then thresholded to identify potential pin locations. This approach is more evolved than using the FULL A and can discard correctly potential pin locations, but it can also remove legit pin locations if the threshold is set incorrectly. Using the empirical matrices A, a matrix is created for each of the five cases. In this section, we test the performance of the proposed approach using the IDEAL response matrices. In this context, the linear model in Equation (1) is expected to be a good approximation of the actual forward model. In all cases, the hyperparameters of the model are set like those described in Section 5. Figures 11-13 show the results of activity estimation and the associated uncertainty measures, for Case 1 (full assembly), Cases 2 and 3 (missing pins) and Cases 4 and 5 (replaced pins), respectively. In these figures, we can observe that the Gaussian noise model generally provides more uniform pin activities compared to when using the Poisson noise model. Moreover, the activity profiles using the Gaussian noise model are lower, compared to the Poisson noise model, as it fitted the data better and hence improved the convergence and mixing properties of the MCMC chains. Figure 14 shows the posterior probabilities of presence of pins in the five investigated assembly patterns, using the Poisson noise model. We can observe that existing pins have a probability of presence of almost 1, thus the proposed approach is very confident in assessing the presence of pins. Similar results have been obtained using the Gaussian model and are thus not presented here. Both noise models allow correct pin identification for the five assembly patterns tested, as no additional pins are identified as missing.

Comparison with Existing Methods
In this part, we compare the proposed approach against the four existing methods described earlier that are SPA-G-1 and FISTA-1 for the Gaussian noise case, and SPA-P-1 and PIDAL-1 for the Poisson noise case. Figure 15 shows the activity estimation results of the five assembly patterns using these methods. We can observe that the results are similar to those obtained by the proposed approach. Moreover, all of the methods correctly identify the assembly pattern of all datasets, as in addition to the pins already identified as missing, no extra pins are. It is also worth mentioning here that the convergence of PIDAL-1 , which considers a Poisson likelihood, was much slower than FISTA-1 which considers a Gaussian likelihood. On the other hand, in Table 1, we show the number of particles (NPS) for the proposed approaches and the four existing methods. NPS is the mean of active pin pixels in each case. For each case, we estimated the activity of each pin by summing up pin pixel intensities and created a histogram of pin activities. The mean of this histogram is then computed, which represents the average fuel activity of each case. We can observe that, in general, the methods considering a Gaussian likelihood (FISTA-1 , SPA-G-BtG and SPA-G-1 ) provide fairly close results to the ground truth NPS.

Comparison with Existing Methods
In this part, we compare the proposed approach against the four existing methods described earlier that are SPA-G-1 and FISTA-1 for the Gaussian noise case, and SPA-P-1 and PIDAL-1 for the Poisson noise case. Figure 15 shows the activity estimation results of the five assembly patterns using these methods. We can observe that the results are similar to those obtained by the proposed approach. Moreover, all of the methods correctly identify the assembly pattern of all datasets, as in addition to the pins already identified as missing, no extra pins are. It is also worth mentioning here that the convergence of PIDAL-1 , which considers a Poisson likelihood, was much slower than FISTA-1 which considers a Gaussian likelihood. On the other hand, in Table 1, we show the number of particles (NPS) for the proposed approaches and the four existing methods. NPS is the mean of active pin pixels in each case. For each case, we estimated the activity of each pin by summing up pin pixel intensities and created a histogram of pin activities. The mean of this histogram is then computed, which represents the average fuel activity of each case. We can observe that, in general, the methods considering a Gaussian likelihood (FISTA-1 , SPA-G-BtG and SPA-G-1 ) provide fairly close results to the ground truth NPS. Figure 15. The IDEAL response matrix: Activity estimation using pin pixels averaging using existing methods. The regularization parameter β was set to 1 × 10 −5 in the four methods. Table 1. The IDEAL response matrix: Comparison of activity estimations (×10 7 ), represented by number of particle (NPS), for the proposed approaches and the four existing methods. NPS is the mean of active pin pixels in each case. Estimated activities close to the ground truth are highlighted in bold. Second close values are underlined.

SPA-P-
SPA-G-SPA-P-SPA-G-PIDAL-FISTA-Ground BtG BtG  Table 2 Table 2. The IDEAL response matrices deployed in the previous subsection are not available, as we aim precisely at identify which pins are present. In this subsection, we investigate a more realistic scenario, consisting of the assumption that all the pins are a priori present (although, they may not be). This assumption is accurate in Case 1, but it expected to introduce errors for the other cases, as we will see below. The resulting matrix is of size A ∈ R 65,520×5329 . The hyperparameters of the proposed methods are set to same values as in the experiments in the Section 6.1. Figures 16-18 show the results activity estimation results obtained by our methods for the five cases. We can see that the algorithms could fairly accurately identify the assembly pattern of cases "1", "2" and "3" but not for cases "4" and "5". This can also be observed in the uncertainty maps of these cases. Moreover, the Gaussian noise model seems more robust in missing pins identification than using the Poisson noise model. On the other hand, Figure 19 shows the probability of presence of pins in each assembly. We can observe that the assembly patterns of cases "1", "2", and "3" are correctly identified, but not for cases "4" and "5". In cases "4" and "5", the cobalt pins are replaced by depleted UO2 and the Compton scattering cross-section of UO2 is larger than cobalt's. UO2 pins are simulated as non-radioactive, as their radioactivity is negligible and UO2 could be used as a high-density fuel replacement material in diversion scenarios. However, UO2 pins are scattering centers of the gamma rays emitted by the surrounding fuel pins and the use of the FULL response matrix induces the algorithm to mis-classify UO2 pins as low-activity cobalt pins. One strategy to mitigate this effect could rely on narrowing the energy detection window to exclude the low-energy contribution from scattering reactions.

Comparison with Existing Methods
As in the previous section, the proposed approach is compared against the four approaches described earlier. Figure 20 shows the activity estimation results of the five assembly configurations. We can observe that for the missing center case, there are still some low amplitude identified pins at the center. Moreover, the proposed approach provides better results, in addition to being able to quantify the uncertainty measures of the estimates and also localize the pins. Figure 20. The FULL response matrix: activity estimation using pin pixels averaging using existing methods for the five cases. The regularization parameter β was set to 1 × 10 −5 in the four methods.

The EMPIRICAL Response Matrix
As outlined in Section 6.2, the proposed approach, and existing methods, find it difficult to provide reliable estimates for cases "4" and "5", which is probably because of the mismatch between the linear operator and the observations. Hence, the last strategy considered here is to build A by identifying as a pre-processing step, the likely pin support using a fast inversion algorithm (filtered backprojection (FBP), as in [22]). FBP provides a coarse activity map which is then thresholded to identify potential pin locations. This approach is more evolved than using the FULL A and can discard correctly potential pin locations. Using the empirical matrices A, a matrix is created for each of the last two cases (Case 4 and 5), which is of size A ∈ R 65,520×4899 . Figures 21 and 22 show pin activity estimation and presence maps using both the Poisson and Gaussian noise models. We can observe that the assembly patterns of both cases are correctly identified, although in the Case "4", there are still some missing pixels identified as existing but with low probability of presence, as in Figure 22. On the other hand, Figure 23 shows a comparison with the existing approaches, which are similar to those obtained using the proposed approach but without associated uncertainty measures and probability of presence.   The EMPIRICAL response matrix: activity estimation using pin pixels averaging using existing methods, for Cases "4" and "5". The regularization parameter β was set to 1 × 10 −5 in the four methods.

Conclusions and Future Work
This paper introduced a hierarchical Bayesian model for pin activity estimation in PGET images, using a linear forward model. Due to the linear approximation, the classical Poisson noise assumption is likely to not hold and we assessed the robustness of the model tested using two different noise models. Prior distributions were assigned to the unknown model parameters and Bayesian inference was performed using a split and augmented-partially collapsed Gibbs sampler. The proposed approaches can provide uncertainty measures to the estimates and are fully automatic in the sense that they can estimate the model associated hyperparameters. Different response matrices in the linear approximation were also investigated. We observed that the Gaussian noise model provided better results, as it seemed more robust to model mismatch. The proposed approaches can provide uncertainty measures and posterior probabilities of presence, which cannot be obtained using classical optimisation methods. However, this comes at a higher computation cost. Future work should investigate alternative approaches to model the forward model.

Data Availability Statement:
The data that support the findings of this study are available from the corresponding authors, A.K.E. and Y.A., upon reasonable request.

Conflicts of Interest:
The authors declare no conflict of interest.
On the other hand, sampling f t z g =0 |y, Φ, ∆ \t zg =0 , Υ, Λ reduces to sampling its elements independently from their Gaussian prior distribution defined in Equation (6).
Once z is sampled from (A8), sampling t| y, ∆ \t , Φ, Υ, Λ can be achieved by sampling from the following multivariate Gaussian distribution: 2 ) 2ρ 2 t| y, ∆ \t , Φ, Υ, Λ ∼ N R + (t; µ t , Σ t ), where: Note that Z T Z = Z as Z is a diagonal matrix. Hence, Σ t is a diagonal covariance matrix, which simplifies the sampling process.
Sampling directly from (A17) can be computationally intensive since, e.g., it requires inversion of the precision matrix, sampling c| ∆, Φ, Υ \c , Λ can be efficiently achieved by the exact perturbation-optimization (E-PO) algorithm proposed in [42].
Sampling o 1 | ∆, Φ, Υ, Λ \o 1 : It can be seen that the full conditional distribution of o 1 can be written as: which reduces to sampling from the following multivariate Gaussian distribution: where: In a similar fashion to sampling o 1 , sampling o 2 reduces to sampling from a multivariate Gaussian distribution with the following mean and covariance: Algorithm A1 is a detailed version of Algorithm 1 following the sampling steps explained above. As t z g =0 generated in step (a) is not used during the following steps, it can thus be omitted.