Next Article in Journal
Poisson Twister Generator by Cumulative Frequency Technology
Previous Article in Journal
Convolution Accelerator Designs Using Fast Algorithms
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Surrogate-Based Robust Design for a Non-Smooth Radiation Source Detection Problem

1
Global Validation Model Department, Spire Global, Inc., Boulder, CO 80301, USA
2
Department of Computer Science, Virginia Tech, Blacksburg, VA 24060-0902, USA
3
Department of Nuclear Engineering, North Carolina State University, Raleigh, NC 27695, USA
4
Department of Mathematics, North Carolina State University, Raleigh, NC 27695, USA
*
Author to whom correspondence should be addressed.
Algorithms 2019, 12(6), 113; https://doi.org/10.3390/a12060113
Submission received: 17 April 2019 / Revised: 20 May 2019 / Accepted: 21 May 2019 / Published: 28 May 2019

Abstract

:
In this paper, we develop and numerically illustrate a robust sensor network design to optimally detect a radiation source in an urban environment. This problem exhibits several challenges: penalty functionals are non-smooth due to the presence of buildings, radiation transport models are often computationally expensive, sensor locations are not limited to a discrete number of points, and source intensity and location responses, based on a fixed number of sensors, are not unique. We consider a radiation source located in a prototypical 250 m × 180 m urban setting. To address the non-smooth properties of the model and computationally expensive simulation codes, we employ a verified surrogate model based on radial basis functions. Using this surrogate, we formulate and solve a robust design problem that is optimal in an average sense for detecting source location and intensity with minimized uncertainty.

1. Introduction

The problem of determining the location and intensity of a radiation source arises in several settings including emergency response to mitigate nuclear threats, structural and nuclear health monitoring in nuclear reactors, and environmental cleanup of biomedical and industrial nuclear waste. In this paper, we consider the development of a robust sensor network design for determining the location and intensity of a radiation source in a simulated urban environment. Specifically, we consider source localization in a simulated 250 m × 180 m block in downtown Washington, DC.
There are several difficulties that are intrinsic to this source localization problem. The first is that inverse problems of this nature are inherently ill-posed and require some form of regularization to obtain reasonable approximate solutions [1]. This difficulty is exacerbated by the fact that sensor observations are often coarsely spaced, which dictates that one cannot estimate source attributes that are more oscillatory than the grid spacing. As detailed in [2], this can yield erroneous results if ignored.
The computational complexity of deterministic [3] and stochastic, Monte Carlo [4] radiation transport models poses a second challenge since it limits the number of model realizations that can be obtained for optimization, or Bayesian or frequentist inference. This has led to the development of alternative parameterizations or surrogate models. For example, in [5] the authors modeled the radiation source as a point gamma source and employed a physics-based parameterization of gamma particle transport. A fast radiation transport model is also available as a component of Synth, a gamma-ray simulation code written by Pacific Northwest National Laboratory [6,7]. In [8], the authors employ a Gaussian mixture to model the radiation field.
In [9], we addressed challenges associated with optimization and Bayesian inference as a prelude to the robust sensor design problem considered in this paper. Specifically, we implemented a fast piecewise-continuously differentiable radiation transport model and solved the associated inverse problem using combined global [10,11] and local [12] optimization algorithms, and Bayesian inference techniques [13,14]. As in [9], we assume here that the threat is a point source and that the model accounts only for photons that travel directly from source to detector, with no intervening collisions. The radiation source is parameterized with three components: its 2-D location coordinates and intensity.
To improve computational efficiency and permit gradient-based optimization and the implementation of a robust design algorithm [15], we implement and verify a continuously differentiable surrogate model based on radial basis functions to approximate the response for all possible detector locations. We employ this surrogate for subsequent optimization and robust design.
There are three different main strategies for taking measurements for general applications. The first searches for a given number of stationary sensors, the second one relies on moving sensors, whereas the third method, entitled scanning, activates only a subset of a given number of stationary sensors at a given moment in time. The existing methods for identification of sensor locations usually employ random fields analysis [16], information theory [17] and optimum experimental design theory [15]. Moreover, sensor placement algorithms can be classified into discrete and continuous depending on the nature of the search space. In the context of nuclear source identification, Michaud [18] used the Gaussian process optimization [19] to solve a continuous detector placement problem and Schmidt [20] applied Shannon entropy [21,22,23] to guide mobile sensors over a discrete grid of possible measurement sites.
To form a basis for comparing different networks, a quantitative measure of efficiency is required. In this study, we explore criteria applied in optimum experimental design [15] to solve a discrete stationary detector placement problem. These criteria are defined in terms of the Fisher information matrix associated with the unknown characteristics of the source. One of the main difficulties associated with optimization of sensor locations is the dependence of the optimal solutions on unknown true values of the source characteristics or prior approximations. To remove this dependency, we employ a robust design strategy based on maximizing the expectation of the corresponding local optimality criterion over the source characteristics domain. We then transform the resulting stochastic optimization problem into a combinatorial optimization problem by generating a finite set of possible detectors locations. We solve the combinatorial optimization problem and test the obtained optimal network of sensors against randomly selected networks. The surrogate model implementation allows solving the combinatorial optimization problem otherwise being computationally infeasible.
The remainder of the paper is organized as follows. In Section 2, we discuss the radiation transport model and radial basis function surrogate along with associated statistical models. We also describe the domain geometry. The inverse problem based on the surrogate models is formulated in Section 3. In Section 4, we present the theoretical framework of the robust design in the average sense employed in this paper. In Section 5, we present the numerical solution of the robust design problem and compare the optimal network performance to randomly selected networks. We draw conclusions in Section 6. We summarize in the Appendix A the Particle Swarm algorithm used to solve the inverse problem.

2. Radiation Transport Model and Surrogate Formulations

Gamma transport phenomena, as derived from Boltzmann transport theory, can be modeled by the partial differential equation (PDE)
Ω ^ · I ( r , E , Ω ^ ) + Σ t ( r , E ) I ( r , E , Ω ^ ) = S ( r , E ) + 0 d E 4 π d Ω ^ Σ s ( r , E E , Ω ^ Ω ^ ) I ( r , E , Ω ^ ) .
Here I and S respectively denote the gamma intensity per unit area and external gamma source in the medium characterized by the position vector r , energy E, and unit vector in the direction of the gamma Ω ^ . The parameters include the total macroscopic cross-section for gamma interactions Σ t , and the double-differential macroscopic scattering cross-section Σ s , which depends on the change in gamma energy from incident energy E to emergent energy E (i.e., E E ) and the change in gamma direction from incident direction Ω to Ω (i.e., Ω Ω ). We refer readers to Shultis and Faw [3] for a more detailed treatment of transport theory.
The problem of inferring the radiation source location and intensity from sensor measurements requires the evaluation of the Boltzmann radiation transport model (1) at various points in the admissible parameter space. Numerically solving the PDE (1) is computationally expensive even on HPC systems. Solving an inverse problem constrained by Equation (1) or forward propagating uncertainties using Monte Carlo simulations are not computationally feasible since require solving Equation (1) for many times.
Instead, we employ a model that only considers gamma rays that travel directly from source to detectors, without taking into account photons that incur collisions. This approach relies on the assumption that photons undergoing interactions in the medium have a very small probability of ever arriving at a detector. We also assume that the physical scale of our problem is sufficiently large so that both the source and detectors can be localized to points inside the domain. We will denote the location of the source as r s and associated intensity by S 0 . S 0 can be treated as time-independent. Most radionuclides of interest for source search have half-lives on the order of several years to tens-of-thousands of years. Consequently, radioactive decay of the source is insignificant during the measurement. Under these assumptions, Equation (1) can be simplified to
Ω ^ · I ( r , E , Ω ) + Σ t ( r , E , Ω ^ ) I ( r , E , Ω ^ ) = S 0 4 π δ ( E E 0 ) δ ( r r s 2 ) ,
where Ω ^ is a unit vector pointing in the traveling direction of the gamma rays and E 0 is the source emission energy and delta denotes the Dirac delta function; see [3] for more details. Equation (2) can be solved to determine the intensity of photons arriving at any point r inside domain. This enables the computation of the count rate measured by the i-th detector D i assuming that detectors are point detectors with face area A i and dwell time Δ t i . The detector intrinsic efficiency ϵ i [ 0 , 1 ] is usually known in practice.
If the i t h detector is located at point r d i , the solution
F ^ i : X R , F ^ i ( θ ) = S 0 Δ t i · ϵ i · A i 4 π r d i r s 2 2 · exp r d i r s Σ T d s
of Equation (2) predicts the number of counts observed by the sensor given the location and intensity θ = ( r s , S 0 ) of the source. Here we denoted by X the space of all possible sources and R is the one-dimensional real coordinate space. The derivation of model response (3) follows in a manner similar to that shown in Shultis and Faw [3], (Chapter 10.1.3), where the resulting solution is evaluated at the detector location r d i .

2.1. Model Geometry

To provide an example of an urban area, we selected a 250 m × 180 m block in downtown Washington, D.C., located at approximately 38 54 ′48″ N by 77 1 ′60″ W (Johnson Avenue NW) to serve as our domain. Buildings in this area are primarily brick and concrete residential housing and are generally 1–5 stories in height. Using data from the OpenStreetMaps database (https://www.openstreetmap.org/), we constructed a 2-D representation of the area to serve as the test geometry. Our implementation treats the buildings as a set of disjoint polygons P j , j = 1 , 2 , , N g , each of which is assigned a corresponding macroscopic cross-section Σ t . A satellite photo of the area with an overlay of the constructed representation is provided in Figure 1.
Approximate calculations indicate that wood and concrete buildings correspond to an optical thickness of around 3 mean free paths (MFPs), where the mean free path denotes the mean distance traveled by the photons between collisions with atoms of the building. Consequently, we randomly selected cross-sections for each building so that their optical thickness is between 1 and 5 MFPs. The random sampling was also weighted according to the volume of each building, so that smaller buildings were biased towards smaller optical thicknesses and vice versa. The regions between buildings were treated as dry air at standard temperature and pressure, with cross-sections taken from the NIST XCOM database (http://www.nist.gov/pml/data/xcom/).
For this geometry, the admissible parameter space is
X = [ 0 , 250 ] × [ 0 , 180 ] × [ 5 · 10 8 , 5 · 10 10 ] .
The first two dimensions define the spatial location representing the simulated 250 × 180 m urban block. The third dimension restricts the source intensity to vary between 5 × 10 8 and 5 × 10 10 Bq.

2.2. Numerical Model for Detector Response

To determine the intensity of photons arriving at a given detector location r d i , the algorithm employs a simple ray-tracing scheme. Starting at the location of the source r s , we draw a ray from r s to r d i . We then compute the intersection of this ray with the disjoint polygons P j , j = 1 , 2 , , N g , representing the set of buildings in our domain. This yields a series of line segments expressing the path traversed in each region. We assume that a given ray intersects N polygons, N < N g , and let L = { ( j , Σ T ( j ) ) } j = 1 N be the set of all intersecting segments, where j is the Euclidean length of the j-th segment and Σ T ( j ) is the corresponding value for the macroscopic total cross-section. With this assumption, Equation (3) takes the form
F ^ i ( θ ) = S 0 Δ t i · ϵ i · A i 4 π r d i r s 2 2 exp j = 1 N j · Σ T ( j ) .
Equation (5) provides an analytic expression estimating the expected detector response, and its computation primarily requires the intersection of lines with the model geometry. Equation (5) represents a significant simplification to the solution of (1), a nonlinear PDE with seven independent variables whose solution in complex geometries can require many hours even on a supercomputer. We implemented the numerical model (5) in a short Python code. It employs the Shapely library (https://pypi.python.org/pypi/Shapely) for performing the computational geometry calculations. The model takes as input a specification of polygons representing the different regions of the domain, cross-section data, detector locations, source intensity, and source location.

2.3. Statistical Model

To construct statistical models associated with N detectors, we consider a background with constant expected intensity B. We denote by θ 0 , the true source location and intensity of a radiation source. It is well known that radioactive decay and detection are Poisson random processes. By including Poisson random effects and assuming that N detectors are available, we obtain the statistical model
Y i P F ^ i ( θ 0 ) + B ,
associated with the i t h detector response, i = 1 , , N . The Poisson distribution with mean
F i : X R , F i ( θ 0 ) = F ^ ( θ 0 ) + B
is denoted by P . For large numbers (>30) of observed photons, the Poisson distribution is adequately approximated by a normal distribution, yielding the approximate statistical model
Y i N ( F i ( θ 0 ) , ( σ i o ) 2 ) ,
where ( σ i o ) 2 = F i ( θ 0 ) ; i.e., with variance equal to the mean. This is equivalent to
Y i = F i ( θ 0 ) + ε i o , ε i 0 N ( 0 , ( σ i o ) 2 ) .
In this manner, we model the observations associated with each detector as random variables Y i , i = 1 , , N .

2.4. Radial Basis Function Surrogate Model

Due to the presence of the buildings, the model response (7) is non-differentiable with respect to both position and intensity. To apply sensitivity analysis to determine an optimal sensor configuration, smoothness of the model responses must be assured. To address these issues and reduce computational times, we used radial basis functions to provide continuously differentiable approximations of the model responses (7).
Radial basis function methods provide interpolants to sampled values associated with irregularly positioned points inside the input domain. A radial basis function approximation of the model response F i ( θ ) has the formulation
F ˜ i : X R , F ˜ i ( θ ) = k = 1 L λ k ψ ( ε θ θ k 2 ) ,
where θ denotes a source in the domain X , ψ : R R is a radial basis function and ε is a shape parameter. Possible choices of radial basis functions ψ include multiquadrics and their inverse formulations, Gaussian functions, and thin plate splines. A more comprehensive list can be found in [14,24]. We employ Gaussian radial basis functions. The coefficients λ k are computed by requiring that F ˜ i ( θ k ) = F i ( θ k ) , k = 1 , , L , where θ k are selected to cover the entire domain X and L is the number of interpolation points. We employed the MATLAB radial basis function toolbox based on Cholesky factorization and Tikhonov regularization. We also tried other methods to approximate the model response based on Legendre and Lagrange polynomials and Gaussian process. Our results (not shown here) revealed that the radial basis functions approximation had the best accuracy for our application.

2.5. Surrogate Statistical Model

The analysis of interpolation error relies on smoothness properties of the map being approximated. In our case, such properties are not directly available nor are error bounds. Instead, we assume that the response surrogate models errors associated with the true source can be modeled as normal random variables ε i m N ( 0 , ( σ i m ) 2 ) , yielding the statistical model
F i ( θ 0 ) = F ˜ i ( θ 0 ) + ε i m , i = 1 , , N ,
with σ i m being the standard deviation of the response surrogate models errors.
A statistical model incorporating both model errors ε i m and observation errors ε i o introduced in (9), is
Y i = F ˜ i ( θ 0 ) + ε i o + ε i m , i = 1 , , N .
Assuming the independence of model errors and observation errors, and exploiting the fact that the sum of independent normal random variables is also a normal random variable, (12) can be expressed as
Y i = F ˜ i ( θ 0 ) + ε ˜ i , Y i N ( F ˜ i ( θ 0 ) , σ i 2 ) i = 1 , , N ,
where ε ˜ i N ( 0 , σ i 2 ) , and σ i 2 = ( σ i o ) 2 + ( σ i m ) 2 .

3. Detection of Nuclear Radiation Sources Using Surrogate Model

The problem of estimating the location and intensity of a radiation source when several detectors with associated measurements are available represents a classic inverse problem. The source θ 0 is unknown and has to be inferred from realizations υ i of the random variables Y i , i = 1 , N whose statistical model is described in (13).
The Gaussian likelihood function π : X [ 0 , ) is given by
π ( V | θ ) = 1 ( 2 π ) N i = 1 N σ i exp i = 1 N 1 σ i 2 [ υ i F ˜ i ( θ ) ] 2 ,
where V = [ υ 1 , , υ N ] is the vector of all the available observations.
A standard technique to estimate the location and intensity of a radiation source, based on measured data, is to apply maximum likelihood estimators. Due to the monotonicity of the logarithm function, maximizing (14) is equivalent to minimizing the negative logarithm of the likelihood
min θ X J ( θ ) , J ( θ ) = 1 2 i = 1 N 1 σ i 2 [ υ i F ˜ i ( θ ) ] 2 .
We omit the constant in front of the exponential term in Equation (14) since it does not affect the solution. The objective of this investigation is to solve the optimization problem (15) with minimal uncertainty. The solution to this problem (15) is a maximum likelihood estimator which is a random variable. In this context, uncertainty represents the covariance of this estimator which is shown to be the inverse of the Fisher information matrix (23). To achieve this, we apply the robust design strategy described in next section.

4. Robust Design in the Average Sense

Here we present a robust design strategy in the average sense. The solution of the robust design problem is a network of sensors that minimizes the uncertainty in the solution of the inverse problem (15).

4.1. Solution of the Inverse Problem

The solution of a nonlinear inverse problem is typically more difficult than in the linear case since one regularly does not have analytic solutions. Under the assumption that the model (10) behaves linearly with respect to variables θ, one can derive an analytic solution for the weighted least-square estimate as described in [15,25]. For nonlinear model responses, it is customary to linearize the system response
F ˜ i ( θ ) = F ˜ i ( θ 0 ) + F ˜ i θ ( θ 0 ) T ( θ θ 0 ) + O ( θ θ 0 2 2 ) , i = 1 , , N
about a prior estimate θ 0 , where · 2 is the Euclidean norm.
By neglecting the higher order terms, the statistical model (13) can be rewritten as
Y ˜ i = F ˜ i θ ( θ 0 ) T θ 0 + ε ˜ i , i = 1 , , N .
Here a realization υ ˜ i of Y ˜ i can be expressed as υ ˜ i = υ i F ˜ i ( θ 0 ) + F ˜ i θ ( θ 0 ) T θ 0 , where υ i is sampled from the distribution defined in (13). Consequently, the problem (15) can be reformulated as
θ ^ = arg min θ J ( θ ) , J ( θ ) = 1 2 i = 1 N 1 σ i 2 υ ˜ i F ˜ i θ ( θ 0 ) T θ 2 .
By differentiating J with respect to θ, we obtain
J ( θ ^ ) = i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) υ ˜ i F ˜ i θ ( θ ^ 0 ) T θ .
By imposing J ( θ ^ ) = 0 , the estimate
θ ^ = M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) υ ˜ i ,
is unique only if the Fisher information matrix
M = i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) F ˜ i θ ( θ 0 ) T
is nonsingular.
A formula for the estimator θ of the problem (15) is obtained based on the linearization of the surrogate model response (10) and by assuming that the high-order terms in the expansion (16) are negligible. In the next subsection, we highlight the role of Fisher information matrix in defining the D-optimality criterion widely used in the optimum experimental design problems as a quantitative measure of the ‘goodness’ of different networks of sensors.

4.2. Optimal Sensor Locations

We focus on the statistical properties of the maximum likelihood estimator whose estimate was derived in Equation (20). Based on these properties, optimal design theory provides solutions for the optimal placement of sensors problem.
We first note that the estimator
θ ˜ = M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) Y ˜ i ,
is unbiased since
E [ θ ˜ ] = M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) E [ Y ˜ i ] = M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) E F ˜ i θ ( θ 0 ) T θ 0 + ε ˜ i , = M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) F ˜ i θ ( θ 0 ) θ 0 = θ 0 ,
and θ 0 is the true source location and intensity of the radiation source. Please note that we employed the relation E [ ε ˜ i ] = 0 , i = 1 , N , and Formula (21).
The covariance of the estimator is given by
cov [ θ ˜ ] = E [ ( θ ˜ θ 0 ) ( θ ˜ θ 0 ) T ] = E ( M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) Y ˜ i θ 0 ) ( M 1 j = 1 N 1 σ i 2 F ˜ j θ ( θ 0 ) Y ˜ j θ 0 ) T = M 1 i = 1 N j = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) E ε ˜ i ε ˜ j 1 σ j 2 F j θ ( θ 0 ) T M T = M 1 i = 1 N 1 σ i 2 F ˜ i θ ( θ 0 ) E ε ˜ i 2 1 σ i 2 F ˜ i θ ( θ 0 ) T M T = M 1 .
The result is based on the independence of the errors and symmetry of M .
The Fisher information matrix does not depend on the pseudo-measurements υ ¯ i but instead on the sensor locations D i , i = 1 , , N , and the prior estimate θ 0 . As such, one can adjusts the sensor locations D i to minimize uncertainty in the estimator; i.e., minimize cov [ θ ˜ ] . As detailed in [26], an optimal configuration exists only for specific cases. This is the reason for the introduction of scalar metrics depending on all possible Fisher information matrices [26]. The most popular metrics are the D-optimality, E-optimality, A-optimality, and sensitivity criteria based on the determinant, smallest eigenvalue, and trace of the Fisher information matrix and its inverse. D-optimality is invariant under scale changes in the parameters and linear transformations of the output in contrast to A-optimality and E-optimality criteria.
In this investigation, we employ the D-optimality criterion, where the optimal network ξ N * = { D i * , i = 1 , , N } is searched as the solution of the optimization problem
max ξ N Ψ M ( ξ N , θ 0 ) , Ψ M ( ξ N , θ 0 ) = det M ( ξ N , θ 0 ) .
Here ξ N ranging in the space of all possible combinations of sensor locations.

4.3. Robust Design in the Average Sense

The optimal design ξ N * obtained as the solution of (24) is dependent on the prior estimate θ 0 of the true parameters θ 0 . When the prior estimate θ 0 is not a reasonable approximation of θ 0 , the network ξ N * may be inaccurate since prior uncertainty on θ 0 is not taken into account. For the admissible parameter space X in (4), the set of all possible characteristics θ, is compact. By incorporating the probabilistic description of the prior uncertainty, we obtain the optimal design in the average sense. The quantity of interest to be maximized is the expectation of the corresponding local optimality criterion,
Γ ( ξ N ) = E θ Ψ [ M ( ξ N , θ ) ] = X Ψ [ M ( ξ N , θ ) ] p ( θ ) d θ ,
where p ( θ ) is the uniform distribution on X . We employ the criterion introduced in (24) leading to the ED-optimal design problem
max ξ N Γ E D , Γ E D ( ξ N ) = X det M ( ξ N , θ ) p ( θ ) d θ .

5. Numerical Examples

As detailed in Section 3, the problem under investigation consists of identifying the location and intensity of a radiation source in a simulated downtown Washington, DC block with minimum error with respect to the true source location and intensity. The solution to this problem can be obtained by applying optimal sensor location strategies [15]. Instead of applying a local optimal design method, whose solution depends on some a priori estimate of the true source, we propose a robust design strategy to remove this dependency. Specifically, we propose a ‘compromise’ design where the obtained network is good enough (in a least-error sense) to identify any possible source from the admissible domain X .
As detailed in (4), we take the admissible parameter space to be X = [ 0 , 250 ] m × [ 0 , 180 ] m × [ 5 · 10 8 , 5 · 10 10 ] Bq . Next we specify the set of all possible detector locations to be a discrete set of 30 spatial positions. By sampling from a uniform distribution, we generate the possible locations of detectors in the domain denoted by diamond marks in Figure 2. In this way, we avoid the problem of overlapping sensors encountered for a continuous formulation. The specific dispersal pattern was selected to spread the detectors evenly throughout the area. We assume that detectors have facial areas A i , with 3-inch diameters and 3-inch lengths, for incident gamma energy of 662 KeV. This is standard packaging for sodium iodide (NaI) scintillators that possess intrinsic efficiency of ε i = 62 % for 662 keV gammas. The dwell time Δ t i for all detectors was chosen to be 1 s.
Finally, we set the size of the network to 10 detectors and formulate the robust design problem in the average sense:
Find the network ξ 10 * = { D i , i = 1 , , 10 } consisting of 10 detectors out of the 30 possible detectors locations depicted in Figure 2 that solves
ξ 10 * = a r g m a x ξ 10 Γ E D , Γ E D ( ξ 10 ) = X det M ( ξ 10 , θ ) p ( θ ) d θ .
By using a sufficiently large number of sources, the integral in (27) can be accurately approximated. To evaluate the efficiency of the robust design network, we employ the metric
= 1 M θ ˜ ^ ξ 10 θ 0 2 2 M ,
where θ 0 , = 1 , M , are M distinct true radiation sources. For each true source θ 0 , we can compute the associated estimate θ ˜ ^ ξ 10 as the solution of the problem (15) using the network ξ 10 . The score (28) corresponding to the robust design ξ 10 * will be tested against scores obtained by randomly selected networks ξ 10 of 10 detectors. The smallest score should be obtained by the robust design ξ 10 * thus validating the approach.
The discrete nature of the space of all possible detectors locations transforms (27) into a combinatorial optimization problem. The number of possible networks is 30,045,015 as given by (30 choose 10) which is equivalent to combination of 30 possible sensor locations taken 10. By imposing that each possible network ξ 10 contains only one detector out of the three possible choices from each of the ten rectangular areas shown in Figure 2, we decrease the number of possible networks to 59,049. This makes the combinatorial problem computationally feasible.
In Figure 3, we plot the model response F 8 (7) corresponding to the sensor location D 8 in Figure 2 and a source intensity of 3.5 × 10 9 Bq. The source location is varied inside the domain and the non-smooth nature of the model response is observed. The Fisher information matrix requires that the model response be differentiable with respect to the source location and intensity. This motivates replacing the model responses for all 30 possible locations with the differentiable radial basis function surrogate model described in Section 2.4.
We employ radial basis interpolation (10) to generate 30 surrogate model responses for all the possible detectors locations. The number of interpolation points is selected at 29,791 distributed inside the domain X . Specifically, for each dimension, we selected 31 points evenly distributed inside the interval. For each possible detector location and source θ k , the response model F i (7) was used to calculate the corresponding interpolation points ( θ k , F i ( θ k ) ) , k = 1 , ,29,791, i = 1 , , 30 . We tested several values of the shape parameter ε and the most accurate surrogate models were obtained for ε p = 8.06 .
In Figure 4a, we compare the surrogate model predictions against the outputs of model response F 8 for D 8 and 100 different sources uniformly randomly sampled from X . These sources were not included in the training set. The red curve corresponds to the model response outputs whereas the blue curve denotes the surrogate model predictions. Figure 4b illustrates the relative errors of the predicted intensities.
To compute accurate variances for each of the 30 surrogate model responses, we generated a data set of 10 5 different sources uniform randomly spread inside domain X . The root mean square errors (RMSE) are shown in Figure 5 for all 30 surrogate models. The largest error is observed for the surrogate model response associated with D 3 . We note that whereas the source intensity ranges between 5 × 10 8 and 5 × 10 10 Bq, the largest RMSE is on the order of 4.2 × 10 6 counts per second (cps). The discrepancies between the outputs of the models F ˜ i and F i are then used to compute variances ( σ i m ) 2 , i = 1 , , 30 .
Next we generated observations for all possible networks ξ 10 , = 1 , , 59,049, using statistical model (9) based on the model response discussed in Section 2.2. The observation errors associated with each detector and source θ are normally distributed with mean 0 and variance ( σ i o ) 2 = F i ( θ ) , i = 1 , , 30 .
The robust design problem solution is given by the network ξ 10 associated with the largest score Γ E D (27). To determine its maximum value, we calculate the associated Fisher information matrix for all possible networks ξ 10 , = 1 , , 59,049 and a collection of possible sources θ , = 1 , , 9880 spread throughout the domain X . This allows us to estimate the integral in (27). The dependencies associated with the Fisher information matrix are the derivatives of the surrogate models with respect to the source characteristics and variances ( σ i ) 2 = ( σ i o ) 2 + ( σ i m ) 2 , i = 1 , , 30 . The gradients F ˜ i θ l are computed from (10) knowing that ψ is the Gaussian radial basis function and · 2 is the Euclidean norm. These sources θ , = 1 , , 9880 , differ from those used for constructing the surrogate models and are uniformly distributed over the entire domain. This spatial distribution was employed since we selected p ( θ ) to be the uniform distribution over X in score Γ E D (27).
The values Γ E D ( ξ 10 ) are computed for all possible networks and the results are shown in Figure 6a. Allowing each network to include only one detector out of the three possible choices over each of the ten rectangular areas—see Figure 2—likely explains the periodic behavior. The optimization problem does not have a unique solution as seen from the expected values. Three different networks produce the largest score. Figure 6b shows the detectors locations of one of these three networks corresponding to the index 35,714.
To test the obtained robust design network, we use the Formula (28) with the 11 randomly selected networks of sensors plotted in Figure 7. Next we set M = 50 , and uniform randomly select 50 true sources θ 0 from X . Observations were then generated using the statistical model (9) for each source θ 0 , = 1 , , 50 and network of sensors including the optimal one. We then solve the inverse problem (15) using the Particle Swarm algorithm [10] detailed in the Appendix A.
The Particle Swarm approach is a global, meta-heuristic optimization algorithm motivated by social-psychological principles [27]. It was originally introduced in [10] and it was designed to imitate a social behavior such as the movements of birds in a flock or fishes in a shoal. Later the algorithm was simplified and its performance for solving optimization problems were reported in [28].
For our example, we set the inertia parameter to be 1.1 and the neighborhood of each particle is set to 4. The self and social adjustment coefficients y 1 and y 2 are set to 1.49 . We select the swarm size to 70, and for each given source θ 0 , = 1 , , 50 , and network out of the 11 randomly selected networks plus the optimal one, we compute the inverse problem solution.
The errors of the inverse problem solutions (i.e., the estimated sources characteristics) are shown in Figure 8 for all possible networks and sources. We note that the solution obtained using the optimal network does not have the smallest error for all the sources. For example, for source number 28, the source characteristics errors obtained using the optimal network are larger than all the estimates errors associated with the random networks except Network 4. This is not unexpected, since the optimal design was obtained following an average sense formulation.
Next, the errors of the inverse problem solutions are averaged over the entire set of sources and the results of Formula (28) are illustrated in Figure 9 for all 12 considered networks. The index associated with the optimal network is 12 and corresponds to the smallest RMSE. This result suggests that we were able to identify the robust design in the average sense for the nuclear transport inverse problem.

6. Conclusions and Future Work

In this investigation, we constructed a network of sensors to reduce uncertainty in the solution of a radiation detection inverse problem. We employed a robust design in the average sense method that eliminated the dependence on the true solution or a priori estimates. We focused on the ED-optimal design [15], whose solution maximizes the expected value of the determinant of the Fisher information matrix over the entire domain.
Since we generated a discrete number of possible sensors locations, the stochastic optimization problem is transformed into a combinatorial one. The number of possible networks decreased by imposing certain combination restrictions for the possible detectors’ locations. We employed a radial basis function surrogate model to alleviate the non-smoothness attributes of the radiation transport model due to the domain geometry.
By solving the combinatorial problem, we found that the solution of the optimization problem is not unique. However, we did identify multiple sensor networks that were optimal in a least- error sense. An optimal network associated with the largest ED-scores was compared against 11 randomly selected networks using a data set of 50 different sources. The overall RMSE revealed that the optimal network has more precision than any other network in the average sense.
In future work, we will reformulate the problem in a continuous framework; i.e., the number of detectors is unlimited inside the city environment. This formulation enables the use of very effective numerical algorithms. We will apply methods such as the stochastic Robbins–-Monro algorithm [29,30].

Author Contributions

R.Ş. worked on conceptualization, methodology, software, validation and writing the manuscript. J.H. was responsible with some conceptualization details, software and reviewing the manuscript. J.C. helped with the results validation. R.C.S. and J.M. contributed to conceptualization, methodology and reviewing the manuscript.

Funding

This research was supported by the Department of Energy National Nuclear Security Administration NNSA Consortium for Nonproliferation Enabling Capabilities (CNEC) under the Award Number DE-NA0002576.

Acknowledgments

Razvan Ştefănescu thanks Michael Navon and Alen Alexanderian for their valuable suggestions regarding the current research topic.

Conflicts of Interest

The authors declare no conflict of interest. The funders had no role in the design of the study, the collection, analysis, and interpretation of data, the writing of the manuscript, and the decision to publish the results.

Appendix A. Particle Swarm

The initialization stage of Particle Swarm is described in Algorithm A1. The algorithm starts by selecting the population size of the swarm denoted by P. Initially, the state positions [ θ o l d ] j and velocities [ v o l d ] j , j = 1 , 2 , , P are randomly selected from uniform distributions; i.e., [ θ i o l d ] j , [ v i o l d ] j U [ l i , u i ] , i = 1 , 2 , 3 . Each state point has an associated neighborhood of size N = N s influencing its future trajectory. Other parameters of the algorithm must be selected too, such as the inertia parameters W j R and stall counter c j for j = 1 , 2 , , P . These parameters influence the space search.
Algorithm A1 Particle Swarm—Initialization
1:Select swarm size P N and generate initial state points [ θ o l d ] j and velocities [ v o l d ] j , j = 1 , 2 , . , P such that [ θ i o l d ] j , [ v i o l d ] j [ l i , u i ] , i = 1 , 2 , 3 .
2:Select the minimum neighborhood size m i n N s and the inertia parameters W j R , j = 1 , 2 , , P , W j [ 0.1 , 1.1 ] .
3:Set the stall counter c j = 0 for all state points j = 1 , 2 , , P .
4:Set the self and social adjustment real variables y 1 and y 2 .
5:Set N = N s .
The evolution of the space point j t h from the current state to the next one is described in Algorithm A2. The index notation is dropped. The proposal function depends on a two steps formula. First, the velocity v n e w is adjusted via Equation (A1) while in the second phase, the new state is obtained by adding the newly generated velocity to its previous position (A2). The weights y 1 and y 2 denote the self and social adjustment coefficients steering the search towards either the state point p or its neighbors g best position. We denote the Hadamard product by . * .
A successful replacement of the best state point position b among the entire population ensures a change in the inertia parameter W while a failure leads to a larger neighborhood selection and maintains W constant. Finally, the new proposals are set to replace the current ones for the next iteration. The algorithm stops when the relative change in the lowest objective function value J b * over a range of predefined number of iterations is smaller than a specified tolerance, the maximum number of iterations is reached, or a preset objective function percentage decrease has been achieved.
Algorithm A2 Particle Swarm— j t h trajectory
1:Select N state points other than j to generate the associated neighborhood.
2:Set f l a g = f a l s e . Define set S containing all the N state points. Find the lowest objective function
g = min θ o l d S J ( θ o l d ) a n d s e t J n * = J ( g ) .
3:Select random vectors u 1 and u 2 of size 3 from the uniform distribution U ( 0 , 1 ) . Update the velocity:
v n e w = W · v o l d + y 1 · u 1 . ( p θ o l d ) + y 2 · u 2 . ( g θ o l d ) .
4:Update the position
θ n e w = θ o l d + v n e w .
5:Enforce the bounds. If any component of θ n e w is outside a bound, set it equal to that bound.
6:if J ( θ n e w ) < J * then p = θ n e w , J * = J ( θ n e w )
7:end if.
8:if J ( θ n e w ) < J b * then f l a g = t r u e , J b * = J ( θ n e w )   a n d   b = θ n e w , where J b * corresponds to the smallest objective function in the swarm.
9:else f l a g = f a l s e
10:end if.
11:if flag = true then set c = max ( 0 , c 1 ) and N = N s .
12:  if c < 2 then W = 2 · W
13:  end if.
14:  if c > 5 then W = W / 2 and ensure that W is inside the bounds.
15:  end if.
16:else set c = c + 1 , N = min ( N + N s , P )
17:end if.
18:Set θ o l d = θ n e w and v o l d = v n e w and GO TO step 1.

References

  1. Alpay, M.E.; Shor, M.H. Model-Based Solution Techniques for the Source Localization Problem. IEEE Trans. Control Syst. Technol. 2000, 8, 895–904. [Google Scholar] [CrossRef]
  2. Sivergina, I.F.; Polis, M.P. Comments on “Model-based solution techniques for the source localization problem”. IEEE Trans. Control Syst. Technol. 2002, 10, 633. [Google Scholar] [CrossRef]
  3. Shultis, J.K.; Faw, R.E. Radiation Shielding; American Nuclear Society: La Grange Park, IL, USA, 2000. [Google Scholar]
  4. Briesmeister, J.; Carlo, M.T.G.M.; Code, N.P.T. Report LA-13709-M; Los Alamos National Laboratory: Los Alamos, NM, USA, 2000.
  5. King, M.J.; Harris, B.; Toolin, M.; DuBord, R.M.; Skowronski, V.J.; LuSoto, M.A.; Estep, R.J.; Brennan, S.M.; Cosofret, B.R.; Shokhirev, K.N. An Urban Environment Simulation Framework for Evaluating Novel Distributed Radiation Detection Architectures. In Proceedings of the 2010 IEEE International Conference on Technologies for Homeland Security (HST), Waltham, MA, USA, 8–10 November 2010; pp. 446–452. [Google Scholar]
  6. Brennan, S.M.; Mielke, A.M.; Torney, D.C.; MacCabe, A.B. Radiation Detection with Distributed Sensor Networks. Computer 2004, 37, 57–59. [Google Scholar] [CrossRef]
  7. Hensley, W.; Lepel, E. Synth: A Gamma-Ray Spectrum Synthesizer; Pacific Northwest Laboratory: Richland, WA, USA, 1996.
  8. Morelande, M.R.; Skvortsov, A. Radiation field estimation using a Gaussian mixture. In Proceedings of the 12th International Conference on Information Fusion, Seattle, WA, USA, 6–9 July 2009; pp. 2247–2254. [Google Scholar]
  9. Ştefănescu, R.; Schmidt, K.; Hite, J.; Smith, R.; Mattingly, J. Hybrid optimization and Bayesian inference techniques for a non-smooth radiation detection problem. Int. J. Numer. Methods Eng. 2017, 111, 955–982. [Google Scholar] [CrossRef] [Green Version]
  10. Kennedy, J.; Eberhart, R. Particle Swarm Optimization. In Proceedings of the IEEE International Conference on Neural Networks, Perth, Australia, 27 November–1 December 1995; IEEE Service Center: Piscataway, NJ, USA, 1995. [Google Scholar]
  11. Kirkpatrick, S.; Gelatt, C.D.; Vecchi, M.P. Optimization by Simulated Annealing. Science 1983, 220, 671–680. [Google Scholar] [CrossRef] [PubMed]
  12. Kelley, C.T. Implicit Filtering; SIAM: Philadelphia, PA, USA, 2011. [Google Scholar]
  13. Haario, H.; Laine, M.; Mira, A.; Saksman, E. DRAM: Efficient adaptive MCMC. Stat. Comput. 2006, 16, 339–354. [Google Scholar] [CrossRef]
  14. Vrugt, J.A.; Ter Braak, C.; Diks, C.; Robinson, B.A.; Hyman, J.M.; Higdon, D. Accelerating Markov Chain Monte Carlo Simulation by Differential Evolution with Self-Adaptive Randomized Subspace Sampling. Int. J. Nonlinear Sci. Numer. Simul. 2009, 10, 273–290. [Google Scholar] [CrossRef]
  15. Ucinski, D. Optimal Measurement Methods for Distributed Parameter System Identification; CRC Press: Boca Raton, FL, USA, 2004. [Google Scholar]
  16. Sun, N.Z. Inverse Problems in Groundwater Modeling; Springer Science & Business Media: Berlin/Heidelberg, Germany, 1999; Volume 6. [Google Scholar]
  17. Krause, A.; Singh, A.; Guestrin, C. Near-optimal sensor placements in Gaussian processes: Theory, efficient algorithms and empirical studies. J. Mach. Learn. Res. 2008, 9, 235–284. [Google Scholar]
  18. Michaud, I. Simulation-Based Bayesian Experimental Design Using Mutual Information. Ph.D. Thesis, North Carolina State University, Raleigh, NC, USA, 2019. [Google Scholar]
  19. Weaver, B.P.; Williams, B.J.; Anderson-Cook, C.M.; Higdon, D.M. Computational enhancements to Bayesian design of experiments using Gaussian processes. Bayesian Anal. 2016, 11, 191–213. [Google Scholar] [CrossRef]
  20. Schmidt, K. Uncertainty Quantification for Mixed-Effects Models with Applications in Nuclear Engineering. Ph.D. Thesis, North Carolina State University, Raleigh, NC, USA, 2016. [Google Scholar]
  21. Kraskov, A.; Stogbauer, H.; Grassberger, P. Estimating mutual information. Phys. Rev. E 2004, 69, 066138. [Google Scholar] [CrossRef] [PubMed]
  22. Lewis, A.; Smith, R.; Williams, B.; Figueroa, V. An information theoretic approach to use high-fidelity codes to calibrate low-fidelity codes. J. Comput. Phys. 2016, 324, 24–43. [Google Scholar] [CrossRef] [Green Version]
  23. Terejanu, G.; Upadhyay, R.R.; Miki, K. Bayesian experimental design for the active nitridation of graphite by atomic nitrogen. Exp. Therm. Fluid Sci. 2012, 36, 178–193. [Google Scholar] [CrossRef] [Green Version]
  24. Buhmann, M.D. Radial Basis Functions: Theory and Implementations; Cambridge University Press: Cambridge, UK, 2003; Volume 12. [Google Scholar]
  25. Fedorov, V.V.; Leonov, S.L. Optimal Design for Nonlinear Response Models; CRC Press: Boca Raton, FL, USA, 2013. [Google Scholar]
  26. Spall, J.C. Introduction to Stochastic Search and Optimization: Estimation, Simulation, and Control; John Wiley & Sons: Hoboken, NJ, USA, 2005; Volume 65. [Google Scholar]
  27. Kennedy, J. Particle swarm optimization. In Encyclopedia of Machine Learning; Springer: Boston, MA, USA, 2010; pp. 760–766. [Google Scholar]
  28. Eberhart, R.C.; Kennedy, J. A New Optimizer Using Particle Swarm Theory. In Proceedings of the Sixth International Symposium on Micro Machine and Human Science, Nagoya, Japan, 4–6 October 1995; Volume 1, pp. 39–43. [Google Scholar]
  29. Marti, K. Stochastic Optimization Methods; Springer: Berlin/Heidelberg, Germany, 2005; Volume 3. [Google Scholar]
  30. Nemirovski, A.; Juditsky, A.; Lan, G.; Shapiro, A. Robust stochastic approximation approach to stochastic programming. SIAM J. Optim. 2009, 19, 1574–1609. [Google Scholar] [CrossRef]
Figure 1. Satellite image of problem domain with model geometry overlaid (Imagery ©2016 Commonwealth of Virginia, DigitalGlobe, District of Columbia (DC GIS), Sanborn, U.S. Geological Survey, Map data ©2016 Google). The figure marks indicate possible detector and source locations.
Figure 1. Satellite image of problem domain with model geometry overlaid (Imagery ©2016 Commonwealth of Virginia, DigitalGlobe, District of Columbia (DC GIS), Sanborn, U.S. Geological Survey, Map data ©2016 Google). The figure marks indicate possible detector and source locations.
Algorithms 12 00113 g001
Figure 2. Network formed by 10 sensors with 30 possible locations.
Figure 2. Network formed by 10 sensors with 30 possible locations.
Algorithms 12 00113 g002
Figure 3. Model response associated with sensor location number 8 with the source intensity fixed at 3.5 × 10 9 Bq. Intensity is plotted with a logarithmic scale.
Figure 3. Model response associated with sensor location number 8 with the source intensity fixed at 3.5 × 10 9 Bq. Intensity is plotted with a logarithmic scale.
Algorithms 12 00113 g003
Figure 4. (a) Predictions of the surrogate and model responses associated with sensor location number 8. (b) Relative errors of the predicted intensity. The test set consists of 100 different sources not included in the training set.
Figure 4. (a) Predictions of the surrogate and model responses associated with sensor location number 8. (b) Relative errors of the predicted intensity. The test set consists of 100 different sources not included in the training set.
Algorithms 12 00113 g004
Figure 5. Root mean square error of all 30 surrogate models. The test set consists of 10 5 different sources not included in the training set.
Figure 5. Root mean square error of all 30 surrogate models. The test set consists of 10 5 different sources not included in the training set.
Algorithms 12 00113 g005
Figure 6. (a) Scores Γ E D for all possible networks. The solution of the robust design problem is not unique; (b) The optimal network obtained as the solution of the robust design in the average sense strategy.
Figure 6. (a) Scores Γ E D for all possible networks. The solution of the robust design problem is not unique; (b) The optimal network obtained as the solution of the robust design in the average sense strategy.
Algorithms 12 00113 g006
Figure 7. Random networks of sensors to be compared against the optimal network shown in the Figure 6.
Figure 7. Random networks of sensors to be compared against the optimal network shown in the Figure 6.
Algorithms 12 00113 g007
Figure 8. Root mean square errors for all the tested networks. The results are obtained for 50 sources whose components were selected from uniform distributions.
Figure 8. Root mean square errors for all the tested networks. The results are obtained for 50 sources whose components were selected from uniform distributions.
Algorithms 12 00113 g008
Figure 9. Root mean square errors averaged for all sources. The network corresponding to index 12 is the optimal one and has the smallest errors.
Figure 9. Root mean square errors averaged for all sources. The network corresponding to index 12 is the optimal one and has the smallest errors.
Algorithms 12 00113 g009

Share and Cite

MDPI and ACS Style

Ştefănescu, R.; Hite, J.; Cook, J.; Smith, R.C.; Mattingly, J. Surrogate-Based Robust Design for a Non-Smooth Radiation Source Detection Problem. Algorithms 2019, 12, 113. https://doi.org/10.3390/a12060113

AMA Style

Ştefănescu R, Hite J, Cook J, Smith RC, Mattingly J. Surrogate-Based Robust Design for a Non-Smooth Radiation Source Detection Problem. Algorithms. 2019; 12(6):113. https://doi.org/10.3390/a12060113

Chicago/Turabian Style

Ştefănescu, Răzvan, Jason Hite, Jared Cook, Ralph C. Smith, and John Mattingly. 2019. "Surrogate-Based Robust Design for a Non-Smooth Radiation Source Detection Problem" Algorithms 12, no. 6: 113. https://doi.org/10.3390/a12060113

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop