Kernel Density Estimators for Axisymmetric Particle Beams

: Bright beams are commonly represented by sampled data in the numerical algorithms used to simulate their properties. However, in these calculations and the analyses of their outputs, the beam’s density is sometimes required and must be calculated from the samples. Axisymmetric beams, which possess a rotational symmetry and are naturally expressed in polar coordinates, pose a particular challenge to density estimators. The area element in polar coordinates shrinks as the radius becomes small, and weighting the samples to account for their reduced frequency may cause unwelcome artifacts. In this work, we derive analytical expressions for two kernel density estimators, which solve these problems in the spatial coordinates and in the transverse phase space. We show how the kernels can be found by averaging the Gaussian kernel in Cartesian coordinates over the polar angle and demonstrate their use on test problems. These results show that particle beam symmetries can be taken advantage of in density estimation while avoiding artifacts.


Introduction
Sampled representations of charged particle beams are abundant in the numerical calculations used in accelerator physics (e.g., particle tracking codes).However, some algorithms and analyses require the beam's density, which must somehow be estimated from the samples.For example, the density along a set of grid points is required when calculating space charge forces using the common particle-mesh algorithm [1].Related algorithms are used in free electron laser codes and need the density for calculating the discretized radiation field [2,3].Some machine learning architectures such as convolutional neural networks [4] or physics-informed neural networks [5][6][7] (PINNs) require data that look like images or continuous functions (i.e., like the solution to a partial differential equation).When talking about a particle beam distribution, this means a density, and for PINNs in particular, one concept of how to introduce physics-based priors to models of a particle accelerator involves the Vlasov equation [8], which applies to the density and not the particles themselves.With the importance of finding densities from sampled data in accelerator physics, it is necessary for us to have good algorithms that solve this problem quickly and accurately.
Kernel density estimation (KDE) is one popular method for the conversion from sampled data to a density.It has already found some use in accelerator physics when the beam does not possess any special symmetries [9][10][11] and is being used for training PINNs on sampled data [12,13].In this method, the density is given by the convolution between a kernel function f (x, x) and the (possibly multidimensional) samples xi .Precisely, this is, where n is the number of samples, x is the point where the density is being evaluated, d is the number of dimensions, and h is a parameter (generalizable to a matrix) called the bandwidth that controls how much "smoothing" the kernel provides [14].In this article, we will use the Gaussian kernel, f (x, x) = (2π) −d/2 exp(−|x − x| 2 /2), as a starting point.Bandwidth selection plays an important role in the accuracy of the density estimator, but is out of the scope of this article.We point readers to review papers on the subject for more information [15][16][17].
Symmetries play an important role in numerical algorithms and can enable better performance and accuracy of the calculations by reducing the number of dimensions involved.One important class of symmetries in accelerator physics is a rotational symmetry around the longitudinal axis.These axisymmetric beams (such as those in most photoinjectors) are better represented in cylindrical coordinates, where the invariance of quantities like density with rotation is more naturally expressed.To reference later, the transformation of transverse phase space into cylindrical coordinates is, where the primed coordinates represent velocities with respect to the time-like dimension (time or position along the accelerator).Since the density is by definition invariant along the θ coordinate, it can be dropped to improve all of the previously mentioned applications of densities in accelerator physics.The only challenge now is to develop a density estimator that is appropriate for axisymmetric particle beams.The remainder of the paper is dedicated to deriving kernels useful for density estimation of axisymmetric particle beams from sampled data.We begin with an example of the problems that come up in naive implementations of a density estimator for this application.We then continue on to derive kernels for use with the spatial coordinate and then the full phase space of axisymmetric particle beams.The article then concludes with a brief discussion of the results and their potential applications in accelerator physics.

A Motivating Problem
Before deriving kernels for axisymmetric density estimation, we provide a motivating example to demonstrate the problems with naive density estimators and why deriving a more serious method is needed.
Let us start with the problem of estimating the density of the spatial part of the particle beam samples, that is, finding an estimate of ρ(r, θ) = ρ(r).Note that in this notation, ρ(r) is the radial part of the 2D density and not the density of the radial coordinates, which differs by a factor of r.For the purpose of this example, we consider samples of x and y from the 2D normal distribution with a unit covariance matrix that is axisymmetric and has the analytical radial density ρ(r) = 1/(2π) • exp(−r 2 /2).Since the distribution is independent of θ, we are motivated to first convert the samples to polar coordinates and then perform a density estimate of the radial part.Here, we immediately run into a problem.The area element in polar coordinates is rdrdθ, and the samples disappear near the origin.This is the blue histogram in Figure 1.To counteract that, we can weight each of the samples as 1/r.This estimate (orange in Figure 1), however, ends up with artifacts near the origin, as fewer and fewer samples are given more weight to contribute to the corrected density estimate.Note that we reflect the densities over r = 0 to make visualization easier, and for the full phase space we will similarly amend the figures with the data transformed by (r, r ) → (−r, −r ).
The same artifacts occur when one tries to estimate the density of samples in the transverse phase space (x, y, x , y ).Here, we take an example distribution in the 4D transverse phase space as being normally distributed with a diagonal covariance matrix.The standard deviation of the spatial coordinates is 1.0, and for the velocities, it is 0.2.We then take samples from this distribution and shear the radial velocities according to r → r + ar + br 3 for some numbers a and b (here, 0.6 and −0.1) to arrive at an interesting shape for us to estimate.For the density calculation, the samples are transformed according to the inverse of Equations ( 2) and ( 3), and we evaluate a 2D histogram of the coordinates r and r weighted by 1/r.We show this estimate in Figure 2 (labeled "Histogram"), and one can see the same type of artifacts near r = 0 that pollute our result.

The Radial Density
Our goal is to solve the problem of artifacts near r = 0 when estimating the density of axisymmetric beams.Let us start again with the simpler case of only the spatial coordinates (x and y).To avoid the singular weights of points near the origin, we can start by estimating the density of points in 2D space using the typical Gaussian kernel, The resulting density estimate (i.e., using Equation ( 1), it may be evaluated in polar coordinates as ρ(x(r, θ), y(r, θ)).Note that we are not scaling the density by the determinant of the transformation's Jacobian (|J| = r) because we wish to simply find the original density evaluated at a polar coordinate and not the density of the radial coordinates, which would go to zero like ρ(r) ≈ r near the origin.This new function, ρ(r, θ) contains an estimate of the radial part at every value of θ but does not benefit from the known symmetry of the points.To incorporate this, we average over the angular coordinate to obtain ρ(r) = 1/(2π) 2π 0 ρ(r, θ)dθ.The density estimator in Cartesian coordinates transformed to polar coordinates and then averaged over the angle should give us an estimate of the density without using the problematic weighting strategy.This integral can be solved analytically, and we find that the kernel is simply replaced by the expression, where I 0 (rr) is a modified Bessel function of the first kind.This function is plotted in Figure 3a.If we use the Hankel asymptotic expansion for the Bessel function [18], we find that the behavior as rr becomes large is This is the typical Gaussian kernel with the expected weighting of 1/r along r = r to account for the effect of the growing area element in polar coordinates as r becomes large.We plot this asymptotic expression against the kernel itself along the diagonal in Figure 3b.Near the origin, however, we see a different behavior.The density no longer has the pole that caused the artifacts we saw in Figure 1.Additionally, back in Figure 3a, if we look along the kernel at fixed r, we can see that when the sample is near r = 0, there is a boost in the amount of density measured near r = 0 compared to when the sample is further away.This comes from the averaging over θ, which means that, for instance, a sample at x = 1 will also contribute density from the point x = −1 that can "leak" over the origin and contribute to the estimate when both r and r are small.All of these properties, both asymptotic and near the origin, will also show up when we deal with the more complicated case of density estimation for the transverse phase space.
We directly compare density estimates for 10,000 samples drawn from the 2D normal distribution using the weighted histogram described in Section 2 and the axisymmetric kernel density estimator in Figure 4.It is clear that the artifacts near r = 0 have been reduced to provide a more accurate estimate of the density compared with the analytical value.While experimenting with the density estimator, we found it necessary to use the scaled Bessel function exp(−x)I 0 (x) to avoid numerical problems when evaluating the kernel in Equation ( 5).Fast implementations of the scaled Bessel function are available in many popular scientific computing libraries and have been described in, for example, ref. [19].

Density in Phase Space
With a template of how to proceed from the density estimator of the spatial coordinates, we can begin to derive a kernel for use in phase space.Starting with the Gaussian kernel over the four phase space variables (x, y, x , and y ), we can transform them according to Equations ( 2) and ( 3) and average out over the angular variable, θ.This gives us the kernel, with the Bessel function's argument, Oftentimes, however, the angular velocity is uninteresting for the beams used in accelerator physics.The samples may have negligible angular velocity, or it may take on a single value for all of the particles such as for magnetized beams.This motivates us to further reduce the number of dimensions by integrating out the dependence of Equation ( 7) on θ .We find that the integral can be solved analytically and gives us the kernel, where the arguments to I 0 (u, v; e iφ ) are and arctan(y, x) is the two-argument arctangent that gives the angle from the positive x-axis to the point (x, y), and the k i are, A detailed derivation of the kernel is given in Appendix A.
The function I 0 (u, v; e iφ ) is an order zero "modified generalized Bessel function" of the first kind with a single parameter (e.g., defined in refs.[20,21]).The function also takes on the integral form, which can be found by applying the relationship between the modified and unmodified generalized Bessel functions (i.e., Equation (2.22) in ref. [22]) to the typical integral representation (i.e., Equation (2.9) in the same reference).We found that numerically evaluating the logarithm of this integral provides a stable method of calculating its value for use in the kernel.This function's values may also be tabulated before use in density estimation and then looked up with linear interpolation later to achieve acceptable performance when used with many samples.Further work could be put into deriving an efficient numerical expression for the function's value using a method similar to how the regular modified Bessel function is computed to improve the speed of evaluating the kernel.Some examples of the kernel (Equation ( 9)) for different values of the sample's location are shown in Figure 5.When the samples have large radius, the function is visually similar to the Gaussian kernel and also falls off with a 1/r weighting as was previously found for the case of the spatial coordinates.As the samples move closer to the origin, we also find additional density showing up near r = 0.However, unlike in the case of the spatial coordinates, a 180-degree rotation of the sample will take the coordinates to (x, x ) → (−x, −x ), and so we see the density "leaking over" from a point at (−r, −r ) and not just the point reflected across r = 0.The effect of angular velocity is shown in the second row of Figure 5 and tends to distribute density at a much greater than typical spread of radial velocity after the integration.We demonstrate the use of this kernel density estimator on the example described in Section 2 of a beam with some nonlinear correlation to the velocities.A set of 50,000 samples was generated, and the densities are shown in Figure 2 using the weighted histogram method and the axisymmetric density estimator with the kernel in Equation ( 9).This is compared to the analytical value of the density.Visually, it can be seen that the axisymmetric kernel density estimator successfully avoids the problem of artifacts near r = 0 and matches closely with the analytical distribution.To quantify the difference in accuracy between the two methods at a variety of sample sizes, we numerically evaluated the mean integrated square error (MISE) of the two estimators using 32 samples each time and evaluating the densities on a 64 × 64 grid of values over the same region shown in Figure 2. The results are shown in Figure 6 and suggest that the axisymmetric kernel density estimator successfully avoids artifacts present in the weighted histogram and converges more quickly to the analytical density.Note that these results strongly depend on the choice of the histogram bin size and KDE bandwidth, and for the brief comparison here, we simply used "Scott's rule of thumb" [17] in both cases.

Conclusions
In this work, we have derived analytical expressions for two kernel density estimators for use with axisymmetric particle beams: one for the spatial coordinates and one for transverse phase space.These density estimators avoid the problem of artifacts near r = 0 that result from the need to weight samples in polar coordinates according to the shrinking area element when the radius becomes small.By starting with an estimate of the density in Cartesian coordinates and then averaging over the angle in polar coordinates, we can find estimates of the density that do not diverge as the sample approaches the origin.These estimates should be accurate for any distribution of samples, where the original Cartesian kernel density estimator would give an accurate estimate of density.Being analytical, the expressions avoid the performance and convergence penalties that could come from averaging the kernels numerically.We also note that the longitudinal coordinates can be trivially added to both estimators by multiplying by another Gaussian kernel in those dimensions.
Taking advantage of known symmetries is important to improving the accuracy and performance of numerical algorithms.The density estimators presented here are a stable and efficient method of calculating densities for axisymmetric beams and could help accelerator codes take advantage of the symmetries present in, for example, many photoinjectors.By advancing the ways we design and operate these machines, improved numerical tools will help enable the next generation of particle accelerators.

Figure 1 .
Figure 1.Densities of samples drawn from the 2D normal distribution with unit variance; (green) analytical radial part of the density function; (blue) density of the radial coordinate of samples; (orange) estimate of radial density using a weighted histogram.

Figure 2 .
Figure 2. Density of points sampled from a beam with nonlinear correlation estimated with the weighted histogram method and the axisymmetric kernel density estimator.The analytical density is shown to the right.

Figure 3 .
Figure 3. (a) The averaged kernel for evaluating the axisymmetric radial density of the spatial coordinates; (b) the same function evaluated along f (r, r) compared with its asymptotic form (with k = 1/(2π) 3/2 ).

Figure 4 .
Figure 4. Comparison between the axisymmetric kernel density estimator (blue), the weighted histogram method (orange), and the analytical expression for the radial part of the 2D density of normally distributed points.

Figure 5 .
Figure 5.The averaged phase space kernel (Equation (9)) evaluated for different locations of the sample.The sample's location is shown as the red cross in each image, and the coordinates are displayed in the header of each column.The angular velocity of the sample is varied along the rows and shown in each row label.The colormaps are normalized to the maximum of the data in each panel to highlight the shape of the kernels rather than their relative magnitude.

Figure 6 .
Figure 6.Estimates of the mean integrated square error of the two density estimators for a variety of sample sizes.For each point, 32 samples were evaluated, and the integrated error was calculated using a 64 × 64 grid of points.