Implicit temporal discretization and exact energy conservation for particle methods applied to the Poisson-Boltzmann equation

Wereportonanewmultiscalemethodapproachforthestudyofsystemswith wide separation of short-range forces acting on short time scales and long-range forces acting on much slower scales. We consider the case of the Poisson-Boltzmann equation that describes the long-range forces using the Boltzmann formula (i.e. we assume the medium to be in quasi local thermal equilibrium). We developed a new approach where fields and particle information (mediated by the equations for their moments) are solved self-consistently. The new approach is implicit and numerically stable, providing exact energy conservation. We tested different implementations all leading to exact energy conservation. The new method requires the solution of a large set of non-linear equations. We considered three solution strategies: Jacobian Free Newton Krylov, an alternative, called field hiding, based on hiding part of the residual calculation and replacing them with direct solutions and a Direct Newton Schwarz solver that considers simplified single particle-based Jacobian. The field hiding strategy proves to be the most efficient approach.


Introduction
Matter in any form and state is characterized by the presence of particles interacting via fields. At the most fundamental level, a quantum point of view is needed but as larger and larger portions of matter need to be studied, the attention shifts from the quantum level to the particle level (where matter is described as particles interacting via forces), to the macroscopic level (where matter is described as a continuum with properties defined everywhere in space).
Perhaps the key effort in science and in particular in computational science is to design models able to describe or predict the properties and behavior of matter based on the knowledge of its constituents and their interaction: We consider here in particular the particle approach (Hockney and Eastwood, 1988;Frenkel and Smit, 2001). At the core of the method is the ability to use supercomputers to track millions or billions of particles to reproduce the behavior of molecules, proteins, genome and any type of matter. The 1 fundamental complexity and challenge is that the evolution of such systems encompasses many scales (de Borst, 2008).
Most first-principle methods are required to resolve the smallest scales present not because the processes there are important but because otherwise the method would fail due to numerical instabilities (Hockney and Eastwood, 1988). Particle models must account for the presence of long and short-range interaction that acts on different temporal scales. The contribution from short-range forces can be computed for each particle considering only the others within a short distance. But the long-range forces require a global approach for the whole system. Different methods have been designed (Frenkel and Smit, 2001). One class of approaches, called the particle-particle particle-mesh (P3M) (Eastwood et al., 1984) and the more recent Particle-mesh Ewald (PME) summation (Essmann et al., 1995), introduce a mesh for the long-range interaction.
In practical terms the short-range calculation is less demanding as it involves only the particles within a cell selected to meet the short-range forces, but the long-range interaction is a costly mesh operation involving the whole domain and requiring the solution of an elliptic equation of the Poisson-type.
Additionally, short-range calculations can be effectively and easily implemented on modern massively parallel computers and especially on GPU-based computers. Long-range calculations require communication among processors as distant areas of the system still need to exchange information.
For these reasons, new methods have been developed to take advantage of a fundamental difference between short and long range interaction: by virtue of their long range, forces acting over long distances affect the system on slower scales. The reversible reference system propagator algorithms (RESPA) (Tuckerman et al., 1992) use more frequent updates for the short-range forces while the long-range forces can be recomputed only intermittently. These approaches are based on decoupling the force calculation for the different contributions to the interaction (short and long range) from the particle mover: the mover calls the force calculation at different time intervals depending on the time scale typical of the force. Long-range interactions are computed less frequently reducing the cost.
But it is possible to go even further and reduce drastically the frequency of long-range force calculation if the particle mover and the force field computation are not decoupled: using implicit methods (Brackbill and Cohen, 2014). This is the approach we propose here. Our new approach allows us to model the coupled particle-motion equations and long-range Poisson equation within a self-consistent set of mesh equations that can be solved with much longer time steps. The key innovation of the implicit approach is, however, its capability to conserve energy exactly and to machine precision, a property that remains elusive for explicit methods (Salmon and Warren, 1994;Hockney and Eastwood, 1988;Birdsall and Langdon, 2004;Grigoryev et al., 2002).
The downside of implicit method is of course the need for iterative solvers. We consider here the case of solvation (Im et al., 1998) where the system is described by the of Poisson-Boltzmann system (Fogolari et al., 2002), a highly non-linear system of coupled equations. We have to rely on the Newton method for its solution, but we explore two practices to reduce the cost: a method that removes part of the operations (solution of the field equations) from the main non-linear iterations (we refer to this approach as field hiding) and one where the Jacobian of the Newton iteration is drastically simplified (we refer to this approach as Newton-Krylov-Schwarz ).
The method presented is tested in a standard problem where complex physics developed. Energy is observed to remain conserved within arbitrary precision. The different solution strategies all converge to the correct solution but the field hiding approach is observed to be the most cost effective.
The paper is organized as follows. Section 2 describes the model problem we use: implicit solvation, while Section 3 describes its mathematical implementation in the Poisson-Boltzmann system. Section 4 describes two possible temporal discretization: the standard explicit method widely used in the literature and our new implicit approach. Section 5 describes a key novelty of our approach, it ability to conserve energy exactly to machine precision. Section 6 describes he solution strategy for the implicit approach, based on the Newton-Krylov method and on two alternative approaches that can riduce the computational cost. Section 7 present results for a specific case, showing exact energy conservation. Section 8, instead, focuses on the performance of the different solution strategies. Finally Section 9 outlines the conclusions of our work.

Model Problem: Implicit Solvation
Our goal is the application of implicit methods to improve the handling of long-range forces in multiple scales particle simulations. We consider as target problem a system of interacting atoms or ions immersed in a matrix.
One example guided the progress. Atoms, molecules and particles immersed in media, like water, or in other matrices are present in many materials and in particular in all biomaterials. Treating media as a collective entity is a more effective approach than trying to model all atoms in the medium. We based our approach on the Poisson-Boltzmann (PB) equation that treats the electrostatic behavior of the media as a continuum model introducing an electrostatic field solved on a grid (Im et al., 1998;Lu et al., 2008).
In these cases, our approach is to treat the motion of the particles of the solute using a mesh to handle long-range forces: the use of a Poisson-Boltzmann model for long range interactions and the effects of solvation. The particles in the system evolve under action of short-range forces and the long-range effect of a potential field computed from the Poisson equation.
Our approach is different from the existing state of the art based on multiscale RESPA (Tuckerman et al., 1992) particle integrators. The implicit method changes the computational cycle by replacing a Verlet-style explicit alternation of particle motion and force computation with a coupled system of equations for both particles and fields.
In a standard particle-based method, the particles are moved for a short distance in the forces computed at the start of the time step. Then the forces are recomputed and the particles are moved again. This requires the use of very small time steps that resolve the smallest scale present to avoid numerical instability (Birdsall and Langdon, 2004;Frenkel and Smit, 2001). The implicit method, instead, solves the coupled non-linear system for particles and fields self-consistently. This allows one to move particles for longer distances in a single time step, thereby allowing the study of much larger systems for much longer times (Brackbill and Cohen, 1985).
The approach is ideally suited for situations where the long-term evolution is of interest and the short scales can be averaged over. For example, in the case of macromolecules, this would allow to quickly allow structural changes.
Recent work has demonstrated mathematically and in practice that the approach conserves energy exactly, to round off Lapenta and Markidis, 2011;Chen et al., 2011). This is a sore point in particle simulations. Multiscale methods like the widely used RESPA suffer from numerical limitations that can lead to unphysical building up of energy in the system, thereby giving rise to drifts in average properties and inaccurate sampling (Morrone et al., 2010). This effect is completely eliminated by the implicit energy conserving approach proposed here.

The Poisson-Boltzmann model
In our study we use the Poisson-Boltzmann model as basis of our investigations. We briefly review the technique we use to provide the background on the new methods we developed The Poisson Boltzmann equation describes the electrochemical potential of ions immersed in a medium (e.g. water). (1) The charge on the right hand side can come from two different sources: ions immersed in the fluid ρ i and the charges present in the medium itself that responds to the evolving conditions ρ m . A classic example is the dispersion in water that responds by polarizing its molecules.
The ions are described using computational particles. Usually these particles are not infinitesimal in size but rather are assumed to be of finite size, with a prescribed shape.
The medium charge, conversely, is defined by a continuum model. In the case of the Boltzmann model, the response of the medium is based on the Boltzmann factor: where k B is the Boltzmann constant, defined by a temperature T m and an asymptotic density at regions where the potential vanishes is ρ 0 . We can consider a more general non-linear response of the medium and generalize the Poisson-Boltzmann model to: for a generic non-linear function f . For example, a often-used formulation leads to: f (ϕ) = sinh(γϕ) where γ is a normalization parameter (Honig and Nicholls, 1995;Koehl, 2006).
The equation above forms our model for the medium. For ions, the equations of motion for the position x p and velocity v p of each ion are simply given by Newton's equations: where the mass and charge of the ionic species are indicated as usual.
The electric field acting on the ions comes from two sources: the direct interaction among the ions and the interaction mediated by the medium. This is in the spirit of the P 3 M method that splits the short-range and long-range interaction (Eastwood et al., 1984). The direct interaction is coming from particle-particle forces: where V is the inter-particle potential and the summation is to all particles (avoiding self-forces). The interaction mediated by the medium is obtained solving the Poisson equation on a grid of points ϕ g = ϕ(x g ). From the grid potential, we can compute the grid electric field and project it to the particles: where the summation is over all points of the grid. For typical interpolation functions, W (x p − x g ) only few cells contribute to the particles. We use here for interpolation bsplines of order 1 (De Boor et al., 1978). The peculiarity of particle methods is the use of interpolation functions W (x g −x p ) (g the generic grid point, center or vertex) to describe the coupling between particles and fields. We use here the Cloud-in-Cell approach, the interpolation function reads (Hockney and Eastwood, 1988): otherwise.
The equations of particle motion and the Poisson equation for the field are a non-linear coupled system of equations. The goal here is to understand how to deal with this coupling so that energy is conserved.

Explicit and Implicit Discretization of the Poisson-Boltzmann Equation
The set of equations above can be discretized in time explicitly or implicitly. The explicit approach solves in sequence in each time interval ∆t the two equations: first it solves one assuming the other frozen and then vice-versa.
Considering the equations for the computational particles with charge q p and mass m p , the explicit approach uses the so-called leap-frog (or Verlet) algorithm: The new particle position can be directly computed from the old electric field. After moving the particles, the new electric field can then be computed form Poisson's equation.
On the contrary, the implicit method uses a formulation where field and particles are advanced together within an iterative procedure where at each time step the field equation and the particle equations are solved together. The implicit mover used here is: where quantities under bar are averaged between the time step n and n + 1 (e.g.v p = (v n p + v n+1 p )/2). The new velocity at the advanced time is then simply: The electric fieldĒ p is the electric field acting of the computational particle at the mid-timē E p = (E n p + E n+1 p )/2. The electric field is computed from the direct interaction among particles and from the Poisson equation. Focusing on the latter, to reach an exactly energy conserving scheme, we reformulate the equation using explicitly the equation of charge conservation: We take the temporal partial derivative of the Poisson's equation (3) and substitute eq. (11) The last term can then be interpreted as the divergence of the current of charge in the medium: The result above allows us to rewrite the Poisson-Boltzmann model as: that can be solved by numerical discretization: where D is the discretized divergence operator and ∆E g is the variation of E g during the time step. This equation needs to be solved directly in this form. Formally, the divergence operator can be inverted (provided suitable boundary conditions) leading to: This equation is our basis for proving the exact conservation but it is not of direct practical use.
To solve the equation above one needs first to invert eq.(13), a task of comparable complexity to inverting eq. (15). We prefer the latter because it allows more easily to set boundary conditions. The solution for the field equations is then found assuming the electric field can be expressed via the potential using a discretized gradient G: and inverting the system of eqs. (15). The time dependent Poisson formulation is Galilean invariant and does not suffer from the presence of any curl component in the current (Chen et al., 2011). In electrostatic systems, the current cannot develop a curl because such curl would develop a corresponding curl of the electric field, and in consequence electromagnetic effects. The formulation above prevents that occurrence since it is based on the divergence of the current and any curl component of J is eliminated. This is not an issue in 1D but it is central in higher dimensions.

Energy Conserving Fully Implicit Method (ECFIM)
There are several energy channels and they need to balance exactly for energy to be conserved. Let us begin by the particles. Their energy change can be computed easily multiplying the momentum equation by the average between new and old velocity over the time step: For the explicit scheme this leads to: where we can recognize the current as where V g is the volume of the cell. The particle energy balance then becomes: Similarly, for the implicit method, the energy balance is: where the current is computed as: The particle energy balance then becomes: There is a crucial difference between implicit and explicit methods. The change in particle energy in the explicit method is given by the product of the old electric field with the current based on the new particle velocity. That electric field was computed with the current from the old time step. There is then an inconsistency between the current used to advance the field and that coming out of the motion of the particles. there is a temporal delay of one time step. The result is that energy is not conserved. In the implicit method, instead, the electric field and the current are computed at the same time level in both particle equations and field equation. Energy is conserved exactly.
To prove the last point, let us now multiply eq. (16) byĒ g : We recognize in this balance the variation of electric field energy on the left. The right hand side has the energy exchange between particles and fields and between medium and fields.
The requirement of energy conservation is that the energy exchange between particles and fields computed form the particles, eq. (23) exactly equals the exchange between particles and fields computed form the fields, eq. (24). Inspection immediately shows this to be the case and energy is indeed exactly conserved for the implicit scheme.
Note that this conclusion holds with respect to both particle-based and mesh-based force computation. Above we focused on the particle-mesh component. But energy conservation of the particle-particle follows directly form the formulation in terms of inter-particle potential, provided the derivatives of the potential are taken also implicitly. To this end we discretize the gradient operator as follows: where α is the direction in 3 . The anti-symmetry of the expression above ensures energy and momentum conservation for the PP part of the computation (Grigoryev et al., 2002).

Newton-Krylov Solvers
The implicit method described above produces a set of non linear equations, formed by the momentum equation of each particle (eq.(9)) and the Poisson-Boltzmann equation for the variation of the potential (eq.(15)): These two residual equations are supplemented by a number of definitions: the calculation of the current from the particles and the calculation of the particle's electric field form the potential computed on the grid. These steps are part of the residual calculation but are not unknowns proper. The unknowns are ∆ϕ g and v p . All other quantities can be considered as derived constitutive relations that are not part of the unknowns variable of the coupled non linear system. The position can be computed immediately and linearly once the velocity is known. Once the position is known the current and density can be computed directly again linearly. As a consequence the only two sets of equations for the coupled non linear system are that for ϕ g and v p .
To solve this non linear system we use the Jacobian-Free Newton Krylov (JFNK) approach (Kelley, 1995;Knoll and Keyes, 2004). In this approach, the system of non linear equations is solved with the Newton method. Each iteration of the Newton method requires us to solve the linearized problem around the previous iteration for the solution. In JFNK, this step is completed numerically using a Krylov solver (we use GMRES) where the Jacobian is not directly computed but rather only its products with Krylov vectors are computed. In practice, this means that the Jacobian never needs to be formulated or completely computed and only successive realizations of the residual need to be available.
The advantage is that the JFNK method can be used as a black box that takes as input a method that defines the residual and as output it provides the solution. Many effective JFNK packages are available in the literature and most computing environment provide one. We use matlab.
In our case then the two residual equations are that for ϕ g and v p . JFNK provides a sequence of guesses for ϕ g and v p produced by the Newton iteration strategy, the user needs to provide a method for the residual evaluation given the guess. The final output is the converged solution for ϕ g and v p that makes the residual smaller than a prescribed tolerance. The size of the problem treated by JFNK is equal to the number of particle unknowns (3N p one for each velocity component) and of the potential unknowns (N g ). The approach is indicated in fig. 1 The JFNK uses two types of iterations, the inner Krylov iteration for the Jacobian equation and the outer Newton iteration. What counts at the end is how many residual evaluations are required for convergence. The larger the number of evaluations, of course, the larger the computational effort.
Besides the direct implementation described above, in our recent research we have explored other alternatives Siddi et al. (2018), possibly preferable in the strategy to reduce the memory requirements or the computational time. These methods modify the residual equations, pre-computing part of the solution to eliminate part of the complexity of the non linear coupled system so that the JFNK method can converge more easily. Figure 1. JFNK approach: the method interacts with the user provided residual evaluation that must supply to JFNK the residual error in the equations for a given guessed unknown variable. At convergence the residual will be smaller than a prescribed tolerance.
The first approach proposed in the context of implicit particle methods is that of particle hiding (PH) (Kim et al., 2005). In particle hiding, the unknowns of the problem are only the values of the potential on the grid points at the new time level and the JFNK solver computes only the residual of the field equations. The particle equations of motion are calculated by a separate Newton-Raphson method and embedded in the field solver as function evaluations. When the JFNK provides a guess of the new potentials, particle positions and velocities are computed consistently with the electromagnetic field guessed, their current are then passed back to the residual evaluation for computing the field residuals required by JFNK.
In essence the idea is that if the JFNK provides a guess for the electric field, the solution of the particle equations of motion does not require any non linear iterations: the particles can be moved in the guessed fields, no iteration needed. In this sense the particles become a constitutive function evaluation. Each time the JFNK requires a residual evaluation for a guessed potential (and consequently electric field) in each grid point, the particles are first moved, the density and current are interpolated to the grid and the residual equations for the electric field can then be computed. This approach reduces dramatically the number of non linear equations that are solved (only N g potential equations), but at the cost of moving the particles many times, once for each residual evaluation. The main advantage of this method is the reduction of the memory requirements of the Krylov method because the particles have been brought out of the Krylov loop and only the filed quantities matter when computing the memory requirement. This approach is especially suitable to hybrid architectures (Chen et al., 2012) and can be made most competitive when fluid-based preconditioning is used Chen et al. (2014). However, in the present case of a low dimensionality problem run on standard CPU computers, particle hiding is neither needed nor competitive. In a recent study, we proposed an alternative approach: field hiding (FH) Siddi et al. (2018). As the name suggests, the crucial difference is that in FH the JFNK operates directly on the particle mover, making now the field computation part of the residual evaluation. When the Newton iteration produces a guess of the particle velocities, the fields can be immediately computed via an evaluation of the current from the particles and solving the field equations. Given that typically there are at least two orders of magnitude fewer fields than particle unknowns, the cost of a field evaluation is typically very small compared with moving particles.
The advantage of FH is that the JFNK method operates directly on the most sensitive part of the system, the particles. When FH and PH are compared, the most striking difference is the much lower number of Newton iterations needed for FH Siddi et al. (2018) (it should be pointed out that fluid preconditioning can change this result Chen et al. (2014)). The reason for this result is that in PH the JFNK is trying to converge using a much less sensitive leverage. By acting directly on the particles, FH lets JFNK operate on the levers that are most sensitive. The fields at one particle depend only on the fields of the nearby nodes, while the fields in the system depend on all the particles and their motion. In other words, the field equation is elliptic, coupling the whole system. As a consequence, converging a Newton method on the fields requires many more iterations, and in each one the particles need to be moved. For this reason the present study focused on FH.
In Ref. Siddi et al. (2018) a third option is also proposed: replacing the JFNK method for the particle residual in the FH strategy with a Direct Newton-Schwarz (DNS) approach. In this case, the equation for each particle is iterated independently of the others, assuming a Schwarz-like decomposition of the Jacobian. The idea is that particle-particle coupling is of secondary importance to particle-field coupling in terms of convergence of the scheme. In the field-hiding approach the full Jacobian of the residual of particle p depends not only on the particle p itself but also on the others: The dependence of the residual of particle p, R p on the velocity of other particles is mediated by the fields that are hidden in the field solver. They are hidden but are non-zero. In the Direct Newton-Schwarz approach proposed by Ref. Siddi et al. (2018), this coupling is approximated with a Picard iteration and the Jacobian of each particle is approximated as: The formulas above are for the simpler 1D case for simplicity of notation but they are valid in 3D by interpreting v as all components of the velocity. In 3D, the Jacobian of each particle is just a 3x3 matrix and there is no need for Krylov solvers to invert it: it can be inverted directly. For this reason we call the method Direct Newton Schwarz (rather than Newton Krylov Schwarz). The DNS is based on a strong simplification of the Jacobian and it is subject to the risk that the assumption might become invalid and the Picard iteration might be slow or even stall. We will see that indeed this is the case as the simulation size increases.
Regardless of the implementation of the JFNK, all the methods considered conserve energy exactly, to machine precision (provided the Newton method convergence is sufficiently tight) and are absolutely identical in the accuracy of the results produced for the cases tested. They only differ in the computing performance. We focus then on comparing the different implementations in terms of CPU time used.

Results
To test energy conservation and the computational implementation of the JFNK methods, we report the result of one sample problem. We initialize a simulation with two streams of positive ions immersed in a uniform solvating medium of density ρ 0 , behaving according to the Boltzmann factor (for negative electrons of charge −e): (29) f (ϕ) = exp(eϕ/kT ) with uniform medium temperature T (in general different from the kinetic temperature of the ions). In this case, the non linear Botlzmann term gives a current: Two possible options can be implemented. In the first (Method 1), the non-linear term is computed at the middle of the time step, to retain second order accuracy: This approach leads to a non-linear field equation becauseφ g = (ϕ n+1 g − ϕ n g )/2. But this poses no particular complexity since the overall system made by particles and fields is non-linear anyway and this non-linearity is handled by the JFNK. Of course, the extra non-linearity might result in more iterations, and we will see in the results that this is indeed the case.
In the second (Method 2), the non-linearity is simplified with a time decentering at the beginning of the time step: avoiding non-linearity in the field equation for ϕ n+1 g . Both choices conserve energy exactly because they just express differently the energy exchanged with the medium but do not affect the exchange of energy between particles and fields. Method 2 is simpler to compute and as it will be shown below it still retains a comparable accuracy in practice. For that reason it is preferable in the case considered. This conclusion is valid for the present case and it might not hold in other problems. The first approach insuring non-linear consistency to second order might be advantageous in problems where non-linear balance between opposing terms are present (Knoll and Keyes, 2004): this is not the case here.
We present the results using adimensional quantities. The 1D system has size L = 20π run for a total time of ω p T = 250 using a total number of 800 cycles. The system is discretized in 256 cells using 40000 particles. This test is just a proof of principle and it is not meant as a production run for a full code.
The two beams have speed v b = ±0.2 with a thermal spread of V th = 0.01. The evolution leads to the formation of electrostatic shocks (sometimes referred to as double layers) (Forslund and Shonk, 1970). This is a sharp difference from the case of two-stream instabilities of electrons. Here the physics is completely different because the medium described by the Boltzmann factor has a strong impact on the evolution.
For the Boltzmann factor we choose the temperature corresponding to a Debye length of λ D = 0 kT /ne 2 = 10 −3 L (i.e. e/kT = 253.3030). Note that physically this is the electron temperature, in principle completely unrelated to the ion kinetic temperature that determines the thermal speed of the ions. Figure 3-a shows the final state of the run for Method 1. As can be observed, the evolution leads to the formation of double layers (sometimes referred to as electrostatic shocks) (Forslund and Shonk, 1970). Figure 3-b shows the electrostatic potential at the end of the run. Sharp potential jumps are present in correspondence with each shock.
The same case is repeated with Method 2 where the non-linearity of the field equations is simplified by using the potential at the n time level. Figure 4 shows the results. As can be seen, the differences are in panel a where the distribution function is shown are visible only in small details. Similar conclusions can be reached for panels b where the potential is also similar, with the same peaks (although the structure of the peaks varies slightly). There are clearly the same shock and same patterns. The differences however are quite significant in the number of iterations needed: the more non-linear case requires obviously more iterations.
But is the solution correct? We have two indications that it is. First, we have conducted a convergence study varying the number of cells and particles. The results presented are converged in the sense that the location of the features in phase-space does not change. Second, the presence of shock-like features in conditions similar to those reported here also led to similar phase space features (Forslund and Shonk, 1970). We reached therefore confidence that teh solution is correct. Besides the interesting physics accurately resolved, the point of the test is evaluating energy conservation. Energy is being exchanged between particles, medium and fields but the total energy is exactly conserved, see Fig. 3-c and 4-c. The solution requires a tolerance in the iteration of the NK method. We set a tolerance of 10 −14 and we indeed find energy conservation to be within the tolerance set, as shown in Fig. 3-d and Fig. 4-d. If the tolerance level is modified so is the level of confidence on the energy conservation. Both methods considered produce exact energy conservation.
For the two cases reported, the number of iterations at each cycle is shown in Fig. 3-e and Fig. 4-e. The number of iterations has a remarkable increase corresponding to the time when the shocks are first formed. Method 1 with the explicit nonlinearity of the field equation requires more iterations by a factor of approximately 50% more, a non-negligible effect.
In the comparison above of Method 1 and Method 2 we have used the JFNK non-linear solver. The physical results are independent of the non-linear solver but the computational cost of reaching those results depend on the non-linear solver used: finding the most efficient is the target of the next section.
8. Performance of Field-Hiding and Direct Newton Schwarz As described above, the performance of the JFNK might be increased if field-hiding or the DNS method is used. We focus now on Method 2 for two reasons. First, it was shown to be accurate and more computationally efficient in the test above. Second, it is simpler to implement field-hiding when the equations for the fields are formally linear.
The latter point requires clarification. When field-hiding is used, for each Newton iteration applied to the particles, the fields need to be computed using the current Newton iterate of the particle velocity (and indirectly position). This operation can be done with a linear solver if the field equations are linear but it can also be done with another nonlinear solver if the equations are non-linear. The first method nests a linear solver inside the function evaluation for the residual of the particles, the second nests a non-linear (for example another independent JFNK) solver for the field equation. Both can be done but of course the first is simpler.
To evaluate the performance of the three non-linear solvers, we consider a series of progressively larger systems. The following parameters are held fixed in all runs: time step ∆t = 1.25, the grid spacing ∆x = 20π/64 using 10000/64 particles per cell, medium temperature e/kT = 253.3030. We change however the system size by multiples M of L = 20πM . We then use 10000M particles and 64M cells in each run. As M is varied, we compare the number of iterations and the total CPU time for the direct solution with JFNK including both field and particle residual (referred to as vanilla JFNK) with that of field-hiding and DNS. Figure 6 shows the average number of Newton iterations needed for the three solution strategies. The two methods based on the JFNK iteration perform nearly ideally, with no significant variation on the number of Newton iterations needed as M increases. The DNS performs well only for small systems, when the scaling M increases the number of iterations balloons and the method fails. This is due to the slow convergence caused by the approximation of the Jacobian.
The difference in iterations between FH and vJFNK is strong and compelling. The FH solution is clearly preferable.
The number of iterations correspond to a similar nearly ideal performance in CPU time for vJFNK and FH. Again the DNS looses ground on larger systems. FH results in almost an order of magnitude gain in computational time.
All tests were done on a dual Intel(R) Xeon(R) CPU E5-2630 0 at 2.30GHz with cache size 15360 KB with 64GB RAM memory. Note that in the analysis above we used the same tolerance for all three non-linear solvers to enforce the exact same accuracy in all three solutions that are therefore indistinguishable from the physical point of view. The difference is only in the number of iterations and the cost of the simulation. But the accuracy of the end result is the same. This is especially true for the DNS where the Jacobian is approximated. Approximating the Jacobian in the Newton method can slower the rate of convergence and the direction used to compute the next guess might become sub-optimal. However, the solution obtained is still exact, it is just might require more iterations. Figure 6. Scaling study for the number of Newton iterations needed for the vanilla JFNK and the field-hiding strategy.

Conclusions
We extended the recent developments on fully implicit energy conserving particle methods to the case where long-range forces are mediated by a medium with very different time and space scale than the inter-particle forces. This is the case of the Poisson-Boltzmann equation that describes a variety of physical systems.
We designed a new implicit method that is shown to conserve energy exactly to round off. The method is based on the solution with the JFNK method of non-linear system of equations composed by the momentum equation for each particle (Newton's law) and the equation of the medium (Boltzmann factor included in Poisson's equation). The challenge of the new approach is the computational cost. We investigate two strategies for reducing the number of iterations needed by hiding the field solution within the residual evaluation of the particles.
In the FH approach, we still use the JFNK methods but applied only to the particle residual. At each new Newton iteration for the particle velocities provided the fields are computed directly (using a direct solver). This avoids the need to include also the field equations as part of the residual evaluation (a strategy referred to as field hiding).
In the DNS approach, even the JFNK for the particles is removed. We compute analytically the Schwarz decomposed Jacobian of each particles and invert it analytically. The coupling between particles is handled as a Picard iteration. The method does not require any extra storage besides one copy of the particle information. This is a strong reduction from the JFNK approach that needs to keep in memory multiple Krylov vectors with the same dimension of the residual (that is the size of the number of particle unknowns).
The three methods differ only in the solution strategy and the number of iterations but the solution is the same.
We tested our approach in the case of two ion beams moving in a medium of Boltzmanndistributed electrons: a configuration leading to multiple interacting double layers (electrostatic shocks). A very taxing test for the methods. The field equations have been discretized in two ways, one with a stronger non-linearity that assumes the medium response at mid time level (Method 1) and one with weaker non-linearity that treats the medium at the beginning of the time level (Method 2).
We show both to conserve energy and to lead to virtually identical results, but with a distinct advantage for the weaker non-linear discretization that evaluated the medium at the beginning of the time step.
We then focused on the Method 2 and compared the JFNK approach based on including all residuals in the evaluation with FH and DNS. Both methods based on a JFNK approach show a nearly ideal scaling, with the number of iterations remaining independent of the system size and the computational cost increasing linearly. On the other hand, the DNS starts to fail as the system size increases: the number of iterations increase and the CPU time increases superlinearly.