Statistical Error for Cosmic Rays Modulation Evaluated by SDE Backward in Time Method for 1D Model

: The propagation of cosmic rays through the heliosphere has been solved for more than half a century by stochastic methods based on Ito’s lemma. This work presents the estimation of statistical error of solution of Fokker–Planck equation by the 1D backward in time stochastic differential equations method. The error dependence on simulation statistics and energy is presented for different combinations of input parameters. The 1% precision criterion in mean value units of intensity standard deviation is deﬁned as a function of solar wind velocity and diffusion coefﬁcient value.


Introduction
The Sun produces and radiates out a charged particle flow, which is called the solar wind. The solar wind propagates through the solar system and continues to the moment when the solar wind and the interstellar wind pressure are balanced. This spherical region, which has a radius that is equal approximately 100 AU, is called the heliosphere. The socalled solar modulation process begins when the galactic cosmic rays (GCRs) reach the heliosphere's boundary, which presents a decrease of GCR intensity inside the heliosphere (mostly for particles with energies less than 30 GeV). At the time of solar modulation, GCR particles interact with magnetic irregularities in the solar wind. This process can be approximated as a diffusion combined with convection and adiabatic energy losses. The particle randomly walking between these irregularities. These irregularities move with the velocity of the solar wind. Consequently, the GCR's intensity is strongly anticorrelated concerning solar activity, and is also influenced by the interplanetary magnetic field polarity and GCR particle charge sign. Parker [1] introduced a widely-used equation to describe GCR propagation inside the heliosphere. One of the most precise methods to solve this equation is the so-called stochastic integration method. Using this approach, we can evaluate the solution calculated as the associated set of stochastic differential equations (SDEs), "forward-in-time" or "backward-in-time". The stochastic approach to solving the Parker transport equation is based on Ito's lemma; however, first forward-in-time stochastic solutions as [2] were based on [3] (Equation (223)). Backward-in-time was introduced in article [4], and precisely described, for example, in [5]. The backward-in-time SDE method was widely applied in studying galactic cosmic rays modulation [6][7][8][9][10][11][12][13][14]. SDE method is very nicely and deeply explained in Hitch-hiker's Guide to Stochastic Differential Equations [15,16].
This article used the backward-in-time stochastic integration approach for all of the simulations. In this approach, quasi-particle objects were injected at the registration boundary inside the heliosphere. They then move back in time through the heliosphere until they reached the heliosphere outer boundary. Both forward and backward approaches descend from the Kolmogorov forward and Kolmogorov backward equations, which is the reason why they have the same mathematical description. Results of both methods are compared in [16][17][18].
The problem of cosmic ray modulation is intrinsically a time-dependent problem. Hysteresis effect appears in cosmic rays intensities and solar parameters dependencies, as sunspot numbers, 10.7 cm solar flux, and other solar parameters.
In this paper, we evaluate the solution of the Parker equation using the backward-intime stochastic integration approach. We focus on the estimation of statistical error for 1D backward-in-time stochastic differential equations method. Preliminary analysis of statistical error for 1D forward-in-time method we present in article [19]. Scaling study of the SDE approach application to cosmic ray modulations showing the influence of a different number of injected particles on realistic test problem is presented in [20]. In this article, we present an explanation of how statistical error in the backward SDE method appears and show how someone with his own SDE code, could fastly estimate the dependence of statistical error of his own method on a number of injected particles. Additionally, we focus on the development of a method allowing a scan of the parameter space of the model to evaluate statistical error dependence on the model's parameters.
For the sake of clarity, this article has focused on the solution of the 1D Parker equation. The solutions that are presented were evaluated for different sets of input parameters. The injection energy range was taken to be from 1 GeV up to 100 GeV.

Model Description
In 1965, Parker proposed a transport equation that describes the GCR distribution inside the heliosphere [1]. The equation that describes the solar modulation and particle propagation process in the heliosphere can be written through an omnidirectional distribution function f ( x, p), with particle momentum p in the following form: Here V = V sw + V dri f t , V sw is solar wind velocity, V dri f t is the particle magnetic drift velocity, x is the 3-D spatial position in Cartesian coordinates, andK is the diffusion tensor. The differential intensity J is related to f as J = p 2 f .
The stochastic SDE set integrations should be performed in a Euclidean space. The set of the spatial position could be pronounced as a set of Cartesian coordinates [5]. In 2010, Pei et al. found that spherical coordinates can be successfully applied [21]. For 1D representation of the heliosphere in spherical coordinates, and for the case where all parameters depend only on radius and energy, the Parker equations can be simplified to the following form [17,22]: Where the diffusion tensorK was simplified to be a scalar K = K 0 βP, here K 0 is the diffusion parameter, β is the particle velocity in speed of light units, and P is the rigidity in gigavolt units. Given that the magnetic field is assumed to be spherically symmetrical in 1D representation, the magnetic drift velocity in the radial direction is equal to zero. Then drift velocity present in the 3D representation has zero value in the 1D representation. The main difference between 1D and 2D or 3D models is the presence of drift in 2D and 3D models. Particle drift caused by the large-scale heliospheric magnetic field is evaluated from the anti-symmetric component of the diffusion tensor [2,23,24]. Comparison of 1D, 2D, and 3D models could be found in [21].

Forward-in-Time Stochastic Integrations
The stochastic path of a particle in the forward-in-time approach is described by SDE (12) (13) and (14) in [17]. In this approach, a quasi-particle injected at the heliosphere's border (for all of the simulations presented in this work) is spherical with a radius of 100 AU and it does not have any structure (i.e., termination shock, heliosheath, heliopause, or bow shock). For each step of time, the particle lost its energy (momentum) by the value calculated in Equation (13). Every time that the particle crosses a 1 AU registration radius, its actual energy and position are registered to an appropriated energy bin. The position at 0.01 AU was set as an inner reflecting boundary in all of the presented simulations (so-called mirroring). As pointed out by [21], the quasi-particle object is not a real particle but is simply a point in phase space.

Backward-in-Time Stochastic Integrations
The modulated spectra at 1 AU position were evaluated using the procedure described in [17] with LIS from [22], and thus the differential intensity was taken to be J ∝ p(m 2 c 4 + p 2 c 2 ) −1.85 . The time step for all of the backward-in-time stochastic integrations that are presented in this article is constant and taken to be ∆t = 5 s. In the presented article, all backward-in-time stochastic integrations are described by SDE (18) (19) and (20) in [17]. In the case of the backward-in-time approach, quasi-particle starts its way from the registration (target) position (in all simulations that were presented in the article, this position was taken to be 1 AU) and propagate to the heliosphere boundary. Unlike from forward-in-time approach, in backward-in-time case particle at every moment of time gain momentum by values evaluated by Equation (19) in [17].

Statistical Error for Selected Energies, Backward-in-Time Method
The situation in the backward-in-time method is more straightforward than the forward-in-time approach [19]. Every simulated quasiparticle is registered in the backwardin-time approach. It means that every quasiparticle traced back in time from the injection position in the heliosphere eventually reaches the border of the heliosphere.
We simulated ten separate sets of one million quasiparticles with energy 5 GeV at 1 AU to illustrate how an error in the backward in time method change with a number of injected particles. Every trajectory was injected at 1 AU with kinetic energy 5 GeV and propagated through the heliosphere backward-in time till the trajectory cross heliosphere border. The kinetic energy of the quasiparticle at the heliosphere border was recorded. Then, having a big set of simulated trajectories, we evaluated intensity first from a small part of quasiparticle trajectories and then again from a bigger part, and again from an even bigger part, repeating this process for incrementally bigger sets. Specifically, we start with an intensity at 1 AU evaluated from 1000 quasiparticle trajectories (taking the first 1000 quasiparticles from a set of a million simulated quasiparticles), then we evaluated intensity for 2000 quasiparticles (taking the first 2000 simulated quasiparticles), then for 3000 quasiparticles, etc., with step one being 1000 quasiparticles, till one million quasiparticles. We repeat the same procedure for all ten sets with one million quasiparticles. By this, we want to show an example of how evaluated intensity changes with an increasing number of used quasiparticles. Figure 1 shows how evaluated intensity of 5 GeV particles at 1 AU changes with a number of used quasiparticles. Results for every simulated set of million quasiparticles is marked by different collor. The convergence of evaluated intensities with an increasing number of used quasiparticles is clearly seen. The intensity in the figure is normalized to one (i.e., divided by mean intensity for all 10 million quasiparticles). To show statistical error, we divided an obtained set of ten million quasiparticles to subsets with N s quasiparticles. For example, for N s = 1000 we get 10,000 subsets (from 10 million simulated quasiparticles). From every subset with N s = 1000 quasiparticles we evaluate an average value of intensity I Ns . Resulted 10,000 values of I N s has normal distribution as is presented in Figure 2. Red squares shows evaluated histogram of I Ns distribution. Every subset of N s quasiparticles could be considered as random sampling from whole set, because every evaluation of quasiparticles trajectories is independent from each other evaluations.
Normal distribution appears here due to the central limit theorem as a result of averaging intensities of N s quasiparticles in subsets. Intensities of ten million simulated quasiparticles do not have a normal distribution. Only averaged values have a normal distribution. We evaluated mean µ Ns and standard deviation σ Ns for a set of 10,000 values of I Ns to describe obtained normal distribution. Resulted probability distribution function curve (red line in the figure) evaluated from µ Ns and σ Ns , as expected, nicely fit histogram of distribution. Standard deviation σ Ns for this example, for N s = 1000 is 0.014 in µ units i.e., 1.4 percent from µ value. ±1σ range is shown in the figure. This is an example for input model parameters K 0 and V set to approximate AMS-02 measurements in a period of May 2011. General dependency of a statistical error on K 0 and V is presented by us later in the text.
Horizontal dash black lines in Figure 1 show µ ± nσ for n = 1, 2. As expected, almost all results in example in Figure 1 are in 2σ range around µ value (because we have normal distribution here, 95.45% of all cases should be in µ ± 2σ). Figure 2 shows also evaluation for N s = 100 (here we have 10 5 values of I Ns ) shown in blue color. As expected distribution for N s = 100 is wider, and has value σ Ns 0.045 in µ units. The dependency of σ Ns at N s for sets with normal distribution should have power law shape with slope −1/2. To show it we evaluated a σ Ns for set of different N s with logarithmical step. In Figure 3. we show dependency of σ on N s for evaluation of intensities for two energies Tkin equal 5 GeV and 1 GeV. Evaluated σ values in the percent of µ units are presented by points. Linear fit of both dependencies in log-log scale has slope with value very close to expected value −1/2, i.e., σ ∼ N −1/2 s . σ values evaluated for same N s have higher values for lower energies. To reach same statistical error for lower energies, higher number of simulated particles, i.e., higher N s is needed. To investigate how σ depend on energy for wider range of energies we simulated 10 5 quasiparticles for every energy and average them by groups of 100 quasiparticles (N s = 100) to evaluate σ for every energy from one thousand intensities. We realize simulations for energies from 1 GeV to 100 GeV for three combinations of input values of diffusion coefficient K 0 = 3 × 10 18 , 5 × 10 18 , 7 × 10 18 m 2 /s and solar wind velocity V = 400 km/s. Results are presented in Figure 4 where σ values in the percent µ units are shown in red color for K 0 = 3 × 10 18 m 2 /s, for K 0 = 5 × 10 18 m 2 /s in black color and for K 0 = 7 × 10 18 m 2 /s in blue color. Range of diffusion values was selected to cover usual values used to simulate measured spectra at 1 AU by 1D SDE models. The dependency, where σ is bigger for smaller energies presented in Figure 3 is more clearly visible in Figure 4. One could notice that with smaller values of diffusion coefficients are σ values bigger. We could notice, that for example for 1 GeV particles simulation with 100 quasiparticles (N s = 100) has σ value close to 3 percent in µ units for K 0 = 3 × 10 18 m 2 /s (signed by red in figure) and 6 percent for K 0 = 7 × 10 18 m 2 /s (signed by blue color in figure). For 10 GeV particles, we get from simulation with 100 quasiparticles σ close to 2 percent for K 0 = 3 × 10 18 m 2 /s and close to 1 percent for K 0 = 7 × 10 18 m 2 /s. How exactly depend σ on diffusion coefficient, which could be seen in Figure 5 where we show dependencies for selected energies 1, 3, 10, 30 and 100 GeV. Dependencies have power law shape with slopes very close to value −1.0 for higher energies (over the few tens of GeV), i.e., σ ∼ K −1.0 0 . For smaller energies, in a region with significant solar modulation, slope value decreases, for example, for energy 1 GeV is close to −0.8, i.e., σ ∼ K −0.8 0 . With knowledge that σ has power law dependency on N S with slope −1/2 (consequence of central limit theorem and resulted normal distributions of averaged intensities, then σ is standard error of the mean) and with simulation for one N S we could evaluate σ values for other N S values. Thus, we could evaluate how many quasiparticles we need to simulate and reach statistical error at a level of 1 percent. In other words, to reach σ equals 1 percent of mean value µ. Results for three values of diffusion coefficient K 0 = 3 × 10 18 , 5 × 10 18 , 7 × 10 18 m 2 /s, solar wind velocity V = 400 km/s and five selected energies T kin = 1, 5, 10, 20, 50 GeV are shown in Table 1.
We could notice range of needed simulation from a few to a couple of thousands. For diffusion coefficient K 0 = 3 × 10 18 m 2 /s, we need only 12 particles to reach σ equal 1 percent in µ units for particles with energy 50 GeV. However, for energy 1 GeV we need to simulate almost four thousand particles to reach the same σ value. While for diffusion coefficient K 0 = 7 × 10 18 m 2 /s for 50 GeV we need only two particles to have σ at one percent level, and for 1 GeV we need almost one thousand of simulated particles. As a consequence, we could set simple simulation strategies to keep statistical error of simulation limited to σ at a 1 percent level, i.e., get evaluated intensities for all energies with statistical errors σ smaller than one percent. For the case presented in Table 1, i.e., for solar wind velocity 400 km/s it could be, for example, a safe strategy where we simulate for every energy 10 4 /T kin quasiparticles, where T kin is in GeV.
Using this strategy we reach for all energies σ < 1%, thus we will have 68 percent of results in ±1% of µ and 95 percent of results in ±2% of µ. Here, µ practically means value which we get for simulations with very high numbers of quasiparticles.  Both input parameters, diffusion coefficient and solar wind velocity have influence to statistical error expressed by σ. Dependency of σ on solar wind velocity is presented in Figure 6. In this figure we show evaluation for solar wind velocities V = 300, 400, 500 km/s and diffusion coefficient K 0 = 5 × 10 18 m 2 /s. Evaluation of σ was done for the same parameter N S = 100 and same number of simulated quasipatricles, 10 5 , for every energy from range of energies from 1 GeV to 100 GeV with the 1 GeV energy step. σ increases with higher values of solar wind speed. Statistical error is higher, when less particles reach the position at 1 AU. Interestingly, dependence of σ on solar wind velocity at high energies has slope closely equal to 1.0, i.e., σ ∼ V 1.0 as could be seen in Figure 7. For smaller energies, where solar modulation is bigger, for example, for energy 1 Gev slope is close to 0.8, i.e., σ ∼ V 0.8 . As we present previously for Table 1, where we did it for different diffusion coefficients and fixed solar wind velocity, we could evaluate σ values to get N S values to reach statistical error at level of 1 percent of µ for fixed value of diffusion coefficient and varied V value. Here we use simulations with K 0 = 5 × 10 18 m 2 /s and vary solar wind velocities with V = 300, 400, 500 km/s. Used range of solar wind velocities is usual range of averaged values from periods close to one Bartols rotation, for which experimental results are usually published. Results are presented in Table 2. One could notice that for solar wind velocity V = 300 km/s for 50 GeV we need only three particles to have σ at one percent level, and for 1 GeV we need more than one thousand of simulated particles. For solar wind velocity V = 500 km/s for 50 GeV we need eight particles to have σ at one percent level, and for 1 GeV we need two and half thousand of simulated particles. With faster solar wind we need more simulated quasiparticles to reach the same statistical error. Safe simulation strategy for varied diffusion values and V fixed at 400 km/s mentioned previously here, with 10 4 /T kin particles simulated for energy T kin in GeV, stay still valid when we vary solar wind velocity. The combination of solar wind velocity and diffusion coefficient with the biggest statistical error is K 0 = 3 × 10 18 m 2 /s and V = 500 km/s. For this combination, we need 4895 simulated quasiparticles with kinetic energy 1 GeV to reach σ at level one percent from µ. Thus strategy with 10 4 /T kin particles simulated for energy T kin in GeV could be considered as safe.  The turbulence spectra have also an effect on statistical error in backward-in-time SDE method. Diffusion coefficient as a function of particle rigidity presented in the previous part depends on rigidity linearly (K = K 0 βP 1.0 ). However, dependency on rigidity could have a different slope K = K 0 βP α with α bigger or smaller than one. We tested influence of different values of α to statistical error for range from α = 0.6 to α = 1.4. To test influence of these α values to statistical error we used the same method as before and evaluate σ for N S = 100, K 0 = 5 × 10 18 m 2 /s and V = 400 km/s. Results for the range of kinetic energies from 1 Gev to 100 Gev are shown in Figure 8. As inspection of the figure shows, with smaller α values we have bigger statistical error. The difference between σ values for different α raise with kinetic energy i.e., is smallest for small energies. We used results presented in Figure 8 to evaluate number of simulated quasiparticles needed to reach statistical error σ equaling 1 percent from intensity mean value µ at different energies. Results for selected energies are shown in Table 3. for two situations. The first one with typical values of the diffusion coefficient and solar wind velocity, for K 0 = 5 × 10 18 m 2 /s & V = 400 km/s which we call average situation. The second one for combination of input parameters with maximal statistical error, is for K 0 = 3 × 10 18 m 2 /s & V = 500 km/s. In a situation, with a maximum statistical error, we need for kinetic energy 1 GeV ∼six thousand simulated quasiparticles to reach σ at level 1 percent of µ. Recommended strategy with 10 4 /T kin particles simulated for energy T kin in GeV is still safe for parameters evaluated in Table 3. To compare our suggested strategy with other authors we could reference only [25] and the current article [26]. The first author generally mentions that typically a few thousand stochastic processes can yield a result with better than 3% error, without specifying energies or ranges of input parameters used for modeling. In the second authors, they found that the generation of N ∼ 2 × 10 3 pseudoparticles for each energy bin is sufficient for being not dominated by SDE-related uncertainties. Which is generally consistent with our recommended safe simulation strategy.  Let us note, that all previously presented results are independent of the time step of simulation ∆t value when keeping condition for dominancy of diffusion over convection valid. The results were tested in simulations with different time steps and the results were very similar.

Conclusions
The statistical error of the SDE backward-in-time 1D approach of Parker's equation solution was analyzed. We show that intensities evaluated with this approach have, due to central limit theorem, normal distribution. We present that standard deviation σ of intensities normal distribution depend on a number of averaged intensities, i.e., on a number of simulated quasiparticles, with power law dependency with slope −1/2. Dependency of σ on input parameters of the simulation, diffusion coefficient K 0 , and solar wind speed V was studied. Dependency of σ on K 0 has power law shape with slope −1.0 at high energies and higher, but close to −1.0 values for lower energies from the region where solar modulation is maximal, i.e., for 1 GeV slope close to −0.8. Dependency of σ on V has power law shape with slope 1.0 at high energies and smaller, but close to 1.0 values for lower energies, i.e., for 1 GeV slope close to 0.8. Based on obtained results, the safe simulation strategy for spectra evaluation at 1AU was suggested. Simulate for every energy 10 4 /T kin quasiparticles (T kin in GeV). This strategy leads to results where we have for every energy in the evaluated spectrum statistical error σ smaller than 1 percent of µ value for normal distribution of intensities for selected energy.