1. Introduction
High-speed Monte Carlo simulations are used for across a broad spectrum of applications, from mathematics to economics. As input for such simulations, the probability distributions are usually generated by pseudo-random number sampling, a method derived from the work of John von Neumann in 1951 [
1]. In the era of “big data”, such methods have to be fast and reliable, and a sign of this necessity was the release of Quside’s inaugural processing unit in 2023 [
2]. However, these samplings need to be cross-validated by exact methods, and for this, the knowledge of analytical functions that describe the stochastic processes, and among those, the error function, are of tremendous importance.
By definition, a function is called analytic if it is locally given by a converging Taylor series expansion. Even if a function itself is not found to be analytic, its inverse could be analytic. The error function could be given analytically, and one of these analytic expressions was the integral representation found by Craig in 1991 [
3]. Craig mentioned this representation only briefly and did not provide a derivation of it. Since then, there have been a couple of derivations of this formula [
4,
5,
6]. In 
Section 2, we describe an additional one that is based on the same geometric considerations as employed in [
7]. In 
Section 3, we provide the series expansion for Craig’s integral representation and show the rapid convergence of this series.
For the inverse error function, the guidance for special functions (e.g., [
8]) do not unveil such an analytic property. Instead, this function has to be approximated. Known approximations date back to the late 1960s and early 1970s [
9,
10]) and include semi-analytical approximations by asymptotic expansion (e.g., [
11,
12,
13,
14,
15,
16]. Using the same geometric considerations, as shown in 
Section 4, we developed a couple of useful approximations that can easily be implemented in different computer languages, resulting in the deviations from an exact treatment. In 
Section 5, we discuss our results and evaluate the CPU time. 
Section 6 contains our conclusions.
  2. Derivation of Craig’s Integral Representation
The authors of [
7] provided an approximation for the integral over the Gaussian standard normal distribution that is obtained by geometric considerations and is related to the cumulative distribution function via 
, where 
 is the Laplace function. The same considerations apply to the error function 
 that is related to 
 via
      
Translating the results of [
7] into the error function, we obtained the approximation of order 
p by the following:
      where the 
 values 
 (
) are found in the intervals between 
 and 
. A method for selecting those values was extensively described in [
7], where the authors showed the following:
      for 
. With 
 times larger precision, the following was expressed:
      for 
 and 
. For the parameters 
 of the upper limits of those intervals, we calculated the deviation by the following:
Given the values 
 with 
, with the limit 
, the sum over 
n in Equation (
2) could be replaced by an integral with measure 
 to obtain the following:
  3. Power Series Expansion
The integral in Equation (
6) could be expanded into a power series in 
,
      
      with
      
      where 
. The coefficients 
 could be expressed by the hyper-geometric function, 
, also known as Barnes’ extended hyper-geometric function. However, we could derive a constraint for the explicit finite series expression for 
 that rendered the series in Equation (
7) convergent for all values of 
t. In order to be self-contained, the intermediate steps to derive this constraint and to show the convergence were shown by the following, in which the sum over the rows of Pascal’s triangle was required:
Returning to Equation (
8), we had 
. Therefore,
      
The result in Equation (
8) led to the following:
      where the existence of a real number 
 is between 
 and 1, such that 
. We found the following:
Because of 
, there was again a real number 
 in the corresponding open interval so that the following was true:
As the latter was the power series expansion of 
, which was convergent for all values of 
t, the original series was then also convergent and, thus, 
 with the limiting value shown in Equation (
7). A more compact form of the power series expansion was expressed by the following:
  4. Approximations for the Inverse Error Function
Based on the geometric approach described in [
7], we were able to describe simple, useful formulas that, when guided by consistently higher orders of the approximation (
2) for the error function, led to consistently more advanced approximations of the inverse error function. The starting point was the degree 
, that is, the approximation in Equation (
3). Inverting 
 led to 
, and using the parameter 
 from Equation (
3) yielded the following:
For , the relative deviation  from the exact value t was less than , and for , the deviation was less than . Therefore, for , a more precise formula has to be used. As such, higher values for E appeared only in  of the cases, so this would not significantly influence the CPU demand.
Continuing with 
, we inserted 
 into Equation (
2) to obtain the following:
      where 
 and 
 are the same as for Equation (
4). Using the derivative of Equation (
1) and approximating this by the difference quotient, we obtained the following:
      resulting in 
. In this case, for the larger interval 
, the relative deviation 
 was less than 
. Using 
 instead of 
 and inserting 
 instead of 
, we obtained 
 with a relative deviation of maximally 
 for the same interval. The results are shown in 
Figure 1.
The method could be optimized by a method similar to the shooting method in boundary problems, which would add dynamics to the calculation. Suppose that following one of the previous methods, for a particular argument 
E, we found an approximation 
 for the value of the inverse error function of this argument. Using 
, we could adjust the improved result
      
      by inserting 
 and calculating 
A for 
. In general, this procedure provided a vanishing deviation close to 
. In this case as well as for 
, in the interval 
, the maximal deviation was slightly larger than 
, while up to 
 the deviation was restricted to 
. A more general ansatz
      
      could be adjusted by inserting 
 for 
 and 
, and yielded the system of equations:
      with 
. Therefore, 
 could be solved for 
A and 
B to obtain the following:
For 
, we obtained a relative deviation of 
. For 
, the maximal deviation was 
. Finally, an adjustment of
      
      led to the following:
      where 
. For 
, the relative deviation was restricted to 
, while up to 
, the maximal relative deviation was 
. The results for the deviations of 
 (
) for linear, quadratic, and cubic dynamical approximation are shown in 
Figure 2.
  5. Discussion
In order to test the feasibility and speed, we coded our algorithm in the computer language C under 
Slackware 15.0 (
Linux 5.15.19) on an ordinary HP laptop with an Intel® Core™2 Duo CPU P8600 @ 2.4GHz with 3MB memory used. The dependence of the CPU time for the calculation was estimated by calculating the value 
 times in sequence. The speed of the calculation did not depend on the value for 
E, as the precision was not optimized. This would be required for practical application. Using an arbitrary starting value 
, we performed this test, and the results are shown in 
Table 1. An analysis of this table showed that a further step in the degree 
p doubled the runtime while the dynamics for increasing 
n added a constant value of approximately 
 seconds to the result. Though the increase in the dynamics required the solution of a linear system of equations and the coding of the results, this endeavor was justified, as by using the dynamics, we could increase the precision of the results without sacrificing the computational speed.
The results for the deviations in 
Figure 1 and 
Figure 2 were multiplied by increasing the decimal powers in order to ensure the results were comparable. This indicated that the convergence was improved in each of the steps for 
p and 
n, at least by the corresponding inverse power, while the static approximations 
 in 
Figure 1 showed both deviations were close to 
, and for higher values of 
E, the dynamical approximations in 
Figure 2 showed no deviation at 
 and moderate deviations for higher values. However, the costs for an improvement step in either 
p or 
n was, at most, a 2-fold increase in CPU time. This indicated that the calculations and coding of expressions such as Equation (
9) were justified by the increased precision. Given the goals for the precision, the user could decide to which degrees of 
p and 
n the algorithm should be developed. In order to prove the precision, in 
Table 2, we showed the convergence of our procedure for 
 with fixed and increasing values of 
n. The last column shows the CPU times for 
 runs of the algorithm proposed in [
12] with 
N given in the last column of the table in [
12], as coded in C.
  6. Conclusions
In this paper, we developed and described an approximation algorithm for the determination of the error function, which was based on geometric considerations. As demonstrated in this paper, the algorithm can be easily implemented and extended. We showed that each improvement step improved the precision by a factor of ten or more, with an increase in CPU time of, at most, a factor of two or more. In addition, we provided a geometric derivation of Craig’s integral representation of the error function and a converging power series expansion for this formula.