A Neural Network Technique for the Derivation of Runge–Kutta Pairs Adjusted for Scalar Autonomous Problems

: We consider the scalar autonomous initial value problem as solved by an explicit Runge– Kutta pair of orders 6 and 5. We focus on an efﬁcient family of such pairs, which were studied extensively in previous decades. This family comes with 5 coefﬁcients that one is able to select arbitrarily. We set, as a ﬁtness function, a certain measure, which is evaluated after running the pair in a couple of relevant problems. Thus, we may adjust the coefﬁcients of the pair, minimizing this ﬁtness function using the differential evolution technique. We conclude with a method (i.e. a Runge–Kutta pair) which outperforms other pairs of the same two orders in a variety of scalar autonomous problems.


Introduction
The Initial Value Problem (IVP) is given as with t, t 0 ∈ I·R, x, x ∈ I·R m and f : I·R × I·R m → I·R m . Amongst the most celebrated numerical methods for dealing with (1) are Runge-Kutta (RK) pairs. The following Butcher tableau [1,2] characterizes these methods.

a B c c
with c T ,ĉ T , a ∈ I·R s and B ∈ I·R s×s . In the above case, the pair shares s stages that are evaluated explicitly when B is strictly lower triangular. The approximation of the solution, forwards from (t n , x n ) to t n+1 = t n + h n is furnished by two estimations of x(t n+1 ). These are x n+1 andx n+1 , which are given by and k i = f (t n + a i h n , x n + h n · j=i−1 ∑ j=1 b ij k j ), for i = 1, 2, · · · , s. The approximations x n+1 andx n+1 are of algebraic orders p and q < p, respectively. Thus, a local error estimation is formed in every step. This helps in forming the following step changing algorithm where σ is a tolerance, chosen by the user, and d = 0.9 is a safety factor. In the case that n < σ, the above formula is used as the next step-length. On the contrary, we also use this formula, but the approximate solution is not forwarded and h n+1 is used as new version of h n . Details of this issue can be retrieved from [3] or even [4] (pg. 167-168). These methods are usually abbreviated as RKp(q) pairs. Runge-Kutta methods first appeared in the late nineteenth century [5,6], while pairs were introduced after 1960. The first-celebrated such pairs, of orders 5(4), 6(5) and 8 (7), were presented by Fehlberg [7,8]. Then, in the early 1980's, Dormand and Prince followed [9,10]. In addition, our research group has derived a number of such pairs [11][12][13][14].
Runge-Kutta pairs are well suited for efficiently approximating the solution of nonstiff problems of the form in (1). Such problems arise, for example, when creating digital twins of clean-energy production technologies, for which functionally fitted finite element methods, with one-step and multi-step forms and improved dispersive and dissipative properties, have become critically important [15]. The precision demanded explains the wide diversity of pairs. As a result, the lower the accuracy on demand, the more efficient the lower RK pairs are. In reverse, a higher-order pair should be used for stringent accuracies at quadruple precision. [16]. The effort to construct better pairs is the subject of current research [17,18].
Here, we focus on RK6(5) pairs, which are used for high to modest accuracies. We are especially interested in problems (1) of the form with f : I·R → I·R. These problems are called scalar autonomous, and we will derive a particular RK6(5) pair tuned specially for this type of problem.

Theory of Runge-Kutta Pairs of Orders 6(5)
Runge-Kutta pairs of orders six and five are, almost, the most-used ones. Their coefficients have to satisfy 54 order conditions, and families of solutions have been discovered through the years. We have selected the Verner-DLMP [19,20] family, which has the advantage of being solved linearly. This is an FSAL (First Stage As Last) family. Although s = 9, the pairs use only eight stages each step, since the last (the 9th) stage is used as the first stage of the next step.
Then we choose freely the coefficients a 2 , a 4 , a 5 , a 6 , a 7 andĉ 9 . This family's pairs have been shown to perform the best in a variety of problems [21].
We may proceed by evaluating explicitly the remaining coefficients. In the following algorithm a i is the vector with the components of a raised to the i-th power and e = [1, 1, · · · , 1] T ∈ I·R s . (B · a) 5 is the 5-th component of B · a. See [22,23] for more details. A = diag(a) and I is the identity matrix of proper dimension. ALGORITHM: The free parameters are a 2 , a 4 , a 5 , a 6 , a 7 ,ĉ 9 . It is known that for this family c 2 = c 3 =ĉ 2 =ĉ 3 = 0 and b i2 = 0, i = 4, 5, 6, 7, 8. Consecutively execute the following instructions.
All equations are solved explicitly and straightforwardly. No back substitutions or implicit equations are present.
The question now raised is how to select the free parameters. Traditionally, we attempt to reduce the norm of the principal term of the local truncation error, i.e., the coefficients of h 7 in the residual of Taylor error expansions corresponding to the sixth order method of the underlying RK pair [11].
Another choice is examined in [24], where we dealt with a class of seven stages, as well as FSAL pairs of orders six and four, that are specially tuned for addressing the problems of interest here. We presented the reduced set of order conditions for the case of interest; then we solved it in order to furnish a certain pair ST6(4). We didn't considered this case, since there are only two free parameters that are non-linearly dependent and it is not expected to make serious improvement. However, we will include this pair in our numerical tests.

Training the Coefficients
Here, our approach is to train the coefficients of a method. In this view, we say that we are utilizing a Neural Network technique. We consider, as input, the free parameters of the Runge-Kutta pair. Then, the steps taken for solving an Initial Value Problem can be seen as internal layers, while the output is a particular efficiency measure of the results.
We intend to derive a particular RK6(5) pair belonging to the studied family above. The resulting pair has to perform best on scalar autonomous problems (2). First, let us say that some pair was run in a certain problem (2) for some tolerance.
We then record the number µ of function evaluations (stages) needed and the global error ε observed over the whole mesh (grid-points) in the interval of integration. Then, we form the efficiency measure r = µ · ε 1/6 .
Here we choose the following couple of scalar autonomous problems.
1st problem : with theoretical solution x(t) = log(e + t), and 2nd problem : with theoretical solution x(t) = 2 3 2 3 · x 3/2 . After using as tolerance σ = 10 −11 , we ran DLMP6(5) for the above problems. We recorded µ 1 = 369, ε 1 ≈ 1.9 · 10 −12 , 1 r DLMP65 = 4.09, for the first problem and µ 2 = 433, ε 2 ≈ 7.6 · 10 −11 , 2 r DLMP65 = 8.92, for the second problem. Let us suppose that any new pair NEW6(5) furnishes corresponding efficiency measures 1 r NEW65 and 2 r NEW65 for the same runs. We then form, as a fitness function, the sumr and try to maximize it. That is, the compound fitness function is actually two whole runs of Initial Value Problems. The valuer changes according to the selection of the free parameters a 2 , a 4 , a 5 , a 6 , a 7 andĉ 9 .
The original idea is based on [25]. For the minimization ofr we used the differential evolution technique [26]. We have already tried this approach and previously acquired some interesting results when producing Numerov-type methods for integrating orbits [27]. In that work, we trained the coefficients of a Numerov-type method on a Kepler orbit. We then observed excellent results over a set of Kepler orbits, as well other known orbital problems.
Software [28] was used for implementing our approach. As the objective (i.e. fitness) function we added (4). Actually, 1 r DLMP65 and 2 r DLMP65 were found in advance; 1 r NEW65 and 2 r NEW65 were adjusted according to the selection of the free parameters.
The resulting pair is presented in Table 1.
In conclusion, no extra property seems to hold. The pair given in Table 1 does not provide anything interesting. No extended interval of stability is observed, no minimal truncation error exists, nor anything else. It is hard to believe its special performance after seeing its traditional characteristics.

Numerical Tests
We tested the following pairs chosen from the family studied above.
The problems we tested are listed in Table 2. We can verify that some of them describe real life problems, e.g., problem 1 demonstrates radioactivity decay. These problems cover various cases. Thus, we have included problems with slowly varying solutions, e.g., see problems 4 and 5. On the contrary, problems 7 and 8 share solutions that are constantly and clearly increasing. Problems with periodic solutions also exist, such as problem 9. Table 2. Problems tested.

Problem
Solution x(t) = 2 cot −1 e −t cot 0.05 We estimated 54 (i.e. 9 problems times 6 tolerances) efficiency measures for each pair. We set NEW65 as a reference pair. Then, we divided each efficiency measure of DLMP6 (5) with the corresponding efficiency measure of NEW6 (5). The results can be found in Table 3. The figures underlined for problems 5 and 7 are the numbers we found in the original training (in the previous section) with tolerance 10 −11 . It is obvious that our results are in favor of the second pair. On average, we observed a ratio of 1.75, meaning that DLMP6 (5) is about 75% more expensive. This is quite remarkable, since much effort has been spent over the years to achieve 10 − 20% efficiency [23,29]. In reverse, this means that about log 10 1.75 6 ≈ 1.46 digits were gained on average, for the same costs. In Table 4, we present the ratios of efficiency measures of ST6(4), with the corresponding efficiency measures of NEW6(5). On average, we observed a ratio of 1.76 meaning that ST6(4) is about 76% more expensive. In reverse, this means that about log 10 1.76 6 ≈ 1.47 digits were gained on average, with the same costs. The result is remarkable. ST6(4) was specially designed to address problems of the form (2). NEW6(5) outperformed the other pairs even in clearly non-linear problems. Finally, we highlight that we achieved more or less similar results for longer integrations. For illustration purposes we have included a couple of efficiency plots for Problems 1 and 6 in Figures 1 and 2, respectively. In these figures, we record the stages used by each pair vs the accuracy achieved. By drawing horizontal lines, we may justify the efficiency ratios for the corresponding problems in the tables above.
The results are very promising. Some future research may use optimization on a wider range of tolerances and model problems. Perhaps a seven-stage pair, that is specially constructed after solving the reduced set of order conditions for problems (2), and then trained properly, would furnish even more interesting results.

Conclusions
The training of the coefficients of a Runge-Kutta pair, for addressing a particular kind of problem, is considered. We concentrated on scalar autonomous problems and an extensively studied family of Runge-Kutta pairs of orders 5 and 6. After optimizing the free parameters (coefficients) of the pair with a couple of runs on certain scalar autonomous problems, we proposed a new pair. This pair was found to outperform other representatives from this family in a wide range of relevant problems.
The topic we presented in this paper may expand into many other cases. It can be easily applied to all kinds of Runge-Kutta methods and many other types of Initial Value Problems, such as Hamiltonian problems, Oscillatory problems, and more.