Colebrook’s Flow Friction Explicit Approximations Based on Fixed-Point Iterative Cycles and Symbolic Regression

The logarithmic Colebrook flow friction equation is implicitly given in respect to an unknown flow friction factor. Traditionally, an explicit approximation of the Colebrook equation requires evaluation of computationally demanding transcendental functions, such as logarithmic, exponential, non-integer power, Lambert W and Wright Ω functions. Conversely, we herein present several computationally cheap explicit approximations of the Colebrook equation that require only one logarithmic function in the initial stage, whilst for the remaining iterations the cheap Padé approximant of the first order is used instead. Moreover, symbolic regression was used for the development of a novel starting point, which significantly reduces the error of internal iterations compared with the fixed value staring point. Despite the starting point using a simple rational function, it reduces the relative error of the approximation with one internal cycle from 1.81% to 0.156% (i.e., by a factor of 11.6), whereas the relative error of the approximation with two internal cycles is reduced from 0.317% to 0.0259% (i.e., by a factor of 12.24). This error analysis uses a sample with 2 million quasi-Monte Carlo points and the Sobol sequence.

where f is the Darcy flow friction factor (dimensionless), Re is the Reynolds number (dimensionless) and ε is the relative roughness of inner pipe surface (dimensionless). The Colebrook equation has been known since the end of the 1930s and is based on an experiment by Colebrook and White [16] that had been performed a few years earlier using pipes with different inner surface roughness, from smooth to very rough. As the Colebrook equation is empirical [1,16], its accuracy can be disputed [17,18] (e.g., the new Oregon and Princeton experiment related to pipe friction [18]); nevertheless, the equation is widely accepted as a standard and is in common use in the design of water and gas pipe networks [19]. It can be also adapted for special cases such as air flow through fuel cells [20], water flow in rivers [21,22] and blood flow in blood vessels [23].
Praks and Brkić [24] recently showed a Newton-Raphson iterative solution of the Colebrook equation based on Padé approximants [25][26][27][28][29]. Based on their solution, one simplified approach and a novel starting point that significantly reduces numerical error will be offered herein.
The simplified method is based on the fixed-point method, a variant of the Newton-Raphson method in which the first derivative of the unknown function is equalized to 1 [30]. In the worst case, the demonstrated iterative procedures need up to seven iterations to reach the final accurate solution [10], whereas in this study the explicit approximations derived from the iterative procedure will have relative errors of up to 1.81% and 0.317% in the case of one internal and two internal iterative steps, respectively. Moreover, when the novel simple rational function p 0 of Equation (2) is used as a starting point, the relative errors are significantly reduced to 0.156% and 0.0259%, respectively. Such small values of the relative error are acceptable for engineering applications, given the empirical nature of the Colebrook equation [17,18].
The approach with Padé approximants offers an iterative procedure in which the repetition of the computationally expensive logarithmic function is avoided [24]. Using only one logarithmic function together with simple rational functions and avoiding computationally expensive functions, such as exponential functions or functions with non-integer powers, the presented method saves the processor time [7][8][9][12][13][14], which is essential for effective simulations of large pipe networks [19].
The presented approximations are in a form suitable for computer codes and for application in software packages for everyday engineering use.

A Fixed-Point Iterative Procedure Based on Padé Approximants
The logarithmic function can be accurately approximated in a certain relatively narrow domain using Padé approximants (an approximation of the given function by a rational function of given order) [24]. For example, log 10 (97) can be cheaply approximated knowing that log 10 (100) = 2, and using the fact that log 10 (97) = log 10 100 100 97 = log 10 (100) − log 10 100 97 , where 100 97 ≈ 1.0309, which is close to 1. Consequently, log 10 (1.0309) can be accurately estimated by its Padé approximant at the expansion point x = 1. This simple idea can be used for solving the logarithmic-based Colebrook equation. Starting from the initial point .71 , the next value can be calculated as x 1 = −2· log 10 (y 0 ). The pre-computed numerical value ζ; ζ = log 10 (y 0 ) should be used in all the following iterations [24]. The next value y 1 = 2.51·x 1 Re + ε 3.71 and the previous value y 0 should be used as the argument of the Padé approximant z 01 = y 0 y 1 , where the chosen Padé approximant is . Now, the new value log 10 (y 1 ) = ζ + 0.8686·p 01 should be used for the calculation ln (10) . To solve the Colebrook equation using a numerical fixed-point iterative procedure, the steps shown in Figure 1 should be followed. Figure 1 includes a simplified procedure compared with [24], where the Newton-Raphson iterative procedure is used instead of the herein presented fixed-point iterative procedure.
The initial starting point x 0 = 1 √ f 0 should be chosen from the domain of applicability of the Colebrook equation, which is 0 < ε < 0.05 and 4000 < Re < 10 8 . Every initial starting point chosen from this domain will be suitable, but to reduce the required number of iterations, different strategies are offered [10]. We will demonstrate the approach with the fixed-value initial starting point, but also the approach with the starting point given by the rational function: Equation (2) was found using the symbolic regression tool Eureqa [31,32] following the approach of Gholamy and Kreinovich [33], in which existing absolute error minimizing software was used to minimize the relative error (various artificial intelligence tools have been used recently for the Colebrook equation, such as symbolic regression [34], novel artificial bee colony programming (ABCP) methods [35], artificial neural networks [36], genetic algorithms [37], etc.). The novel starting point given by Equation (2) has a maximal relative error of only 6.7%, whereas the previous version of the starting point [34] had a maximal relative error of 40%. Thus, we can observe that the approach Gholamy and Kreinovich [33] used was very useful for symbolic regression of the Colebrook equation. The initial starting point x = should be chosen from the domain of applicability of the Colebrook equation, which is 0 < ε < 0.05 and 4000 < Re < 10 . Every initial starting point chosen from this domain will be suitable, but to reduce the required number of iterations, different strategies are offered [10]. We will demonstrate the approach with the fixed-value initial starting point, but also the approach with the starting point given by the rational function: p = x = 1 f = 2600 · Re 657.7 · Re + 214600 · Re · ε + 12970000 − 13.58 · ε (2) Calculation of log 10 (y 0 ) should not be repeated in each iteration, but its numerical value ζ should be used in a combination with the Padé approximant in every iteration (with the exception of the first) [24]. used is of the order /1,1/, referring to the polynomial order in the numerator and in the denominator, respectively [24]. As the expansion point z = 1 is the root of ln(z), the accuracy of the Padé approximant decreases. For this reason, setting the OrderMode option in the Matlab Padé command to relative compensates for the decreased accuracy. Consequently, the numerator of the Padé approximant of the order /1,1/ includes the polynomial of the second order.
Possible software implementation of developed algorithms is summarized in Section 4.

Explicit Approximations Based on Padé Approximants
Two explicit approximations of the Colebrook equation based on the algorithm from Figure 1 will be shown here. The approximation with one internal iterative cycle will be shown in Section 3.1, while the approximation with two internal iterative cycles will be presented in Section 3.2. The iterative procedures are sensitive, depending on the chosen initial starting point [38], and therefore carefully conducted numerical tests are provided separately for the approximation with one and two internal iterative cycles. In both cases, fixed numerical values for the initial starting points are chosen in such a way as to reduce the final maximal relative error over the domain of applicability of the Colebrook equation. Alternatively, the novel rational function, Equation (2), is used as the initial starting point. We show that this alternative approach significantly reduces the relative error of approximations.
In order to perform a deeper analysis of the turbulent flow, the emphasis of this section is on the Reynolds number Re between 10 4 and 10 8 (the analysis is compatible with [39]). However, the error aggregates in the zone between 4000 and 10 4 , where it is still possible that a laminar flow can exist.

An Explicit Approximation with One Internal Iterative Cycle and a Numerical Fix-Value Starting Point
After numerous numerical tests, it is clear that the algorithm from Figure 1 gives the smallest relative error in the domain of applicability of the Colebrook equation for the approximation with one internal cycle, Equation (3), if the initial starting point is set as (the numerical value 16.9 of the parameter c in Equation (3) is computed as 2.51 × 6.73306). The maximal relative error of Equation (3) is up to 0.79% and its distribution can be seen in Figure 2.

An Explicit Approximation with One Internal Iterative Cycle and a Starting Point Given by a Rational Function
Using the rational function p 0 , Equation (2) for the initial starting point x 0 and the proposed approximation, Equation (4) has a maximal relative error of 0.101%. The distribution of its relative error is shown in Figure 3. In comparison with the fixed point version, Equation (3)-the novel staring point-Equation (2) reduces the maximal relative error for one internal iterative cycle by a factor of 7.82 (=0.79%/0.101%).
where p is from Equation (2).
where p is from Equation (2).

An Explicit Approximation with Two Internal Iterative Cycles and a Fixed Starting Point
The explicit approximation of the Colebrook formula with two internal iterative cycles is given in Equation (5). The numerical fixed value 18.15 of the parameter c of Equation (5) depends on the initial starting point, which is carefully chosen to reduce the maximal relative error and corresponds to the The maximal relative error of Equation (5) is up to 0.172% and its distribution is shown in Figure 4.

An Explicit Approximation with Two Internal Iterative Cycles and a Fixed Starting Point
The explicit approximation of the Colebrook formula with two internal iterative cycles is given in Equation (5)

An Explicit Approximation with Two Internal Iterative Cycles and a Starting Point Given by a Rational Function
The introduction of a starting point given by the rational function Equation (2) instead of a fixed point significantly reduces the relative error from 0.172% in Equation (5) to no more than 0.0154% in Equation 6 (i.e., by a factor of 11.16). Equation (6) presents the new explicit approximation with two internal iterative cycles and a starting point given by the rational function, Equation (2), while the distribution of its relative error is shown in Figure 5.

An Explicit Approximation with Two Internal Iterative Cycles and a Starting Point Given by a Rational Function
The introduction of a starting point given by the rational function Equation (2) instead of a fixed point significantly reduces the relative error from 0.172% in Equation (5) to no more than 0.0154% in Equation 6 (i.e., by a factor of 11.16). Equation (6) presents the new explicit approximation with two internal iterative cycles and a starting point given by the rational function, Equation (2), while the distribution of its relative error is shown in Figure 5.
where p o is from Equation (2).
where p is from Equation (2).  Table 1 summarize the results of error analysis for 2 million quasi Monte-Carlo points using the Sobol sequence [40,41] related to Equations (3)- (6). The domain for this analysis in respect to the Reynolds number Re is from 4000 to 10 8 , while for the sample of 740 points, which was used for Figures 2-5, the domain is from 10 4 to 10 8 .  Table 1 shows that iterations using the Padé approximant of the first order /1,1/ ln11(z) = (z*(z+4)-5)/(4*z+2) have the same relative error as iterations using the natural logarithm. For the method with  Table 1 summarize the results of error analysis for 2 million quasi Monte-Carlo points using the Sobol sequence [40,41] related to Equations (3)- (6). The domain for this analysis in respect to the Reynolds number Re is from 4000 to 10 8 , while for the sample of 740 points, which was used for Figures 2-5, the domain is from 10 4 to 10 8 . Table 1. Comparison of maximal relative error (in %) for the first and second iteration without the Padé approximant and with a Padé approximant of the order /1,1/ for the Reynolds number Re from 4000 to 10 8 .
Compared with analysis performed in MS Excel where 740 points are used, analysis with 2 million quasi Monte-Carlo points using the Sobol sequence detected additional peaks of error, as shown in Table 2. The same value of the maximal error is detected for even 2000 quasi Monte-Carlo points. Thus, this analysis confirms that the Sobol sequence of 2 million quasi Monte-Carlo points sufficiently covers the whole domain. For example, for Equation (6), the largest relative error is detected for the input pair Re = 5263 and ε = 3.1707 × 10 −7 , which corresponds to the zone with lower Reynolds numbers. Although this input pair is not shown in Figures 2-5, the figures clearly indicate a tendency towards higher relative errors when the Reynolds number Re and the relative roughness of the inner pipe surface ε are smaller. As shown in Figure 6, additional analysis in MS Excel for the zone 4000 < Re < 10 4 shows very good agreement with the results obtained in Matlab.
Compared with analysis performed in MS Excel where 740 points are used, analysis with 2 million quasi Monte-Carlo points using the Sobol sequence detected additional peaks of error, as shown in Table 2. The same value of the maximal error is detected for even 2000 quasi Monte-Carlo points. Thus, this analysis confirms that the Sobol sequence of 2 million quasi Monte-Carlo points sufficiently covers the whole domain. For example, for Equation (6), the largest relative error is detected for the input pair Re = 5263 and = 3.1707 × 10 −7 , which corresponds to the zone with lower Reynolds numbers. Although this input pair is not shown in Figures 2−5, the figures clearly indicate a tendency towards higher relative errors when the Reynolds number and the relative roughness of the inner pipe surface are smaller. As shown in Figure 6, additional analysis in MS Excel for the zone 4000 < Re < 10 4 shows very good agreement with the results obtained in Matlab.

Description of Software
The presented approximations were thoroughly tested at IT4Innovations, National Supercomputing Center, VŠB (Technical University of Ostrava, Czech Republic). For simplicity, the codes were expressed in MS Excel format [42] and in Matlab. The Matlab codes also work in GNU Octave, but they can be easily translated in any programming language.
Software codes in MS Excel are given in Supplementary Materials. However, we provide Matlab codes here, in order to support the benchmark study of the Colebrook equation [35] (in addition these Matlab codes are also given in Supplementary Materials).
Equation (4): An explicit approximation with one internal iterative cycle and a rational starting point given by Equation (2).
Matlab code: For the rational starting point, x is given by Equation (2). The rest of the Matlab code is unchanged.

Conclusions
This article presents simplified, but still precise algorithms for solving the Colebrook equation based on a fixed-point iterative procedure in which the logarithm is replaced by its Padé approximant in all iterations except the first one [24]. Moreover, a novel simple starting point, which was found by symbolic regression, significantly reduces the error of internal iterations. Consequently, a cheap Padé approximant of the first order is suitable for approximations. Depending on the stopping criteria and number of iterations, this algorithm can solve the Colebrook equation with any desired accuracy. As the computationally expensive logarithmic function is used only in the first step, this algorithm can be used efficiently for simulations of a complex pipe network. This is because the carefully selected Padé approximants can save computational time. For a good balance between efficiency and accuracy [9], two explicit approximations have been demonstrated based on a simplified iterative procedure with a maximal relative error of no more than 1.81% and 0.156% for the variant with one internal iterative cycle, and of no more than 0.317% and 0.0259% for two internal cycles (higher error values are for the fixed initial starting point, while lower error values are for the initial starting point given by the rational function, Equation (2), respectively). The value of the relative error, and also the tendency of its distribution, are confirmed by 2 million quasi Monte-Carlo points using the Sobol sequence for 4000 < Re < 10 8 and for 0 < ε < 0.05.
The presented approximations have an acceptable level of accuracy [43], especially in light of the fact that the Colebrook equation is empirical and by its nature not always accurate. We can see that iterations with the cheap Padé approximant of the first order have a nearly identical relative error to iterations with the logarithm. Consequently, the herein presented approximations are favorable, as they do not contain computationally demanding evaluations such as special functions, exponential functions, or non-integer powers.
Funding: This work has been partially funded by the Technology Agency of the Czech Republic, partially by the National Centre for Energy TN01000007 and partially by the project "Energy System for Grids" TK02030039. The work of D.B. has also been supported by the Ministry of Education, Youth, and Sports of the Czech Republic through the National Programme of Sustainability (NPS II) project "IT4Innovations excellence in science" LQ1602 and by the Ministry of Education, Science, and Technological Development of the Republic of Serbia through the project "Development of new information and communication technologies, based on advanced mathematical methods, with applications in medicine, telecommunications, power systems, protection of national heritage and education" iii44006.