Approximate Flow Friction Factor: Estimation of the Accuracy Using Sobol’s Quasi-Random Sampling

: The unknown friction factor from the implicit Colebrook equation cannot be expressed explicitly in an analytical way, and therefore to simplify the calculation, many explicit approximations can be used instead. The accuracy of such approximations should be evaluated only throughout the domain of interest in engineering practice where the number of test points can be chosen in many different ways, using uniform, quasi-uniform, random, and quasi-random patterns. To avoid picking points with undetected errors, a sufﬁcient minimal number of such points should be chosen


Introduction
The Colebrook equation, formally established in 1939 for calculating the flow friction through pipes, is empirical but widely accepted in engineering practice as an informal standard [1].It is based on an experiment from 1937 performed by Colebrook and White, who tested flow through a set of pipes with their inner surfaces ranging from smooth to very rough [2].The equation was later developed solely by Colebrook [1]; Equation (1).Its graphical interpretation is given by Rose [3] and was later reevaluated by Moody [4].Two non-dimensional parameters are used as input, the Reynolds number Re [5] and the roughness of the inner pipe surface ε [6].An unfortunate circumstance is that the Colebrook equation is expressed in an implicitly given logarithmic form with respect to the unknown Darcy's flow friction factor f, which cannot be extracted analytically.
where: f -turbulent Darcy's flow friction factor (dimensionless)-index 0 refers to the accurate solution obtained iteratively after a sufficient number of iterations.
Re-Reynolds number (dimensionless); Re = u•D/ ν where u is flow velocity (in m/sec), ν is the kinematic viscosity of fluid (in m 2 /sec), and D is inner diameter of pipe (in m).
ε-relative roughness of inner pipe surface (dimensionless); ε = ε*/D where ε* is the height of the protrusion in the inner surface of the pipe wall above the viscose fluid layer (in m), and D is the inner diameter of pipe (in m) where ε* << D (ε* typically goes from 0.0015 mm for PVC pipes through to about 3.0 mm for rough concrete pipes [7]).
ln-natural logarithm.The Colebrook equation is used in engineering practice for the Reynolds number Re between 4000 and 10 8 , and for the relative roughness of an inner pipe surface ε between 0 and 0.05, i.e., for a turbulent condition of flow.
The Colebrook equation in its implicitly given native form can be solved only iteratively [8][9][10][11].Therefore, to simplify the everyday life of engineers, various very accurate explicit approximations have been developed over time [12][13][14][15][16].The distribution of the relative error δ % over the domain of applicability of the Colebrook equation is uneven and is different for every new approximation [17,18].This error should be evaluated in a sufficient number of points dispersed over the domain of applicability in engineering practice, and the points should be uniformly, randomly or quasi-randomly distributed.This communication shows how to use Sobol's quasi-random distribution for such purposes.The Colebrook equation is widely used in many scientific disciplines where fluid flow occurs [19][20][21][22], and hence evaluation of the error and its distribution is essential for the ability to check and repeat scientific findings.

Estimation of Error; Testing Patterns and Quantity of Points
This section describes the calculation of the relative error δ % of the chosen approximation of the Colebrook equation and evaluates different quantities of testing points and related patterns, i.e., their distribution over the domain of its applicability in engineering practice.

Relative Error
The relative error δ % of any explicit approximation should be calculated in reference to the solution of the original implicitly given Colebrook equation.Its native implicitly given form is usually solved in an iterative process after sufficient iterations f 0 [8][9][10][11] (as the Colebrook equation is empirical, its accuracy can be disputed, but for this study, it is treated as accurate [23][24][25]).The relative error δ % is calculated as δ % = f − f 0 f 0 •100%, where f is obtained using the chosen explicit approximation (the testing approximation in this communication is given in Equation ( 2)).The goal is to find a worst case, which is represented by the maximum relative error, i.e., to find a combination of input parameters for the largest approximation error.For this reason, a sufficiently large number of sample points from the domain of the Colebrook equation has to be chosen and those points are chosen using Sobol's sampling, a type of quasi-random sampling which is capable of detecting picks of the relative error more efficiently than the classical Monte Carlo sampling, as fewer evaluation points of quasi-Monte Carlo points are required.Consequently, quasi-Monte Carlo sampling overperforms the classical Monte Carlo sampling [26].
The results do not depend only on the number of testing points, but also on their distribution.

Chosen Approximation for Tests
Explicit approximations of the Colebrook equation should be both accurate and simple for computation (a smaller number of floating-point operations requiring execution in a computer's processor increases the computational efficiency) [27][28][29][30][31][32].In general, an approximation with a smaller number of logarithmic and exponential functions is more efficient (each non-integer power and exponential function should be calculated approximately as two logarithmic functions [27]).
For the tests performed in this communication, an approximation given by Praks and Brkić [31] is used; Equation (2).It was chosen from a selection of the simplest and most accurate approximations, which included the approximations of Vatankhah [33], Lamri [34], Lamri and Easa [35], etc.
The parameters in Equation ( 2) are numerically optimized to minimize the maximal relative error δ % [36].Such variation of the numerical values of parameters does not only change the value of the maximal relative error δ % , but also changes the distribution of the error over the domain of applicability of the Colebrook equation in engineering practice [17,18].Therefore, the results depend on both the number of testing points and on their distribution.
The maximal relative error δ % of Equation ( 2) is estimated by Praks and Brkić [31] to be around 0.0012% using up to 2 to 8 million Sobol's quasi-random testing points.Using the same methodology for estimation of the maximal relative error δ % as in Brkić [14], with 740 quasi-uniform testing points, it is estimated to be up to 0.00120421% (it was additionally tested using 740 points and confirmed up to 0.001204% by Brkić and Stajić [15], who used VBA coding for MS Excel).

Distribution of Testing Points
The domain of applicability in engineering practice of the Colebrook equation should be tested using a sufficient number of points.Otherwise, the highest value of the relative error δ % can be overlooked, because it can be located among the chosen testing points.Therefore, the testing points should sufficiently cover the domain of applicability of the Colebrook equation using an appropriate pattern to avoid such undetected picks of error which can occur among the testing points.
Some authors recommend a few million testing points while others suggest even less than a thousand, chosen using various patterns such as uniform, quasi-uniform, random, and quasi-random.For such purposes, Yıldırım [13] uses 10 thousand uniformly distributed points, Brkić [14] uses 740 quasi-uniformly distributed points, Shaik et al. [25] one million, while Praks and Brkić [30,31] use even 2 to 8 million quasi-Monte Carlo points.
In this communication, results obtained using a random pattern of points for testing are compared with the Sobol quasi-random points [37-41], always using an equal number of testing points.
In further text, the methodology on how to use Sobol's quasi-random sequence for testing of the approximations of the Colebrook equation is shown.It is compared with quasi-random testing points generated in MS Excel [14,15].

Sobol's Quasi-Random Testing Points
This communication does not describe how the algorithm for generating Sobol sequences works [37][38][39][40][41].It is focused on how to use it to test the explicit approximations of the Colebrook equation.Compared with random sampling, Sobol numbers offer a lower discrepancy (they fill the space of possibilities more evenly), and because of that ability, they have been chosen for testing.
The Sobol quasi-Monte Carlo sampling is a complex procedure, which requires a specialized software tool.For Sobol quasi-Monte Carlo sampling in Matlab, the opensource software can be downloaded for free [37], or alternatively, open-source SciPy library of Python can be used for the Sobol sequence [45].
Because, the Colebrook equation has two input parameters, i.e., the Reynolds number Re, and the relative roughness of inner pipe surface ε, two-dimensional Sobol sequences [39] are used in the tests performed here.Using Sobol's two-dimensional sequence [S 1i , S 2i ], values of the Reynolds number Re between 4000 and 10 8 can be generated, as well for the relative roughness of inner pipe surface ε between 0 and 0.05.Sobol's numbers are always between 0 and 1, and the Reynolds number Re can be generated using the first dimension of the Sobol two-dimensional sequence S 1i , while the relative roughness of the inner pipe surface ε using the second S 2i , as shown in Equation (3): Re = 10 S 1i •(log 10 (10 8 )−log 10 (4000))+log 10 (4000) ε = 10 −(S 2i •(6.5−log 10 (1/0.05))+log 10 (1/0.05)) , The Sobol sequence is defined for values between zero and one.On the other hand, the input parameters of the Colebrook equation cover large intervals.For example, Reynolds numbers vary from 4000 to 10 8 , and ε between 0 and 0.05, while to normalize it, Equation ( 4) is used, where Re norm and ε norm represent a normalized value, i.e., a value between 0 and 1 (for example a random number or quasi-random number of the Sobol sequence): Logarithms and the 10 x functions were used in the transformation to sufficiently cover the large interval of input parameters of the Colebrook equation (especially for the Reynolds numbers).As the Reynold numbers Re of the Colebrook equation are between 4000 and 10 8 , the procedure for the generation can be expressed as 10ˆ(Re norm *(Re max − Re min ) + Re min ), where Re min = log 10 (4000) and Re max = log 10 (10 8 ) = 8.Consequently, the expression for the generation of Reynold numbers Re of the Colebrook equation can be approximated as Re~10.0ˆ(4.3979*Renorm + 3.6021), as 10 3 . 6021~4000 represents the minimal Reynold number of the Colebrook equation.Moreover, 4.3979 + 3.6021 = 8, which represents the maximal Reynold number 10 8 , where for Re norm = 0→Re~4000 and for Re norm = 1→Re~10 8 .Similarly, the relative roughness ε of the pipeline between ε min = log 10 (3.1808 × 10 −7 ) and ε max = log 10 (0.05) can be generated from ε norm as ε = 10ˆ(ε norm *(ε max − ε min ) + ε min ) where the expression can be approximated for the Colebrook equation as ε~10.0ˆ(5.1964*xnorm − 6.4975) because for ε norm = 0→ε~0 and for ε norm = 1→ε~0.05.
The Sobol sequence is not random, and in our case, the starting pattern is always identical, as in Figure 1 where 64 two-dimension points are shown.By comparing Figures 1 and 2, it can be seen that such sequences more thoroughly cover the domain of the Colebrook equation than random sampling.Using Sobol's quasi-random tests, the maximal relative error δ % of Equation ( 2) for n = 6, for 64 sampling points is 0.00120432%, for 740 sampling points the same results (the maximal error was already detected in the first 64 samples), while for n = 11, 2 n = 2048 is up to 0.00120441%.
Compared with the methodology by Brkić [14] with 740 quasi-uniform testing points with the error estimated up to 0.00120421%, Sobol's quasi-random testing captured an even higher error of 0.00120432%, with only n = 6, 2 n = 64 sampling points.Using Sobol's quasi-random tests, the maximal relative error δ% of Equation ( 2) for n = 6, for 64 sampling points is 0.00120432%, for 740 sampling points the same results (the maximal error was already detected in the first 64 samples), while for n = 11, 2 n = 2048 is up to 0.00120441%.
Compared with the methodology by Brkić [14] with 740 quasi-uniform testing points with the error estimated up to 0.00120421%, Sobol's quasi-random testing captured an even higher error of 0.00120432%, with only n = 6, 2 n = 64 sampling points.
Sobol's points and solutions of Equations ( 1) and ( 2) are shown in Table 1.The maximal relative error δ% in all tests for the approximation from Equation ( 2) is always evaluated to be around 0.0012% using random sampling.

Conclusions
The Colebrook equation depends on two input parameters: the Reynolds number Re and the relative roughness ε of the pipeline.As the input parameters have a large variance of possible values (the Reynolds number Re varies from 4000 to 10 8 and the inner pipe surface varies from 3.1808 × 10 −7 to 0.05), every new approximation of the Colebrook equation should be discovered by the evaluation of a large number of possible combinations of input parameters.For this reason, a method is required, which is able to identify a limited number of pairs suitable for the building of a new approximation.This communication shows that the Sobol quasi-Monte Carlo method requires, for the same accuracy of the Colebrook approximation, a less number of evaluations of the Colebrook equation than the classical Monte-Carlo method.
The findings of this communication for 2048 and even for 64 quasi-random points give comparable results as 740 uniform points of logarithmic scale [14].Moreover, Sobol's test points are not random, but quasi-random, and so such an approach is deterministic (which is useful for comparisons of calculations [46][47][48]).Finally, the Sobol quasi-random approach is preferable, as it fills the sampling space more evenly.Thus, the chance to neglect some parts of the examined domain is minimized for the Sobol quasi-random ap- Sobol's points and solutions of Equations ( 1) and ( 2) are shown in Table 1.  1) and (2).  2.
The maximal relative error δ % in all tests for the approximation from Equation ( 2) is always evaluated to be around 0.0012% using random sampling.

Conclusions
The Colebrook equation depends on two input parameters: the Reynolds number Re and the relative roughness ε of the pipeline.As the input parameters have a large variance of possible values (the Reynolds number Re varies from 4000 to 10 8 and the inner pipe surface varies from 3.1808 × 10 −7 to 0.05), every new approximation of the Colebrook equation should be discovered by the evaluation of a large number of possible combinations of input parameters.For this reason, a method is required, which is able to identify a limited number of pairs suitable for the building of a new approximation.This communication shows that the Sobol quasi-Monte Carlo method requires, for the same accuracy of the Colebrook approximation, a less number of evaluations of the Colebrook equation than the classical Monte-Carlo method.
The findings of this communication for 2048 and even for 64 quasi-random points give comparable results as 740 uniform points of logarithmic scale [14].Moreover, Sobol's test points are not random, but quasi-random, and so such an approach is deterministic (which is useful for comparisons of calculations [46][47][48]).Finally, the Sobol quasi-random approach is preferable, as it fills the sampling space more evenly.Thus, the chance to neglect some parts of the examined domain is minimized for the Sobol quasi-random approach.

Figure 2 .
Figure 2. Random distribution of testing points; (a) the first example, (b) the second example.

Table 1 .
Sobol's points and solutions of Equations ( S 2i ], Equation (3) used Excel function "Rand()".This will always generate different testing patterns as shown in Figure