Efficient Double-Tee Junction Mixing Assessment by Machine Learning

Grbčić, Luka; Kranjčević, Lado; Družeta, Siniša; Lučin, Ivana

doi:10.3390/w12010238

Open AccessArticle

Efficient Double-Tee Junction Mixing Assessment by Machine Learning

¹

Department of Fluid Mechanics and Computational Engineering, Faculty of Engineering, University of Rijeka, 51000 Rijeka, Croatia

²

Center for Advanced Computing and Modeling, University of Rijeka, 51000 Rijeka, Croatia

^*

Author to whom correspondence should be addressed.

Water 2020, 12(1), 238; https://doi.org/10.3390/w12010238

Submission received: 8 November 2019 / Revised: 28 December 2019 / Accepted: 11 January 2020 / Published: 15 January 2020

(This article belongs to the Special Issue Machine Learning Applied to Hydraulic and Hydrological Modelling)

Download

Browse Figures

Versions Notes

Abstract

:

A new approach in modeling of mixing phenomena in double-Tee pipe junctions based on machine learning is presented in this paper. Machine learning represents a paradigm shift that can be efficiently used to calculate needed mixing parameters. Usually, these parameters are obtained either by experiment or by computational fluid dynamics (CFD) numerical modeling. A machine learning approach is used together with a CFD model. The CFD model was calibrated with experimental data from a previous study and it served as a generator of input data for the machine learning metamodels—Artificial Neural Network (ANN) and Support Vector Regression (SVR). Metamodel input variables are defined as inlet pipe flow ratio, outlet pipe flow ratio, and the distance between the pipe junctions, with the output parameter being the branch pipe outlet to main inlet pipe mixing ratio. A comparison of ANN and SVR models showed that ANN outperforms SVR in accuracy for a given problem. Consequently, ANN proved to be a viable way to model mixing phenomena in double-Tee junctions also because its mixing prediction time is extremely efficient (compared to CFD time). Because of its high computational efficiency, the machine learning metamodel can be directly incorporated into pipe network numerical models in future studies.

Keywords:

mixing phenomena; double-Tee junctions; machine learning; artificial neural networks; support vector regression; CFD model

1. Introduction

The mixing of fluids in water distribution networks is a complex phenomenon that has been extensively subjected to research as it is relevant to several specific areas of application such as water distribution quality and safety [1,2,3], pollution source detection systems (both large [4,5] and small networks [6]), and optimal pollution sensor placement in a water distribution network [7,8,9,10]. The elements that form water distribution networks are pipes and junctions. When modeling mixing in a complex system, a correct mixing model must be applied to accurately describe the contaminant transport through the network due to the fact that a wrong solution could present a hazard to a great number of network users.

Usually, mixing in a pipe network is modeled as either complete mixing or bulk mixing. Complete mixing can be described as an even split of contamination at a network junction. Complete mixing models such as the one developed in the hydraulic analysis software EPANET are implemented by calculating the flow-weighted concentrations at the inlet pipes of a junction and then assuming an even split in the outlet pipes. The complete mixing model can be assumed correct only if there is a single outlet at a junction and if the distance between two junctions is great enough. For certain distances [11,12], it does not describe the physical process of mixing correctly.

Bulk mixing is an idealized mixing in which the flow between two inlet pipes is not interacting and the diffusive behavior between the inlet pipe streams is ignored (the streams are only touching at an interface). To model the bulk mixing, the flow momentum in the inlet pipes is calculated, and it determines how the concentration splits between the outlets. The inlet pipe with the higher momentum will dominate in concentration transfer into the outlet pipe, which follows the same direction as the inlet pipe. The inlet pipe with lower flow momentum will only transfer the concentration into its neighboring outlet pipe. This kind of mixing behavior is usually specific to cross junctions. The bulk mixing model is implemented in the software EPANET-BAM [13] (which can be considered as an extension of EPANET). If the expected mixing behavior is neither complete nor bulk, an experimentally-calibrated mixing model parameter can be used, by which type of mixing is defined.

In research by McKenna et al. [14], it is shown that, for a cross junction (which is a special case of a double-Tee junction in which the distance between converging and diverging pipes is equal to zero), a strong deviation from complete mixing occurs when the Reynolds number at each inlet and outlet pipe is equal. On the other hand, stronger turbulence can also create a considerable unpredictability of mixing in cross junctions [15]. The mixing that occurs at two sequential Tee junctions is especially complex due to increased chaotic eddying of flow, which in turn increases the diffusion between the streams. It is proved in several studies [12,16,17] that the length between double-Tee junctions is important since it determines the behavior of mixing of fluids or rather the transport of contaminant into each diverging pipe. In addition, variation of flows at both inlet and outlet pipes causes the mixing behavior to vary [11,12]. Another property of a pipe system that can cause an incomplete mixing behavior are uneven pipe diameters, where a large difference between pipe diameters can produce a more complete mixing [18]. The orientation of pipes for a double-Tee junction also affects the mixing behavior, and it is shown that mixing for a case when the pipes are at a opposite side compared to the case where they are at the same side, behaves differently [11,12] (assuming planar configuration). It is also shown that transitional and laminar flows in pipes exhibit a specific mixing behavior [19].

Instead of an experimental approach to studying mixing behavior in double-Tee junctions, computational fluid dynamics (CFD) can also be used. A species transport model coupled with RANS turbulence models yields good results [12,16] that can be used to further enhance simpler 1D mixing models (such as EPANET-BAM). Additionally, a high fidelity LES turbulence model can also be used to study mixing with more detail [20]. The problem with CFD simulations and the contaminant transport model is that they need to be calibrated as they are dependant on the turbulent Schmidt number, the value of which is very much case-specific [21,22] and has been differently reported in several studies regarding CFD analysis of mixing in double-Tee junctions [12,23,24]. Furthermore, CFD simulations are computationally expensive and producing results for a range of combinations of mixing scenarios (varying inlet and outlet flow, distance between junctions, and even pipe diameters) is quite impractical. Alternatively, mixing behavior can be predicted by use of data analysis (i.e., interpolation on a fixed previously produced dataset), as was done in a previous study [25] where Kriging and Delaunay triangulation were combined to describe mixing for different Reynolds numbers.

Recently, machine learning (ML) algorithms used in hydrology and hydraulics include Artificial Neural Networks (ANN) and variations [26,27], Random Forest [28], and Support Vector Machine [29]. ML has been previously successfully implemented in pipe flow systems. A selection of ML algorithms such as Random Forest, Bagging Algorithm, and Regression Tree were used to predict various characteristics of a wastewater pipe system and it was found that Regression Trees are most successful [30]. Several ML algorithms were also used to predict pressure gradients, liquid holdup, and flow pattern identification of multiphase flow on a pipe segment (where data were generated by CFD) and, for some of the characteristics, the Gradient Boosting algorithm and Support Vector Machine yielded best results [31]. Deep learning (i.e., ANN) was successfully applied to produce the best valve scheduling scenario in the case of pipe network contamination instead of searching for a contamination source [32]. Similarly, Support Vector Regression (SVR) was used to detect anomalies in a pipe network system [33] and a modified ANN approach was used for pipe network management [34].

A ML approach can be also used to evaluate the mixing behavior based on previously obtained data (either by experiment or by CFD). In this study, two ML algorithms—SVR and ANN—were tested in predicting mixing behavior for a combination of different inlet and outlet pipe flow ratios and different distances between double-Tee junctions, on one pipe configuration. Training data were generated by 3D CFD that was calibrated on an experiment presented in a previous study [12]. A supercomputer was used for CFD modeling due to its high computational demand and need to generate as many data as possible for better ML training.

2. Materials and Methods

2.1. Problem Description

Mixing prediction was performed for a previously reported experiment [12], in which the setup includes varying distances between double-Tee junctions and varying inflows in pipes.

Figure 1 shows the pipe configuration used. Branch pipes are on the opposite sides and the arrows represent the flow direction. Tap water (T) (which was used as a tracer) enters the main inlet pipe 1 while distilled water (D) enters the branch inlet pipe 2 causing the mixture (M) of tap and distilled water to exit both the main outlet pipe 3 and the branch outlet pipe 4. The electrical conductivity of tap water in the experiment was measured to be around 200 μS/cm² while the distilled water conductivity was around 2 μS/cm² at pipe inlets 1 and 2, respectively. This difference of electrical conductivity of the fluids was used to determine the mixing process in the double-Tee junctions since they differed by a factor of 100. The electrical conductivity of the mixture that leaves pipe outlets 3 and 4 was measured and was always constrained between the inlet tap water and inlet distilled water electrical conductivity. The rate of mixing or the electrical conductivity of the mixture at outlets depends on the flow conditions in the inlet and outlet pipes.

The measure of mixing taken from experimental data and used for further investigation of mixing was the branch outlet pipe 4 to main inlet pipe 1 (see Figure 1) conductivity ratio, which is defined as

R_{c} = \frac{c_{4}}{c_{1}}

(1)

where

c_{4}

represents the conductivity at outlet pipe 4, while

c_{1}

is the conductivity at inlet pipe 1.

This ratio was chosen due to its usage in previous studies [13,35] as it can be used in mixing prediction for simpler 1D models (EPANET-BAM). The inlet pipe flow ratio was also varied in the experiment and it is defined as

R_{q_{i n}} = \frac{Q_{1}}{Q_{2}}

(2)

where

Q_{1}

defines the flow at inlet pipe 1 and

Q_{2}

the flow at inlet pipe 2. The values of

Q_{1}

and

Q_{2}

were controlled with valves in the experiment until a steady state behavior was achieved and then the measurement of the electrical conductivity at each outlet pipe was made.

The inflow pipe ratios

R_{q_{i n}}

used in the experiment were 0.333, 0.5, 1, 2, and 3, while the flow at both outlet pipes was the same. One of the purposes of this study was to examine the behavior of mixing when

R_{q_{i n}}

is not equal to 1 while the flow at both outlet pipes is also being varied since there is a lack of previous research regarding these combinations (which in a realistic case are most probable). Another important characteristic of the experiment is the distance between the double-Tee junction that was varied as

5.6 D

,

10 D

, and

15 D

(where D is the diameter of the pipes). All of the relevant specifications of the experiment are summarized in Table 1.

2.2. CFD Simulations

2.2.1. CFD Model

A CFD approach was used as both a comparison of efficiency with the ML approach and a dataset generator for ML algorithms. The open-source CFD toolbox OpenFOAM [36] was used to model the mixing phenomena as it was previously proven to work well for the given problem. The isothermal 3D steady Reynolds-Averaged Navier–Stokes (RANS) equations with the passive scalar transport model and the k-Epsilon (k-

ϵ

) turbulence model were solved, where k represents the turbulence kinetic energy and

ϵ

is the turbulence energy dissipation. The passive scalar transport model is defined as:

\nabla \times (v c) = \nabla \times (D_{t} \nabla c)

(3)

D_{t} = D_{m o l} + \frac{ν_{t}}{S_{c t}}

(4)

ν_{t} = C_{μ} \frac{k^{2}}{ϵ}

(5)

In Equation (3), parameter c represents a dimensionless scalar value, set as

c = 0

for distilled water and

c = 1

for tap water as it represents the tracer in pipes;

D_{t}

is the turbulent diffusivity defined by Equation (4); and v represents the advection velocity vector. Equation (3) can be physically interpreted as transport of a scalar quantity c by the fluid flow, which is resolved by the coupled RANS/(k-

ϵ

) model. The additional passive scalar transport model in Equation (3) was coupled with the steady-state turbulent OpenFOAM solver simpleFoam making it a new fluid flow solver referred to as tracerFoam.

Equation (4) is essential for modeling mixing phenomena as the turbulent Schmidt number

S_{c t}

needs to be calibrated. According to the results of the previous study [12],

S_{c t}

was defined as

S_{c t} = 0.5

. As for other parameters,

D_{m o l}

is the molecular diffusivity of water (set as

D_{m o l} = 1.7 \times 10^{- 9}

m²/s [37]),

ν_{t}

represents the kinematic eddy viscosity, and the constant

C_{μ}

was set to

C_{μ} = 0.09

.

A bounded second-order upwind-biased scheme was used for the divergence terms of the model, while for the gradient terms the central difference scheme was used. The fluid used for all simulation runs was water with kinematic viscosity of

ν = 1 \times 10^{- 6}

m²/s.

A uniform value for velocity

v

was defined at pipe inlets and outlets and a no-slip boundary condition was defined at the pipe walls. The values of

v

were set at the pipe inlets and pipe outlets to form an exact ratio of both

R_{q_{i n}}

and

R_{q_{o u t}}

for all cases (0.333, 0.5, 1, and 2). Pressure

p = 0

was defined at outlet pipe 4 and zero pressure gradient in the normal direction was defined on the inlets, outlet pipe 3, and pipe walls. The concentration c was fixed at the inlets (

c = 1

for inlet pipe 1 and

c = 0

for inlet pipe 2), while, at the outlets and pipe walls, the derivative of concentration in the normal direction was set to zero. Additionally, the value of c was monitored at outlet pipe 3, and, to achieve a steady state regime, it had a convergence criterion of 10⁻⁵. For the (k-

ϵ

) turbulence model, the values of k and

ϵ

at inlets and outlets were defined the same way as for c except at the pipe walls, where turbulence wall functions were used. Boundary conditions used in the numerical model are summarized in Table 2.

2.2.2. Mesh Independence

Firstly, to use it as a dataset generator, the CFD model was validated with the previously obtained experimental data. A mesh independence study was done to find the optimal mesh sizes for data generation. Each distance between the double-Tee junctions yielded a differently sized mesh, with greater distances requiring more cells. The CFD approach mesh independence was validated with

R_{q_{i n}}

equal to 1 and with equal flow at outlet pipes (

R_{q_{o u t}}

equal to 1) for all of the above-mentioned distances (Table 1). The grid convergence residual criterion was set to 10⁻⁵ for Equation (3). Meshes used in the mesh independence study can be seen in Table 3 where the average cell size ranged from 1.2 to 0.5 mm, while the summary of the mesh independence study is presented in Table 4 where

R_{c}

is the branch outlet pipe to main inlet pipe conductivity ratio. For ML model data generation, the fine mesh was chosen.

2.2.3. CFD Data Generation

After the mesh independence study was completed and the numerical model was validated, CFD was used to generate data for ML. As it was defined in the experiment, the value of the inlet pipe flow ratio in Equation (2) was varied with the outlet pipe flow ratio, which is defined in Equation (6), where

Q_{3}

is the main pipe 3 outlet flow and

Q_{4}

is the branch pipe 4 outlet flow.

R_{q_{o u t}} = \frac{Q_{3}}{Q_{4}}

(6)

For an array of

R_{q_{i n}}

and

R_{q_{o u t}}

, the value of

R_{c}

was obtained (

R_{c_{C F D}}

). The ratios were all possible combinations of 0.333, 0.5, 1, 2, and 3 between the inlet pipes and the outlet pipes (e.g.,

R_{q_{i n}}

set as 2 and

R_{q_{o u t}}

as 0.333,

R_{q_{i n}}

= 1 and

R_{q_{o u t}}

= 2, or

R_{q_{i n}}

= 0.333 and

R_{q_{o u t}}

= 0.5).

This was conducted for all distances (

5.6 D

,

10 D

, and

15 D

), comprising a total of 75 simulations on the fine mesh (25 simulations per distance).

2.3. ML Model

2.3.1. Artificial Neural Networks

Both an Artificial Neural Network (ANN) and Support Vector Regression (SVR) were used to train a model which predicts the behavior of mixing in a double-Tee junction case. For both methods, the input and output data were split as 70% for model training and 30% for model accuracy testing.

Generally, ANNs are a type of ML algorithms that are used to model complex phenomena, which are not as easy to model using conventional data analysis methods. The type of ANNs used in this study are Feedforward Multilayer Perceptron Neural Networks. This type of ANN is constructed using an input layer, which is usually data that cause the emergence of complex phenomena modeled, one or more hidden layers, and one output layer, which is usually the outcome of the complex phenomena. Layers consist of artificial neurons and a link exists between every layer of the ANN. Training an ANN is the process of fitting the weights of the artificial neuron links so that for each input datum a path of links is made to a certain output. Further information on ANNs can be found in the literature [38].

In this study, three hidden layers were added between the input and output layers and between every layer a dense connection was defined. The input layer consisted of three different data, the inlet pipe flow ratio

R_{q_{i n}}

, the outlet pipe flow ratio

R_{q_{o u t}}

, and distance between the double-Tee junctions (

5.6 D

,

10 D

, and

15 D

). The first hidden layer consisted of 50 neurons, both the second and third hidden layer consisted of 25 neurons, and all hidden layers used the rectifier activation function. The output layer was the value of Equation (1) or

R_{c_{A N N}}

. The sigmoid activation function was used at the output layer. The layer configuration is summarized in Table 5.

The ANN was trained using the ADAM optimizer [39] for 500 epochs with the binary cross-entropy loss function for error modeling. Initial weights of each layer link was randomized. The ANN was implemented in the Python Neural Network library Keras [40].

2.3.2. Support Vector Regression

SVR can be considered as a special case of Support Vector Machine (SVM) models. SVMs can be used for both classification and regression problems and SVR would obviously belong to a family of algorithms used to create regression models. The SVR model is given training data in the form of inputs and outputs and it learns the relationship between the data. The inputs are n-dimensional vectors and the outputs are continuous values (unlike in classification problems). The input–output dependency or mapping is approximated using kernel functions and weights. Similar to in ANNs where the link weights are being fitted, the weights in SVR models are also fitted. In the process of learning, a hyperplane is created between the n-dimensional data along with error planes. The main goal is to find the data (or the support vectors) that are within the error margin. A further description of SVR can be found in the literature [41].

The data input and output values were the same as the ones for the ANN (defined in Section 2.3.1). A fifth-degree radial basis function (RBF) kernel was used for the model, a tolerance of 0.001 (which is the model fitting stopping criterion), an

ϵ

of 0.0001 (which specifies the error margin), and a penalty parameter equal to 3 (which is enforced for values outside the error margin). SVR was implemented in the Python ML library Scikit-learn [42].

3. Results and Discussion

3.1. CFD Results and Efficiency

The CFD analysis was done for 75 different scenarios by varying

R_{q_{i n}}

,

R_{q_{o u t}}

, and three distances (

5.6 D

,

10 D

, and

15 D

). The CFD simulation with

R_{q_{i n}}

equal to 0.333, 0.5, 2, and 3 and

R_{q_{o u t}}

equal to 1 were compared with the experimental values to make sure that the procedure is valid and that a CFD model can be used as a data generator for ML (simulation results for

R_{q_{i n}}

equal to 1 are previously given in Section 2.2.2). The results are summarized in Table 6.

For the distance of

5.6 D

between double-Tee junctions, the root-mean-square error (RMSE) is 0.0214 with the minimum relative error being +0.3% and the maximum relative error −10% between the experimental and CFD values of

R_{c}

. The distance of 10D had the RMSE of 0.0246, the minimum relative error of experimental and CFD

R_{c}

as −3.05%, and the maximum relative error +11%. For the

15 D

distance, the RMSE is 0.013 with minimum relative error as 0.9% and maximum relative error 8.9%. The error results are summarized in Table 7.

It can be seen that the error of the CFD model for the

15 D

distance is the lowest one. This can be attributed to a larger distance between double-Tees since the mixture is already averaged enough (and closer to complete mixing) across the pipe cross section before it is transported into the diverging pipes where, arguably, the most complex eddying of flow is happening. At smaller distances, that averaging of concentration before the diverging pipe is less visible. This can be seen in Figure 2 where Figure 2a is the concentration in the pipe cross section at

1.67 D

before the branch pipe 4 center-line for

5.6 D

distance, Figure 2b represents the same but for the

10 D

distance, and Figure 2c shows results for

15 D

. In Figure 2d, the modeled kinetic energy of turbulence is shown and it is the highest at the entrance of branch pipe 4, which could be the cause of a slightly higher error for smaller distances. Overall, the errors are acceptable and CFD proved to be a viable tool in modeling this phenomena.

Even though the CFD method used provides a good model for mixing phenomena in double-Tee junctions, there is still a problem of computational efficiency. All simulations (for the fine mesh that ranged from approximately 1.8 to 3 million cells) were done using the BURA supercomputer resources at the University of Rijeka. All runs were executed in parallel on 8 Intel E5 nodes with 20 threads per node resulting in 160 processes in total. The computational run times (until the residual of Equation (3) fell below

10^{- 5}

) for each case is summarized in Table 8.

Taking into account that these results were obtained with a supercomputer on 160 Intel E5 cores, this methodology is very inefficient if employed on an personal computer, especially since a lot of computation is necessary for a large variety of possible combinations of inlet pipe flows, outlet pipe flows, and distances between double-Tee junctions.

3.2. ML Results

3.2.1. Models’ Training and Testing

Of the 75 obtained simulation results, 70% were used for model training, and 30% were used for the purpose of model accuracy testing for both ANN and SVR since this train-to-test ratio provides that the training dataset should include all possible patterns used for defining the problem and should extend to the edge of the modeling domain. Both models were tested 10 times with randomly selected input data and the models with the minimum RMSE value of the test data were adopted and used further (both ML models RMSE varied for around 0.008 between the minimum and maximum value). The CFD obtained results were taken to be exact (since they exhibit a good agreement with the experimental results based on the average RMSE) and ANN and SVR model results were compared with them to examine their accuracy. For the same data inputs (

R_{q_{i n}}

,

R_{q_{o u t}}

, and distance), the ANN model performed better than the SVR model regarding the accuracy of the testing data (predictions of

R_{c}

). The RMSE of ANN predictions was 0.0172 while for SVR it was 0.0361 with the maximum error of 0.0427 for ANN and 0.119 for SVR. The results of training data predictions are summarized in Table 9 while a detailed comparison of predicted data can be seen in Table 10.

The values of

R_{c_{S V R}}

(SVR predicted

R_{c}

) and

R_{c_{A N N}}

(ANN predicted

R_{c}

) in Table 10, which were obtained from the 30% of the input dataset, in bold better predict the CFD modeled values

R_{c_{C F D}}

, which serve as correct results of mixing behavior. ANN had a better prediction than SVR for 14 out of 23 input data values but for the other 9 predictions it was also quite close to the CFD modeled values. Overall, both models exhibit solid accuracy for predicting the input data even though ANN would obviously be a better choice for further examination.

3.2.2. $R_{c}$ Prediction for Other Double-Tee Distances

Another useful test was done on the ANN and SVR models, which is the prediction of

R_{c}

for distances that were not in the input training data. The chosen distances were

7.5 D

and

12.5 D

since they are in between the training data values (

5.6 D

,

10 D

, and

15 D

). The results of the

7.5 D

and

12.5 D

tests can be seen in Table 11 and again the results in bold are the ones that agree better with the CFD prediction.

After testing the prediction on different double-Tee junction distances (which were not included in the training data), it is obvious that ANN greatly outperforms SVR on this task. Henceforth, further double-Tee mixing ML analysis (of a whole range of

R_{q_{i n}}

and

R_{q_{o u t}}

values) was only done using ANN.

3.2.3. $R_{c}$ Prediction for a Range of $R_{q_{i n}}$ and $R_{q_{o u t}}$ with ANN

A prediction of a range of possible combinations of inlet pipe flow and outflow ratios was done, albeit on a range that is still inside the minimum and maximum values of

R_{q_{i n}}

,

R_{q_{o u t}}

(from 0.333 to 3), and the distance of the training model, thus no extrapolation of data was done. The purpose of this test was to obtain the whole

R_{q_{i n}}

–

R_{q_{o u t}}

surface for a given distance between the double-Tee junction. Before extracting the surface data from the ANN model, which show mixing behavior for the whole array of

R_{q_{i n}}

and

R_{q_{o u t}}

, an additional check of validity of the ANN model was performed for a

R_{q_{i n}} = 1.3

and

R_{q_{o u t}} = 1

since the testing data were done only for

R_{q_{i n}}

and

R_{q_{o u t}}

of 0.333, 0.5, 1, 2, and 3. The CFD model result

R_{c_{C F D}}

was 0.706 while the

R_{c_{A N N}}

was 0.681, which corresponds to an error of +3.54% when

R_{c_{C F D}}

is considered to be the exact value.

Figure 3, Figure 4 and Figure 5 show the predicted values of

R_{c}

by ANN for the distances 5.6D, 10D, and 15D, respectively. Values of

R_{c_{A N N}}

are represented by color and iso-lines. All three figures show continuous values of

R_{c_{A N N}}

instead of the discreet ones obtained by CFD and experiments. It can be noticed by comparing the figures that the gradient of

R_{c_{A N N}}

is smoother (the difference between the minimum and the maximum value is smaller) when the double-Tee distance is increased, which is logical since a much more complete mixing behavior is achieved with an increased distance. Another thing that can be noticed (for all distances) in the figures is that, when both

R_{q_{i n}}

and

R_{q_{o u t}}

are increased, the value of

R_{c_{A N N}}

tends towards 1, which can be interpreted as, when simultaneously both the flow in the main inlet pipe and the main outlet pipe are increasing, the concentration in the main outlet pipe is also increasing, and therefore deviating more from complete mixing. A similar thing happens when both (

R_{q_{i n}}

and

R_{q_{o u t}}

) are decreasing, making the concentration

R_{c_{A N N}}

lower (it tends towards 0).

Additionally, in Figure 6, a 3D

R_{c_{A N N}}

plot that contains the surfaces for

5.6 D

,

10 D

, and

15 D

distances is presented to better depict the gradient or the steepness of the surfaces.

3.2.4. On ANN Efficiency

It is hard to quantify the efficiency of the ANN model since it is dependant on previously generated data (either by CFD or experiment) but if a task were to generate, with a trained ANN, the whole range of new values of

R_{c}

from a range of input values, then it could be argued that ANN is extremely efficient since obtaining

R_{c_{A N N}}

for 23 input data lasted 0.18 s.

4. Summary and Conclusions

In this paper, a new methodology for double-Tee junctions mixing modeling is presented based on an experiment from the previous literature. A CFD model was calibrated with the experimental data to serve both as a computational efficiency benchmark and a data-generator for a ML approach for the presented double-Tee junction mixing phenomena assessment problem.

In total, 75 simulations were done by the calibrated CFD RANS k-

ϵ

turbulence model and the data were used to train two different ML models. The input variables were defined as the inlet pipe flow ratio, the outlet pipe flow ratio, and distance between the double-Tee junction, while the output was the outlet branch pipe to main inlet pipe conductivity ratio, which is a relevant variable that describes the mixing behavior in a double-Tee junction.

The two tested ML algorithms were Support Vector Regression and Artificial Neural Network (specifically Feedforward Multilayer Perceptron), and, of the two, ANN outperformed SVR in accuracy. Empirical results obtained with the ANN model used show good approximation in predicting the mixing behavior for different distances between double-Tee junctions and inlet/outlet pipe flow ratios.

Besides the accuracy of ANN, it was determined that the computational efficiency of the ANN model is much greater than that of the CFD model, which makes it a great tool for modeling this kind of phenomena. Since many previous studies produced experimental data for a great number of different combinations (which include different Reynolds numbers ranges in pipes, varied pipe diameters, distances between double-Tees, pipe configuration, etc.), a further study could be done to create an ANN that takes into account all of these variables and produces all the relevant parameters to accurately model mixing. With its computational efficiency, a fast model such as ANN could be directly incorporated into complex water network supply pipe network models (such as EPANET and similar), which would greatly improve their accuracy and in turn make them a more powerful tool for predicting water network pollution dispersion and similar hazardous phenomena.

Author Contributions

Conceptualization, L.G., L.K., S.D., and I.L.; Data curation, L.G. and I.L.; Formal analysis, L.G.; Funding acquisition, L.K.; Investigation, L.G., L.K., and I.L.; Methodology, L.K., S.D., and I.L.; Project administration, L.K. and S.D.; Resources, L.K. and S.D.; Software, L.G.; Supervision, S.D.; Validation, L.G. and I.L.; Writing—original draft, L.G.; and Writing—review and editing, L.G., L.K., S.D., and I.L. All authors have read and agreed to the published version of the manuscript.

Funding

This research was funded by the Center for Advanced Computing and Modeling, University of Rijeka.

Conflicts of Interest

The authors declare no conflict of interest.

References

Di Nardo, A.; Di Natale, M.; Guida, M.; Musmarra, D. Water network protection from intentional contamination by sectorization. Water Resour. Manag. 2013, 27, 1837–1850. [Google Scholar] [CrossRef]
Grayman, W.M.; Murray, R.; Savic, D.A. Effects of redesign of water systems for security and water quality factors. In Proceedings of the World Environmental and Water Resources Congress 2009, Kansas City, MO, USA, 17–21 May 2009; pp. 1–11. [Google Scholar]
Bello, O.; Abu-Mahfouz, A.M.; Hamam, Y.; Page, P.R.; Adedeji, K.B.; Piller, O. Solving Management Problems in Water Distribution Networks: A Survey of Approaches and Mathematical Models. Water 2019, 11, 562. [Google Scholar] [CrossRef] [Green Version]
Kessler, A.; Ostfeld, A.; Sinai, G. Detecting accidental contaminations in municipal water networks. J. Water Resour. Plan. Manag. 1998, 124, 192–198. [Google Scholar] [CrossRef]
Adedoja, O.; Hamam, Y.; Khalaf, B.; Sadiku, R. Towards development of an optimization model to identify contamination source in a water distribution network. Water 2018, 10, 579. [Google Scholar] [CrossRef] [Green Version]
Kranjčević, L.; Čavrak, M.; Šestan, M. Contamination source detection in water distribution networks. Eng. Rev. 2010, 30, 11–25. [Google Scholar]
Ciaponi, C.; Creaco, E.; Di Nardo, A.; Di Natale, M.; Giudicianni, C.; Musmarra, D.; Santonastaso, G.F. Reducing Impacts of Contamination in Water Distribution Networks: A Combined Strategy Based on Network Partitioning and Installation of Water Quality Sensors. Water 2019, 11, 1315. [Google Scholar] [CrossRef] [Green Version]
Krause, A.; Leskovec, J.; Guestrin, C.; VanBriesen, J.; Faloutsos, C. Efficient sensor placement optimization for securing large water distribution networks. J. Water Resour. Plan. Manag. 2008, 134, 516–526. [Google Scholar] [CrossRef] [Green Version]
Berry, J.; Hart, W.E.; Phillips, C.A.; Uber, J.G.; Watson, J.P. Sensor placement in municipal water networks with temporal integer programming models. J. Water Resour. Plan. Manag. 2006, 132, 218–224. [Google Scholar] [CrossRef] [Green Version]
Hart, W.E.; Murray, R. Review of sensor placement strategies for contamination warning systems in drinking water distribution systems. J. Water Resour. Plan. Manag. 2010, 136, 611–619. [Google Scholar]
Song, I.; Romero-Gomez, P.; Andrade, M.A.; Mondaca, M.; Choi, C.Y. Mixing at junctions in water distribution systems: An experimental study. Urban Water J. 2018, 15, 32–38. [Google Scholar] [CrossRef]
Grbčić, L.; Kranjčević, L.; Lučin, I.; Čarija, Z. Experimental and Numerical Investigation of Mixing Phenomena in Double-Tee Junctions. Water 2019, 11, 1198. [Google Scholar] [CrossRef] [Green Version]
Ho, C.K.; O’Rear, L., Jr. Evaluation of solute mixing in water distribution pipe junctions. J. Am. Water Work. Assoc. 2009, 101, 116–127. [Google Scholar] [CrossRef]
McKenna, S.A.; Orear, L.; Wright, J. Experimental determination of solute mixing in pipe joints. In Proceedings of the World Environmental and Water Resources Congress 2007, Tampa, FL, USA, 15–19 May 2007; pp. 1–11. [Google Scholar]
Austin, R.; Waanders, B.v.B.; McKenna, S.; Choi, C. Mixing at cross junctions in water distribution systems. II: Experimental study. J. Water Resour. Plan. Manag. 2008, 134, 295–302. [Google Scholar] [CrossRef] [Green Version]
Shao, Y.; Yang, Y.J.; Jiang, L.; Yu, T.; Shen, C. Experimental testing and modeling analysis of solute mixing at water distribution pipe junctions. Water Res. 2014, 56, 133–147. [Google Scholar] [CrossRef] [PubMed]
Yu, T.; Tao, L.; Shao, Y.; Zhang, T. Experimental study of solute mixing at double-Tee junctions in water distribution systems. Water Sci. Technol. Water Supply 2014, 15, 474–482. [Google Scholar] [CrossRef]
Yu, T.; Qiu, H.; Yang, J.; Shao, Y.; Tao, L. Mixing at double-Tee junctions with unequal pipe sizes in water distribution systems. Water Sci. Technol. Water Supply 2016, 16, 1595–1602. [Google Scholar] [CrossRef]
Shao, Y.; Zhao, L.; Yang, Y.J.; Zhang, T.; Ye, M. Experimentally Determined Solute Mixing under Laminar and Transitional Flows at Junctions in Water Distribution Systems. Adv. Civ. Eng. 2019, 2019, 3686510. [Google Scholar] [CrossRef]
Webb, S.W.; van Bloemen Waanders, B.G. High fidelity computational fluid dynamics for mixing in water distribution systems. In Proceedings of the Water Distribution Systems Analysis Symposium 2006, Cincinnati, OH, USA, 27–30 August 2006; pp. 1–15. [Google Scholar]
Gualtieri, C.; Angeloudis, A.; Bombardelli, F.; Jha, S.; Stoesser, T. On the values for the turbulent Schmidt number in environmental flows. Fluids 2017, 2, 17. [Google Scholar] [CrossRef] [Green Version]
Valero, D.; Bung, D.B. Sensitivity of turbulent Schmidt number and turbulence model to simulations of jets in crossflow. Environ. Model. Softw. 2016, 82, 218–228. [Google Scholar] [CrossRef]
Ho, C.K.; Wright, J.L.; McKenna, S.A.; Orear Jr, L. Contaminant Mixing at Pipe Joints: Comparison between Laboratory Flow Experiments and Computational Fluid Dynamics Models; Technical Report; Sandia National Lab. (SNL-NM): Albuquerque, NM, USA, 2006.
Romero-Gomez, P.; Choi, C.; van Bloemen Waanders, B.; McKenna, S. Transport phenomena at intersections of pressurized pipe systems. In Proceedings of the Water Distribution Systems Analysis Symposium 2006, Cincinnati, OH, USA, 27–30 August 2006; pp. 1–20. [Google Scholar]
Gilbert, D.; Mortazavi, I.; Piller, O.; Ung, H. Low dimensional modeling of Double T-junctions in water distribution networks using Kriging interpolation and Delaunay triangulation. Pac. J. Math. Ind. 2017, 9, 2. [Google Scholar] [CrossRef] [Green Version]
Le, X.H.; Ho, H.V.; Lee, G.; Jung, S. Application of Long Short-Term Memory (LSTM) Neural Network for Flood Forecasting. Water 2019, 11, 1387. [Google Scholar] [CrossRef] [Green Version]
Tian, Y.; Xu, Y.P.; Yang, Z.; Wang, G.; Zhu, Q. Integration of a parsimonious hydrological model with recurrent neural networks for improved streamflow forecasting. Water 2018, 10, 1655. [Google Scholar] [CrossRef] [Green Version]
Saadi, M.; Oudin, L.; Ribstein, P. Random Forest Ability in Regionalizing Hourly Hydrological Model Parameters. Water 2019, 11, 1540. [Google Scholar] [CrossRef] [Green Version]
Gedik, N. Least Squares Support Vector Mechanics to Predict the Stability Number of Rubble-Mound Breakwaters. Water 2018, 10, 1452. [Google Scholar] [CrossRef] [Green Version]
Granata, F.; de Marinis, G. Machine learning methods for wastewater hydraulics. Flow Meas. Instrum. 2017, 57, 1–9. [Google Scholar] [CrossRef]
Kanin, E.; Vainshtein, A.; Osiptsov, A.; Burnaev, E. The Method of Calculation the Pressure Gradient in Multiphase Flow in the Pipe Segment Based on the Machine Learning Algorithms; IOP Conference Series: Earth and Environmental Science; IOP Publishing: Bristol, UK, 2018; Volume 193, p. 012028. [Google Scholar]
Hu, C.; Cai, J.; Zeng, D.; Yan, X.; Gong, W.; Wang, L. Deep reinforcement learning based valve scheduling for pollution isolation in water distribution network. Math. Biosci. Eng. 2019, 17. [Google Scholar]
Vries, D.; van den Akker, B.; Vonk, E.; de Jong, W.; van Summeren, J. Application of machine learning techniques to predict anomalies in water supply networks. Water Sci. Technol. Water Supply 2016, 16, 1528–1535. [Google Scholar] [CrossRef]
Sattar, A.M.; Ertuğrul, Ö.F.; Gharabaghi, B.; McBean, E.A.; Cao, J. Extreme learning machine model for water network management. Neural Comput. Appl. 2019, 31, 157–169. [Google Scholar] [CrossRef]
Ho, C.K. Solute mixing models for water-distribution pipe networks. J. Hydraul. Eng. 2008, 134, 1236–1244. [Google Scholar] [CrossRef]
Jasak, H.; Jemcov, A.; Tukovic, Z. OpenFOAM: A C++ library for complex physics simulations. In International Workshop on Coupled Methods in Numerical Dynamics; IUC: Dubrovnik Croatia, 2007; Volume 1000, pp. 1–20. [Google Scholar]
Mills, R. Self-diffusion in normal and heavy water in the range 1–45. deg. J. Phys. Chem. 1973, 77, 685–688. [Google Scholar] [CrossRef]
Nielsen, M.A. Neural Networks and Deep Learning; Determination Press: San Francisco, CA, USA, 2015; Volume 25. [Google Scholar]
Kingma, D.P.; Ba, J. Adam: A method for stochastic optimization. arXiv 2014, arXiv:1412.6980. [Google Scholar]
Chollet, F. Keras: The Python Deep Learning Library. 2019. Available online: https://keras.io/ (accessed on 24 October 2019).
Scholkopf, B.; Smola, A.J. Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond; MIT Press: Cambridge, MA, USA, 2001. [Google Scholar]
Pedregosa, F.; Varoquaux, G.; Gramfort, A.; Michel, V.; Thirion, B.; Grisel, O.; Blondel, M.; Prettenhofer, P.; Weiss, R.; Dubourg, V.; et al. Scikit-learn: Machine learning in Python. J. Mach. Learn. Res. 2011, 12, 2825–2830. [Google Scholar]

Figure 1. Scheme of the experimental configuration.

Figure 2. Concentration at pipe cross sections (1.67D before the center-line of branch outlet pipe 4 and all cross sections are perpendicular to fluid flow) and turbulence kinetic energy with R_qin and R_qout equal to 1 for all cases: (a) cross section concentration for 5.6D; (b) cross section concentration for 10D; (c) cross section concentration for 15D; and (d) turbulence kinetic energy k (m²/s²) for 5.6D (plan view), where arrows represent the flow direction and a dashed line the position of the cross section of the 5.6D case.

Figure 3. Predicted

R_{c_{A N N}}

surface values for

R_{q_{i n}}

,

R_{q_{o u t}}

, and 5.6D distance.

Figure 3. Predicted

R_{c_{A N N}}

surface values for

R_{q_{i n}}

,

R_{q_{o u t}}

, and 5.6D distance.

Figure 4. Predicted

R_{c_{A N N}}

surface values for

R_{q_{i n}}

,

R_{q_{o u t}}

, and 10D distance.

Figure 4. Predicted

R_{c_{A N N}}

surface values for

R_{q_{i n}}

,

R_{q_{o u t}}

, and 10D distance.

Figure 5. Predicted

R_{c_{A N N}}

surface values for

R_{q_{i n}}

,

R_{q_{o u t}}

, and 15D distance.

Figure 5. Predicted

R_{c_{A N N}}

surface values for

R_{q_{i n}}

,

R_{q_{o u t}}

, and 15D distance.

Figure 6. Predicted

R_{c_{A N N}}

surfaces for

5.6 D

,

10 D

, and

15 D

.

Figure 6. Predicted

R_{c_{A N N}}

surfaces for

5.6 D

,

10 D

, and

15 D

.

Table 1. Experimental data.

Parameter	Value	Unit
Internal pipe diameter (D)	18	mm
Inlet pipes length	$20 D$	mm
Outlet pipes length	$40 D$	mm
Tee distances	$5.6 D, 10 D, 15 D$	mm
Flow in mixing zone	0.08−0.43	l/s
Reynolds number range	5658−30416	-
Inlet flow ratios	0.333, 0.5, 1, 2, 3	-
Kinematic viscosity of water	$1 \times 10^{- 6}$	m²/s

Table 2. Boundary conditions for the passive scalar model.

Variable	Inlets	Outlets	Pipe Walls
$v$	$v$	$v$	$v = 0$
p	$\partial p / \partial n = 0$	p and $\partial p / \partial n = 0$	$\partial p / \partial n = 0$
c	c	$\partial c / \partial n = 0$	$\partial c / \partial n = 0$
k	k	$\partial k / \partial n = 0$	$f_{w a l l}$
$ϵ$	$ϵ$	$\partial ϵ / \partial n = 0$	$f_{w a l l}$

Table 3. Computational mesh sizes (number of cells in thousands) for different distances.

Distance	Coarse	Mid	Fine	Finest
$5.6 D$	196	861	1819	4900
$10 D$	401	1300	2950	5000
$15 D$	538	1960	3000	5200

Table 4. Mesh independence study results (value of

R_{c}

) for different distances.

Table 4. Mesh independence study results (value of

R_{c}

) for different distances.

Distance	Coarse	Mid	Fine	Finest	Experiment
$5.6 D$	0.602	0.607	0.613	0.613	0.610
$10 D$	0.563	0.560	0.552	0.560	0.563
$15 D$	0.541	0.541	0.530	0.530	0.539

Table 5. ANN model definition with layers, activation function and data.

Layer	Neurons	Activation	Data
Input Layer	3	-	$R_{q_{i n}}$ , $R_{q_{o u t}}$ , Distance
Hidden Layer 1	50	Rectifier	-
Hidden Layer 2	25	Rectifier	-
Hidden Layer 3	25	Rectifier	-
Output Layer	1	Sigmoid	$R_{c_{A N N}}$

Table 6. Comparison of CFD and experimental results for different

R_{q_{i n}}

values.

Table 6. Comparison of CFD and experimental results for different

R_{q_{i n}}

values.

Distance	$R_{q_{in}}$	$R_{c_{CFD}}$	$R_{c}$
$5.6 D$	0.333	0.262	0.239
$5.6 D$	0.5	0.377	0.342
$5.6 D$	2	0.798	0.801
$5.6 D$	3	0.880	0.857
$10 D$	0.333	0.249	0.280
$10 D$	0.5	0.356	0.370
$10 D$	2	0.741	0.719
$10 D$	3	0.825	0.790
$15 D$	0.333	0.246	0.268
$15 D$	0.5	0.343	0.359
$15 D$	2	0.712	0.719
$15 D$	3	0.800	0.809

Table 7. CFD model error summary for all distances and

R_{q_{i n}}

from Table 6.

Table 7. CFD model error summary for all distances and

R_{q_{i n}}

from Table 6.

Distance	RMSE	Min. Error	Max. Error
$5.6 D$	0.0214	+0.3%	−10%
$10 D$	0.0246	−3.05%	+11%
$15 D$	0.013	0.9%	8.9%

Table 8. Computational run time for each distance on a fine mesh.

Distance	Cell Number	Time
$5.6 D$	1.86 million	56 min
$10 D$	2.95 million	82 min
$15 D$	3.00 million	83 min

Table 9. Accuracy metrics of ANN and SVR models.

Model	RMSE	Max Error
ANN	0.0172	0.0427
SVR	0.0361	0.119

Table 10. Comparison of the testing data (30% of the whole generated data) predictions for SVR and ANN with bold values representing a better prediction.

$R_{q_{in}}$	$R_{q_{out}}$	Distance	$R_{c_{CFD}}$	$R_{c_{SVR}}$	$R_{c_{ANN}}$
1	0.333	$15 D$	0.519	0.516	0.520
0.5	0.333	$15 D$	0.342	0.323	0.329
3	0.333	$5.6 D$	0.812	0.810	0.833
3	3	$15 D$	0.823	0.775	0.850
1	2	$10 D$	0.565	0.577	0.578
1	0.5	$10 D$	0.535	0.546	0.555
2	0.5	$15 D$	0.702	0.699	0.710
1	3	$15 D$	0.540	0.472	0.527
0.5	3	$10 D$	0.360	0.327	0.333
0.5	0.333	$5.6 D$	0.354	0.335	0.362
2	2	$10 D$	0.769	0.807	0.756
0.5	3	$15 D$	0.341	0.290	0.336
2	0.5	$10 D$	0.718	0.714	0.707
2	1	$10 D$	0.741	0.736	0.718
0.5	2	$10 D$	0.360	0.334	0.357
3	1	$15 D$	0.800	0.800	0.807
1	0.333	$10 D$	0.529	0.545	0.546
3	3	$10 D$	0.868	0.749	0.911
0.333	1	$15 D$	0.246	0.267	0.257
0	0.5	$15 D$	0.344	0.326	0.333
0	0.5	$5.6 D$	0.360	0.343	0.361
1	1	$15 D$	0.530	0.545	0.556
2	2	$15 D$	0.730	0.751	0.729

Table 11. SVR and ANN model comparison for 7.5D and 12.5D (better prediction values are made bold).

$R_{q_{in}}$	$R_{q_{out}}$	Distance	$R_{c_{CFD}}$	$R_{c_{SVR}}$	$R_{c_{ANN}}$
2	1	$7.5 D$	0.770	0.624	0.742
0.5	3	$7.5 D$	0.362	0.456	0.371
2	1	$12.5 D$	0.728	0.562	0.712
0.333	1	$12.5 D$	0.245	0.447	0.245

© 2020 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Grbčić, L.; Kranjčević, L.; Družeta, S.; Lučin, I. Efficient Double-Tee Junction Mixing Assessment by Machine Learning. Water 2020, 12, 238. https://doi.org/10.3390/w12010238

AMA Style

Grbčić L, Kranjčević L, Družeta S, Lučin I. Efficient Double-Tee Junction Mixing Assessment by Machine Learning. Water. 2020; 12(1):238. https://doi.org/10.3390/w12010238

Chicago/Turabian Style

Grbčić, Luka, Lado Kranjčević, Siniša Družeta, and Ivana Lučin. 2020. "Efficient Double-Tee Junction Mixing Assessment by Machine Learning" Water 12, no. 1: 238. https://doi.org/10.3390/w12010238

APA Style

Grbčić, L., Kranjčević, L., Družeta, S., & Lučin, I. (2020). Efficient Double-Tee Junction Mixing Assessment by Machine Learning. Water, 12(1), 238. https://doi.org/10.3390/w12010238

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Efficient Double-Tee Junction Mixing Assessment by Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Problem Description

2.2. CFD Simulations

2.2.1. CFD Model

2.2.2. Mesh Independence

2.2.3. CFD Data Generation

2.3. ML Model

2.3.1. Artificial Neural Networks

2.3.2. Support Vector Regression

3. Results and Discussion

3.1. CFD Results and Efficiency

3.2. ML Results

3.2.1. Models’ Training and Testing

3.2.2. $R_{c}$ Prediction for Other Double-Tee Distances

3.2.3. $R_{c}$ Prediction for a Range of $R_{q_{i n}}$ and $R_{q_{o u t}}$ with ANN

3.2.4. On ANN Efficiency

4. Summary and Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

Article Menu

Efficient Double-Tee Junction Mixing Assessment by Machine Learning

Abstract

1. Introduction

2. Materials and Methods

2.1. Problem Description

2.2. CFD Simulations

2.2.1. CFD Model

2.2.2. Mesh Independence

2.2.3. CFD Data Generation

2.3. ML Model

2.3.1. Artificial Neural Networks

2.3.2. Support Vector Regression

3. Results and Discussion

3.1. CFD Results and Efficiency

3.2. ML Results

3.2.1. Models’ Training and Testing

3.2.2. R c Prediction for Other Double-Tee Distances

3.2.3. R c Prediction for a Range of R q i n and R q o u t with ANN

3.2.4. On ANN Efficiency

4. Summary and Conclusions

Author Contributions

Funding

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI

3.2.2. $R_{c}$ Prediction for Other Double-Tee Distances

3.2.3. $R_{c}$ Prediction for a Range of $R_{q_{i n}}$ and $R_{q_{o u t}}$ with ANN