Viscosity Deviation Modeling for Binary and Ternary Mixtures of Benzyl Alcohol-N-Hexanol-Water

Knowing the thermodynamic and transport properties of liquid systems is very important in engineering for the development of theoretical models and for the design of new technologies. Models that allow accurate predictions of thermodynamic and transport properties are needed in chemical engineering calculations involving fluid, heat, and mass transfer. In this study, the modeling of viscosity deviation for binary and ternary systems containing benzyl alcohol, n-hexanol, and water, less studied in the literature, was carried out using Redlich and Kister (R-L) models, multiple linear regression (MLR) models and artificial neural networks (ANN). The viscosity of the binary and ternary systems was experimentally determined at the following temperatures: 293.15, 303.15, 313.15, and 323.15 K. Viscosity deviation was calculated and then correlated with mole fractions, normalized temperature, and refractive index. The neural model that led to the best performance in the testing and validation stages contains 4 neurons in the input layer, 12 neurons in the hidden layer, and one neuron in the output layer. In the testing stage for this model, the standard deviation is 0.0067, and the correlation coefficient is 0.999. In the validation stage, a deviation of 0.0226 and a correlation coefficient of 0.996 were obtained. The MLR model led to worse results than those obtained with the neural model and also with the R-L models. The standard deviation for this model is 0.099, and the correlation coefficient is 0.898. Its advantage over the R-L type models is that the influence of both composition and temperature are included in a single equation.


Introduction
Many products from the pharmaceutical or cosmetic industries consist of liquid systems in the form of solutions that contain more than one solute. After they are used, they reach the environment and can negatively influence it. Environmental pollution with pharmaceuticals and cosmetics has represented a particularly important global problem since ancient times [1]. The presence of pharmaceutical and cosmetic products in surface waters negatively affects living organisms. Also, if they reach the human body, they can cause various ailments. In order to reduce these negative effects on the environment, numerous groups of researchers from around the world have focused their attention on studies on the behavior of liquid mixtures [2][3][4][5][6][7][8][9][10][11][12][13][14]. It can be said that the pollution caused by the products of the pharmaceutical and cosmetic industry represents an ever-increasing challenge for the protection of the environment because the human population registers a systematic increase in the use of medicines and personal care products [2]. Emerging pollutants reach the environment and pollute wastewater through various pathways. Medicines and pharmaceutical skin products such as creams, ointments, and body lotions The prediction of density, refractive index, boiling point, dielectric constant, and viscosity for a series of organic derivatives having the general structure X-CH 2 -CH 2 -Y was carried out by Cocchi et al. using MLR-type models. The performances of the obtained models were quantified by the standard deviation, finding very good correlations between the values of the properties calculated with the MLR models and those determined experimentally [16]. Zhao et al. performed the prediction of the viscosity of imidazoliumbased ionic liquids using the multiple linear regression (MLR) algorithm, but also artificial intelligence tools, namely the support vector machine (SVM) algorithm. They used a consistent database. The average deviation of the entire data set was 24.2% for the MLR model and 3.95% for the SVM [17]. The statistical processing of the experimental data with the multiple linear regression (MLR) method was used by our research group to estimate some excess thermodynamic properties (the dependent variable) using the composition of the mixtures the temperature and the refractive index [18,19] as independent variables.
Artificial neural networks have been used by Mehlman et al. to make density, viscosity, and refractive index predictions for ternary and quaternary solvent mixtures using experimental determinations of the properties of binary systems [20]. Rocabruno-Valdés et al. made predictions of the dynamic density and viscosity of biodiesel from fatty acid methyl esters composition, number of carbon atoms, number of hydrogen atoms, and temperature, using single-layer neural networks with 1 to 6 hidden neurons. In the validation stage, the performance of the built models was very good, and when comparing the experimental values with the simulated ones, the correlation coefficients were higher than 0.92, and the mean squared error (MSE) was higher than 0.0018 [21]. Correlation coefficients of 0.98 and the average absolute relative deviation on the test set of 6.84% were recently obtained by Yu et al. with a multilayer perceptron neural model for predicting the viscosity of an emerging class of green solvents [22]. Other existing studies in the literature used neural networks to predict the viscosity of mixtures containing ionic liquids [23,24] or nanofluids [25]. The use of artificial neural networks for viscosity deviation modeling is less studied in the literature.
The development of models that allow predictions of the thermodynamic and transport properties of multicomponent mixtures to be made is extremely important for chemistry and chemical engineering. An important reason for these approaches is the multiple economic advantages obtained both in terms of reducing the costs of research and in the efficiency of the time needed to carry out experimental determinations, which are quite laborious, and many repetitions are necessary for accurate data.
This paper presents the experimental determination of the viscosity of binary and ternary mixtures of benzyl alcohol, n-hexanol, and water, which are less studied in the literature. Also, with the help of neural networks, models are built that correlate viscosity deviation with normalized temperature, composition, and refractive index. The obtained results are compared with those provided by Redlich and Kister type polynomial models, MLR models, and other regression algorithms.

Materials and Methods
The substances used for the experimental measurements were purchased from specialized companies, and according to the technical sheet, they have a purity level of at least 99%, namely benzyl alcohol (Purity 99.5%, Merck, Kenilworth, NJ, USA) and n-hexanol (Purity 99%, Sigma-Aldrich, St. Louis, MO, USA). Bidistilled water was used to obtain the binary and ternary systems.
The preparation of binary and ternary mixtures using the mentioned liquids was carried out by weighing on an analytical balance type XP105 from Mettler Toledo (Columbus, OH, USA), which ensures a measurement accuracy of ±0.01 mg. The estimation of mole fractions was thus achieved with a precision of ±0.0001, and this will depend on the compound used. The liquid mixtures were made in sealed vials to avoid preferential evaporation.
The viscosities of the pure components and binary and ternary mixtures were determined with a Physica MCR 501 modular rheometer equipped with a CC27 concentric Based on these experimental determinations, the thermodynamic properties of the excess were evaluated, respectively the viscosity deviation (∆η) was determined.
To model the viscosity deviation, a database containing information on experimental determinations of viscosity and refractive index for 27 binary and ternary mixtures containing benzyl alcohol (X 1 ), n-hexanol (X 2 ), and water (X 3 ) was built, at four temperature values (293. 15, 303.15, 313.15, and 323.15 K). Thermodynamic properties were evaluated for 6 benzyl alcohol-water binary mixtures, 9 benzyl alcohol-n-hexanol binary mixtures, 6 n-hexanol-water binary mixtures, and 6 benzyl alcohol-n-hexanol-water ternary mixtures.
The Redlich and Kister model [26] is one of the most used models for representing the thermodynamic properties of excess: where x is the mole fractions and A i is the coefficients in the polynomial equation.
The number of parameters that must be used to represent the experimental data depends on the distribution of the data and is directly related to the complexity of the process, the quality of the data and also the number of available experimental data. Thus, for binary mixtures, the equation has the following form: where x 1 , x 2 is the mole fractions and A 12 , B 12 , and C 12 are the coefficients in the polynomial equation.
Considering that for binary mixtures, the complementarity of the mole fractions is verified, then x 1 − x 2 = 2x 1 − 1, and knowing that A 12 = A 0 , B 12 = A 1 , C 12 = A 2 , . . . , etc., Equation (2) will have the following form: in which the coefficient A i is obtained through a least squares regression procedure. The number of coefficients used in Equation (3) depends on the criterion chosen by the researcher, using evaluations for this, including the number of experimental data, their distribution, the purpose for which they are obtained, etc.
In the case of ternary systems, the Redlich and Kister model takes into account the relation (3), which allows the calculation of deviations from the ideal behavior for binary mixtures and Equation (4) for ternary mixtures [26].
where ∆Y 1,2 , ∆Y 1,3 , and ∆Y 2,3 are the deviations for binary mixtures, x 1 , x 2 , and x 3 are the mole fractions, and A 0,123 , A 1,123 , A 2,123 , and A 3,123 are the coefficients in the polynomial equation for ternary mixtures. The relation between viscosity deviation, composition, normalized temperature, and refractive index was evaluated using multiple linear regression (MLR) and artificial neural networks.
The viscosity deviation calculated based on the experimental determinations of the viscosity of pure compounds and binary and ternary mixtures was modeled by statistical processing of the experimental data with the multiple linear regression method (MLR). The use of Sigmaplot 11 software (Systat Software Inc., San Jose, CA, USA) allowed establishment of the dependence between the viscosity deviation (dependent variable) and the composition (X 1ş i X 2 ), the normalized temperature (T/273.15), and the refractive index (independent variables).
Modeling with the help of neural networks was carried out with the Neurosolutions commercial simulator produced by the NeuroDimension company (NeuroDimension Inc., Boston, MA, USA). This simulator has the advantage of providing users with a set of neuron models, data set interaction models, and training algorithm models. They are easy to use and have a visual representation that allows the user to easily build their neural network structure. Through a graphic interface, the software provides elementary blocks that, when combined, can generate various neural architectures. In this study, neural models of multilayer perceptrons were built. The type of transfer function selected was TanhAxon and learning rule momentum.
The methodology for modeling the viscosity deviation based on experimental determinations of the refractive index and viscosity at different temperatures and mole fractions of n-hexanol and benzyl alcohol involves following the steps shown in Scheme 1.
refractive index was evaluated using multiple linear regression (MLR) and artificial neural networks.
The viscosity deviation calculated based on the experimental determinations of the viscosity of pure compounds and binary and ternary mixtures was modeled by statistical processing of the experimental data with the multiple linear regression method (MLR). The use of Sigmaplot 11 software (Systat Software Inc., San Jose, CA, USA) allowed establishment of the dependence between the viscosity deviation (dependent variable) and the composition (X1 şi X2), the normalized temperature (T/273.15), and the refractive index (independent variables).
Modeling with the help of neural networks was carried out with the Neurosolutions commercial simulator produced by the NeuroDimension company (NeuroDimension Inc., Boston, MA, USA). This simulator has the advantage of providing users with a set of neuron models, data set interaction models, and training algorithm models. They are easy to use and have a visual representation that allows the user to easily build their neural network structure. Through a graphic interface, the software provides elementary blocks that, when combined, can generate various neural architectures. In this study, neural models of multilayer perceptrons were built. The type of transfer function selected was TanhAxon and learning rule momentum.
The methodology for modeling the viscosity deviation based on experimental determinations of the refractive index and viscosity at different temperatures and mole fractions of n-hexanol and benzyl alcohol involves following the steps shown in Scheme 1.

Scheme 1. Viscosity deviation modeling methodology.
The performance evaluation of the built models was carried out by calculating the standard deviation with the following relationship: where n represents the number of experimental data and p is the number of parameters.

Scheme 1. Viscosity deviation modeling methodology.
The performance evaluation of the built models was carried out by calculating the standard deviation with the following relationship: where n represents the number of experimental data and p is the number of parameters. The performances of the built neural models were evaluated based on the mean squared error (MSE) and the correlation coefficient (r 2 ): where N is the number of data, P is the number of output quantities (in this case, P = 1), O ij is the output value for element i with element j processing, and D ij is the desired output for i with element j processing; where O are the values of the output data, respectively, and exp and net denote the experimental values and those obtained from neural models.
Other regression algorithms were also used for comparison: nearest neighbor (NN), k-nearest neighbor (kNN), K*, support vector regression (SVR), and random forest. For the kNN algorithm that searches for the most suitable training examples in the feature space and then uses their average as a prediction, k = 3 or 10 neighbors and the inverse distance function: w i = 1/d i were used, and in the NN algorithm, k = 1 was considered. For the K* algorithm, the global parameter gb = 10, 20, or 50 was considered. This parameter can be considered a sphere of influence that implicitly specifies how many neighbors are significant [27]. Support Vector Regression (SVR) is a method that seeks to minimize error by finding output values that lie within a given range. Three types of kernels were used: polynomial, Radial Basis Function (RBF), and Pearson Universal Kernel (PUK); and cost parameters C = 100 or 10,000. This cost parameter controls the strictness of the objective function that is optimized by the algorithm. If a greater value is used for C, there will be a smaller margin of error, but it results in a lower generalizability [27]. The random forest algorithm involved 100 or 1000 decision trees. Each decision tree calculates an output value, and the forest ensemble calculates the average of these individual values [27].

Experimental Determination of Thermodynamic Properties for Binary and Ternary Systems
The ternary graph presented in Figure 1 indicates that experimental determinations were made for the entire range of variation of mole fractions for the benzyl alcohol (1)n-hexanol (2) binary system and in the field of dilute solutions for the benzyl alcohol (1)-water (3) and n-hexanol (2)-water (3) binary systems and for the benzyl alcohol (1)-n-hexanol (2)-water (3) ternary system. The performances of the built neural models were evaluated based on the mean squared error (MSE) and the correlation coefficient (r 2 ): ) /(N⸱P) (6) where N is the number of data, P is the number of output quantities (in this case, P = 1), Oij is the output value for element i with element j processing, and Dij is the desired output for i with element j processing; where O are the values of the output data, respectively, and exp and net denote the experimental values and those obtained from neural models.
Other regression algorithms were also used for comparison: nearest neighbor (NN), k-nearest neighbor (kNN), K*, support vector regression (SVR), and random forest. For the kNN algorithm that searches for the most suitable training examples in the feature space and then uses their average as a prediction, k = 3 or 10 neighbors and the inverse distance function: wi = 1/di were used, and in the NN algorithm, k = 1 was considered. For the K* algorithm, the global parameter gb = 10, 20, or 50 was considered. This parameter can be considered a sphere of influence that implicitly specifies how many neighbors are significant [27]. Support Vector Regression (SVR) is a method that seeks to minimize error by finding output values that lie within a given range. Three types of kernels were used: polynomial, Radial Basis Function (RBF), and Pearson Universal Kernel (PUK); and cost parameters C = 100 or 10,000. This cost parameter controls the strictness of the objective function that is optimized by the algorithm. If a greater value is used for C, there will be a smaller margin of error, but it results in a lower generalizability [27]. The random forest algorithm involved 100 or 1000 decision trees. Each decision tree calculates an output value, and the forest ensemble calculates the average of these individual values [27].

Experimental Determination of Thermodynamic Properties for Binary and Ternary Systems
The ternary graph presented in Figure 1 indicates that experimental determinations were made for the entire range of variation of mole fractions for the benzyl alcohol (1)-n-hexanol (2) binary system and in the field of dilute solutions for the benzyl alcohol (1)-water (3) and n-hexanol (2)-water (3) binary systems and for the benzyl alcohol (1)-n-hexanol (2)-water (3) ternary system.  The viscosity values experimentally measured at different temperatures for pure compounds are presented in Table 1, and those for binary and ternary mixtures are presented in Table 2.  [4], b [28], c [9], d [10], e [29], f [30], g [31], h [14], i [32], j [33], k [11], l [12].  Table 1 reports the experimental data from this study to other data provided by the specialized literature. Close values with those reported by other researchers at the same temperature are found for the viscosities of the pure compounds.
The viscosity of the binary and ternary mixtures, according to the results presented in Table 2, decreases as the temperature increases from 293.15 to 323.15 K. In the case of the benzyl alcohol (1)-water (2) binary system, the viscosity increases slightly with the increase in the molar fraction of benzyl alcohol at the same temperature. For the benzyl alcohol (1)-n-hexanol (2) binary system, the viscosity decreases slightly up to molar fractions X 1 = 0.3, and then, due to the intensification of the interactions between the components, the viscosity increases with the increase of the molar fraction of benzyl alcohol at the same temperature. In this binary system, the predominant interactions between the molecules of benzyl alcohol and n-hexanol are of the type (O . . . H-O), but there can also be weaker dipole-dipole interactions [34]. For the n-hexanol-water binary system and the ternary system, the viscosity varies less with composition at the same temperature.
The deviation from the ideal behavior is highlighted by the viscosity deviation (∆η) (Figure 2), defined by relation (8): where n varies from 1 to 3, η is the viscosity of binary/ternary systems, η i is the viscosity of pure compounds, and x i is the mole fractions. According to the values obtained for the viscosity deviation graphically represented in Figure 2, the binary and ternary systems containing water generally give positive deviations, and the benzyl alcohol (1), n-hexanol (2) binary system gives negative deviations from the ideal behavior. In systems containing water, positive viscosity deviations can be attributed to specific interactions between different molecules in the mixture [35]. In the benzyl alcohol (1)-n-hexanol (2) binary system, negative values appear due to the disruption or breaking of the associative bonds of the molecules [9]. For this binary system (Figure 3) lower and lower values of the viscosity deviation are found as the temperature increases and a minimum is observed near the equimolar composition.
The refractive index is determined experimentally quite easily but also with a fairly good measurement precision and with a small consumption of materials. According to specialized literature, models can be built that correlate this parameter with other properties that are much more difficult to measure, such as density, surface tension, or viscosity [18,19,36]. For the binary and ternary systems [36] analyzed in this study, the refractive index ( Figure 4) varies depending on composition and temperature between 1.3 and 1.6.

Statistical Analysis of Experimental Data and Construction of Training and Validation Sets
The statistical processing of the experimental data was carried out with the specialized program SigmaPlot 11.00. The program allowed the determination of mean values, standard deviation, and standard mean error. Also, the confidence interval of the mean, the amplitude, the minimum and maximum values, the median, and the distribution interval of 25% and 75% of the data were established. The evaluation of the normal distribution of the data was carried out by the Skewness and Kurtosis tests, the Kolmogorov-Smirnov test, the Shapiro-Wilk test, and by calculating the sum of the data and the sum of squares [27]. According to the data presented in Table 3, there are no missing data. The analysis of the distribution of the experimental data (Skewnss) reveals a positive asymmetry for the mole fractions, the normalized temperature, and the refractive index and a negative one for the viscosity deviation. The Kolmogorov-Smirnov normality test was also applied, which quantifies the degree of overlap between the cumulative distribution of the analyzed variables and the cumulative distribution of the variable that follows the shape of the Gaussian curve. The obtained values indicate a normal distribution of the experimental data and the fact that they are statistically significant (p < 0.001), an aspect confirmed by the results obtained with the Shapiro-Wilk normality test.

Modeling the Thermodynamic Properties of Excess with the Redlich and Kister Model
To model the viscosity deviation, the applicability of the Redlich and Kister (R-K) polynomial model was verified, which correlates this thermodynamic quantity with the composition (relations 3 and 4). The coefficients A 0 ...A 3 and the standard deviation for these models are presented in Table 4. Figure 3 compares the values calculated based on the experimental determinations for the viscosity deviation and those obtained from the Redlich and Kister (R-K) model for the benzyl alcohol (1)-n-hexanol (2) binary system. It is found that the experimental values are very close to those calculated with the R-K model, and the standard deviations are between 0.0069 and 0.0161.
If we analyze the standard deviations shown in Table 4, we find very good performance for this type of model for all the evaluated systems, but its disadvantage is the fact that it requires one equation for each temperature, a lot of calculations to obtain them, and respectively, a longer experimental data processing time.  (Equation (4)).

Multiple Linear Regression (MLR) Method
In previous studies [18,19,36], it was demonstrated that good results were obtained for the correlation of excess thermodynamic properties with normalized temperature, refractive index, and mole fractions, with the help of models obtained by the multiple linear regression method.
In this study, for modeling the viscosity deviation, an MLR model was built for which the standard deviation is 0.099, the correlation coefficient is 0.898, and most of the values obtained are located within a confidence interval of ±17% ( Figure 5). The proposed MLR model has the following mathematical expression: 1-benzyl alcohol, 2-n-hexanol, 3-water; for the ternary system A0,123 … A3,123 (Equation (4)).

Multiple Linear Regression (MLR) Method
In previous studies [18,19,35], it was demonstrated that good results were obtained for the correlation of excess thermodynamic properties with normalized temperature, refractive index, and mole fractions, with the help of models obtained by the multiple linear regression method.
In this study, for modeling the viscosity deviation, an MLR model was built for which the standard deviation is 0.099, the correlation coefficient is 0.898, and most of the values obtained are located within a confidence interval of ±17% ( Figure 5). The proposed MLR model has the following mathematical expression:

Modeling with Neural Networks
To build the training and validation data sets, the mixing of the 108 available data series was used. For this purpose, their processing in Excel was used by generating random numbers for the experimental data, which were then ordered in ascending or descending order, thus resulting in a good mixing of them. To generate the random numbers, the following expression "=INT(n*RAND()+1)" was used, where n is a number equal to or greater than the number of experimental data. A total of 87% of the available data were used for training and 13% were used in the validation stage.
After mixing the 108 available experimental data, according to the mentioned algorithm, 93 data were used in the training stage. Neural models were built with four neurons in the input layer, one or two layers of hidden neurons, and one output. To avoid overtraining the neural models, the variation of the mean squared error (MSE) with the increase in the number of epochs was followed, and it was determined that the optimal number of training epochs is 60,000. For all the neural models, this result was taken into account, and their performances are presented comparatively in Table 5.
The neural model containing 4 neurons in the input layer, 12 neurons in the hidden layer, and one neuron in the output layer led to the best performance in the training stage. Figure 6 compares the experimental data with those provided by the ANN(4:12:1) model in the testing stage.  The neural model containing 4 neurons in the input layer, 12 neurons in the hidden layer, and one neuron in the output layer led to the best performance in the training stage. Figure 6 compares the experimental data with those provided by the ANN(4:12:1) model in the testing stage.  If we compare the results obtained with the ANN(4:12:1) model, with those given by the previously presented MLR model, we see the clearly superior performance of the neural model. The standard deviation for the neural model is several times smaller than for the MLR model, and the correlation coefficient is higher than 0.99 for ANN compared to 0.898 for MLR.
with relation (5) is 0.0067, and the correlation coefficient is 0.999. For the neural model validation stage, 15 series of experimental data were used. According to the representation in Figure 7, very good results are obtained, and the values calculated with the ANN(4:12:1) model for the viscosity deviation are very close to the experimental values. The standard deviation in this step of neural model validation is 0.0226, and the correlation coefficient is 0.996. If we compare the results obtained with the ANN(4:12:1) model, with those given by the previously presented MLR model, we see the clearly superior performance of the neural model. The standard deviation for the neural model is several times smaller than for the MLR model, and the correlation coefficient is higher than 0.99 for ANN compared to 0.898 for MLR.

Modeling with Regression Methods
On the same database, modeling was done with different regression methods: nearest neighbor (NN), k-nearest neighbor (kNN), K*, support vector regression (SVR), and random forest. The results of the regression with different algorithms and combinations of parameter values are presented in Table 6. In this table, both the results obtained in the training stage and in the cross-validation are presented.

Modeling with Regression Methods
On the same database, modeling was done with different regression methods: nearest neighbor (NN), k-nearest neighbor (kNN), K*, support vector regression (SVR), and random forest. The results of the regression with different algorithms and combinations of parameter values are presented in Table 6. In this table, both the results obtained in the training stage and in the cross-validation are presented. Similar results to neural models were obtained in the cross-validation stage with two of the SVR models, respectively with radial basis function (RBF) kernels with cost parameter C = 100 and Pearson universal kernel (PUK) with C = 10,000.

Conclusions
With high-performance equipment, i.e., the Physica MCR 501 modular rheometer, equipped with a CC27 concentric cylinder measuring system, at a shear rate of 10 s −1 and at temperatures of 293.15, 303.15, 313.15, and 323.15 K, the viscosities for pure compounds and binary and ternary mixtures of benzyl alcohol, n-hexanol, water, were determined.
The viscosity deviation was calculated, which has negative values for the benzyl alcohol (1)-n-hexanol (2) binary system and generally positive values for the binary and ternary systems containing water. A database was created that was statistically processed and with the help of which models were built that correlate viscosity deviation with temperature, composition, and refractive index.
Very good results were obtained with the ANN(4:12:1) model. The viscosity deviation values are very close to the experimental ones; the standard deviation in the test stage is 0.0067, and in the validation stage, it is 0.0226. The correlation coefficients are 0.999 in the testing stage and 0.996 in the validation stage. In the cross-validation stage, close results were also obtained with two regression models, namely SVR (C = 100, RBF γ = 1) and SVR (C = 10,000, PUK).
The results obtained with the MLR model are worse than those obtained with the neural model and the Redlich and Kister type model. The standard deviation is 0.099, and the correlation coefficient is 0.898 for the MLR model. Its advantage over those of Redlich and Kister type polynomials is that a single equation is obtained that correlates the viscosity deviation with temperature, composition, and refractive index.