Calculation of the Surface Tension of Ordinary Organic and Ionic Liquids by Means of a Generally Applicable Computer Algorithm Based on the Group-Additivity Method

The calculation of the surface tension of ordinary organic and ionic liquids, based on a computer algorithm applying a refined group-additivity method, is presented. The refinement consists of the complete breakdown of the molecules into their constituting atoms, further distinguishing them by their immediate neighbour atoms and bond constitution. The evaluation of the atom-groups’ contributions was carried out by means of a fast Gauss-Seidel fitting method, founded upon the experimental data of 1893 compounds from literature. The result has been tested for plausibility using a 10-fold cross-validation (cv) procedure. The direct calculation and the cv test proved the applicability of the present method by the close similarity and excellent goodness of fit R2 and Q2 of 0.9039 and 0.8823, respectively. The respective standard deviations are ±1.99 and ±2.16 dyn/cm. Some correlation peculiarities have been observed in a series of ordinary and ionic liquids with homologous alkyl chains, as well as with di- and trihydroxy-groups-containing liquids, which have been discussed in detail, exhibiting the limit of the present method.


Introduction
Surface tension has received increasing interest in recent years due to its significance in material and environmental science, as well as in chemical separation processes, where it plays a key role in the dispersion and emulsion of immiscible solvent compositions, and adsorption at solid surfaces. A detailed discussion of the forces acting on the "interfacial region" (as it was named in order to include contributions of the second or third layer below the actual surface layer) was given by Fowkes [1]. He explained the surface tension as a result of the attractive forces from the underlying molecules net perpendicular to the surface, which causes a reduction in the number of molecules on the surface, which again increases their intermolecular distance. This increase requires work, which is the intrinsic reason for the tension, and is expressed as surface free energy upon its relaxation. The intermolecular attractive forces are separable into the London dispersive, polar, and Lewis acid-base forces, of which the former two are additive and the latter is non-additive, as has been outlined by van Oss et al. [2]. Based on these findings, Freitas et al. [3] developed a linear free energy relationship (LSER) equation, which-besides a constant-encompasses parameters representing the dipolarity, the excess molar refraction, the hydrogen bond acidity and basicity, and the molar volume of an organic liquid. They concluded from the weights of these parameters in the LSER equation that the dominant factors contributing to the surface tension are the dipolarity, the excess molar refraction, and the constant. The latter was interpreted as representing the loss of the dispersive interaction which

General Procedure
The experimental surface-tension data are stored, together with the molecules in their 3D-geometry-optimized structure and further experimental and calculated descriptors, in an object-oriented knowledge database, at present encompassing more than 31,000 records of pharmaceuticals, plant protection products, dyes, ionic liquids, liquid crystals, metal-organics, lab intermediates, and more.
The details of the present atom-groups additivity method has been outlined in [13]. While the definitions and meanings of the atom groups in the following group-parameters table (Table 2) are to be interpreted in the same way as exemplified in Table 1 of [13], the inclusion of the ionic liquids required the addition of a number of further atom groups in order to represent their charged moieties, analogous to those given for the calculation of their viscosity in [15]. The exemplary list of these additional atom groups is collected in Table 1. These groups are treated just like the remaining ones by the computer algorithm.
For practical reasons-and following chemical conventions-the ion charges of the ionic liquids are centred on the atom types of the atom groups in Tables 1 and 2. A certain deviation from this convention has been made for the imidazolium cations, where the conventional notation would imply an asymmetrcal charge distribution which, as e.g., the EHMO calculations indicate (visualized in Figure 1 in [15]), is not the case. Therefore, in this case, the positive charge has been positioned onto the carbon atom at position 2 between the two nitrogen atoms, which are bound to this central carbon atom by aromatic bonds. Accordingly, the carbon atom at position 2 and the nitrogen atoms in the imidazolium ions are represented in Table 1 by the atom groups 7 and 14, respectively. Atom types representing atoms that are immediate neighbours of charged atoms are distinguishable from those without charged neighbours by the added sign (in brackets) in their associated "Neigbours" definition (see examples 4, 6, 8-11, 14, 18, 19, 23, and 24 in Table 1).
C aromatic H:C:N(+) C:CH:N + C2 in pyridinium 7 C(+) aromatic C:N2 N:  The computer algorithm evaluating the atom-group parameters first collects from the database those molecules which fulfil the conditions for their inclusion into the parameters calculation, i.e., it checks the availability of an experimental surface-tension value and ensures that all atom groups in the molecule are present in the group-parameters table, and then carries out the parameters calculation using a fast Gauss-Seidel matrix-diagonalization procedure. Details of this entire algorithm have been outlined in [13]. Once the group parameters have been generated and stored in the parameters table, an immediate test of its predictive quality is carried out, first including all the compounds in the parameters evaluation, followed by a 10-fold cross-validation plausibility test, ensuring that each of the compounds has been introduced alternatively as both a test or training sample, as has been described in detail in Section 2.4 of [13]. These cross-validation calculations-and all the subsequent predictive descriptor calculations-are carried out using Equation (1), where ST is the surface-tension value, a i and b j are the contributions, A i is the number of occurrences of the ith atom group, and B j is the number of occurrences of the jth special group, and C is a constant. Yet, there is one further restriction beyond the ones mentioned above, in that for the predictive calculations of the surface tension of the training and test compounds, only those atom groups in the parameters table are considered valid which have been represented in the preceding parametrization process by at least three independent compounds with a known experimental surface-tension value.

Results
Since the value of the surface tension is highly sensitive to the experimental temperature conditions, and since several authors applied different temperatures as their own standard, an overall temperature standard was required in order to ensure comparability. The decision to choose 293.15 K as standard resulted from the observation that the majority of the authors referred to this temperature, and that measurements of another molecular physical property, the liquid viscosity (see [15]), also rested upon this standard. Where possible, e.g., if experiments at a series of temperatures have been published, the experimental surface-tension value was either linearly inter-or extrapolated if necessary, provided that the experimental temperature conditions did not deviate too much from this standard. The most productive source for experimental surface-tension data for ordinary liquid compounds, Jasper's comprehensive paper [16], collecting some 2200 data from the year 1874 until 1969, stated that besides the temperature, other aspects, such as the method of measurement, the purity of the compounds, and even the experience of the investigator, had a major influence on the accuracy of the data. Unfortunately, he did not elaborate on the extent of the data uncertainty resulting from these aspects. This collection has been complemented-and its data compared-by the more recent collective papers [11,12,17,18]. Additionally, surface-tension data have been provided for various alkanes [19][20][21][22][23][24], alkylbenzenes [25], haloalkanes [26,27], halogenated esters and ethers [28], sulfoxides [29], and siloxanes [30,31]. Of particular interest are surface-tension data for ionic liquids. A recent comprehensive collection of publications in the supplement of [32], accumulated for the development of a further method for the prediction of the surface tension, based on the density, molar mass, and anion type, provided the source of data for 222 ionic liquids which have been included in the present studies.
In Table 2, the result of the atom-group parameters calculations, based on 1895 molecules, has been collected, together with a summary of the statistics data at the bottom (rows A to H). Attempts to further improve the result, e.g., by the exclusion of one or both of the special groups "Alkane" and "Unsaturated HC" (olefins and aromatics), yielded slightly lower correlation coefficients and higher standard deviations.
According to the entries A to H in Table 2, 165 (of 221) atom and special groups are valid for predictive calculations, as they are based on at least three independent training molecules. Therefore, the result of the goodness of fit R 2 of 0.9039 was based on 1833 of the 1893 training compounds, with a standard deviation σ of 1.99 dyn/cm. The average statistics data of the ten 10-fold cross-validation calculations (entries F-H) rested on a total of 1769 compounds, resulting in a cross-validated goodness of fit Q 2 of 0.8823 and a standard deviation S of 2.16 dyn/cm. The standard deviations σ and S (entries D and H) have been calculated from the training set and the combined test sets of the cross-validation calculations, respectively, using the well-known Equation (2), where SD is the respective standard deviation, x the experimental, x the calculated surface tension of each molecule, and N the number of molecules. (The corresponding average deviations -entries C and G -are the sum of the absolute differences between the experimental and calculated surface tension of all involved compounds, divided by the number of these compounds. Since the standard deviation is more widely used in the examination of the reliability of predictive calculations, corresponding discussions in this paper refer to this value.) The excellent compliance, in most cases, between the black crosses of the training set with the affiliated red circles of the cross-validated set in Figure 1, as well as the close similarity of standard deviations R 2 and Q 2 , confirm the applicability of the present surface-tension prediction method. The corresponding histogram in Figure 2 exhibits a fairly even Gaussian distribution for both the direct and the cross-validated deviations. A list of all the compounds used in this study, their experimental and calculated data and their 3D structures is available online in the supplementary material.
The relatively large standard deviation in relation to the overall data range, however, obscures the otherwise bright picture of the good correlation between the experimental and predicted surface-tension values, in that it hides three important observations. The first observation concerns the reliability of the experimental data for the ionic liquids, a point that has already been referred to in [32]. A typical example is 1,3-dimethylimidazolium bis(trifluoromethylsulfonyl)amide, for which in [5] a surface tension of 39 dyn/cm was given at 298.3 K, whereas in [33] a value 36.3 dyn/cm at the same temperature was published. In a further example, the surface tensions of each of the complete series of 1-alkyl-3-methylimidazolium bis(trifluoromethylsulfonyl)amide (with alkyl being ethyl to decyl) varied by ca. 1.5 to 2.1 dyn/cm at 298.15 K between the two publications [33] and [34]. Due to their hygroscopicity and high viscosity, a higher uncertainty, and thus scatter, of the experimental values should be expected, as is reflected in Figure 3. As a further consequence, the number of ionic liquid outliers, i.e., compounds for which the values exceed three times the cross-validated standard deviation, are disproportionately higher (26.6%) than the 4.8% for the normal compounds (see the outliers list in the supplementary material). The small number of ionic liquids compared with the complete set of compounds, however, did not impede them from remaining included in the parameters calculations without undue deterioration of the result. deviations R 2 and Q 2 , confirm the applicability of the present surface-tension prediction method. The corresponding histogram in Figure 2 exhibits a fairly even Gaussian distribution for both the direct and the cross-validated deviations. A list of all the compounds used in this study, their experimental and calculated data and their 3D structures is available online in the supplementary material.  The second observation, disguised behind the range of the standard deviation, reveals an important shortcoming of the present prediction method. A small set of compound classes, characterized by the common feature of carrying a homologous sequence of linear methylene chains, exhibits an unexpected deviation of the experimental sequence of surface-tension data from the calculated values, whereas other analogous classes of homologues show fairly normal correlation between experiment and prediction. Typical examples of the latter normal correlation sequence are n-alkylbenzenes [3] (chart a in Figure 4), methylesters of long-chain carboxylic acids [35] (chart 4b), 1-alkanols [36], and 1-alkylthiols [37], which only deviate from the ideal correlation by slightly differing slopes. In contrast to this, the sequence of the experimental surface-tensions in the homologous n-alkane series [38] (chart 4c) is nonlinear and seems to aim at a constant maximum with increasing chain length. Analogous nonlinearity with increasing chain length was found for 1-alkenes [38] and 1-bromoalkanes [39] (chart 4d). A nearly linear but inverse correlation was found for a methylene chain homologue substituted at both ends by a nitrate group [40] (chart 4e). This characteristic feature was also found for the homologues of α,o-dibromo-n-alkane [41] and 1,4-Bis(n-alkylcarbonyloxy)-2-butyne [42] (chart 4f). Quite a bizarre surface-tension sequence was revealed by the symmetrical (chart a in Figure 5) and asymmetrical (chart 5b) homologues of the ionic liquids 1,3-Bis(n-alkyl)imidazolium and 1-n-alkyl-3-methylimidazolium bis(trifluoromethylsulfonyl)amide [5,33,34], respectively. It is obvious that the present atom-group additivity approach is not able to treat these highly heterogeneous sequences. The reason behind these deviations has been described by Fowkes [4] as a result of anisotropism on the liquid surface caused by the extensive molecular directional orientation of these compounds, leading to a correlated molecular orientation (CMO). However, Fowkes only related his CMO thesis to linear n-alkanes; its extension to compounds with various substitutions inside or at the end of the methylene chains remains open to further studies. Since the CMO effect is generally small in relation to the other attractive forces on the liquid surface-Fowkes evaluated a range of between 0 for hexane and 2.89 dyn/cm for hexadecane, i.e., ca. 10% of the total force for the largest n-alkane in the series-the maximum range of the surface tension of all these homologous series remained within the deviation limits to allow all of their members to stay included in the parameters calculation. As a consequence, however, the present atom-group additivity method at best provides an average value for the surface tension of these homologues. The relatively large standard deviation in relation to the overall data range, however, obscures the otherwise bright picture of the good correlation between the experimental and predicted surface-tension values, in that it hides three important observations. The first observation concerns the reliability of the experimental data for the ionic liquids, a point that has already been referred to in [32]. A typical example is 1,3-dimethylimidazolium bis(trifluoromethylsulfonyl)amide, for which in [5] a surface tension of 39 dyn/cm was given at 298.3 K, whereas in [33] a value 36.3 dyn/cm at the same temperature was published. In a further example, the surface tensions of each of the complete series of 1-alkyl-3-methylimidazolium bis(trifluoromethylsulfonyl)amide (with alkyl being ethyl to decyl) varied by ca. 1.5 to 2.1 dyn/cm at 298.15 K between the two publications [33] and [34]. Due to their hygroscopicity and high viscosity, a higher uncertainty, and thus scatter, of the experimental values should be expected, as is reflected in Figure 3. As a further consequence,  The second observation, disguised behind the range of the standard deviation, reveals an important shortcoming of the present prediction method. A small set of compound classes, characterized by the common feature of carrying a homologous sequence of linear methylene chains, exhibits an unexpected deviation of the experimental sequence of surface-tension data from the calculated values, whereas other analogous classes of homologues show fairly normal correlation between experiment and prediction. Typical examples of the latter normal correlation sequence are n-alkylbenzenes [3] (chart a in Figure 4), methylesters of long-chain carboxylic acids [35] (chart 4b), 1-alkanols [36], and 1-alkylthiols [37], which only deviate from the ideal correlation by slightly differing slopes. In contrast to this, the sequence of the experimental surface-tensions in the homologous n-alkane series [38] (chart 4c) is nonlinear and seems to aim at a constant maximum with increasing chain length. Analogous nonlinearity with increasing chain length was found for 1-alkenes [38] and 1-bromoalkanes [39] (chart 4d). A nearly linear but inverse correlation was found for a methylene chain homologue substituted at both ends by a nitrate group [40] (chart 4e). This characteristic feature was also found for the homologues of α,o-dibromo-n-alkane [41] and 1,4-Bis(n-alkylcarbonyloxy)-2-butyne [42] (chart 4f). Quite a bizarre surface-tension sequence was revealed by the symmetrical (chart a in Figure 5) and asymmetrical (chart 5b) homologues of the ionic liquids 1,3-Bis(n-alkyl)imidazolium and 1-n-alkyl-3-methylimidazolium bis(trifluoromethylsulfonyl)amide [5,33,34], respectively. It is obvious that the present atom-group additivity approach is not able to treat these highly heterogeneous sequences. The reason behind these deviations has been described by Fowkes [4] as a result of anisotropism on the liquid surface The third observation, another form of special intermolecular association, was apparent on comparing the experimental surface tension of di-and tri-hydroxy-group-containing compounds with their calculated value, as these systematically by far underestimated the measured values. (An analogous observation was made for hydrazine, ethanolamine, propanolamine, 2-(isopropylamino)ethanol, and ethylenediamine.) Evidently, the excessive increase of the experimental surface tension is caused by an effect that is not captured by the ordinary hydroxy-group parameter (entry 157 in Table 2) and is most probably best described as additional associative intermolecular H-O bond forces. Therefore, a special group (entry 219 in Table 2) has been introduced to take account of the surplus effect of each additional hydroxy group, which indeed improved conformance with the experimental values. Nevertheless, due to the large scatter of the experimental values, which did not indicate any systematic correlation with the corresponding molecular structure-compare, e.g., the experimental surface tensions of the two closely related outliers 1,2,3-propanetriol and 1,2,6-hexanetriol showing values of 63.3 and 44.14 dyn/cm [12], respectively, and on the other hand those of the two structurally very different compounds ethylene glycol and heptaethylene glycol exhibiting experimental values of 48.43 and 48.39 dyn/cm [16], respectively-11 of the 21 examples with available data still exceeded the deviation limits and had to remain in the outliers list.

Conclusions
The present results prove the reliable applicability of the atom-group additivity approach on the molecular surface-tension prediction by simply extending, by a few further lines of control code, the common computer algorithm outlined in [13], which has already demonstrated its extraordinary versatility with the trustworthy prediction of 13 further descriptors described in the previous papers [13][14][15] in a split second in one single sweep on a desktop computer: The heats of combustion, formation (indirectly), solvation, sublimation and vaporization, the entropy of fusion, the partition coefficient logPo/w, the solubility logSwater, the refractivity, the polarizability, the toxicity against the protozoan Tetrahymena pyriformis, the liquid viscosity, and the activity coefficient at infinite dilution. In addition, the present method has the advantage of enabling an easily generalizable computer algorithm for the definition of the atom groups, i.e., the atom types and their neighbours. The present work is part of an ongoing project called ChemBrain IXL available from Neuronix Software (www.neuronix.ch, Rudolf Naef, Lupsingen, Switzerland).
Supplementary Materials: Supplementary materials are available online at http://www.mdpi.com/s1. The list of compounds, their experimental and calculated data and 3D structures of the surface-tension calculations are available online under the names of "S1. Experimental and Calculated Surface-Tension Data Table.doc" and "S2. Compounds List of Surface-Tension Calculations.sdf". A list of their outliers has been added under the name of "S3. Compounds List of Surface-Tension Outliers.xls". The figures are available as tif files and the tables as doc files under the names given in the text.
Acknowledgments: R. Naef is indebted to the library of the University of Basel for allowing him full and free access to the electronic literature database.
Author Contributions: R. Naef developed project ChemBrain and its software upon which this paper is based, and also fed the database, calculated and analysed the results and wrote the paper. W. E. Acree suggested the extension of ChemBrain's tools to include the presented descriptors and contributed the experimental data and the majority of the literature references. Beyond this, R. Naef is deeply indebted to W. E. Acree for the many valuable discussions.

Conflicts of Interest:
The authors declare no conflict of interest. Barring these special cases, the overwhelming majority of surface tension data have shown a normal statistical pattern in relation to the predictions, clearly proving the applicability of the present group-additivity approach. But how well does it compare with other published methods? Since to the best of our knowledge the present calculations are founded on the largest set of compounds with experimental surface-tension data, a direct comparison of their reliability with earlier papers, often focusing on only a limited number of closely related compounds, seems of little use. For instance, the most similar concept to the present group-additivity method, published in 2000 [12], yet only applying 12 functional groups, yielded a correlation coefficient R 2 of 0.754, based on a training set of only 349 compounds of structurally limited extent. The correlation coefficient of 0.995 and average deviation of 1.7% of the ANN method [10] mentioned earlier, on the other hand, are surprisingly good-and questionable-insofar as for a number of compound examples the experimental values, which have been measured by various scientific groups, scatter by far more than 1.7%, as has been demonstrated in the comprehensive paper [16]. Beyond this, any prediction of a property by means of the ANN method is inevitably bound to the computer incorporating the trained artificial network. By contrast, the greatest advantage of the present approach lies in the fact that no computer is required: The prediction of the surface tension of a compound takes only a simple 2D drawing on a sheet of paper to help to find all the atom groups-and the parameters of Table 2 to sum up their contributions as exemplified at the bottom of Section 2. The large number of presently 165 valid atom groups in Table 2 enables the surface-tension prediction of a wide range of structurally varying molecules, which is evidenced by the surface-tension calculability of 55% of the currently 31,212 compounds in ChemBrain's database, which can be viewed as representative for the entire structural coverage of chemicals.

Conclusions
The present results prove the reliable applicability of the atom-group additivity approach on the molecular surface-tension prediction by simply extending, by a few further lines of control code, the common computer algorithm outlined in [13], which has already demonstrated its extraordinary versatility with the trustworthy prediction of 13 further descriptors described in the previous papers [13][14][15] in a split second in one single sweep on a desktop computer: The heats of combustion, formation (indirectly), solvation, sublimation and vaporization, the entropy of fusion, the partition coefficient logP o/w , the solubility logS water , the refractivity, the polarizability, the toxicity against the protozoan Tetrahymena pyriformis, the liquid viscosity, and the activity coefficient at infinite dilution. In addition, the present method has the advantage of enabling an easily generalizable computer algorithm for the definition of the atom groups, i.e., the atom types and their neighbours. The present work is part of an ongoing project called ChemBrain IXL available from Neuronix Software (www.neuronix.ch, Rudolf Naef, Lupsingen, Switzerland).
Supplementary Materials: Supplementary materials are available online at http://www.mdpi.com/1420-3049/23/5/1224/s1. The list of compounds, their experimental and calculated data and 3D structures of the surface-tension calculations are available online under the names of "S1. Experimental and Calculated Surface-Tension Data Table.doc" and "S2. Compounds List of Surface-Tension Calculations.sdf". A list of their outliers has been added under the name of "S3. Compounds List of Surface-Tension Outliers.xls". The figures are available as tif files and the tables as doc files under the names given in the text.
Author Contributions: R.N. developed project ChemBrain and its software upon which this paper is based, and also fed the database, calculated and analysed the results and wrote the paper. W.E.A. suggested the extension of ChemBrain's tools to include the presented descriptors and contributed the experimental data and the majority of the literature references. Beyond this, R.N. is deeply indebted to W.E.A. for the many valuable discussions.