Next Article in Journal
Automated Equations of State Tuning Workflow Using Global Optimization and Physical Constraints
Previous Article in Journal
Polarity of Organic Solvent/Water Mixtures Measured with Reichardt’s B30 and Related Solvatochromic Probes—A Critical Review
Previous Article in Special Issue
Evaluation of Thermodynamic and Kinetic Contributions to Over-Extraction of Extractables by Nonpolar Organic Solvents in Comparison to Lipids in Exhaustive and Exaggerated Extractions of Medical Devices Based on Abraham Solvation Model and Solvent–Material Interactions Using Low-Density Polyethylene as a Representative Material
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Calculation of the Three Partition Coefficients logPow, logKoa and logKaw of Organic Molecules at Standard Conditions at Once by Means of a Generally Applicable Group-Additivity Method

by
Rudolf Naef
1,* and
William E. Acree, Jr.
2
1
Department of Chemistry, University of Basel, 4003 Basel, Switzerland
2
Department of Chemistry, University of North Texas, Denton, TX 76203, USA
*
Author to whom correspondence should be addressed.
Liquids 2024, 4(1), 231-260; https://doi.org/10.3390/liquids4010011
Submission received: 4 December 2023 / Revised: 9 January 2024 / Accepted: 29 January 2024 / Published: 1 March 2024

Abstract

:
Assessment of the environmental impact of organic chemicals has become an important subject in chemical science. Efficient quantitative descriptors of their impact are their partition coefficients logPow, logKoa and logKaw. We present a group-additivity method that has proven its versatility for the reliable prediction of many other molecular descriptors for the calculation of the first two partition coefficients and indirectly of the third with high dependability. Based on the experimental logPow data of 3332 molecules and the experimental logKoa data of 1900 molecules at 298.15 K, the respective partition coefficients have been calculated with a cross-validated standard deviation S of only 0.42 and 0.48 log units and a goodness of fit Q2 of 0.9599 and 0.9717, respectively, in a range of ca. 17 log units for both descriptors. The third partition coefficient logKaw has been derived from the calculated values of the former two descriptors and compared with the experimentally determined logKaw value of 1937 molecules, yielding a standard deviation σ of 0.67 log units and a correlation coefficient R2 of 0.9467. This approach enabled the quick calculation of 29,462 logPow, 27,069 logKoa and 26,220 logKaw values for the more than 37,100 molecules of ChemBrain’s database available to the public.

1. Introduction

Environmental considerations of organic molecules as potential contaminants have become an important subject in recent years. Several descriptors have been applied to quantify their impact on the natural environment, among them the octanol/water partition coefficient logPow (more recently named logKow), a standard model for the description of the lipophilicity of drugs in medicinal and agricultural chemistry, whereby octanol is the substitute for the natural organic matter, and the octanol/air partition coefficient Koa and the air/water partition coefficient logKaw both indicate the role of the chemicals for air-breathing organisms [1,2,3]. In view of the time consumption and costs of their experimental determination, fast mathematical methods for the prediction of their value attributed to a molecule have been developed. An excellent comprehensive overview of the various methods for the prediction of the logKow—among many other descriptors—is given by Nieto-Draghi et al. [4]. Cappelli et al. [5] analysed a series of free programs based on atom/fragment contributions, hydrophobicity contributions of atoms, the number of carbon atoms and heteroatoms as well as Monte Carlo methods to calculate logPow and found correlation coefficients R2 of between 0.7 and 0.8 and root mean square errors (RMSE) from 0.8 to 1.5. A number of authors [6,7,8,9,10,11,12,13] have successfully carried out logPow calculations for a large variability of compounds based on various group-additivity methods. Plante and Werner [14] presented a logPow prediction method based on the combination of the calculated data of the four different open-source group-additivity calculation methods AlogP, XlogP2, SlogP and XlogP3 into a single model, providing a best RMSE of 0.63. Ulrich et al. [15] used deep neural networks (DNNs) for the logPow calculations, based on ca. 14,000 different SMILES representations of molecules including potential tautomers, whereby, however, a substantial number of compounds might have been presented as duplicates and triplicates to the DNNs. Their best prediction performance yielded an RMSE of 0.47. Recently, an entirely different path was followed by Sun et al. [16]: since logPow is proportional to the Gibbs free energy of the transfer from one solvent to another, it can be calculated using the free solvation energy in these solvents. Sun used the molecular mechanics–Poisson Boltzmann surface area (MM-PBSA) method for the determination of the free energies of solvation. Their best RMSE for the 707 compounds test set was 0.91.
Many publications [17,18,19,20,21,22,23,24,25,26,27,28] dealing with the prediction of the coefficient Koa, based on various QSPR methods, are limited to specific chemical families, thus lacking general applicability. Li et al. [29] used a group-additivity method based on five fragment constants and one structural correction factor for the evaluation of logKoa, limited to halogenated aromatic pollutants. Recently, Ebert et al. [30] suggested a general-purpose fragment model for the calculation of the air/water partition coefficient logKaw resembling the atom group-additivity method presented in one of our earlier papers [13] for the calculation of—among several further descriptors—the octanol/water partition coefficient logPow.
The goal of the present paper was to suggest the extension of a simple tool, which has already served well for the prediction of the octanol/water coefficient logPow described in [13], to enable it to calculate all three mentioned partition coefficients at once by means of a uniform computer algorithm based on the atom group-additivity method detailed in [13]. Since under common standard conditions, any third partition coefficient can be directly calculated from the other two if we neglect the effect of the contamination of water in octanol (and vice versa) influencing the determination of the logPow values, which will be addressed later on, it made sense to select the two coefficients for which any group parameters could be founded on the most reliable as well as the largest number of experimental data. It turned out that the experimental data for the partition coefficients logPow and logKoa provided excellent basis sets for the evaluation of their respective tables of atom and special group parameters. Accordingly, from the subsequently calculated values of a molecule’s logPow and logKoa, its air/water partition coefficient logKaw should easily be evaluable following the equation logKaw ≈ logPow − logKoa.

2. Method

The calculation method is based on a regularly updated object-oriented database of more than 37,100 compounds stored in their geometry-optimised 3D structure, encompassing pharmaceuticals, herbicides, pesticides, fungicides, textile and other dyes, ionic liquids, liquid crystals, metal–organics, lab intermediates and many more, collecting—among further molecular experimental and calculated descriptors—a large set of experimental logPow, logKoa and logKaw data, outlined in the respective sections below. It should be stressed that for the calculation of the partition coefficients, the 3D geometry-optimised form of the compounds is not required—except for the algorithm-based determination of intramolecular hydrogen bridges, the impact of which will be discussed further down. In order to avoid structural ambiguities in the presentation of the chemical structures to the computer algorithm defining the molecules´ atom groups, a special algorithm ensured at the time of the input of a new compound that any six-membered aromatic ring system is defined by six aromatic bonds instead of alternating single–double-bonds.

2.1. Definition of the Atom and Special Groups

The details of the atom group-additivity model applied in the present study have been outlined in [13]. Accordingly, the definition of the atom types and their immediate atomic neighbourhood and meaning are retained as described in Table 1 of [13] and are also valid for both the logPow and logKoa descriptors. However, since these atom groups are not able to cover certain additional structural effects such as intramolecular hydrogen-bridge bonds and the influence of saturated cyclic compared to saturated noncyclic systems, a number of additional special groups had to be introduced. In a paper applying a different group-additivity method for the calculation of logPow, Klopman et al. [8] discovered that the inclusion of a correction value per carbon atom in pure saturated and unsaturated hydrocarbons improved compliance with the experiment. This has indeed been confirmed in the present study.
In order to take account of these and further potential structure-related peculiarities, the list of atom groups has been extended by “special groups” for which the column-title terms “atom type” and “neighbours” in the subsequent tables should not be taken literally, but which the computer algorithm treats in the same way as ordinary atom groups. In Table 1, the respective special groups, their nomenclature and meaning are detailed. In order to enable a future comparison of the contributions of the special group parameter sets within this study, the same special groups have been applied for the calculation of both descriptors logPow and logKoa.
At present, the list of elements is limited to H, B, C, N, O, P, S, Si and halogen, but an extension is always possible, provided that corresponding molecules with experimental descriptor data are available.

2.2. Calculation of the Atom and Special Group Contributions

Since the algorithm for the evaluation of the parameter values of the atom groups has been outlined in detail in [13], its four steps may just be summarised as follows: the first step encompasses the selection of all the compounds from a database of, at present, more than 37,100 compounds for which the experimental descriptor data in question are known and their storage is in a temporary compounds list. In the second step, the molecules in the temporary list are broken down into their constituting atom groups, whereby their central atoms, called “backbone atoms”, are characterised in that they are bound to at least two covalently bound neighbour atoms. The atom groups’ atom types and neighbour terms are generated according to the rules described in [13] and their occurrence is registered. Any molecule carrying an atom group that is not found in the pre-defined group parameters table is discarded from the temporary compound list. The third step generates an M × (N + 1) matrix, wherein M is the number of molecules, N + 1 is the number of pre-defined atom groups plus the container for the molecule´s descriptor value, and the matrix element (i,j) contains the number of registered occurrences of the jth atom group in the ith molecule. Atom groups and their related jth column, which are not present in any molecule of the temporary molecules list, are removed from the M × (N + 1) matrix. In the final step, this adjusted matrix is normalised into an Ax = B matrix, followed by its balancing by means of fast Gauss–Seidel calculus [31] to receive the atom and special group parameters x. These parameters are then added to their related atom and special group in the corresponding parameter table assigned to the specific descriptor.
The group parameter calculation is then immediately followed by the computation of each molecule’s descriptor value in question, on the basis of these group parameters according to Equation (1) outlined in the next section, and compared with its experimental value to receive the related statistics data, which are finally added at the bottom of the parameter table. Following the above-mentioned procedure resulted in the two parameter sets in Table 2 and Table 3, designed for the calculation of the molecules’ logPow and logKoa values, respectively.

2.3. Calculation of Descriptors logPow and logKoa

Based on the aforementioned respective atom-group-parameter tables, the descriptors logPow and logKoa of a molecule can now easily be calculated by simply summing up the contribution of each atom and special group occurring in the molecule, following Equation (1), wherein i and j are the number of atom groups Ai and special groups Bj, respectively, ai is the contribution of atom group Ai, bj is the contribution of special group Bj and c is the constant listed at the top of the respective parameter tables.
Descriptor calc = ∑aiAi + ∑bjBj + c
In Table 4, a typical example is presented with endosulfan sulphate (Figure 1), demonstrating the ease of the calculation of logKoa for which the experimental value was 9.68 [30]. Note that the term “endocyclic bonds” only concerns C-C single bonds.
Evidently, the group-additivity method is limited to the calculation of a molecule’s logPow or logKoa for which a parameter value in the respective Table 2 or Table 3 is known for each atom group that is found in the molecule. Beyond this, since the reliability of these parameter values increases with the number of independent molecules upon which their calculation is based, the lowest reliability limit for these parameters was set to three molecules, which, as a consequence, excluded any atom group based on less than three molecules from further calculations. Accordingly, only atom groups for which the number of molecules is three or more, shown in the rightmost columns of Table 2 and Table 3, have been accepted as “valid” for descriptor calculations. This explains the lower number of molecules for which the logPow and logKoa have been calculated (lines B, C and D in Table 2 and Table 3) than the number upon which the evaluation of the complete set of parameters is based (line A).

2.4. Cross-Validation Calculations

The plausibility of the group-parameter calculations was immediately checked by applying a 10-fold cross-validation algorithm, which comprises 10 recalculations of the complete set of group parameters, whereby, before each recalculation, every other 10th compound of the total compounds’ list is temporarily removed from the calculation and separated into a test list, thus ensuring that each molecule has played the role of a test sample once. The combined test data were then statistically worked up and their results added to Table 2 and Table 3 at the bottom in lines E, F, G and H. It may be noticed that the total number of test compounds shown in the right-most column of the statistics lines is lower than that in the training set in lines B, C and D; this is a consequence of the requirement that only “valid” atom groups are to be used for descriptor calculations, and due to the 10% lower number of training samples in each recalculation, the number of “valid” atom groups (as defined in the prior section) tends to decrease to an unpredictable degree. Atom groups, which are represented by less than three molecules, as shown in the right-most column, and are thus not “valid” for descriptor calculations, are therefore remnants of the parameter calculation based on the complete compound set (line A in Table 2 and Table 3). Nevertheless, they have deliberately been left in Table 2 and Table 3 for use in future calculations with additional molecules potentially carrying under-represented atom groups in this ongoing project.

3. Sources

3.1. Sources of logPow Values

The majority of the experimental logPow data originates from the comprehensive collection of Klopman et al. [8], supplemented by works of Sangster [32] and Lipinski et al. [33], already cited in [13]. Additional data have been provided for unsubstituted and substituted, saturated and unsaturated hydrocarbons, alcohols and esters in the works of Tewari et al. [34]; for heterocycles, hetarenes and carboxylic acids by Ghose et al. [6,7]; complemented for amines, amides and nitro derivatives by Leo [10]. Further data for the aforementioned compound classes have been found in papers by Abraham et al. [35], for certain drugs by Hou and Xu [12] and Wang et al. [11], for organophosphorus derivatives by Czerwinski et al. [36], for the specific energetic compound 2,4-dinitroanisole by Boddu et al. [37], for a number of fluorobenzenes, -anilines and -phenols by Li et al. [38] and finally for a number of pesticides and oil constituents in a paper by Saranjampour et al. [39].

3.2. Sources of logKoa Values

Recently, Ebert et al. [40] published a comprehensive collection of more than 2000 experimental logKoa values upon which the present study is essentially based. This set of data has been complemented with data for 75 chloronaphthalene derivatives by Puzyn et al. [41], for 14 PAHs by Odabasi et al. [42], for some methylsiloxanes and dimethylsilanol by Xu and Kropscott [43] and for ethyl nitrate by Easterbrook et al. [44].

3.3. Sources of logKaw Values

Ebert’s paper [30], cited in the introductory section, presented in their supplementary information a large collection of experimental logKaw data, which served as reference values for the calculated data. Sander [45] provided an extensive library of Henry’s law constants for more than 2600 compounds, which, after translation into logKaw values at 298.15 K, complemented Ebert’s data set.

4. Results

4.1. Partition Coefficient logPow

As shown at the bottom of Table 2, the number of molecules upon which the present group parameter set is based is 3332, substantially larger than the 2780 samples in our earlier paper [13]. Beyond this, the significantly better statistical results in Table 2 (lines B to H) with, e.g., a cross-validated (cv) standard deviation S of 0.42 (line H) vs. the earlier value of 0.51 is the result of the removal of molecules from the parameter computation for which the experimental value deviates by more than three times the value of S. The 122 molecules thus removed (3.5% of the total set) have been collected in an outlier list, available in the Supplementary Materials. The larger number of compounds for the group parameter computation not only significantly improved the statistical results but also enlarged the list of “valid” atom groups from 195 to 214, enabling the calculation of the logPow value of at present 29,462 molecules (79.4% of the total dataset). The correlation coefficients R2 of 0.9648, the (cross-validated) Q2 of 0.9599 and the cv standard error S of 0.42, based on 3246 and 3164 molecules, respectively, are significantly better than in our earlier paper [13] and compare very well not only with Klopman’s [8] results, which are based on a group-additivity method comparable to ours and have R2 and Q2 values of 0.93 and 0.926, respectively, but also with the statistical results of more elaborate calculation methods published recently [4,5,14,15,16]. As shown in the correlation diagram of Figure 2 and the histogram for Figure 3, the experimental logPow values range from −4.6 to +12.53 with a fairly even Gaussian error distribution.
It is worth mentioning that the observation discussed in our earlier paper (see Table 9 in [13]) concerning the two forms of amino acids (nonionic or zwitterionic) is not only confirmed by the new and extended group parameter set of Table 2 but also that the logPow differences in nearly all cases even more clearly distinguish the two forms. On the other hand, the ambiguous results concerning the keto/enol forms of the compounds listed in Table 10 in [13] could not be lifted by the new parameter set, which is not surprising in view of the sometimes strong solvent dependence of the equilibrium, as exemplified with acetylacetone [46]. In view of the discussion of certain particularities concerning the subsequent calculation of the third partition coefficient logKaw in Section 4.3, it should be stressed at this point that the calculated logPow values for the hydrocarbons do not show any abnormal or systematic deviations from experimental values.

4.2. Partition Coefficient logKoa

The calculation of the group parameter set of Table 3 used for the prediction of the logKoa values is essentially based on the curated data set provided in Ebert’s paper [40], whereby compounds with just one “backbone atom” such as halomethanes or hydrocyanide had to be omitted as they are obviously not calculable using the present method. After the removal of another 129 compounds as outliers (6.36% of the total), following the same exclusion criterion as in the previous section, 1900 samples with their experimental data (line A in Table 3) remained for the computation of the group parameter values. Again, the outliers have been collected in a separate list available in the Supplementary Materials for readers who might want to re-evaluate their logKoa values.
The subsequent calculation of the logKoa values of 1829 training and 1765 test molecules based on 167 “valid” atom and special groups (line A) revealed excellent statistical results with a correlation coefficient R2 of 0.9765, a standard deviation s of 0.44 (lines B and D) and a cross-validated Q2 of 0.9717 with a corresponding S of 0.48 (lines F and H), visualised in the correlation diagram on Figure 4 and the histogram on Figure 5. These statistical data even outperform those given in Ebert’s paper and thus also the competing methods mentioned therein such as COSMOtherm [47] and EPI-Suite KOAWIN [48], not only confirming the versatility but also the reliability of the present group-additivity approach, which allowed the calculation of the logKoa value for 27,044 molecules (72.9% of the entire database). Again, it should be kept in mind that just like in Section 4.1, any particularly large or systematic deviations between the experimental and calculated logKoa values for the hydrocarbons could not be observed.

4.3. Partition Coefficient logKaw

Once the partition coefficients logPow and logKoa were calculated by means of the group-additivity method based on Table 2 and Table 3, respectively, it was easy to determine the logKaw values, applying Equation (2) on each molecule in the database for which both descriptors had been calculated, adding up to 26,220 molecules. In order to assess the quality of the logKaw values, it is important to recognise the flaws of this approach: while the logPow values were experimentally measured in a mixture of water-saturated octanol and octanol-saturated water, the logKoa measurements occurred in dry octanol, an aspect that has been discussed in detail by Ebert et al. [40]. Hence, Equation (2) serves only as an approximation. In addition, since both descriptors on the right side of the equation appear with their own standard error, the error-propagation rule stipulates a standard error of logKaw that is clearly larger than either of the two constituting descriptors. Entering the standard errors S for the test molecules of 0.42 (for logPow) and 0.48 (for logKoa) into an error-propagation calculation, the expected standard error S for logKaw is 0.638.
logKaw (calc) = logPow (calc) − logKoa (calc)
In order to test the reliability of the thus-calculated logKaw values, a representative number of experimentally determined logKaw data, extracted from the comprehensive databases of Ebert et al. [30] and Sander [45], were added to the database. In the latter case, the Henry’s law solubility constants Hscp were translated into the corresponding logKaw values at 298.15 K. The comparison of the calculated with the experimental logKaw values is visualised in the correlation diagram of Figure 6 and the histogram in Figure 7.
The complete set of experimental data was separated from the outliers, applying the same exclusion conditions as for the logPow and logKoa values, and the outliers were collected in a corresponding list, available in the Supplementary Materials. Comparison of the remaining dataset with the calculated values yielded a standard error of 0.67, slightly higher than that predicted by the error-propagation calculation. A detailed analysis of the experimental data revealed two potential explanations for the inordinate scatter: (1) Within a series of substitution isomers, e.g., the tetra- or hexachlorobiphenyls, the tri- or pentachlorodiphenyl ethers or the dichloroanisoles, the experimental logKaw values varied in a range of up to and over 1 unit, which was hard to assign to the specific positioning of the substituents. At any rate, the group-additivity-based calculation of the logPow and logKoa values was not able to distinguish between these substitution isomers. (2) Sander’s comprehensive database of Henry’s law constants [45], listing the experimental Hscp values for a compound originating from various authors, showed for many compounds large differences between their Hscp values, in some cases exceeding one unit after translation into logKaw, e.g., for undecane, acetylacetone or anthraquinone.
A thorough analysis of the correlation diagram in Figure 6 and the histogram in Figure 7 revealed an interesting peculiarity, visible as an indentation at the upper end of the correlation diagram and as a weak hump on the right side of the histogram: except for some siloxanes with experimental logKaw values above 1.6 and normal scatter about calculated values, the predicted logKaw for the remaining compounds with experimental logKaw values above −1.0 were nearly systematically too low by ca. 0.5–1 units. It turned out that they were all pure hydrocarbons, in particular alkanes, alkenes and alkynes. The correlation diagram of the logKaw data in Figure 8, focussing on these hydrocarbons, confirms this observation.
Since, as mentioned in Section 4.1 and Section 4.2, no particularly large or systematic deviations between the experimental and calculated logPow and logKoa data for the hydrocarbons could be detected, a potential explanation for this peculiarity might be based on the experimental conditions for the determination of the logPow values as mentioned by Ebert et al. [40]: since water-saturated octanol is a more polar solvent than pure octanol, while octanol-saturated water is less polar than pure water, the experimental logPow values, measured in an octanol/water mixture, tend to be shifted to smaller absolute values than theory would predict. While this is true for all measured solutes, it is possibly most effective for the least polar solutes such as the mentioned hydrocarbons, thus leading to experimental logPow values that are particularly low for the hydrocarbons. As a consequence, their calculation based on the group-additivity method predicts equally low logPow values, which again lead to low logKaw data when Equation (2) is applied and when compared with experimental logKaw values that are determined under pure air/water conditions.

4.4. Interpretation of the Special Groups’ Contributions to logPow and logKoa, and Ultimately for logKaw

While the atom group parameters are descriptor-specific and their comparison between descriptors does not make sense, special groups serve as differentiators of molecules that carry these groups from those that do not. Therefore, their meanings have overlapping descriptors; their values, however, must be viewed in the context of the value range of the descriptors. In the present case, the value ranges of logPow and logKoa are similar (ca. 17 log units) and in the same area, and thus, a direct comparison of the special group contributions in Table 2 and Table 3 is permissible and leads to a few interesting observations: While the groups “(COH)n”, “Alkane”, “Unsaturated HC” and “Endocyclic bonds” in both tables only contribute to a minor degree (but nevertheless improve the statistical results) and consequently show only minor differences between the two tables, a significant difference was found for the groups “H/H Acceptor” and “(COOH)n”. The former special group, taking account of intramolecular hydrogen bridges, indicates a small but clearly higher chance of being a compound carrying an intramolecular H-bridge towards the octanol side in an octanol/water mixture, thus raising the logPow value. In contrast, the same H-bridge-carrying molecule has its inclination significantly shifted to the air side in an octanol/air environment compared to that without H-bridges, expressed in its lower logKoa value. The reason may be found in the lower solvent–solute interaction caused by the H-bridge being bound intramolecularly, leading in both cases to a preference for the less polar of the respective two media. A typical example is the compound couple 2- and 3-nitroaniline, sampled in Table 5, where the former molecule carries a H-bridge between an amino-H and an oxygen of the nitro group.
An inverse effect can be found with molecules carrying two or more carboxylic acid functions: While the additional contribution of a second or third COOH function shows little effect in an octanol/water environment with a slightly increased shift towards water, leading to a lower logPow value, in an octanol/air environment, each additional COOH group drastically tilts the equilibrium towards the octanol side, thus strongly raising the logKoa value. This may be demonstrated by the couple of hexanoic/1,6-hexanedioic acid, where both have the same carbon-chain length but where the second molecule carries two carboxylic acid functions, which tilts the octanol/air equilibrium by a factor of more than 10,000 towards the octanol side as shown in Table 6. Now, it is well known that monocarboxylic acids usually exist as dimeric associates in all three aggregate states. This association effect on the solubility is inherently taken into account in the atom group parameter evaluation of the COOH function. On the other hand, dicarboxylic acids do not only form dimers but also cyclical and linear oligomeric associates, with drastic consequences on their solubility in the various solvents. It is these additional associations that are considered by the special group “(COOH)n”.
As a consequence, solutes with a low tendency to interact with solvents, either inherent or induced by intramolecular hydrogen bridges, show a trend to higher logKaw values; the additional intermolecular association of di- and tricarboxylic acids, on the other hand, results in a significantly lower logKaw value, as exemplified in Table 7, where the respective calculated data in Table 5 and Table 6 have been applied in Equation (2). The experimental logKaw values have been extracted from Ebert et al. [30].

5. Conclusions

The present study, which is part of an ongoing project, put to use a tool for the simple and reliable calculation of the two partition coefficients logPow and logKoa that has proven its unmatched versatility in the equally reliable prediction of now up to 19 physical, thermodynamic, solubility-, optics-, charge- and environment-related molecular descriptors [13,49,50,51,52,53,54,55], based on a common group-additivity method. The large database of more than 3300 and 1900 experimental data, respectively, upon which the group parameters for the logPow and logKoa calculations are founded enabled their prediction for nearly 29,500 and more than 27,000 molecules, respectively, of the presently more than 37,100 compounds in ChemBrain’s database. In addition, these results also allowed the trustworthy calculation of the third partition coefficient logKaw for more than 26,000 compounds. The big advantage of the present approach is its ease of use by simply adding, by means of paper and pencil, the parameters of the atoms and groups found in a particular molecule, which are listed in the respective Table 2 and Table 3.
The mentioned project’s software is called ChemBrain IXL, available from Neuronix Software, version ChemBrain IXL 5.9.70.1 (www.neuronix.ch (accessed on 27 November 2023), Rudolf Naef, Lupsingen, Switzerland).

Supplementary Materials

The following supporting information can be downloaded at https://www.mdpi.com/article/10.3390/liquids4010011/s1: The lists of compounds used in the present work, collected in their 3D structure together with their experimental data, are available as standard SDF files for use in external chemical software under the names of “S01. Compounds List for logPow-Parameters Calculations.sdf”, “S02. Compounds List for logKoa-Parameters Calculations.sdf” and “S03. Compounds List with exp logKaw Data”. The compounds used in the correlation diagrams and histograms are listed with their names and experimental and calculated data under the respective names of “S04. Compounds with Experimental vs. Calculated logPow Values.doc”, “S05. Compounds with Experimental vs. Calculated logKoa Values.doc”, “S06. Compounds with Experimental vs. Calculated logKaw Values.doc” and “S07. Alkanes, Alkenes and Alkynes with Exp. vs. Calc. logKaw Values.doc”. In addition, for each of the three partition coefficients, a list of their outliers has been added under the names of “S08. Outliers of logPow.doc”, “S09. Outliers of logKoa.doc” and “S10. Outliers of logKaw.doc”. Beyond this, the Supplementary Materials encompass all the figures and tables cited in the text as .tif and .doc files, respectively.

Author Contributions

R.N. developed the project ChemBrain and its software upon which this paper is based, and also fed the database, calculated and analysed the results and wrote the paper. W.E.A.J. suggested the extension of ChemBrain’s tool and contributed experimental data and the great majority of the literature references. Beyond this, R.N. is indebted to W.E.A.J. for the many valuable discussions. All authors have read and agreed to the published version of the manuscript.

Funding

This research received no external funding.

Data Availability Statement

The data presented in this study are available in the Supplementary Materials.

Acknowledgments

R.N. is indebted to the library of the University of Basel for allowing him full and free access to the electronic literature database.

Conflicts of Interest

The author declares no conflicts of interest.

References

  1. Simonich, S.L.; Hites, R.A. Organic Pollutant Accumulation in Vegetation. Environ. Sci. Technol. 1995, 29, 2905–2914. [Google Scholar] [CrossRef]
  2. McLachlan, M.S. Bioaccumulation of Hydrophobic Chemicals in Agricultural Food Chains. Environ. Sci. Technol. 1996, 30, 252–259. [Google Scholar] [CrossRef]
  3. Doucette, W.J.; Shunthirsasingham, C.; Dettenmaier, E.M.; Zaleski, R.T.; Fantke, P.; Arnot, J.A. A Review of Measured Bioaccumulation Data on Terrestrial Plants for Organic Chemicals:Metrics, Variability, and the Need for Standardized Measurement Protocols. Environ. Toxicol. Chem. 2018, 37, 21–33. [Google Scholar] [CrossRef] [PubMed]
  4. Nieto-Draghi, C.; Fayet, G.; Creton, B.; Rozanska, X.; Rotureau, P.; de Hemptinne, J.-C.; Ungerer, P.; Rousseau, B.; Adamo, C. A General Guidebook for the Theoretical Prediction of Physicochemical Properties of Chemicals for Regulatory Purposes. Chem. Rev. 2015, 115, 13093–13164. [Google Scholar] [CrossRef] [PubMed]
  5. Cappelli, C.I.; Benfenati, E.; Cester, J. Evaluation of QSAR models for predicting the partition coefficient (log P) of chemicals under the REACH regulation. Environ. Res. 2015, 43, 26–32. [Google Scholar] [CrossRef]
  6. Ghose, A.K.; Crippen, G.M. Atomic physicochemical parameters for three-dimensional structure-directed quantitative structure-activity relationships I. Partition coefficients as a measure of hydrophobicity. J. Computer. Chem. 1986, 7, 565–577. [Google Scholar] [CrossRef]
  7. Ghose, A.K.; Pritchett, A.; Crippen, G.M. Atomic physicochemical parameters for three dimensional structure directed quantitative structure-activity relationships III: Modeling hydrophobic interactions. J. Comput. Chem. 1988, 9, 80–90. [Google Scholar] [CrossRef]
  8. Klopman, G.; Li, J.-Y.; Wang, S.; Dimayuga, M. Computer automated log P calculations based on an extended group contribution approach. J. Chem. Inf. Comput. Sci. 1994, 34, 752–781. [Google Scholar] [CrossRef]
  9. Visvanadhan, V.N.; Ghose, A.K.; Revankar, G.R.; Robins, R.K. Atomic physicochemical parameters for three dimensional structure directed quantitative structure-activity relationships. 4. Additional parameters for hydrophobic and dispersive interactions and their application for an automated superposition of certain naturally occurring nucleoside antibiotics. J. Chem. Inf. Comput. Sci. 1989, 29, 163–172. [Google Scholar] [CrossRef]
  10. Leo, A.J. Calculating log Poct from structures. Chem. Rev. 1993, 93, 1281–1306. [Google Scholar] [CrossRef]
  11. Wang, R.; Fu, Y.; Lai, L. A new atom-additive method for calculating partition coefficients. J. Chem. Inf. Comput. Sci. 1997, 37, 615–621. [Google Scholar] [CrossRef]
  12. Hou, T.J.; Xu, X.J. ADME evaluation in drug discovery. 2. Prediction of partition coefficient by atom-additive approach based on atom-weighted solvent accessible surface areas. J. Chem. Inf. Comput. Sci. 2003, 43, 1058–1067. [Google Scholar] [CrossRef]
  13. Naef, R. A Generally Applicable Computer Algorithm Based on the Group Additivity Method for the Calculation of Seven Molecular Descriptors: Heat of Combustion, LogPO/W, LogS, Refractivity, Polarizability, Toxicity and LogBB of Organic Compounds; Scope and Limits of Applicability. Molecules 2015, 20, 18279–18351. [Google Scholar] [CrossRef] [PubMed]
  14. Plante, J.; Werner, S. JPlogP: An improved logP predictor trained using predicted data. J. Cheminform. 2018, 10, 61. [Google Scholar] [CrossRef] [PubMed]
  15. Ulrich, N.; Goss, K.-U.; Ebert, A. Exploring the octanol–water partition coefficient dataset using deep learning techniques and data augmentation. Com. Chem. 2021, 4, 90. [Google Scholar] [CrossRef] [PubMed]
  16. Sun, Y.; Hou, T.; He, X.; Man, V.H.; Wang, J. Development and test of highly accurate endpoint free energy methods. 2: Prediction of logarithm of n-octanol–water partition coefficient (logP) for druglike molecules using MM-PBSA method. J. Comput. Chem. 2023, 44, 1300–1311. [Google Scholar] [CrossRef] [PubMed]
  17. Chen, J.; Harner, T.; Schramm, K.W.; Quan, X.; Xue, X.; Wu, W.; Kettrup, A. Quantitative relationships between molecular structures, environmental temperatures and octanol/air partition coefficients of PCDD/Fs. Sci. Total Environ. 2002, 300, 155–166. [Google Scholar] [CrossRef] [PubMed]
  18. Chen, J.; Harner, T.; Yang, P.; Quan, X.; Chen, S.; Schramm, K.W.; Kettrup, A. Quantitative predictive models for octanol/air partition coefficients of polybrominated diphenyl ethers at different temperatures. Chemosphere 2003, 51, 577–584. [Google Scholar] [CrossRef] [PubMed]
  19. Chen, J.; Harner, T.; Schramm, K.W.; Quan, X.; Xue, X.; Kettrup, A. Quantitative relationships between molecular structures, environmental temperatures and octanol/air partition coefficients of polychlorinated biphenyls. Comput. Biol. Chem. 2003, 27, 405–421. [Google Scholar] [CrossRef]
  20. Zhao, H.; Chen, J.; Quan, X.; Qu, B.; Liang, X. Octanol/air partition coefficients of polybrominated biphenyls. Chemosphere 2009, 74, 1490–1494. [Google Scholar] [CrossRef]
  21. Staikova, M.; Wania, F.; Donaldson, D. Molecular polarizability as a single parameter predictor of vapour pressures and octanoleair partitioning coefficients of non-polar compounds: A priori approach and results. Atmos. Environ. 2004, 38, 213–225. [Google Scholar] [CrossRef]
  22. Zhao, H.; Zhang, Q.; Chen, J.; Xue, X.; Liang, X. Prediction of octanol/air partition coefficients of semivolatile organic compounds based on molecular connectivity index. Chemosphere 2005, 59, 1421–1426. [Google Scholar] [CrossRef]
  23. Zeng, X.L.; Zhang, X.L.; Wang, Y. Qspr modeling of n-octanol/air partition coefficients and liquid vapor pressures of polychlorinated dibenzo-p-dioxins. Chemosphere 2013, 91, 229–232. [Google Scholar] [CrossRef] [PubMed]
  24. Liu, H.; Shi, J.; Liu, H.; Wang, Z. Improved 3D-QSPR analysis of the predictive octanol/air partition coefficients of hydroxylated and methoxylated polybrominated diphenyl ethers. Atmos. Environ. 2013, 77, 840–845. [Google Scholar] [CrossRef]
  25. Jiao, L.; Gao, M.; Wang, X.; Li, H. QSPR study on the octanol/air partition coefficient of polybrominated diphenyl ethers by using molecular distance-edge vector index. Chem. Cent. J. 2014, 8, 36. [Google Scholar] [CrossRef]
  26. Chen, Y.; Cai, X.; Jiang, L.; Li, Y. Prediction of octanol-air partition coefficients for polychlorinated biphenyls (PCBs) using 3D-SQAR models. Ecotoxicol. Environ. Saf. 2016, 124, 202–212. [Google Scholar] [CrossRef] [PubMed]
  27. Fu, Z.; Chen, J.; Li, X.; Wang, Y.; Yu, H. Comparison of prediction methods for octanol-air partition coefficients of diverse organic compounds. Chemosphere 2016, 148, 118–125. [Google Scholar] [CrossRef] [PubMed]
  28. Jin, X.; Fu, Z.; Li, X.; Chen, J. Development of polyparameter linear free energy relationship models for octanol/air partition coefficients of diverse chemicals. Environ. Sci. Process. Impact. 2017, 19, 300–306. [Google Scholar] [CrossRef] [PubMed]
  29. Li, X.; Chen, J.; Zhang, L.; Qiao, X.; Huang, L. The fragment constant method for predicting octanol/air partition coefficients of persistent organic pollutants at different temperatures. J. Phys. Chem. Ref. Data 2006, 35, 1365–1384. [Google Scholar] [CrossRef]
  30. Ebert, R.-U.; Kühne, R.; Schüürmann, G. Henry’s Law Constant—A General-Purpose Fragment Model to Predict Log Kaw from Molecular Structure. Environ. Sci. Technol. 2023, 57, 160–167. [Google Scholar] [CrossRef]
  31. Hardtwig, E. Fehler—Und Ausgleichsrechnung; Bibliographisches Institut AG: Mannheim, Germany, 1968. [Google Scholar]
  32. Sangster, J. Octanol-water partition coefficients of simple organic compounds. J. Phys. Chem. Ref. Data 1989, 18, 1111–1229. [Google Scholar] [CrossRef]
  33. Lipinski, C.A.; Lombardo, F.; Dominy, B.W.; Feeney, P.J. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 1997, 23, 3–25. [Google Scholar] [CrossRef]
  34. Tewari, Y.B.; Miller, M.M.; St. Wasik, P.; Martire, D.E. Aqueous Solubility and Octanol/Water Partition Coefficient of Organic Compounds at 25.0 °C. J. Chem. Eng. Data 1982, 27, 451–454. [Google Scholar] [CrossRef]
  35. Abraham, M.H.; Chadha, H.S.; Whiting, G.S.; Mitchell, R.C. Hydrogen Bonding. 32. An Analysis of Water-Octanol and Water-Alkane Partitioning and the Δlog P Parameter of Seiler. J. Pharm. Sci. 1994, 83, 1085–1100. [Google Scholar] [CrossRef] [PubMed]
  36. St. Czerwinski, E.; Skvorak, J.P.; Maxwell, D.M.; Lenz, D.E.; St. Baskin, I. Organophosphorus Compounds on Biodistribution and Percutaneous Toxicity. J. Biochem. Mol. Tox. 2006, 20, 241–246. [Google Scholar] [CrossRef] [PubMed]
  37. Boddu, V.M.; Abburi, K.; St. Maloney, W.; Damavarapu, R. Thermophysical Properties of an Insensitive Munitions Compound, 2,4-Dinitroanisole. J. Chem. Eng. Data 2008, 53, 1120–1125. [Google Scholar] [CrossRef]
  38. Li, X.-J.; Shan, G.; Liu, H.; Wang, Z.-Y. Determination of lgKow and QSPR Study on Some Fluorobenzene Derivatives. Chin. J. Struct. Chem. 2009, 28, 1236–1241. [Google Scholar]
  39. Saranjampour, P.; Vebrosky, E.N.; Armbrust, K.L. Salinity Impacts on Water Solubility and n-Octanol/Water Partition Coefficients of Selected Pesticides and Oil Constituents. Environ. Toxicol. Chem. 2017, 36, 2274–2280. [Google Scholar] [CrossRef] [PubMed]
  40. Ebert, R.-U.; Kühne, R.; Schüürmann, G. Octanol/Air Partition Coefficient. A General-Purpose Fragment Model to Predict Log Koa from Molecular Structure. Environ. Sci. Technol. 2023, 57, 976–984. [Google Scholar] [CrossRef] [PubMed]
  41. Puzyn, T.; Falandysz, J.; Rostkowski, P.; Piliszek, S.; Wilczynska, A. Computational estimation of logarithm of octanol/air partition coefficients and subcooled vapour pressures for each of 75 chloronaphtalene congeners. Phys.-Chem. Prop. Distr. Model. Organohal. Compds. 2004, 66, 2354–2360. [Google Scholar] [CrossRef]
  42. Odabasi, M.; Cetin, E.; Sofuoglu, A. Determination of octanol–air partition coefficients and supercooled liquid vapor pressures of PAHs as a function of temperature: Application to gas–particle partitioning in an urban atmosphere. Atm. Environ. 2006, 40, 6615–6625. [Google Scholar] [CrossRef]
  43. Xu, S.; Kropscott, B. Method for Simultaneous Determination of Partition Coefficients for Cyclic Volatile Methylsiloxanes and Dimethylsilanediol. Anal. Chem. 2012, 84, 1948–1955. [Google Scholar] [CrossRef] [PubMed]
  44. Easterbrook, K.D.; Vona, M.A.; Osthoff, H.D. Measurement of Henry’s law constants of ethyl nitrate in deionized water, synthetic sea salt solutions, and n-octanol. Chemosphere 2024, 346, 140482. [Google Scholar] [CrossRef] [PubMed]
  45. Sander, R. Compilation of Henry’s law constants (version 5.0.0) for water as solvent. Atmos. Chem. Phys. 2023, 23, 10901–12440. [Google Scholar] [CrossRef]
  46. Allen, G.; Dwek, R.A. An n.m.r. study of keto-enol tautomerism in β-diketones. J. Chem. Soc. B 1966, 1966, 161–163. [Google Scholar] [CrossRef]
  47. COSMOlogic GmbH Co. KG. A Dassault Systemes Company, Version 19.0.4, COSMOthermX. 2019. Available online: www.cosmologic.de (accessed on 4 December 2023).
  48. US EPA. Estimation Programs Interface Suite for Microsoft Windows; Version 4.11, Module KOAWIN v. 1.11; United States Environmental Protection Agency: Washington, DC, USA, 2015. [Google Scholar]
  49. Naef, R.; Acree, W.E., Jr. Calculation of Five Thermodynamic Molecular Descriptors by Means of a General Computer Algorithm Based on the Group-Additivity Method: Standard Enthalpies of Vaporization, Sublimation and Solvation, and Entropy of Fusion of Ordinary Organic Molecules and Total Phase-Change Entropy of Liquid Crystals. Molecules 2017, 22, 1059. [Google Scholar] [CrossRef]
  50. Naef, R.; Acree, W.E. Application of a General Computer Algorithm Based on the Group-Additivity Method for the Calculation of Two Molecular Descriptors at Both Ends of Dilution: Liquid Viscosity and Activity Coefficient inWater at Infinite Dilution. Molecules 2018, 23, 5. [Google Scholar] [CrossRef]
  51. Naef, R.; Acree, W.E., Jr. Calculation of the Surface Tension of Ordinary Organic and Ionic Liquids by Means of a Generally Applicable Computer Algorithm Based on the Group-Additivity Method. Molecules 2018, 23, 1224. [Google Scholar] [CrossRef] [PubMed]
  52. Naef, R. Calculation of the Isobaric Heat Capacities of the Liquid and Solid Phase of Organic Compounds at 298.15K by Means of the Group-Additivity Method. Molecules 2020, 25, 1147. [Google Scholar] [CrossRef]
  53. Naef, R.; Acree, W.E., Jr. Calculation of the Vapour Pressure of Organic Molecules by Means of a Group-Additivity Method and Their Resultant Gibbs Free Energy and Entropy of Vaporization at 298.15 K. Molecules 2021, 26, 1045. [Google Scholar] [CrossRef] [PubMed]
  54. Naef, R.; Acree, W.E., Jr. Revision and Extension of a Generally Applicable Group-Additivity Method for the Calculation of the Standard Heat of Combustion and Formation of Organic Molecules. Molecules 2021, 26, 6101. [Google Scholar] [CrossRef] [PubMed]
  55. Naef, R.; Acree, W.E., Jr. Revision and Extension of a Generally Applicable Group Additivity Method for the Calculation of the Refractivity and Polarizability of Organic Molecules at 298.15 K. Liquids 2022, 2, 327–377. [Google Scholar] [CrossRef]
Figure 1. Endosulfan sulphate (graphics by ChemBrain IXL).
Figure 1. Endosulfan sulphate (graphics by ChemBrain IXL).
Liquids 04 00011 g001
Figure 2. Correlation diagram of the logPow data. Cross-validation data are superpositioned as red circles (10-fold cross-valid.: N = 3246; Q2 = 0.9599; regression line: intercept = 0.1052; slope = 0.9636).
Figure 2. Correlation diagram of the logPow data. Cross-validation data are superpositioned as red circles (10-fold cross-valid.: N = 3246; Q2 = 0.9599; regression line: intercept = 0.1052; slope = 0.9636).
Liquids 04 00011 g002
Figure 3. Histogram of the logPow data. Cross-validation data are superpositioned as red bars. (σ = 0.39; S = 0.42; experimental values range from −4.6 to +12.53).
Figure 3. Histogram of the logPow data. Cross-validation data are superpositioned as red bars. (σ = 0.39; S = 0.42; experimental values range from −4.6 to +12.53).
Liquids 04 00011 g003
Figure 4. Correlation diagram of the logKoa data. Cross-validation data are superpositioned as red circles (10-fold cross-valid.: N = 1829; Q2 = 0.9717; regression line: intercept = 0.1997; slope = 0.9729; MAPD = 6.39%).
Figure 4. Correlation diagram of the logKoa data. Cross-validation data are superpositioned as red circles (10-fold cross-valid.: N = 1829; Q2 = 0.9717; regression line: intercept = 0.1997; slope = 0.9729; MAPD = 6.39%).
Liquids 04 00011 g004
Figure 5. Histogram of the logKoa data. Cross-validation data are superpositioned as red bars (σ = 0.44; S = 0.48; experimental values range from 0.28 to 17.15).
Figure 5. Histogram of the logKoa data. Cross-validation data are superpositioned as red bars (σ = 0.44; S = 0.48; experimental values range from 0.28 to 17.15).
Liquids 04 00011 g005
Figure 6. Correlation diagram of the logKaw data (N = 1937; Q2 = 0.9467; regression line: intercept = −0.4196; slope = 0.9044).
Figure 6. Correlation diagram of the logKaw data (N = 1937; Q2 = 0.9467; regression line: intercept = −0.4196; slope = 0.9044).
Liquids 04 00011 g006
Figure 7. Histogram of the logKaw data (S = 0.67; experimental values range from −17.99 to +3.71).
Figure 7. Histogram of the logKaw data (S = 0.67; experimental values range from −17.99 to +3.71).
Liquids 04 00011 g007
Figure 8. Correlation diagram of the logKaw data for alkanes, alkenes and alkynes (N = 170).
Figure 8. Correlation diagram of the logKaw data for alkanes, alkenes and alkynes (N = 170).
Liquids 04 00011 g008
Table 1. Special groups and their meaning.
Table 1. Special groups and their meaning.
Atom TypeNeighboursMeaning
HH AcceptorCorrection value for intramolecular H bridge between acidic H (on O, N or S) and basic acceptor (O, N or F)
(COH)nn > 1Correction value for each additional hydroxy group
(COOH)nn > 1Correction value for each additional carboxylic acid group
AlkaneNo. of C atomsCorrection value for each C atom in a pure alkane
Unsaturated HCNo. of C atomsCorrection value for each C atom in an aromatic hydrocarbon
Endocyclic bondsNo. of single bondsCorrection value for each single endocyclic bond
Table 2. Atom and special groups and their contribution in logPow calculations.
Table 2. Atom and special groups and their contribution in logPow calculations.
EntryAtom TypeNeighboursContributionOccurrencesMolecules
1Const 0.7333323332
2B(−)F42.711010
3C sp3H3C0.2726141498
4C sp3H3N0.14457320
5C sp3H3N(+)−1.3522
6C sp3H3O−0.26375285
7C sp3H3P−0.344
8C sp3H3S−0.346153
9C sp3H3Si0.76445
10C sp3H2C20.4432621046
11C sp3H2CN0.42741429
12C sp3H2CN(+)−0.863225
13C sp3H2CO−0.1799604
14C sp3H2CS−0.339769
15C sp3H2CF−0.2955
16C sp3H2CCl0.338467
17C sp3H2CBr0.415448
18C sp3H2CJ1.0866
19C sp3H2CP2.7711
20C sp3H2N22.0533
21C sp3H2NO0.4644
22C sp3H2NS0.7233
23C sp3H2O2−0.1766
24C sp3H2S2−0.8666
25C sp3HC30.45417269
26C sp3HC2N0.58200157
27C sp3HC2N(+)−0.732524
28C sp3HC2O0.1383241
29C sp3HC2S−0.2188
30C sp3HC2F−0.3622
31C sp3HC2Cl0.696422
32C sp3HC2Br0.812622
33C sp3HCN21.265
34C sp3HCNO1.151717
35C sp3HCNS0.92525
36C sp3HCO2−0.023122
37C sp3HCOS0.633
38C sp3HCOCl0.1931
39C sp3HCOBr1.0311
40C sp3HCOP0.3111
41C sp3HCF2−0.0222
42C sp3HCCl20.931312
43C sp3HOF2−0.0411
44C sp3C40.54144111
45C sp3C3N0.713736
46C sp3C3N(+)−0.4366
47C sp3C3O0.045452
48C sp3C3S−0.11717
49C sp3C3F0.9444
50C sp3C3Cl0.8218
51C sp3C3Br0.5954
52C sp3C2N2−1.1711
53C sp3C2NO0.5255
54C sp3C2O21.6555
55C sp3C2F20.6722
56C sp3C2Cl20.8499
57C sp3CNO21.4611
58C sp3CF30.868076
59C sp3CF2Cl1.132
60C sp3CFCl21.132
61C sp3CCl31.62321
62C sp3CCl2Br011
63C sp3CBr32.4411
64C sp3OF30.822
65C sp3SF31.0488
66C sp3SFCl21.911
67C sp3SCl30.7633
68C sp2H2=C0.259787
69C sp2H2=N−0.6211
70C sp2HC=C0.24449285
71C sp2HC=N−1.981818
72C sp2HC=N(+)0.941010
73C sp2H=CN−0.08146109
74C sp2H=CN(+)−0.61818
75C sp2HC=O−0.734545
76C sp2H=CO0.321413
77C sp2H=CS0.021716
78C sp2H=CCl0.5186
79C sp2H=CBr0.5911
80C sp2HN=N−0.066552
81C sp2HN=O−0.631615
82C sp2HO=O−0.41010
83C sp2H=NS−0.5144
84C sp2C2=C0.38160133
85C sp2C2=N−0.25105102
86C sp2C2=N(+)2.4511
87C sp2C2=O−0.86242194
88C sp2C=CN0.767664
89C sp2C=CN(+)−0.5633
90C sp2C=CO0.644136
91C sp2C=CS−0.161715
92C sp2C=CF−0.0133
93C sp2C=CCl0.813121
94C sp2C=CBr0.9444
95C sp2C=CJ0.8911
96C sp2C=CP011
97C sp2=CN21.361919
98C sp2=CN2(+)0.741111
99C sp2CN=N0.246763
100C sp2CN=N(+)−0.6711
101C sp2CN=O−0.69449364
102C sp2C=NO−0.7611
103C sp2=CNO−0.0144
104C sp2=CNO(+)−0.3722
105C sp2CN=S−0.3688
106C sp2C=NS0.0754
107C sp2=CNS0.3744
108C sp2=CNCl1.9411
109C sp2=CNBr0.753
110C sp2C=NCl1.7511
111C sp2CO=O−0.13700613
112C sp2CO=O(-)−2.163535
113C sp2C=OS−0.9944
114C sp2C=OCl0.2844
115C sp2=COCl1.2711
116C sp2=CS2033
117C sp2=CSBr−2.4111
118C sp2=CF20.2611
119C sp2=CCl21.211210
120C sp2=CBr21.3611
121C sp2N2=N0.792625
122C sp2N2=N(+)0.7411
123C sp2N2=O0.07135134
124C sp2N=NO0.1111
125C sp2N2=S0.1198
126C sp2N=NS0.242524
127C sp2N=NCl1.1333
128C sp2N=NBr0.2432
129C sp2NO=O0.2117114
130C sp2=NOS−0.1911
131C sp2N=OS0.0577
132C sp2NO=S0.9711
133C sp2=NS2−1.6522
134C sp2NS=S−1.0253
135C sp2=NSCl1.1711
136C sp2O2=O033
137C sp2O=OCl−0.1333
138C aromaticH:C20.2599632133
139C aromaticH:C:N−0.49283193
140C aromaticH:C:N(+)0.223327
141C aromaticH:N2−0.9199
142C aromatic:C30.25389170
143C aromaticC:C20.3220231351
144C aromaticC:C:N−0.387462
145C aromaticC:C:N(+)−3.2943
146C aromatic:C2N0.39653534
147C aromatic:C2N(+)−0.15194161
148C aromatic:C2:N−0.099372
149C aromatic:C2:N(+)−3.541919
150C aromatic:C2O0.571076742
151C aromatic:C2S0.08208170
152C aromatic:C2F0.2712686
153C aromatic:C2Cl0.781718565
154C aromatic:C2Br0.9248111
155C aromatic:C2J1.265034
156C aromatic:C2P1.0811
157C aromaticC:N2−1.8199
158C aromatic:C:N2−0.1311
159C aromatic:CN:N0.493834
160C aromatic:CN:N(+)−0.8311
161C aromatic:C:NO0.972115
162C aromatic:C:NS−0.1655
163C aromatic:C:NF−0.2343
164C aromatic:C:NCl0.161816
165C aromatic:C:NBr0.0611
166C aromaticN:N2−0.055141
167C aromaticN:N2(+)011
168C aromatic:N2O1.5388
169C aromatic:N2S0.833
170C aromatic:N2Cl0.8966
171C(+) aromaticH:N20.212525
172C spH#C−0.272828
173C spC#C0.28657
174C spC#N−0.7136130
175C spN#N0.0433
176C sp#NS−0.5955
177C sp=N=O0.6444
178C sp=N=S1.532726
179N sp3H2C−1.578684
180N sp3H2C(pi)−1.05326292
181N sp3H2N−0.852020
182N sp3H2S−1.553434
183N sp3HC2−1.37473
184N sp3HC2(pi)−0.93225203
185N sp3HC2(2pi)−0.47311272
186N sp3HCN−1.143
187N sp3HCN(pi)−0.491413
188N sp3HCN(2pi)1.654242
189N sp3HCO(pi)−1.3299
190N sp3HCS−1.6944
191N sp3HCS(pi)−0.984747
192N sp3HCP−1.7833
193N sp3HCP(pi)−0.4111
194N sp3C3−1.03122108
195N sp3C3(pi)−0.73153138
196N sp3C3(2pi)−0.72149136
197N sp3C3(3pi)−0.752323
198N sp3C2N−1.5711
199N sp3C2N(pi)−1.413128
200N sp3C2N(2pi)−0.675147
201N sp3C2N(3pi)−0.441010
202N sp3C2O(pi)−0.3155
203N sp3C2S−1.4255
204N sp3C2S(pi)0.0376
205N sp3C2S(2pi)0.7622
206N sp3C2P−0.3353
207N sp3CN2(2pi)1.3611
208N sp3CS20.2711
209N sp3CS2(pi)−0.2911
210N sp2H=C−0.671211
211N sp2C=C−0.72200180
212N sp2C=N0.011312
213N sp2=CN0.499678
214N sp2C=N(+)−6.6111
215N sp2=CN(+)−1.0222
216N sp2=CO−0.644741
217N sp2C=O−1.0522
218N sp2=CS−1.4454
219N sp2N=N−0.782518
220N sp2N=O0.164037
221N aromaticC2:C(+)05025
222N aromatic:C20.38354258
223N aromatic:C:N−0.3542
224N(+) sp3H3C−1.032626
225N(+) sp3H2C21.255
226N(+) sp3HC32.6811
227N(+) sp3C43.0311
228N(+) sp2C=CO(−)−2.31010
229N(+) sp2CO=O(−)0.27235198
230N(+) sp2NO=O(−)−0.1922
231N(+) sp2O2=O(−)0.445529
232N(+) aromaticH:C22.533
233N(+) aromaticC:C2−0.4876
234N(+) aromatic:C2O(−)1.731919
235N(+) sp=C=N(−)1.811
236N(+) sp=N2(−)011
237OHC−0.96481344
238OHC(pi)−0.72627557
239OHN−0.151111
240OHN(pi)−0.2466
241OC20.06156115
242OC2(pi)−0.13726588
243OC2(2pi)−0.51301280
244OCN0.433
245OCN(pi)0.8244
246OCN(+)(pi)0.015529
247OCN(2pi)0.531312
248OCS−0.13138
249OCS(pi)−0.133
250OCP0.2313268
251OCP(pi)−0.493626
252OCSi−0.1582
253ON2(2pi)1.9155
254ONP(pi)−1.951414
255OSi20.09184
256S2HC0.651412
257S2HC(pi)0.143131
258S2C21.394845
259S2C2(pi)0.986863
260S2C2(2pi)0.985554
261S2CN033
262S2CN(2pi)2.311
263S2CS0.8721
264S2CS(pi)1.9742
265S2CP1.121715
266S2CP(pi)0.4832
267S2N2−2.222
268S2N2(2pi)5.9611
269S4C2=O−1.131111
270S4C2=O2−0.51616
271S4CO=O2−0.4821
272S4CN=O2−0.058580
273S4C=O2F0.2422
274S4NO=O2033
275S4N2=O20.7755
276S4O2=O0.8322
277S4O2=O20.522
278S4O2=O2(−)−1.1433
279P4CO2=O−1.1122
280P4CO2=S0.2611
281P4CO=OS−2.5811
282P4CO=OF−0.8833
283P4COS=S−2.0411
284P4O3=O−0.562929
285P4O3=S1.121818
286P4O2S=S0.71211
287P4O=OS2−0.5422
288P4N3=O−0.3111
289P4N2O=O0.2422
290P4NO=OS−1.522
291SiC4−0.5111
292SiC3O−1.721
293SiC2O20.13174
294SiO4022
295Halide 1.12019
296HH Acceptor0.51164154
297(COH)nn > 10.2613774
298(COOH)nn > 1−0.152625
299AlkaneNo. of C atoms0.0929032
300Unsaturated HCNo. of C atoms0.021584135
301Endocyclic bondsNo. of single bds−0.142338384
ABased onValid groups214 3332
BGoodness of fitR20.9648 3246
CDeviationAverage0.31 3246
DDeviationStandard0.39 3246
EK-fold cvK10 3164
FGoodness of fitQ20.9599 3164
GDeviationAverage (cv)0.33 3164
HDeviationStandard (cv)0.42 3164
Table 3. Atom and special groups and their contribution in logKoa calculations.
Table 3. Atom and special groups and their contribution in logKoa calculations.
EntryAtom TypeNeighboursContributionOccurrencesMolecules
1Const 1.4619001900
2C sp3H3C−0.071800875
3C sp3H3N3.4213187
4C sp3H3N(+)1.4211
5C sp3H3O2.24292219
6C sp3H3S1.513026
7C sp3H3P−0.4233
8C sp3H3Si0.426811
9C sp3H2C20.431732538
10C sp3H2CN3.91191129
11C sp3H2CN(+)1.6465
12C sp3H2CO2.61535342
13C sp3H2CS1.765744
14C sp3H2CP2.5833
15C sp3H2CF−0.7733
16C sp3H2CCl0.717556
17C sp3H2CBr1.052318
18C sp3H2CJ1.1355
19C sp3H2CSi2.9144
20C sp3H2N24.8983
21C sp3H2NO5.6598
22C sp3H2NS4.6755
23C sp3H2O24.7864
24C sp3H2S23.7444
25C sp3HC30.64268180
26C sp3HC2N4.086453
27C sp3HC2N(+)2.0511
28C sp3HC2O2.86169135
29C sp3HC2S1.7697
30C sp3HC2F−1.6611
31C sp3HC2Cl1.214317
32C sp3HC2Br1.31149
33C sp3HC2J1.9511
34C sp3HCNO8.1833
35C sp3HCNS2.0811
36C sp3HCO25.7366
37C sp3HCF2−0.1877
38C sp3HCFCl0.0222
39C sp3HCCl21.181514
40C sp3HCClBr0.7711
41C sp3HOF21.7933
42C sp3C40.739884
43C sp3C3N4.11313
44C sp3C3O3.114037
45C sp3C3S2.633
46C sp3C3Cl0.873715
47C sp3C2NO5.9411
48C sp3C2O25.9466
49C sp3C2F20.235810
50C sp3C2Cl21.241817
51C sp3CNO29.5611
52C sp3COF23.0633
53C sp3CF3−0.065551
54C sp3CF2Cl−0.0243
55C sp3CFCl20.3732
56C sp3CCl31.621716
57C sp3CBr30.5711
58C sp3O2F26.8511
59C sp3OF31.8633
60C sp2H2=C−0.198876
61C sp2HC=C0.34233141
62C sp2HC=N0.8588
63C sp2HC=O12727
64C sp2H=CN1.11913
65C sp2H=CO0.481514
66C sp2H=CS−1.0897
67C sp2H=CCl0.441210
68C sp2H=CBr0.632
69C sp2H=CSi2.1711
70C sp2HN=N1.735330
71C sp2HN=O2.2733
72C sp2HO=O0.9244
73C sp2H=NS2.8811
74C sp2C2=C0.810379
75C sp2C2=N1.623430
76C sp2C=CN1.61916
77C sp2C2=O1.088775
78C sp2C=CO1.222726
79C sp2C=CP−0.0911
80C sp2C=CS−0.411410
81C sp2C=CCl0.623924
82C sp2C=CBr1.01125
83C sp2=CN22.9822
84C sp2CN=N2.7577
85C sp2CN=O2.649388
86C sp2C=NO1.2655
87C sp2=CNO−1.2433
88C sp2C=NS0.4666
89C sp2=CNCl3.3563
90C sp2CO=O1.73244210
91C sp2C=OS−0.6132
92C sp2=CS2−0.6611
93C sp2=CF2−1.1411
94C sp2=CCl21.141614
95C sp2N2=N3.3699
96C sp2N2=O3.654340
97C sp2N=NO2.6444
98C sp2N=NS0.7177
99C sp2NO=O2.813836
100C sp2N=OS0.931717
101C sp2NO=S4.2611
102C sp2=NOS0.6833
103C sp2NS=S6.0332
104C sp2=NSCl−5.4422
105C sp2O2=O2.5633
106C aromaticH:C20.3154361136
107C aromaticH:C:N0.538149
108C aromaticH:N20.1766
109C aromatic:C30.89441148
110C aromaticC:C20.791163657
111C aromaticC:C:N0.684230
112C aromatic:C2N1.35164146
113C aromatic:C2N(+)2.099669
114C aromatic:C2:N1.011310
115C aromatic:C2O1.27769453
116C aromatic:C2P3.5353
117C aromatic:C2S−0.193833
118C aromatic:C2Si−0.2511
119C aromatic:C2F0.139941
120C aromatic:C2Cl0.911844550
121C aromatic:C2Br1.24391143
122C aromatic:C2J2.14109
123C aromaticC:N20.771110
124C aromatic:CN:N0.844
125C aromatic:C:NO1.22824
126C aromatic:C:NCl0.91412
127C aromaticN:N21.186036
128C aromatic:N2O1.151111
129C aromatic:N2S−0.688
130C aromatic:N2Cl0.4398
131C spH#C−0.451817
132C spC#C0.671817
133C spC#N0.734643
134C spN#N5.3211
135C sp#NP−5.5811
136C sp=N=S−0.1322
137N sp3H2C−2.181716
138N sp3H2C(pi)1.025753
139N sp3H2N3.5755
140N sp3H2S1.8111
141N sp3HC2−5.941211
142N sp3HC2(pi)−2.389370
143N sp3HC2(2pi)0.086556
144N sp3HCN(pi)0.0254
145N sp3HCN(2pi)1.244
146N sp3HCO(pi)1.1311
147N sp3HCP−4.133
148N sp3HCP(pi)1.5111
149N sp3HCS(pi)−1.5488
150N sp3C3−9.441717
151N sp3C3(pi)−6.395855
152N sp3C3(2pi)−4.824945
153N sp3C3(3pi)−3.6199
154N sp3C2N−5.1211
155N sp3C2N(pi)−2.541514
156N sp3C2N(+)(pi)−1.9372
157N sp3C2N(2pi)−3.843636
158N sp3C2N(3pi)−0.651312
159N sp3C2P011
160N sp3C2P(pi)−2.9711
161N sp3C2P(2pi)−4.0711
162N sp2H=C0.5111
163N sp2C=C−0.975448
164N sp2C=N0.6164
165N sp2=CN0.035449
166N sp2=CN(+)9.7422
167N sp2=CO−3.653026
168N sp2N=N−1.343
169N sp2N=O−2.021313
170N aromatic:C20.54194109
171N aromatic:C:N0.4741
172N(+) sp2CO=O(−)−0.3610476
173N(+) sp2NO=O(−)094
174N(+) sp2O2=O(−)−1.096335
175OHC−0.66143121
176OHC(pi)1.39175159
177OHN(pi)4.1822
178OHP2.1142
179OHSi1.9132
180OC2−4.17139105
181OC2(pi)−2.68392317
182OC2(2pi)−0.92255228
183OCN(pi)0.512016
184OCN(+)(pi)0.16335
185OCN(2pi)3.0788
186OCO(pi)−1.0321
187OCS−0.88116
188OCP−1.218393
189OCP(pi)−0.017054
190OCSi−2.3893
191ONP(pi)4.6511
192OP21.711
193OSi20216
194P4C3=O−5.711
195P4CNO=O1.211
196P4CO2=O1.4733
197P4CO2=S−1.533
198P4CO=OS1.9911
199P4CO=OF1.9411
200P4COS=S−0.8611
201P4NO2=O3.4211
202P4NO2=S1.8833
203P4NO=OS1.222
204P4O3=O0.092929
205P4O3=S−0.43230
206P4O2=OS0.4355
207P4O2=OF−0.1711
208P4O=OS21.5833
209P4O2S=S−0.271817
210P4=OS31.4611
211S2HC−1.0822
212S2HC(pi)1.5411
213S2C2−1.51414
214S2C2(pi)0.44139
215S2C2(2pi)2.822423
216S2CS−0.5842
217S2CS(pi)−2.9821
218S2CP−0.123328
219S2CP(pi)1.7832
220S4C2=O0.622
221S4C2=O22.1333
222S4CN=O23.2699
223S4CO=O2−0.0511
224S4O2=O−0.3522
225S4O2=O20.2433
226S6CF51.9233
227SiC4−1.3333
228SiC3O−0.6574
229SiC2O20.1196
230SiCO3033
231HH Acceptor−1.514745
232(COH)nn > 10.062215
233(COOH)nn > 11.266
234AlkaneNo. of C atoms−0.0526834
235Unsaturated HCNo. of C atoms−0.031512140
236Endocyclic bondsNo. of single bds−0.111109210
ABased onValid groups167 1900
BGoodness of fitR20.9765 1829
CDeviationAverage0.34 1829
DDeviationStandard0.44 1829
EK-fold cvK10 1765
FGoodness of fitQ20.9717 1765
GDeviationAverage (cv)0.37 1765
HDeviationStandard (cv)0.48 1765
Table 4. Example calculation of the logKoa of endosulfan sulphate.
Table 4. Example calculation of the logKoa of endosulfan sulphate.
Atom TypeC sp3C sp3C sp3C sp3C sp2OS4Endocycl. BondsConstSum
NeighborsH2COHC3C3ClC2Cl2C=CClCSO2=O2n C-C
Contribution2.610.640.871.240.62−0.880.24−0.111.46
n Groups22212219
n × Contribution5.221.281.741.241.24−1.760.24−0.991.469.67
Table 5. Experimental (calculated) logPow and logKoa values of 2- and 3-nitroaniline.
Table 5. Experimental (calculated) logPow and logKoa values of 2- and 3-nitroaniline.
Descriptor2-Nitroaniline3-Nitroaniline
logPow1.85 (1.70)1.37 (1.19)
logKoa6.46 (5.29)7.62 (6.80)
Table 6. Experimental (calculated) logPow and logKoa of hexanoic and 1,6-hexanedioic acid.
Table 6. Experimental (calculated) logPow and logKoa of hexanoic and 1,6-hexanedioic acid.
DescriptorHexanoic Acid1,6-Hexanedioic Acid
logPow1.92 (1.91)0.08 (0.64)
logKoa6.31 (6.23)10.74 (10.62)
Table 7. Experimental and calculated logKaw of some examples.
Table 7. Experimental and calculated logKaw of some examples.
CompoundlogKaw ExplogKaw Calc
2-Nitroaniline−4.77−3.59
3-Nitroaniline−6.49−5.61
Hexanoic Acid−4.531−4.32
1,6-Hexanedioic Acid−11.15−9.98
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.

Share and Cite

MDPI and ACS Style

Naef, R.; Acree, W.E., Jr. Calculation of the Three Partition Coefficients logPow, logKoa and logKaw of Organic Molecules at Standard Conditions at Once by Means of a Generally Applicable Group-Additivity Method. Liquids 2024, 4, 231-260. https://doi.org/10.3390/liquids4010011

AMA Style

Naef R, Acree WE Jr. Calculation of the Three Partition Coefficients logPow, logKoa and logKaw of Organic Molecules at Standard Conditions at Once by Means of a Generally Applicable Group-Additivity Method. Liquids. 2024; 4(1):231-260. https://doi.org/10.3390/liquids4010011

Chicago/Turabian Style

Naef, Rudolf, and William E. Acree, Jr. 2024. "Calculation of the Three Partition Coefficients logPow, logKoa and logKaw of Organic Molecules at Standard Conditions at Once by Means of a Generally Applicable Group-Additivity Method" Liquids 4, no. 1: 231-260. https://doi.org/10.3390/liquids4010011

Article Metrics

Back to TopTop